Sensitivity of Calibrated Parameters and Water Resource Estimates on Different Objective Functions and Optimization Algorithms

: The successful application of hydrological models relies on careful calibration and uncertainty analysis. However, there are many different calibration/uncertainty analysis algorithms, and each could be run with different objective functions. In this paper, we highlight the fact that each combination of optimization algorithm-objective functions may lead to a different set of optimum parameters, while having the same performance; this makes the interpretation of dominant hydrological processes in a watershed highly uncertain. We used three different optimization algorithms (SUFI-2, GLUE, and PSO), and eight different objective functions ( R 2 , bR 2 , NSE , MNS , RSR , SSQR , KGE , and PBIAS ) in a SWAT model to calibrate the monthly discharges in two watersheds in Iran. The results show that all three algorithms, using the same objective function, produced acceptable calibration results; however, with signiﬁcantly different parameter ranges. Similarly, an algorithm using different objective functions also produced acceptable calibration results, but with different parameter ranges. The different calibrated parameter ranges consequently resulted in signiﬁcantly different water resource estimates. Hence, the parameters and the outputs that they produce in a calibrated model are “conditioned” on the choices of the optimization algorithm and objective function. This adds another level of non-negligible uncertainty to watershed models, calling for more attention and investigation in this area.


Introduction
Distributed hydrologic models are useful tools for the simulation of hydrologic processes, planning and management of water resources, investigation of water quality, and prediction of the impact of climate and landuse changes worldwide [1][2][3][4][5].The successful application of hydrologic models, however, depends on proper calibration/validation and uncertainty analysis [6].
Process-based distributed hydrologic models are generally characterized by a large number of parameters, which are often not measurable and must be calibrated.Calibration is performed by carefully selecting the values for model input parameters (within their respective uncertainty ranges) and by comparing model simulation (outputs) for a given set of assumed conditions with observed data for the same conditions [7].
Hydrological model predictions are affected by four sources of error, leading to uncertainties in the results of the model.These are: 1-input errors (e.g., errors in rainfall, landuse map, pollutant source inputs); 2-model structure/model hypothesis errors (e.g., errors and simplifications in the description of physical processes); 3-errors in the observations used to calibrate/validate the model (e.g., errors in measured discharge and sediment); and 4-errors in the parameters, which arise from a lack of knowledge of the parameters at the scale of interest (e.g., hydraulic conductivity, Soil Conservation Service (SCS) curve number).These sources of error are commonly acknowledged in many studies (e.g., Montanari et al. [8]).
Over the years, a variety of optimization algorithms have been developed for calibration and uncertainty analysis, such as the Generalized Likelihood Uncertainty Estimation method (GLUE) [9], the Sequential Uncertainty Fitting procedure (SUFI-2) [10], Parameter Solution (ParaSol) [11], and Particle Swarm Optimization (PSO) [12,13].Although these algorithms differ in their search strategies, their goal is to find a set of the best parameter ranges, satisfying a desired threshold assigned to an objective function.Furthermore, many objective functions have also been developed and are in common usage, such as Nash-Sutcliffe efficiency (NSE) [14], the root mean square error (RMSE), the observations standard deviation ratio (RSR) [15], and Kling-Gupta efficiency (KGE) [16], to name just a few.
A comparison of the performance of hydrological models under different optimization algorithms [17][18][19][20] and objective functions [21,22] has been the subject of some scrutiny in the literature.Examples of this are the work of Arsenault et al. [19], who compared ten optimization algorithms in terms of the method performance with respect to model complexity, basin type, convergence speed, and computing power for three hydrological models.Wu and Chen [20] compared three calibration methods (SUFI-2, GLUE, and ParaSol) within the same modeling framework and showed that SUFI-2 was able to provide more reasonable and balanced predictive results than GLUE and ParaSol.Wu and Liu [21] examined four potential objective functions and suggested SAR as a reasonable choice.In a more comprehensive study, Muleta [22] examined the sensitivity of model performance to nine widely used objective functions in an automated calibration procedure.Less attention, however, has been paid to the optimized parameter values obtained under different optimization algorithms and objective functions, in addition to their impact on the interpretation of hydrological processes in the studied watersheds.
In this study, we examine the sensitivity of optimized model parameters to different optimization algorithms and objective functions, as well as their impacts on the calculation of water resources in two different watersheds in Iran.The current paper focuses on the GLUE, SUFI-2, and PSO algorithms and the objective functions R 2 , bR 2 , NSE, MNS, RSR, SSQR, KGE, and PBIAS (see Table 1 for a definition of the function).To achieve our objectives, we used the Soil and Water Assessment Tool (SWAT) [23] in the Salman Dam Basin (SDB) and Karkheh River Basin (KRB).For model calibration, we used SWAT-CUP [24], which couples five optimization algorithms to SWAT, allowing the use of different objective functions for SUFI-2 and PSO algorithms.
Ratio of standard deviation of observations to root mean square error (RSR) Kling-Gupta efficiency (KGE) Percent bias (PBIAS) * R is the correlation coefficient between observed and simulated data; b is the slope of regression line between observed an simulated data; Q i,o and Q i,s are the ith observed and simulated values, respectively; Q o , Q s are the mean observed and simulated values, respectively; n is total number of observations; α = σ s /σ m and β = µ s /µ m where σ m and σ s are the standard deviation of the observed and simulated data, respectively, and µ m and µ s are the mean of observed and simulated data, respectively.

Hydrologic Model SWAT
SWAT is a process-based, spatially distributed, and time continuous model.The program is open source and is commonly applied to quantify the impact of landuse and climate change, as well as the impact of different watershed management activities on hydrology, sediment movement, and water quality.SWAT operates by spatially dividing the watershed into multiple sub-basins using digital elevation data.Each sub-basin is further discretized into hydrologic response units (HRUs), which consist of uniform soil, landuse, management, and topographical classes.More information on SWAT can be found in Arnold et al. [7] and Neitsch et al. [28].

Calibration/Uncertainty Analysis Programs
SUFI-2 is a semi-automated approach used for calibration, validation, and sensitivity and uncertainty analysis.In SUFI-2, all sources of parameter uncertainties are assigned to parameters.The uncertainty in the input parameters are described as uniform distributions, while model output uncertainty is quantified by the 95% prediction uncertainty (95PPU) determined at the 2.5% and 97.5% levels of the cumulative distribution of output variables obtained through Latin hypercube sampling.
Two indices determine the model's goodness-of-fit and uncertainty: p-factor and d-factor.The p-factor is the percentage of observed data bracketed by the 95% prediction uncertainty (95PPU), while the d-factor is the average thickness of the 95PPU band divided by the standard deviation of the observed data.In the ideal situation, where the simulation exactly matches the observed data, the p-factor and d-factor tend to be 100% and 0, respectively, but these values cannot be achieved for real cases due to the errors from different sources.A wide d-factor can lead to a large p-factor, but SUFI-2 searches to bracket most of the measured data with the smallest possible uncertainty band (d-factor) [24].
GLUE relies on the output of numerous Monte Carlo simulations in which a global optimum parameter set is sought and any assessment of parameter uncertainty is made with respect to that global optimum.In GLUE, all sources of uncertainty (i.e., input uncertainty, structural uncertainty, and response uncertainty) are also accounted for by parameter uncertainty.This method is based on the concept of non-uniqueness, which means that different parameter sets can produce equally good and acceptable performances of model predictions due to the interactions of different parameters.This concept rejects the idea of a unique global optimum parameter set.The objective of GLUE is to identify a set of behavioral models within the universe of possible model/parameter combinations.The term "behavioral" is used to signify models that are judged to be "acceptable" on the basis of the available data.In this method, a large number of model runs are performed with different randomly chosen parameter values selected from prior parameter distributions.To quantify how well the parameter combination simulates the real system, a likelihood value is assigned to each set of parameter values by comparing the predicted simulation and observed data.Then, this value is compared to a cutoff threshold value that is selected arbitrarily.Each parameter set that leads to a likelihood value less than the threshold value is discarded from future consideration [29].
PSO is a population-(swarm-) based statistical optimization technique inspired by the social behavior of bird flocking or fish schooling.PSO is initialized with a group of random particles (solutions) moving through the search space for optima.During the optimization process, PSO generates the positions of particles (coordinate in the parameter space) and their velocities (step change in space), and then updates the velocity of each particle using the information from the best solution it has achieved so far and a global best solution obtained by all the other particles.The new position of each particle is calculated by updating the current position using the velocity vector [12].

Objective Function
To assess the impact of the objective functions on the calibration results and final sensitive parameter ranges, we used the following popularly used functions: the coefficient of determination (R 2 ) and modified R 2 (bR 2 ), Nash-Sutcliffe efficiency (NSE), Modified Nash-Sutcliffe efficiency (MNS), ratio of the standard deviation of observations to root mean square error (RSR), ranked sum of squares (SSQR), Kling-Gupta efficiency (KGE), and percent bias (PBIAS).The formulation of these eight objective functions is presented in Table 1.

Case Studies
The first case study is the Salman Dam Basin (SDB) located in the arid regions of south-central Iran.This region includes the watershed upstream of the Salman Farsi Dam (Figure 1a).The area of the SDB is approximately 13,000 km 2 , with geographic coordinates of 28 • 26 N to 29 • 47 N and 51 • 55 E to 54 • 19 E. The elevation of the basin ranges from less than 800 m above sea level in the southern areas to more than 3100 m in the northern areas of the basin.The main river in the SDB is Ghareh-Aghaj, with an annual average discharge of 18 m 3 •s −1 .The average annual precipitation is less than 250 mm•year −1 in the central and southern part of the watershed, and >750 mm•year −1 in the northwest regions.
The second case study is the Karkheh River Basin (KRB), which is the highly studied basin of the Challenge Program in Water and Food [30], located in western Iran.The KRB covers an area of 51,000 km 2 and lies between 30 • N to 35 • N and 46 • E to 49 • E geographic coordinates, with an elevation ranging from less than the mean sea level to more than 3600 m (Figure 1b).The Karkheh river is the third longest river in Iran, with an annual average discharge of 188 m 3 •s −1 [31].The climate is semi-arid in the uplands (north) and arid in the lowlands (south).The precipitation exhibits large spatial and temporal variability.The mean annual precipitation is about 450 mm•year −1 , ranging from 150 mm•year −1 in the lower arid plains to 750 mm•year −1 in the upper mountainous parts [31].A large multi-purpose earthen embankment dam, Karkheh, was built on the river and has been utilized since 2001 in order to supply irrigation water in the Khuzestan plains (in the lower Karkheh region), and hydropower generation and flood control.Management information relating to Karkheh reservoir operation (i.e., the minimum and maximum daily outflow, reservoir surface area, and spillway conditions) was considered in the SWAT model of KRB [3].

SDB and KRB Models
For this study, we used ArcSWAT 2012 with ArcGIS (ESRI-version 10.2.2).The input data and their sources are listed in Table 2.The SDB and KRB watersheds were discretized into 184 and 333 sub-basins, respectively.The sub-basins were further subdivided into 1115 and 3002 homogeneous hydrological response units (HRUs), respectively, by fixing a threshold value of 5% for landuse and 10% for soil type.By using these thresholds, soils and landuses with smaller areas than their respective thresholds were integrated into larger soil and landuses, respectively, by an area weighted scheme.

Digital Elevation Maps (DEM) (resolution 90 m)
The Shuttle Radar Topography Mission (SRTM by NASA) [32] Soil data (resolution 10 km) The Food and Agriculture Organization of the United Nations [33] Landuse data Satellite images (IRS-P6 LISS-IV and IRS-P5-Pan satellite images, ETM + 2001 Landsat) Weather data (minimum and maximum daily air temperature and daily precipitation) Iranian ministry of Energy, The Iranian Meteorological Organization, and WFDEI_CRU data (0.5 • × 0.5 The Hargreaves method [34] was used to simulate the potential evapotranspiration (PET).The maximum transpiration and soil evaporation values were then calculated in SWAT using an approach similar to Ritchie [35], where soil evaporation is estimated by using the exponential functions of soil depth and water content based on PET and a soil cover index based on aboveground biomass.Plant transpiration is simulated as a linear function of PET and the leaf area index, root depth, and soil water content [36].The modified SCS curve number method was used to calculate surface runoff, and the variable storage routing method was used for flow routing.
Water 2017, 9, 384 5 of 17 Table 2. Data description and sources used in the SWAT projects.

Digital Elevation Maps (DEM) (resolution 90 m)
The Shuttle Radar Topography Mission (SRTM by NASA) [32] Soil data (resolution 10 km) The Food and Agriculture Organization of the United Nations [33] Landuse The Hargreaves method [34] was used to simulate the potential evapotranspiration (PET).The maximum transpiration and soil evaporation values were then calculated in SWAT using an approach similar to Ritchie [35], where soil evaporation is estimated by using the exponential functions of soil depth and water content based on PET and a soil cover index based on aboveground biomass.Plant transpiration is simulated as a linear function of PET and the leaf area index, root depth, and soil water content [36].The modified SCS curve number method was used to calculate surface runoff, and the variable storage routing method was used for flow routing.Monthly discharges in the SDB model were calibrated from 1990 to 2008 and validated from 1977 to 1989, and in the KRB, the same periods were 1988-2012 and 1980-1987, respectively, using daily observed flows from four and eight river discharge stations, respectively, (Figure 1).A three-year duration was considered as the warm up period, in order to account for the initial conditions.
To calibrate the model, we initially selected 15 parameters for SDB based on a preliminary one-at-a-time sensitivity analysis and nine parameters for KRB based on the previous research Monthly discharges in the SDB model were calibrated from 1990 to 2008 and validated from 1977 to 1989, and in the KRB, the same periods were 1988-2012 and 1980-1987, respectively, using daily observed flows from four and eight river discharge stations, respectively, (Figure 1).A three-year duration was considered as the warm up period, in order to account for the initial conditions.
To calibrate the model, we initially selected 15 parameters for SDB based on a preliminary one-at-a-time sensitivity analysis and nine parameters for KRB based on the previous research works [3,37].Ashraf Vaghefi et al. [3] used SUFI-2 to model KRB and identified nine parameters based on their global sensitivity analysis (Table 3).As one-at-a-time analysis is quite limited due to the large interactions between parameters, we initially used a large number of parameters for further analysis.To evaluate the impact of various optimization algorithms and objective functions on the final calibrated parameter ranges, we calibrated each model using the same initial parameter ranges (Table 3) and followed the calibration protocol presented by Abbaspour et al. [6].Initially, the snow parameters were fitted separately and their values were fixed to avoid identifiability problems with other parameters.Similar to rainfall, snow melt is also a driving variable and its parameters should not be calibrated simultaneously with other model parameters.Groundwater "revap" coefficient 0 0.2 * r_ refers to a relative change in the parameters were their current values are multiplied by (1 plus a factor in the given range); ** v_ refers to the substitution of a parameter value by another value in the given range [24]).
Monthly discharges in both watersheds were calibrated separately using the eight different efficiency criteria in Table 1.Then, the goodness of calibration results were compared for different objective functions using the criteria in Table 4.For bR 2 and MNS, we introduced measures based on the results of similar studies [22,38,39], where the satisfactory threshold values for bR 2 and MNS were considered greater than or equal to 0.4.No such threshold could be specified for SSQR as the measured and simulated variables are independently ranked and its value depends on the magnitude of the variables being investigated.

Table 4.
General performance ratings for a monthly time step [15,40].

Optimization Algorithms
To compare the parameters obtained in each optimization method, we used similar conditions in terms of behavioral and non-behavioral parameter values, objective function types, calibration parameters and their prior ranges, number of runs, and statistical criteria.The NSE was selected as the common objective function for all three optimization algorithms.The behavioral threshold was set at NSE ≥ 0.5.The criteria NSE, p-factor, and d-factor were used to evaluate model performance.
In SDB, we used three iterations with 480 simulation runs (totally 1440 simulation runs) for SUFI-2 and 1440 simulation runs for GLUE and PSO.In KRB, we used five iterations with 480 simulation runs (totally 2400 simulation runs) for SUFI-2 and 2400 simulation runs for GLUE and PSO.The parallel processing option of SWAT-CUP [41] was used to run SUFI-2 with different objective functions.GLUE and PSO usually require a large number of simulations.However, because of our relatively good initial parameter values and reasonable parameter ranges (based on the previous works mentioned above), fewer runs were needed to produce satisfactory results.

Statistical Analysis
We used the non-parametric Kruskal-Wallis test to assess whether the ranges of sensitive parameters obtained by different objective functions were significantly different from each other.The test is based on an analysis of variance using the ranks of the data values, not the data values themselves.If the Kruskal-Wallis test was significant, we used Tukey's post-hoc test to determine which objective functions produced similar or different parameters.

Sensitivity of Model Performance to the Objective Functions Used in SUFI-2 Algorithm
Based on the criteria of Table 4, all objective functions performed better than satisfactory, except for PBIAS in the calibration stage (Table 5).In the validation, the Barak station did not have satisfactory results for six of the objective functions.This could be due to extensive water management and human activities in the upstream of Barak during the validation period.For an illustration example, we plotted the best and the worst calibration results for the T.Karzin sub-basin in SDB in Figure 2. The discharges based on NSE were quite similar and close to the observation, while PBIAS showed a systematic delay in the recession leg of the discharge.In KRB, all objective functions performed better than satisfactory for all sub-basins except in Payepol (Table 5).For KRB, we obtained similar results to Ashraf Vaghefi et al. [3], who also modeled this watershed with SWAT.They reported larger uncertainties in the southern parts of the Karkheh Dam (i.e., Payepol station), because of higher water management activities.While in the northern part of the Dam (i.e., Afarine and Jologir stations), the uncertainties were smaller, and in general, model performance was better [3].
At the Payepol station, the validation results were better than the calibration results because the Karkheh Dam was constructed after the validation period.However, the results from some objective functions like NSE and RSR were still unsatisfactory in Payepol.
To compare the closeness of the final discharges in all objective functions, we calculated the correlation coefficient table (Table 6).The high correlation coefficients among the best simulated discharges in KRB show that most objective functions led to similar results.As in SDB, in KRB, PBIAS displayed the worst correlation with the other methods.We conclude here that the final results of the monthly discharges in our two case studies are not very sensitive to the objective functions in the SUFI-2 algorithm.In our two case studies, except PBIAS, other objective functions produce equally acceptable simulation results.However, this is not a general conclusion because in other regions, where, for example, snow melt is dominant, a certain objective function that targets a specific feature of the discharge may perform better and be more desirable.In KRB, all objective functions performed better than satisfactory for all sub-basins except in Payepol (Table 5).For KRB, we obtained similar results to Ashraf Vaghefi et al. [3], who also modeled this watershed with SWAT.They reported larger uncertainties in the southern parts of the Karkheh Dam (i.e., Payepol station), because of higher water management activities.While in the northern part of the Dam (i.e., Afarine and Jologir stations), the uncertainties were smaller, and in general, model performance was better [3].
At the Payepol station, the validation results were better than the calibration results because the Karkheh Dam was constructed after the validation period.However, the results from some objective functions like NSE and RSR were still unsatisfactory in Payepol.
To compare the closeness of the final discharges in all objective functions, we calculated the correlation coefficient table (Table 6).The high correlation coefficients among the best simulated discharges in KRB show that most objective functions led to similar results.As in SDB, in KRB, PBIAS displayed the worst correlation with the other methods.We conclude here that the final results of the monthly discharges in our two case studies are not very sensitive to the objective functions in the SUFI-2 algorithm.In our two case studies, except PBIAS, other objective functions produce equally acceptable simulation results.However, this is not a general conclusion because in other regions, where, for example, snow melt is dominant, a certain objective function that targets a specific feature of the discharge may perform better and be more desirable.

Sensitivity of Model Parameters to Objective Functions
In SUFI-2, parameters are always expressed as distributions, beginning with a wider distribution and ending up with a narrower distribution after calibration.In this study, we used a uniform distribution to express the parameter uncertainty.The parameters obtained by each objective function in the SDB and KRB study sites showed significantly different ranges (Figure 3), even though the simulated discharges were not significantly different.This illustrates the concept of parameter "non-uniqueness" and the concept of "conditionality" of the calibrated parameters.An unconditional parameter range is a parameter range that is independent of the objective function used in calibration.By this definition, the unconditional parameter range of CN2 for B.Bahman would be the range indicated by the broken line in Figure 3.However, this translates into a very large parameter uncertainty.This indicates that there is a significant uncertainty associated with the choice of objective functions with respect to parameter ranges.Using the Kruskal-Wallis test, we determined which parameter ranges were significantly different from the others (Table 7).As an example, the parameter CN2 for the upstream sub-basins of the B.Bahman outlet were not significantly different for NSE, SSQR, and KGE, while they were significantly different for all other objective functions.A careful analysis of the results in Table 7 reveals that there is no clear pattern of similarity or differences between the objective functions.However, it is clearly indicated that the NSE method has the most common parameters with other objective functions, followed by RSR and KGE.

Sensitivity of Water Resources Components to the Objective Functions
Next, we calculated the water resource components for parameters obtained by different objective functions.To show this, we calculated the actual evapotranspiration (AET), soil water (SW), and water yield (WYLD) (Figure 4).The long-term annual averages of these variables in SDB, based on the best parameter values given by different objective functions, show significant differences.Furthermore, it is seen that the regional water resources maps of AET, SW, and WYLD exhibit significant differences in their spatial distributions (Figure 5).
Using the Kruskal-Wallis test, we determined which parameter ranges were significantly different from the others (Table 7).As an example, the parameter CN2 for the upstream sub-basins of the B.Bahman outlet were not significantly different for NSE, SSQR, and KGE, while they were significantly different for all other objective functions.A careful analysis of the results in Table 7 reveals that there is no clear pattern of similarity or differences between the objective functions.However, it is clearly indicated that the NSE method has the most common parameters with other objective functions, followed by RSR and KGE.

Sensitivity of Water Resources Components to the Objective Functions
Next, we calculated the water resource components for parameters obtained by different objective functions.To show this, we calculated the actual evapotranspiration (AET), soil water (SW), and water yield (WYLD) (Figure 4).The long-term annual averages of these variables in SDB, based on the best parameter values given by different objective functions, show significant differences.Furthermore, it is seen that the regional water resources maps of AET, SW, and WYLD exhibit significant differences in their spatial distributions (Figure 5).Faramarzi et al. [2] reported a range of 120-300 mm•year −1 in their national model for AET for the same region.In the current study, the minimum and maximum values of the annual average AET were determined by RSR and KGE as being 191 and 295 mm•year −1 , respectively (Figure 4a).These values are within the uncertainty ranges reported by Faramarzi et al. [2].The results of SW and WYLD in SDB (Figure 4b,c) also corresponded well with the values reported by Faramarzi et al. [2].

Sensitivity of Calibration Performance and Model Parameters to Optimization Algorithms Using NSE
In SDB, the maximum NSE values in all three optimization techniques were higher than 0.6; hence, they all achieved satisfactory results (Table 8).The p-factor values verify that most of the observed discharges were bracketed by the 95PPU of simulations by SUFI-2, followed by GLUE and PSO during the calibration and validation periods.Using a threshold value of NSE ≥ 0.5, the SUFI-2 algorithm found 214 behavioral solutions in 480 simulations, while PSO and GLUE achieved 477 and 283 behavioral solutions in 1440 simulations, respectively.Although PSO and GLUE used a larger number of simulations, the p-factor and d-factor of SUFI-2 show a better performance than GLUE, followed by PSO.This would probably be expected as the latter two algorithms were not allowed to fully exploit the parameter spaces due to the limited number of runs.However, in this study, we used relatively good initial parameter values and uncertainty ranges, and all of the methods obtained quite similar and satisfactory results.In KRB, GLUE and PSO were not successful in calibrating the SWAT model based on the defined conditions (i.e., initial parameter ranges, number of simulation runs, and behavioral threshold value), as there were no behavioral parameter sets.The SUFI-2 algorithm achieved satisfactory simulations of discharge, with NSE = 0.53 and NSE = 0.51 for the calibration and validation periods, respectively.The p-factor was 55% and the d-factor was around 1, indicating a reasonable uncertainty in the calibration and verification results (Table 8).More than 100 behavioral solutions in 480 simulations were found with NSE ≥ 0.5, while only three behavioral solutions were found by GLUE and no behavioral solution was found by PSO in the 2400 simulation (Table 8).Yang et al. [17] calibrated the Chaohe Basin in China and showed that the application of SUFI-2 based on the Nash-Sutcliffe coefficient used the smallest number of model runs to achieve similar prediction results to GLUE.Additionally, in the current study, in both watersheds, the SUFI-2 algorithm used the smallest number of runs to achieve similar results to GLUE and PSO.As already mentioned, GLUE and PSO in KRB were not allowed to fully explore the parameter spaces, which is the reason for their relatively poor performances here.
Although all three algorithms underestimated the monthly discharge at SDB, they obtained similarly good results based on the performance criteria given by Moriasi et al. [15] (Figure 6) and (Table 9).The calibrated parameters estimated by the three algorithms have larger overlaps than those by different objective functions (Figure 7).PSO provided the widest ranges of parameter uncertainty, followed by GLUE and SUFI-2.Based on multiple comparison tests, half of the calibrated parameter ranges obtained by SUFI-2, GLUE, and PSO were significantly different in SDB.Between GLUE-PSO, SUFI2-GLUE, and SUFI2-PSO, five, four, and four parameters out of 18 were found not to be significantly different from each other, respectively.Overall, the sensitivity of the parameters to different objective functions was found to be larger than the sensitivity to optimization algorithms.This is expected because objective functions solve different problems, while calibration methods basically solve the same problem.

Conclusions
We investigated the sensitivity of parameters, model calibration performance, and water resource components to different objective functions (R 2 , bR 2 , NSE, MNS, RSR, SSQR, KGE, and PBIAS) and optimization algorithms (e.g., SUFI-2, GLUE, and PSO) using SWAT in two watersheds.The following conclusions could be drawn:

1)
In most cases, different objective functions with one optimization algorithm (in this case SUFI-2) led to satisfactory calibration/validation results for river discharges in both case studies.However, the calibrated parameters were significantly different in each case, leading to different water resource estimates.2) Different optimization algorithms with one objective function (in this case NSE) also produced satisfactory calibration/validation results for river discharges in both case studies.However, the calibrated parameters were significantly different in each case, resulting in significantly different water resources estimates.
Finally, the important message of this work is that the calibration/validation performance may not be sensitive to the choice of optimization algorithm and objective function, but the parameters obtained may be significantly different.As parameters represent processes, the choice of calibration algorithm and objective function may be critical in interpreting the model results in terms of important watershed processes.

Figure 3 .
Figure 3. Uncertainty ranges of calibrated parameters using different objective functions in (Top) SDB and (bottom) KRB.The points in each line show the best value of parameters, r_ refers to a relative change where the current values are multiplied by (one plus a factor from the given parameter range),and v_ refers to the substitution by a value from the given parameter range[24]).

Figure 3 .
Figure 3. Uncertainty ranges of calibrated parameters using different objective functions in (Top) SDB and (bottom) KRB.The points in each line show the best value of parameters, r_ refers to a relative change where the current values are multiplied by (one plus a factor from the given parameter range),and v_ refers to the substitution by a value from the given parameter range[24]).

Figure 6 .
Figure 6.Calibration (1990-2008) and validation (1977-1989) results of the monthly simulated discharges using the three optimization algorithms (SUFI-2, GLUE, and PSO), with NSE as the objective function in the T.Karzin station in Salman Dam Basin (SDB).

Figure 7 .
Figure 7. Uncertainty ranges of the parameters based on all three methods applied in Salman Dam Basin (SDB).The points in each line show the best value of the parameters, r_ refers to a relative change where the current values are multiplied by one plus a factor from the given parameter range, and v_ refers to the substitution by a value from the given parameter range[24]).

Figure 6 .
Figure 6.Calibration (1990-2008) and validation (1977-1989) results of the monthly simulated discharges using the three optimization algorithms (SUFI-2, GLUE, and PSO), with NSE as the objective function in the T.Karzin station in Salman Dam Basin (SDB).

Figure 6 .
Figure 6.Calibration (1990-2008) and validation (1977-1989) results of the monthly simulated discharges using the three optimization algorithms (SUFI-2, GLUE, and PSO), with NSE as the objective function in the T.Karzin station in Salman Dam Basin (SDB).

Figure 7 .
Figure 7. Uncertainty ranges of the parameters based on all three methods applied in Salman Dam Basin (SDB).The points in each line show the best value of the parameters, r_ refers to a relative change where the current values are multiplied by one plus a factor from the given parameter range, and v_ refers to the substitution by a value from the given parameter range[24]).

Figure 7 .
Figure 7. Uncertainty ranges of the parameters based on all three methods applied in Salman Dam Basin (SDB).The points in each line show the best value of the parameters, r_ refers to a relative change where the current values are multiplied by one plus a factor from the given parameter range, and v_ refers to the substitution by a value from the given parameter range[24]).

Table 1 .
Formulation of the objective functions.

Table 2 .
Data description and sources used in the SWAT projects.

Table 3 .
Initial ranges and descriptions of the parameters used for calibrating the SWAT models in the Salman Dam Basin (SDB) and Karkheh River Basin (KRB).

Table 5 .
Calibration and validation (in parentheses) results by eight different objective functions using the SUFI-2 optimization algorithm.

Table 6 .
Correlation coefficients of the objective functions based on the best simulation in the calibration period using the SUFI-2 algorithm.

Table 6 .
Correlation coefficients of the objective functions based on the best simulation in the calibration period using the SUFI-2 algorithm.

Table 7 .
Results of Tukey's post-hoc test to determine if parameters obtained by different objective functions were statistically different or similar.

Table 7 .
Results of Tukey's post-hoc test to determine if parameters obtained by different objective functions were statistically different or similar.

Table 8 .
Performance of the optimization algorithms and the number of behavioral parameter ranges for the calibration and/validation periods in Salman Dam Basin (SDB) and Karkheh River Basin (KRB).

Table 9 .
Correlation coefficient among the best simulation of discharges obtained by all optimization techniques in all stations at Salman Dam Basin (SDB).