Estimation of Fuzzy Parameters in the Linear Muskingum Model with the Aid of Particle Swarm Optimization

The Muskingum method is one of the widely used methods for lumped flood routing in natural rivers. Calibration of its parameters remains an active challenge for the researchers. The task has been mostly addressed by using crisp numbers, but fuzzy seems a reasonable alternative to account for parameter uncertainty. In this work, a fuzzy Muskingum model is proposed where the assessment of the outflow as a fuzzy quantity is based on the crisp linear Muskingum method but with fuzzy parameters as inputs. This calculation can be achieved based on the extension principle of the fuzzy sets and logic. The critical point is the calibration of the proposed fuzzy extension of the Muskingum method. Due to complexity of the model, the particle swarm optimization (PSO) method is used to enable the use of a simulation process for each possible solution that composes the swarm. A weighted sum of several performance criteria is used as the fitness function of the PSO. The function accounts for the inclusive constraints (the property that the data must be included within the produced fuzzy band) and for the magnitude of the fuzzy band, since large uncertainty may render the model non-functional. Four case studies from the references are used to benchmark the proposed method, including smooth, double, and non-smooth data and a complex, real case study that shows the advantages of the approach. The use of fuzzy parameters is closer to the uncertain nature of the problem. The new methodology increases the reliability of the prediction. Furthermore, the produced fuzzy band can include, to a significant degree, the observed data and the output of the existent crisp methodologies even if they include more complex assumptions.


Introduction
Flood risk management is a key component in sustainable water resources management. Floods impact both individuals and communities, producing harmful consequences of social, economic, and environmental implications. Negative impacts of flooding include loss of human life, destruction of crops, damage to essential infrastructure, and disruption of the value chain. Many flood risk management strategies are based on operational early waring schemes that predict the arrival of the flood wave and enable civil protection actions to safeguard life and property. The prediction of the outflow hydrograph for a river's reach given a specific inflow hydrograph is, therefore, an important issue in water resources management. Hydrological forecasting requires a sound understanding of the physical mechanisms that control the propagation of flood waves along rivers. Mathematical models for flood routing are based on the unsteady flow Saint-Venant equations. Numerical solutions of the flow equations are complex, and thus, simplified versions are preferred in operational flood forecasting schemes. Various methods have been developed for this purpose, which may be categorized into three distinct groups [1,2]: (1) the distributed (or hydraulic) methods (e.g., dynamic wave, diffusion wave, and kinematic wave models), which are based on the mass conservation and momentum transport equations; (2) the lumped (or hydrologic) methods (in which the linear and nonlinear Muskingum models [3] are widely used); and (3) the semi-distributed or hybrid methods (e.g., Muskingum-Cunge family models [4][5][6]). This work is focused on enhancing models of the third category using fuzzy sets and logic.
Lumped models estimate the flood hydrograph at a downstream section of the river from the flood hydrograph at the upstream section without considering explicitly the river characteristics between the upstream and the downstream sections. These models use the one-dimensional continuity equation along with a storage equation in the calibration and verification steps.
The Muskingum method for natural streamflow routing, first proposed by McCarthy (1938) [3] in the Muskingum River basin in Ohio, is a widely applied hydrologic method [7]. It is a lumped method which cannot provide the outflow at the intermediate sections.
However, these models are very popular since, in most cases, the scarcity of field data prevents the use of the Saint-Venant equations to route floods in rivers, as occurs in the second category of river-reach routing [8].
The widely-used linear Muskingum method depends on two parameters: K and x [9]. Originally, the graphical method was developed to determine the parameters of the widely-used Muskingum linear equation. As is widely known, the graphical solution is based on the graphical representation of the storage versus the weighted discharge as a function of x. The preferred value of x is the one that produces the narrowest loop (e.g., Wilson, 1974 [10]). Next, some techniques based either on linear programming or on linear regression were developed [11]. However, the use of the regression-based techniques can lead to unreasonable values for K, x (e.g., [12]), due to an overtraining behavior.
The use of the nonlinear storage equation within the Muskingum method increases the number of model parameters, but it is closer to reality. However, the nonlinear storage equation increases the difficulty of the calibration process. Various mathematical-hydrological methodologies are developed to calibrate the river-routing models by using a nonlinear approach regarding the relation of the reach storage. For instance, according to the segmented least-squares method (S-LSM) (Gill 1978) [13], the whole available data are subdivided into several groups based on the storage values. It is then assumed that K and x remain constant in each subgroup, and hence, for each subgroup, the least square analysis is used. Therefore, the optimization procedure is based on the minimization between the estimated storage and the existent storage.
Next, a directly (global) nonlinear formulation of the Muskingum method (regarding the reach storage) was proposed by several authors with the use of a nonlinear form for the storage. For instance, [14] used several mathematical techniques to minimize the sum of the squares of deviations between observed channel storage and computed channel storage.
Similarly, the nonlinear least squares method (NL-LSM) [15], the Lagrange multiplier method (LMM) [7], and the Broyden-Fletcher-Goldfarb-Shanno (BFGS) technique [16] have been exploited for assessing the parameter values of the models. Das (2004) [7] and Geem (2006) [15] proposed an optimization model whose objective function emphasizes the determination of a set of parameter values that minimizes the error between modelpredicted and observed outflows. Das (2004) [7] considered the validation of the storagedischarge relation by using the Lagrange multipliers.
In order to improve the optimum solution, several authors proposed hybrid optimization algorithms that combine the heuristic and the derivatives-based algorithms from a mathematical point of view. Two steps are mainly considered. In the first step, one of search-based, phenomenon-mimicking algorithms, which requires no initial guess, is exploited to assess the Muskingum parameters. Afterwards, the obtained values for nonlinear Muskingum parameters are utilized as the first initial vector for the second step in which a deterministic, derivatives-based method continues the routing simulation optimization (Niazkar and Afzali 2016) [25]. For instance, in Karahan et al., 2015 [18], the HS algorithm (global search) searched the optimum solution with multiple solution vectors, and then the BFGS algorithm (local search) adjusted the results of the HS algorithm by getting its results as the new initial solution [18]. A critical point is that most of the suitability measures are based on the comparison between the observed and the predicted outflow, whilst the other older works, such as [13,14], are focused on the storage equation (as can be seen from the objective function used).
In addition, several authors proposed other innovations in the nonlinear Muskingum relation. Karahan et al., 2015 [18] proposed the use of the cuckoo search algorithm (which also uses an initial population) during the calibration of the model parameters by also including the lateral flow. Easa (2015) [26] proposed to increase performance using continuous and discontinuous parameters expressed as a function of a dimensionless inflow variable. In addition, Farzin et al., 2018 [27] proposed the multi-reach Muskingum method to enhance the accuracy of the Muskingum method. The river under study was divided into several smaller reaches, and hence, for each reach, routing was applied separately. However, if there are no data in intermediate sections, it is not obvious that the intermediate outcomes can be used to estimate the outflow at these sections, and therefore, the method could be characterized as a black box model rather than a conceptual model.
Regarding the application of the fuzzy sets and logic methods in these types of problems, most of the proposed models are based on the adaptive network-based fuzzy inference systems (ANFIS) implementation. ANFIS is a neuro-adaptive learning technique which provides a hybrid method for the fuzzy modeling procedure to learn information based on a data set. Therefore, it can be seen as a neuro-fuzzy approach. It is very popular because of the corresponding toolbox made available in MATLAB. In many hydraulic applications, the ANFIS method is widely used, providing very good results; however, it seems that sometimes it was solely utilized as a black-box model without any specific utilization of the produced fuzzy rules. Chu, 2009 [28] proposed the combined application of fuzzy inference system (FIS) and Muskingum model in flood routing. The implementation is based on the ANFIS toolbox of MATLAB. Three points of this methodology must be noted. The first one is that the Muskingum method is used indirectly in the structure of the model (neither the evaluation of K or x takes place). The second point is that the output (outflow) for each time step is a crisp number and not a fuzzy number. Thirdly, based on the ANFIS method, a large amount of data is required.
Another idea is to use the fuzzy linear regression to calibrate the linear Muskingum method instead of the crisp linear regression as O'Donnel (1985) [11] proposed. Spiliotis and Garrote (2017) [29] proposed the calibration of the Muskingum parameters based on fuzzy linear regression. Therefore, instead of the crisp coefficient of the O'Donnel (1985) method [11], the coefficients are proposed to be fuzzy symmetric triangular numbers. This implies that, in contrast with the widely used ANFIS, the output (outflow) of the proposed model is a fuzzy number. Furthermore, based on the Tanaka (1987) [30] fuzzy regression implementation, all the observed data (regarding the outflow) are forced to be included in the produced fuzzy band aiming at its minimum width. Instead of the identification of the parameters K, x, the problem is oriented to the determination of the corresponding regression coefficient. Another point of view is that, based on the problem itself, the constant term of the usual regression model must be removed [29]. The relevant feature of articles mentioned in this paragraph is that they contain methodologies that produce a fuzzy outflow as a final output for the channel-routing problem.
However, the fuzzy linear regression of Tanaka (1987) [30] is heavily influenced by the existence of outliers, which can extremely enlarge the spread of the model. Hence, Spiliotis and Garrote, 2017 [29] proposed the use of a fuzzy linear regression model that is modified by incorporating ideas from the field of goal programming. Spiliotis et al., 2018 [12] expanded the methodology by including the lateral flow on the basis of the O'Donnel article [11]. As it was noted by the authors, an interesting point is that sometimes the produced fuzzy coefficients cannot correspond to the physical meaning of the storage equation, and hence, the model leads to a rather metric or black-box method.
In this work, a hybrid, fuzzy-based methodology is proposed. The proposed method consists of two main ideas. Firstly, the linear Muskingum model is adopted. The parameters K, x and α (where lateral flow occurs) are selected to be fuzzy symmetrical triangular numbers to avoid any irrational training as may occur in the case of the fuzzy linear regression. Hence, the methods produce a fuzzy estimation of the outflow based on the (crisp) Muskingum function to predict the outflow. Mathematically, the application of a crisp function using fuzzy parameters can be treated based on the extension principle of the fuzzy sets and logic. Secondly, the calibration is addressed by using a hybrid optimization procedure. In this article, the particle swarm optimization method (PSO) is used. Hence, there is no need to predict an initial solution. For each possible solution based on the (considered) values of the fuzzy parameters K, x and α, the outflow is calculated. The outflow will be a fuzzy number, whilst its determination is based on the extension principle. Unfortunately, the implementation of the extension principle leads to some sub-optimization problems but without difficult constraints. The solvability of these suboptimization problems is discussed in the methodology section.
Each possible solution is evaluated based on the degree of inclusion of the real data within the produced fuzzy band. This criterion should be included in the formulation of the fitness function of the heuristic optimization method. However, other criteria will be added which will be discussed in the methodology section. For instance, a fuzzy solution with a very large band will contain all the data, but this information is non-functional. Therefore, the fitness function must contain not only the inclusion degree but the magnitude of the fuzzy band, etc. In PSO, the values of the fitness function of the swarm and the use of random numbers determine the movement of the swarm, that is, the positions of the new possible solutions. The procedure is finished by considering a maximum number of iterations.
Three classical case studies from the references are widely used to benchmark the variations of the Muskingum method. These three case studies include: (1) smooth outflow (Wilson 1974 [10]), (2) double-peak outflow (Viessman and Lewis 2003) [9], and (3) nonsmooth outflow with lateral flow (O'Donnel 1985) data sets [11]. The implementation of the proposed methodology is validated using the three benchmark examples. The validation exercise is focused on the following questions: (1) Do the achieved fuzzy values of K, x have physical meaning? (2) Is the maximum outflow included satisfactorily within the produced fuzzy band? (3) Does the produced fuzzy band include the data with a rational width? (4) Does the produced fuzzy band include the aforementioned results of the crisp (deterministic, without uncertainty) methods?
A fourth case study taken from real-life data in a basin in Spain was used to validate the approach.

Principles of Fuzz Set and Logic
From a mathematical point of view, a fuzzy set can be described as a mapping from a general set X to the closed interval [0, 1] [31]. The membership function (MF) is a key concept which describes the fuzzy sets. Hence, the membership function declares to what degree an element belongs to the specific set. A membership degree of 0 denotes that the element does not belong to the set, and a membership degree of 1 denotes that the element belongs fully to the set. Subsequently, an element with a membership degree between 0 and 1 will partially belong to the examined set. A classical (crisp or precise or deterministic) set can be considered as a special case of a fuzzy set with a MF that only takes values 0 and 1.
A fuzzy number is a special case of the fuzzy set satisfying additionally the properties of convexity and normality. It is defined in the axis of real numbers, and its MF is a piecewise continuous function ( Figure 1). In general, the definition of fuzzy numbers can be found in Klir and Yuan, 1995. It is proven (Klir and Yuan, 1995) [31] that the MF of a fuzzy number (Figure 1) follows the mathematical expression (Equation (1)): 1] are the left and right parts of the membership function regarding the fuzzy number A. In addition, A L is increasing and continuous from the right, and A R is decreasing and continuous from the left. The interval [α 1 , α 2 ] can be an interval or a point, but it cannot be an empty set [31].
A simple fuzzy number for representing the parameters K and x is the fuzzy symmetrical triangular number (Figure 2), which is a kind of fuzzy number [12]. The symbols α, w denote the central value (a single point where µ = 1) and the width of the fuzzy number. The h-cut set of the fuzzy number A (with 0 < h ≤ 1) is the key idea to move from the fuzzy to the precise (crisp) sets, and it is defined as follows [31,32] (Equation (2)): The h-cut set is a crisp set determined from the fuzzy set according to a selected value of the membership function, α, and, reciprocally, a fuzzy set can be derived from a significant number of α-cut sets. In the case of a fuzzy triangular number, the α-cut is (Equation (3)): or equivalently: The crisp set including all the elements with non-zero membership function is the 0-strongcut, which is defined as follows [33,34] (Equation (4)): The following symbols are used for the zero-cut (Equation (5)): More analytically, according to Equation (5), the 0-cut is an open interval and does not contain the boundaries. For this reason (and to have a closed interval containing the boundaries), Hanss (2005) [35] proposed the phrase "worst-case interval W", which is the union of the 0-strongcut and the boundaries [35,36].
We can now extend the operation of the usual crisp functions in cases where the inputs are fuzzy sets, based on the extension principle that is briefly presented below.
Let X be a Cartesian product of universe X = X 1 × X 2 × . . . × X n and A 1 , A 2 , . . . , A n be defined in the universe sets X 1 , X 2 , . . . , X n , respectively. Let f be a (crisp) mapping from X to a universe Y, y = f (x 1 , x 2 , . . . , x n ). The mapping f for these particular input sets can now be defined as B = y, µ B (y) |y = f (x 1 , x 2 , . . . , x n ), (x 1 , x 2 , . . . , x n ) ∈ X in which the membership function of the image B can be defined (Zimmermann 1991) [32] by (Equation (6)): where f −1 is the inverse image of f. The above principle is known as the extension principle. Its implementation provides the means to use a crisp function even if the inputs are fuzzy numbers. With most input variables, it is preferable to use a number of h-cuts in fuzzy analysis instead of using a MF based directly on the above definition ( [37]). If f is a continuous function in the extension principle, the use of h-cuts can be also extended by determining the h-cuts of the function f, as follows ( [33,34]) (Equation (7)): It is customary to use fuzzy numbers as inputs (here, the parameters K and x) to the crisp function (here, the Muskingum equation, which calculates the outflow, Q), and hence, the boundaries of the decision space will be closed (because of Equation (1)). Then, from the theorem of global existence for maxima and minima of functions with many variables, it is known that if the domain of a real function is closed and bounded, and the real function is continuous, then the function will have its absolute minimum and maximum values at some points in the domain [38]. Based on this theorem, in cases where fuzzy triangular numbers appear as inputs [37], the h-cut for any real continuous function with real variables in this domain can be determined. The determination of each h-cut is concluded to a double sub-optimization problem. Hence, the fuzzy output can be described by determining several representative h-cuts. Summarizing, in the case of fuzzy parameters, the simulation problem (that is, the determination of the outflow based on the Muskingum method) leads to the problem of determining several h-cuts, which is implemented through a double optimization procedure. In case of lateral flow, then a fuzzy estimation of the parameter α with the use of the h-cuts can be added.

Formulation of the Muskingum Method
The lumped hydrological model of Muskingum is based on the mass balance equation (Equation (8)) and the storage equation (Equation (9)).
where I is the inflow rate to the reach, Q is the outflow rate from the reach, and S the storage in the reach. In the case where the relationship between storage and flow through a reach is linear, the Muskingum storage relationship can be written as (Equation (9)): where I is the inflow rate to the reach, Q is the outflow rate from the reach, K is the storage time constant for the reach, and x is a weighting factor that varies between 0 and 0.5 [9]. The mass balance equation can be expressed in discrete form as follows (Equation (10)): By combining Equations (8) and (9), the Muskingum routing equation is obtained (Equation (11)): The linear Muskingum model can be modified to include the lateral flow. O'Donnell, 1985 [11] suggested a simple approach: "Possibly, the simplest model is one which assumes that the rate at which lateral inflow enters the reach is directly proportional to the rate of inflow I into the reach, with a proportionality factor α." Following this approach, the mass balance equation and the empirical storage equations can be written as (Equations (12) and (13)): After some algebraic operations, the following equation can be used (Equation (14)) [11]: Or equivalently [12] (Equation (15)): The focus of this work is the crisp function that determines the current outflow when the parameters K and x are fuzzy numbers. Whereas in the case of no lateral flow, the Equation (11) describes this function, in the case of lateral flow, the Equation (15) is applied.

Implementation of the Muskingum Method with Fuzzy Parameters
In this study, the parameters K and x (and α in case of lateral flow) are selected by assuming them as fuzzy triangular symmetrical numbers ( Figure 2). The simple linear Muskingum equation is applied with fuzzy parameters. As aforementioned, the application of a crisp function with fuzzy parameters is achieved by applying the extension principle. In practice, several h-cuts can be determined in order to describe the values of the examined function (in this application, the function f describes the outflow). In case of no lateral flow (based on Equation (11)), the boundaries of each h-cut of the outflow can be determined as follows (Equation (16)): where I is the inflow rate to the reach, Q is the outflow rate from the reach, K is the storage time constant for the reach, and x is a weighting factor. The index j is referred to the time step, and the indexes L and R are referred to the left and the right hand of the produced h-cut regarding the examined fuzzy number. In cases of lateral flow, based on Equation (15) it holds (Equation (17)): For the case of non-lateral flow, to calibrate the Muskingum method, the parameters K and x need to be estimated. The central values (h = 1) and the left and the right hands of the zero-cut (or the worst-case interval W) regarding the outflow are exploited. The zero-cut can be used to express the concept of inclusion, that is, to check the property that the fuzzy band contains the observed data at least to a high degree. This is set as a key idea in order to calibrate the proposed fuzzy Muskingum model. The determination of the boundaries of the worst-case interval W is finally achieved by following an optimization procedure, which, in the examined case, will have a solution since the inputs are fuzzy numbers and the used crisp function Q is continuous. The determination of the worst-case interval W composes the sub-optimization problem that arises during the calibration of the model.
The problem of the calibration process is to assess the central values and the semiwidth of the parameters K, x, and α so that a selected criterion is minimized (including the inclusion constraints). Because of the model complexity, the calibration is achieved by applying a heuristic optimization model. In this article, the particle swarm optimization (PSO) method is selected. This method can be coupled with a simulation model; in this study, the Muskingum equation enhanced with the use of the extension principle for fuzzy parameters is applied.

Particle Swarm Optimization Method
Particle swarm optimization (PSO) is a stochastic global optimization method based on the simulation of the swarm. As in genetic algorithms (GA), PSO exploits a population of possible candidate solutions to probe the search area. The PSO algorithm can be characterized as one of the population-based algorithms (Parsopoulos and Vrahatis, 2002) [39]. Although PSO is an effective and widely used method, it is simpler than other heuristic optimization algorithms (e.g., GA) because the crossover and mutation operations in the original version of the GA [40,41] are not used.
PSO can deal with nonlinear optimization problems in non-convex domains [42]. Each candidate solution is called a particle, and the set of potential possible solutions in each iteration creates the swarm [41,43]. A swarm has a dimension N , in which N is the number of potential solutions. Each potential solution is comprised of D variables, in which D is the dimension of the problem [41,44]. In this article, in case of non-lateral flow, D = 4 K, w K , x, w x , where K is the central value of K, w K the semi-width of K, x is the central value of x, w x is the semi-width of x.
Analytically, the population dynamics in PSO simulates the behavior of a bird flock, where social sharing of information takes place and individuals benefit from the discoveries and previous experience of all other companions during their search for food. Thus, two variants of the PSO algorithm were developed considering either a local neighborhood or a global neighborhood. In the former, the partial optimum of the particle is usually applied [40]. In the latter, each particle moves towards its best previous position and towards the best particle in the whole swarm [39,41,45].
Several modifications to the original version of the PSO method of Kennedy and Eberhart (1995) [45] have been proposed, such as the adaptation of inertia term and the consideration of the maximum velocity (e.g., [46]).
The basic PSO algorithm is detailed below (e.g., [44,46,47]: 1: Initialize a population array of particles with random positions and velocities on D dimensions in the search area.

2:
Loop 3: For each particle, evaluate the desired optimization fitness function in D variables.

4:
Compare particle fitness evaluation with its best previously visited position (p i ). If the current value is better than p i , then set p i equal to the current value. 5: Identify the particle with the best fitness function value of the swarm p g . 6: Change the velocity and position of the particle (x i ) according to the Equation (18):

7:
If a criterion is met (usually a sufficiently good fitness or a maximum number of iterations), exit loop. 8: End loop where ρ() is a vector of random numbers uniformly distributed in the open interval 0, 1 that is generated at each iteration, and for each particle, p i is the best previously visited position of the ith particle (partial optimum), and p g is the global best previously visited position of all particles (global optimum). Furthermore, the term c 1 ρ 1 · → p i − → x i that associates the particle's own experience with its current position is weighted by the constant c 1 and is called individuality (cognitive acceleration). The term c 2 ρ 2 · → p g − → x i is associated with the social interaction between the particles of the swarm and weighted by the constant c 2 , and is called sociality (social acceleration). As υ the velocity is meant. The velocity, by supposing that ∆t = 1, modulates the new position x i (t + 1).
For the basic PSO, the coefficients c 1 and c 2 were allowed to take values in the interval 1.5 to 2.5 [41].
Clerc and Kennedy's analysis [48] based on the initial PSO method proposed the following modified equation for the new positions (Equation (19)): where χ is a parameter called constriction coefficient or constriction factor (Equation (20)): On the other hand, based on the literature, the following parameters are proposed ( [48]) (Equation (21)): In this study, the proposals of Clerc & Kennedy, 2002 are adopted in the implementation of the PSO.
It is worth noting that, in some cases, the PSO method converges rapidly to a local optimum. This problem can be overcome by the strategy of Salehizadeh et al. (2009) [49] by dividing the population or by using other hybrid methods, e.g., by combining the PSO together with simulated annealing behavior. For instance, [41] applied a modified PSO to achieve the identification of optimal hedging rules for complex real systems of operating reservoirs with many parameters, seeking to mitigate the drought impacts.

Proposed Calibration and Performance Measures
A procedure is followed to implement the model calibration (PSO) method together with the extension principle. First, a swarm of possible solutions is randomly created within the decision space. For non-lateral flow, each possible optimum solution (member of the swarm) contains the values of the parameters K, x, w K , w x that describe the membership functions of K, x. Then, a fuzzy simulation method is applied based on the extension principle. For each possible solution, the frontiers of the 0-cut and the central values are calculated. Finally, we apply the PSO methodology, and the objective is calculated. The objective function is identical with the fitness function, since by using the simulation stage, the use of constraints could be avoided.
The key question is about the formulation of the objective function (fitness function), which will be used during the calibration process. The objective function will differ from those selected by the crisp calibrations, since the output of the model will be a fuzzy band that expresses the outflow hydrograph. In this article, an objective function that contains a weighted sum of four performance measures is proposed (Equations (22)-(24)): where The first term expresses the divergence of the produced fuzzy band to include all data. In other words, the squared penalty term, E 1 , is activated only whether the observed data are not included within the produced fuzzy band. This procedure was initially proposed by Ishibuchi et al. (1993) [50] as a cost function to be minimized in the learning process regarding a neural network with interval weights, whilst in this work, it is used as a term of the fitness function (Equation (24)): The above measure represents the idea of inclusion of the fuzzy linear regression of Tanaka [30]. According to this model, all the data must be included within the produced fuzzy band [51,52].
The second term, E 2 , expresses the distance between the central values (α = 1) and the observed data (Equation (25)): The third term, E 3 , expresses the magnitude of the produced fuzzy width. A large width leads to an unfunctional fuzzy band, which cannot be exploited in real applications (Equation (26)): The use of this distance comes from the fuzzy regression of Tanaka where the problem of the fuzzy linear regression concludes to a constrained optimization problem aiming to minimize the total width of the produced fuzzy band (e.g., [51,52]) Finally, the last term, E 4 , expresses the distance between the maximum value of the outflow and the right hand of the estimated outflow (Equations (27) and (28)): where a max = 1 i f q max ≥ q + p 0 otherwise , p : time when q max = max(Q 1 , . . . , Q j , . . . , Q M ) (28) This distance is activated only if the observed data are above the produced fuzzy band as it is indicated by the coefficient α max where p is the time, which is referred to as the time where the maximum value occurs. This criterion is useful in river routing. For flood protection, it is critical that the produced fuzzy band (from the proposed model) includes the observed maximum value even in real time. Therefore, since this criterion expresses the aforementioned property, it is obviously useful. This measure is proposed by [44] where a fuzzified version of the fuzzy unit hydrograph was developed.
The terms Q observed j , Q − j , and Q + j express the observed value at time j, the left, and the right-hand boundary of the zero-cut, respectively.
These behavioral parameters are user defined and control the optimization process. The adopted values were taken mainly from the literature because they are shown to lead to good performance. Their effectiveness is later analyzed and compared to other alternatives in the validation section.
The weight w 1 is selected to give more emphasis on the first term of the fitness function. Indeed, the main advantage of the fuzzy model, similarly with the fuzzy regression model (e.g., [30]), is the inclusion of the observed data into the produced fuzzy band. However, in the case of a precise satisfaction of the inclusion constraints, then an outlier can significantly enlarge the size of the produced fuzzy band. Hence, the larger the w 1 , the closer to the absolute satisfaction of the inclusion constraints the model becomes. If a non-useful fuzzy band is produced, then the value of w 1 is reduced. In fact, the first measure strengthens the satisfaction of the inclusion constraints. E 1 conflicts with E 3 ; however, an unfunctional fuzzy band must be rejected. E 2 is similar but not identical with the crisp measures SSQ, since apart from the central value, the output of the fuzzy model is the entire fuzzy number.

Results and Discussion
In order to check the proposed methodology, four case studies from the references are used to benchmark the proposed method. The first three case studies include: (1) the routing of a smooth hydrograph (Wilson 1974 [10]), (2) a double peak hydrograph (Viessman and Lewis 2003 [9]), and (3) non-smooth hydrograph with lateral flow (O'Donnel 1985) data sets [11]. The last case study is located in the Ebro River Basin with more data available. Therefore, the validation of the solution achieved is studied in the last case study.

Smooth Hydrograph
The first data set is the data set presented by [10]. For these kinds of heuristic optimization problems, a maximum of 100 iterations were selected. Furthermore, the swarm consists of 50 members, whilst the expression of Clerc and Kennedy's is adopted for training (Equations (19)-(21)). The results were not practically changed even if the SWARM was activated again. The fitness or objective function consists of the four performance measures (Equation (22)). These selections were adopted for all the examined examples.
Since no lateral flow occurs, the Equation (16) is used to determine the boundary of the worst-case interval W, which is used to check the width of the produced fuzzy band and the degree of inclusion. Because of the sub-optimization problems (according to extension principle, Equation (16)) a significant computational time is required for the calibration process. Initially, a significant value of the w 1 is considered (w 1 = M 2 ). By following the aforementioned calibration procedure, the following results are achieved: where M is the number of data, whilst in the bracket, the first term means the central value and the second term the semi-width. The brackets symbolize the fuzzy symmetrical triangular numbers. Wilson, 1974 [10] proposed the following crisp values for the examined parameters: K = 1.5 days and x = 0.25, which are included from the produced corresponding quantities.
Therefore, the produced solution has physical sense and incorporates all the data to a high degree. However, the above suggestion holds only if the width of the produced fuzzy band is functional. For flood protection, the most important factor is the uncertainty within the neighborhood of the peak flow. Figure 3 shows that a range between the left and the right boundaries of the outflow is lower than 10 m 3 /s, which seems a reasonable range. However, the greater values have a high dispersion in time according to the proposed fuzzy solution. , which cannot be compared directly with the performance evaluation measures, since according to the proposed methodology, the output of the fuzzy Muskingum simulation will be a fuzzy number in contrast with the crisp formulation where the output is a crisp number. In case a significantly very small weight w 1 is selected, an almost crisp solution is obtained. In this case, the value of E 1 , E 2 , and SSQ are practically identical. However, the comparison between the SSQ and E 1 indicates that E 1 takes the smallest value ( Table 1 in Niazkar and Afzali, 2016) of the SSQ. P P P P P P P P P P P P P P P P P P P P P P GA P P P P P P P P P P P P P P P P P P P P P P BFGS P P P P P P P P P P F P P P P P P P P P P P BFGS-HS P P P P P P P P P P F P P P P P P P P P P P NLMM-L P P P P P P P P P P P P P P P P P P P F F F NLI (SSQ) P P P P P P P P P P P P P P P P P P P P F F NLII (SSQ) P P P P P P P P P P P P P P P P P P P P P P NLIII (SSQ) P P P P P P P P P P P P P P P P P P P P P F NLI (MARE) P P P P P P P P P F P P P P P P P P P P P P NLII (MARE) P P P P P P P P P F P P P P P P P P P P P P NLIII (MARE) P P P P P P P P P F P P P P P P P P P P P F To summarize, it is proposed that, in the case E 1 has a small value (preferably smaller than the SSQ of the crisp simulation) and a logical width, then the fuzzy calibration can be accepted. The time-step of Wilson 1974 [10], 6 h, can be characterized as sufficient, since in 6 h, some emergency measures can be implemented.
Analytically, for w 1 = M 2 , the proposed measures, for which their weighted sum composes the objective function, have the following values: central values near to data = 344.9 including the maximum value = 0.02 Two additional calibrations are presented in Figure 4 for (a) w 1 = 1 and (b) w 1 = 1/M. In cases where smaller weights are selected, the produced fuzzy band becomes thinner and smoother (Figure 4a,b), but it cannot contain a significant number of data (especially Figure 4b, where a smaller weight is selected). However, the maximum discharge is practically included in all cases. In all figures, the blue line expresses the central value, that is, the value which corresponds to unit membership function. For illustrative purposes, Figure 5 shows the evolution of the member of the swarm. Since there are four decision variables, only the parameters central values of K and x are represented. The blue circles indicate the swarm of the possible solutions and the diamond the global optimum solution for each generation. After the 50th generation, the swarm converts rapidly near the global optimum, and the optimum solution remains stable after the 89th iteration. After the end of the calibration procedure (and hence, the PSO algorithm), for each time step, the fuzzy number which corresponds to the outflow can be separately calculated by considering a very large number of h-cuts. For each h-cut, the extension principle is used to determine the boundaries of each h-cut. Figure 6 represents the produced outflow when the maximum outflow occurs in the observed data. As it can be seen, the output remains a fuzzy number (since it satisfies Equation (1)), but the linearity and the symmetry (which appears in case of the parameters K and x) were lost because the used crisp functions (Equation (16)) do not remain linear. By comparing the proposed fuzzy solution with many (crisp) methodologies, it is concluded that the fuzzy solution contains the majority of the described solutions (see the literature session). Indicatively, the Wilson-trial approach [10], the regression model, the NL-LSM (Yoon and Padmanabhan 1993) [15], the S-LSM (Gill 1978) [13], the LMM (Das 2004) [7], the HJ+DFP (Tung 1985) [14], the GA, the BFGS (Geem 2006) [16], the BFGS-HS  [18], and the CM (Easa 2015) [26] models are figured together with the fuzzy solution. The fuzzy solution is depicted by the left-hand and the right-hand bound of the zero-cut and the central values (Figure 7).
The Table 1 contains the comparison between the fuzzy solution with the aforementioned crisp models. The logical test is passed when the previous crisp solution for the outflow is included within the produced fuzzy band. A tolerance of 1 m 3 /s is permitted. From Table 1, it is shown that the majority of the crisp solutions are included within the produced fuzzy band apart from the last (decreasing) part of the hydrograph. The yellow lines indicate the time steps near the time where the maximum occurs.
In addition, the produced fuzzy band is separately compared with the graphical solution of Wilson, 1974 [10] (Figure 8). The graphical solution is based on the graphical representation of the storage versus the weighted discharge and with x as the preferred value, which produces the narrowest loop. An important point is that by comparing the central values of the fuzzy solution and the graphical solution, these two solutions are close, and moreover, the initial region of the outflows has a similar decreasing behavior.

Two-Peak Hydrograph
The second case study is a multi-peak flood hydrograph [9]. By selecting w 1 = M 2 , K = (3.5744, 1.5672), and x = (0.3460, 0.1399), whilst E 1 = 0.51 (Figure 9), where M is the number of data for the examined set. By selecting w 1 = 1, a functional approach with a smaller fuzzy band but with some points out of the fuzzy band is produced ( Figure 10). The value E 1 = 8974.5, which is sufficient. K = (3.6975, 0.3954) hr and x = (0.3134, 0.1067). An interesting perspective is that the two solutions are very close regarding the central values. Furthermore, the produced fuzzy band is compared with BFGS (Karahan 2014) [54] and MHBMO (Niazkar and Afzali 2015) [25] (Figure 11). Although the crisp models are rather complex with many parameters, the fuzzy model based on the Muskingum linear method contains practically all the values for w 1 = M 2 (Figure 11a) and the majority of the values when w 1 = 1 (Figure 11b), whilst the fuzziness is within a rational value. The method of BFGS seems to be away from the data. Figure 11. Other crisp simulations, the observed data ,and the produced fuzzy band in case of the [9] data for (a) w 1 = M 2 (b) w 1 = 1.

Non-Smooth Hydrograph with Lateral Flow
A non-smooth outflow hydrograph with lateral flow, previously presented by O'Donnel (1985) [11], was analyzed. The data are based on the event on the River Wyre, 20-21 October 1982. The river flows into the Irish Sea at Fleetwood. It is approximately 28 miles (45 km) in length. The river is a county Biological Heritage Site. According to O' Donnel (1985), there was a considerable increase in the flood volume between the inflow and outflow sections (some 25 km apart), and furthermore, it also had a multi-peaked inflow. Initially, a significant value of the w 1 is considered (w 1 = M 2 ). Since lateral flow occurs, Equation (17) is used to determine the worst-case interval. By selecting w 1 = M 2 , it is concluded that K = (4.9405, 2.1945), x = (0.0593, 0.1035), and a = (2.5960, 0.1035) ( Figure 12). However, the values of x approach the value corresponding to the reservoir, whilst small negative values are expected to be irrational. By selecting w 1 = 1, (Figure 13) it is concluded that K = (6.7211, 2.6006), x = (0.0997, 0.0002), and a = (2.9648, 0.1848) and E = 23.4438. These values approximate the crisp values that are provided by [11]. By comparing the observed data and the produced fuzzy band, it seems that although a new parameter was added, the performance of the fuzzy solution can be characterized as sufficient since the produced fuzzy band includes to a high degree the observed data of the outflow without an irrational width. Based on the three examples analyzed, results show advantages of the proposed methodology to be highlighted: (1) the use of a fuzzy estimation with the aim of fuzzy parameters is closer to observed; (2) the new methodology increases the safety of the prediction including the uncertainty; (3) the produced fuzzy band include, to a significant degree, the observed data; and (4) the output of the successful existent crisp methodologies even if they include more complex assumptions.

Validation with Real-Life Data
The validation case study is located in the Ebro River Basin (Spain, Figure 14). The gauge stations (automatic flow measures) belong to the SAIH network (Automatic Hydrologic Information System) managed by the Ebro Basin Water Authority. The reach under analysis is located in the Aragón River between stations 9271 (elevation 1040 m, basin area 101 km 2 ) and 9018 (elevation 793 m, basin area 238 km 2 ). The reach length is 18.09 km, and the mean slope of the reach is 1.365%. The method proposed in this study is validated with four hydrographs recently recorded in the reach. The time step of the original data is equal to 15 min. In order to reduce computational time and for stability reasons, the original data were resampled at ∆t = 1 h time step. The selected hydrographs can be characterized as rather complex since they include important lateral flow, and their shapes are not smooth. The fuzzy bands obtained for the outflow hydrographs are presented in Figure 15. In both cases, the most relevant values of the outflow hydrograph (near the peak discharge) are included within the produced fuzzy band. The wider fuzzy band for w 1 = 1 includes a larger fraction of values, but the narrower fuzzy band for w 1 = 0.1 reduces the uncertainty of the forecast. Afterwards, the fuzzy parameter values obtained for the two solutions were validated by applying them on three other real events, named Hydrograph 2, 3, and 4. The results are shown in Figure 16. For the fuzzy parameters obtained with hydrograph 1 and w 1 = 1, the stronger emphasis on the inclusion during the calibration process leads to results for other events where the observed outflow is almost included within the fuzzy band produced. This aspect is quantified with the following modified measure E 1 : The values of the modified measure obtained for Hydrographs 2, 3, and 4 are 0.6556, 0.3304, and 0.2526. These values are small, indicating a good coverage of the observations, although most of the observed discharges are below the central values of the fuzzy band.
For the fuzzy parameters obtained with Hydrograph 1 and w 1 = 0.1, the observed outflow is not included within the produced fuzzy band in most cases (Figure 16a). However, the shape of the observed outflow is similar to the produced fuzzy bands, suggesting that the deviation may be due to the different contribution of incremental flow in each event. The values of the modified measure are 6.8537, 10.8548, and 4.8134 ( Table 2). As aforementioned, the property of inclusion (which is expressed by the modified measure E 1 ) is in conflict with the goal of low uncertainty, which is expressed by the measure E 3 ( Table 2). Table 2. Modified measure E 1 (property of inclusion) and E 3 (uncertainty) by testing weights w 1 = 0.1 and w 1 = 1 and applied to Hydrographs 1 (training) and 2, 3, and 4 (validation). These fuzzy parameters showed a good skill to produce operational forecasts based only on inflow to the reach. The results were encouraging, particularly considering that incremental flow was important in comparison with the inflow to the reach. The parameter w 1 could be used to control whether the emphasis should be placed on inclusion of the observations within the fuzzy band or in narrowing the uncertainty of the predictions. It should be noted that important hydrograph's characteristics were simulated with a high degree of accuracy, for example, the volume of the hydrograph, the peak flow, the lag time, and the duration of the hydrographs, among others.
In this validation exercise, the effectiveness of behavioral parameters was tested against an alternative parameter configuration proposed by Pedersen and Chipperfield [55], who suggested to simplify the approach by eliminating the use of the particle's previous best-known position by setting c 1 = 0. The performance of this option was lower than that of the traditional values for behavioral parameters in the examined case. The small improvement in computation time was not found to be relevant because the large computation time is caused by the simulation (based on the extension principle) and not by the topology of the network and the corresponding update of the positions of the swarm. Therefore, the simplification of the topology of the swarm in this problem cannot win computational time and leads to poorer results.
A critical point of which these kind of studies should be aware is the number of data. For instance, the approach of the ANFIS system based on the Wilson, 1974 data [10] is not a safe choice since the ANFIS requires several variables. Furthermore, the use of sophisticated non-linear Muskingum models with either 10 [54] or significantly large [56] calibrated parameters, with only 18 sets of data available, have the risk of overtraining.
A disadvantage of the proposed methodology is the high computational time consumption required and secondly, the selection of the proper weight w 1 , which differentiates the solution. However, (1) the production of a functional uncertainty (2) with parameters that make physical sense and (3) the inclusion of the observed data to a high degree, especially near the maximum value, will be the determinant criteria in the selection of the weight. For instance, in the case of the third example, a large quantity of w 1 leads to an unfunctional uncertainty as well as an irrational values for the parameter x.
An extension of the proposed methodology could be the application of a nonlinear form regarding the Muskingum model. However, two limitations must be taken into account. The first one is that a significant number of data is required for calibration, and the second is the difficulty in the incorporation of the fuzziness. Indeed, by using the linear Muskingum theory, an explicit relation exists to determine the outflow with respect to the parameters K and x (and α for lateral flow), while in the case of the nonlinear relation, a process with many steps is required. Hence, in the first case, the extension principle can be directly applied to determine the selected h-cuts.The more traditional optimization problems are gradient-based and local search algorithms, and hence the final optimum solutions, may depend on the consideration of the initial point. The evolutionary algorithms, such as genetic algorithm (GA) and swarm intelligence, has an advantage that overcomes this difficulty. Such global optimizers are in most cases simple, flexible, and efficient. However, it lacks in-depth understanding of how such algorithms may converge and how quickly they can converge to the global optimum, and hence, this point is a topic for further investigation [57]. The PSO method is selected because of its simplicity and since it has been applied in the examined problem of river routing with satisfactory results ( [21,58,59]). Indeed, [60] suggested that their outcomes confirmed that the PSO algorithm estimated the parameters in a complex nonlinear Muskingum model with high accuracy along with a fast rate of convergence. In addition, in the examined case, which is a calibration problem, we can suppose a possible range of the parameters, and hence, a randomly created initial swarm can be easily constructed. The use of other heuristic algorithms is challenging especially in more complex simulation where a lateral flow exist, and the no linear Muskingum model may be adopted. For instance, the use of hybrid simulated annealing-PSO methods [61]) seem promising alternatives for further work. As it is written in [61], many algorithms introduce ideas that could be easily exported to other methods with varying degrees of compatibility. In [62], these hybrid models were proposed in the case of a water-based algorithm (e.g., [62]). For instance, by a similar way with the PSO, rain-fall optimization algorithm (RFO) has been applied as a new, naturally inspired algorithm based on behavior of raindrops with an effective approach [63].
The utility of the method might be extended in the cases of the rain-runoff models since some conceptual models use the well-known Muskingum method to express the quick flow development (instead of the unit hydrograph theory) and, with other parameters, the slow flow development [1].

Concluding Remarks
A general methodology to assess the parameter of the linear Muskingum model for river routing by considering parameters as fuzzy symmetrical triangular numbers is presented in this study. The expanded linear Muskingum storage method presented by O'Donnel (1985) [11] is used also in cases where a lateral flow occurs. Since the calibration model is an optimization of a case model, the PSO is used since it enables us to use a simulation process for each possible solution that composes the swarm. Hence, for each candidate solution, the extension principle of fuzzy sets and logic is activated in order to determine a fuzzy band of the outflow.
A fitness function is established that expresses the divergence of the produced fuzzy band to include all data. It takes into account the distance between the central values and the observed data, the total width of the fuzzy band and the distance between the maximum value of the outflow, and the right hand of the estimated outflow. A critical point is that the aim of divergence of the produced fuzzy band to include all data (property of inclusion) is in conflict with the aim of minimization of the total width of the produced fuzzy band. A weighted sum of the above goals composes the fitness function, whilst a functional width that contains most of the observed data is applied to select the weights of the fitness function.
Four case studies from the references are used to benchmark the proposed method, including smooth and non-smooth hydrographs with lateral flow and a double peak hydrograph. The last case study includes a complicated, real case study, and it is used for validation purposes. The proposed methodology improve results with the use of a fuzzy estimation with the aim of fuzzy parameters closer to nature; the new methodology increases the safety of the prediction; and furthermore, the produced fuzzy band can include, to a significant degree, the observed data and the output of the existent crisp methodologies even if they include a more complex formulation. Regarding the PSO model, results suggest that 100 iterations of a swarm with 50 members is sufficient to approach the final solution, while the candidates solutions converge to the total optimum.   [11]. The data can be found in the references. In addition, the validation case study is located in the Ebro River Basin (Spain, Figure 14). The gauge stations (automatic flow measures) belong to the SAIH network (Automatic Hydrologic Information System) managed by the Ebro Basin Water Authority. Data from the SAIH network of the Ebro basin can be requested in the following web page: http://www.saihebro.com/saihebro/index.php?url= /autoservicio/inicio.