Optimization of Multi-Quality Water Networks: Can Simple Optimization Heuristics Compete with Nonlinear Solvers?

: Optimal management of water systems tends to be very complex, especially when water quality aspects are included. This paper addresses the management of multi-quality water networks over a ﬁxed time horizon. The problem is formulated as an optimization program that minimizes cost by determining the optimal ﬂow distribution that satisﬁes the water quantity and quality requirement in the demand nodes. The resulted model is nonlinear and non-convex due to bilinear terms in the mass balance equations of blending multi-quality ﬂow. This results in several local optima, making the process of solving large-scale problems to global optimality very challenging. One classical approach to deal with this challenge is to use a multi-start procedure in which off-the-shelf local optimization solvers are initialized with several random initial points. Then the ﬁnal optimal solution is considered as the lowest objective value over the different runs. This will lead to a cumbersome and slow solution process for large-scale problems. In light of the above, this study supports using ultra-fast simple optimization heuristics, which despite their moderate accuracy, can still reach the optimum solution when run many times using a multi-start procedure. As such, the ﬁnal solution from simple optimization heuristics can compete with off-the-shelf nonlinear solvers in terms of accuracy and efﬁciency. The paper presents a simple optimization heuristic, which is specially tailored for the problem and compares its performance with a state-of-the-art nonlinear solver on large-scale systems. The results from the suggested approach compete with nonlinear solvers in terms of both accuracy and efﬁciency. The results show that for large-scale problems (ﬁve-year horizon) the nonlinear solver will require ~4.8 h, making it an impractical tool for near-real-time interactions with the model, while the suggested heuristic can solve the same problem in ~15 minutes. If we further increase the horizon to eight years (i.e., 9528 decision variables), the nonlinear solver will reach the maximum time limit of 1800 s in 80% of the runs, and the runtime will be as high as ~11 h which is compared to ~0.7 h when using the SOH. This efﬁciency of the SOH is obtained by harnessing ultra-fast optimization procedures within an iterative process. Moreover, the results show that SOH can reach good accuracy with maximum relative error in the objective function of 0.45% and relative infeasibility within the prescribed tolerance. Still, the SOH is a local solver (as well as the nonlinear solver, IPOPT), and thus, it cannot guarantee a globally optimal solution. In this study, we compared the performance of the nonlinear solver and the SOH using the same settings of the multi-start procedure (i.e., the same number of initial points). However, owing to the fast solution process of the SOH, we can consider a larger number of initial points, which will increase the chance to reach a globally optimal solution. Further research is needed to examine the performance of the proposed approach in reaching the global optimum compared to state-of-the-art global optimization solvers.


Introduction
In recent years, water scarcity is being acknowledged across the world as an immediate threat to current and future sustainability. The threat is even more relevant in arid and semi-arid regions such as Middle East-North Africa (MENA) region. The MENA region is considered one of the most water-scarce regions in the world [1]. Thus, efficient planning and management for the water resources systems are crucial. Optimization models for water systems planning and management are considered an effective tool to aid in the decision-making process for managing water resources and their related infrastructure [2]. These models can be classified by the function they solve and the constraints they include. Simpler models only include quantity (Q) variables and only solve the quantitative aspect of the system. Some models are quantity-driven (with hydraulic considerations), called Q-H models. These models solve the problem with quantity variables while taking into account hydraulic consideration. When used in operational problems the Q-H models usually solve real-time systems with low temporal resolution such as hours or part of the hour [3]. Other models include the quality aspect of the water system, and these are called Q-C models. These models have two sets of decision variables (flow and concentration), but they do not include hydraulics explicitly. That is, they rely on the inherent assumption that the network is hydraulically feasible [4,5]. Additionally, some models deal with all three aspects; these are called Q-H-C models. Cohen et al. [6][7][8] developed an optimization Modeling the MMWSS problem will result in nonlinear and non-convex optimization models. This is because the quality (concentration) in the network edges, which is unknown, depends on the flow, which is also unknown, and on the quality of the water sources. Specifically, nonlinearity and non-convexity are due to the use of bilinear terms in the solute mass balance equations. For example, when salinity aspects are introduced, the mathematical model turns nonlinear due to the bilinear equation of salt mass balance in each mixing junction of the conveyance network. That is, to calculate the mass of salt in a junction, the concentration variables have to be multiplied by the flow variables causing the bilinear system of equations [16,17]. These characteristics of nonlinearity and non-convexity introduce several local optima, making the process of solving large-scale problems to global optimality very challenging [19]. Therefore, developing efficient solution algorithms for large-scale MMWSS is very desirable.
In previous studies, such as [20][21][22][23], the quality variables were simulated outside the optimization program so that a simpler optimization problem is obtained. For example, in [23] an external module that consists of an Artificial Neural Network (ANN) is used to estimate the water quality in the system, while in [22] water quality requirements are represented by flow constraints. Cai et al.,in [20], developed a multi-quality model for a river basin where salinity presents a major environmental problem. The model has a time horizon for a single year with monthly time steps. To deal with the complexity of the model, a decomposition approach was adapted. The model is broken up into four sub-problems; the first one is the flow balance with fixed salinity. The second is the first, plus salt mass balance. The third is the second, plus the relationships defining the effect of soil salinity on crop yield. The fourth problem is actually like the primary problem with no infeasibilities. Clearly, these approaches that do not simultaneously consider quality and quantity variables result in suboptimal solutions.
Other studies consider the joint optimization of quality and quantity of multi-quality water networks, but many of them were solved using local nonlinear solvers [4,6,16,17,[24][25][26]. For example, [6] solved a model of a multi-quality water supply system that minimizes the cost of operation composed of several components: pumping from sources, transportation, treatment, and yield reduction cost depending on water quality. The model is nonlinear and non-convex and it was solved using a local solver based on the projected gradient method [27]. Ostfeld and Shamir, in [24], solved a model for the optimal operation policy of a multi-quality water supply system, which consists of the following elements: sources, reservoirs, pipes, pumping stations, treatment plants, and consumption nodes. The objective function was to minimize the total operational cost. The solution was obtained Water 2021, 13, 2209 3 of 18 using a local solver of GAMS/MINOS [28]. In a later study, Ostfeld and Shamir [25] developed an integrated model for joint design and management of multi-quality water supply systems. Due to non-smoothness, the r-algorithm [29] was adapted to solve the optimization problem. Zaide [4] solved the MMWSS problem by nonlinear optimization solver using the Generalized Reduced Gradient (GRG) algorithm [30]. The author notes that using this direct approach is limited to small or mid-scale problems. As such, to deal with the large size of the problem, [4] performed a relaxation in the model to reduce its size. It is obvious that this kind of relaxation will often lead to a suboptimal solution for the problem. As opposed to [4], [16] suggested a different reduction scheme that does not change the original model. In [16] all the salinity variables are eliminated using algebraic manipulations such that a smaller optimization problem was obtained. The reduced problem is then solved using a local nonlinear optimization solver which is based on the Interior-Point algorithm [31]. While this reduction scheme is good in reducing the model size (i.e., number of decision variables), it was found to introduce more nonlinearity in the model. Thus, the nonlinear solver might converge to a locally optimal solution.
The studies in [5,32] followed a different approach in which computationally expensive search methods such as Genetic Algorithms are used to solve the nonlinear non-convex mathematical model. In [5] a quality-quantity flow model to optimize water distribution in a regional water supply system with multi-quality sources is developed. The model is a nonlinear optimization model with monthly operation units and a six-month management horizon. The nonlinear mathematical model is solved by a hybrid Genetic Algorithm (GA), in which a GRG algorithm is embedded in the GA to optimize the objective function for each fitness evaluation. Cai et al., [32] adopted a similar hybrid GA approach for solving large-scale nonlinear water management models. The proposed approach solves a linear program for each fitness evaluation. These hybrid approaches are computationally expensive, especially because an inner optimization problem must be solved for each fitness function evaluation of the search algorithm.
Solution efficiency and accuracy are crucial properties when solving MMWSS models. A local nonlinear solver may converge to different solutions depending on different initial points; this makes it untrusted for real-life decision-making. One classical approach to deal with this challenge is to use a multi-start procedure in which off-the-shelf local optimization solvers are initialized with several random initial points. Then the final optimal solution is considered as the lowest objective value over the different runs. However, this will lead to a cumbersome and slow solution process for large-scale problems, which prevents real-time interaction with the model results. Solution efficiency is a desired property for model users since it facilitates real-time discussion of the model's results between multiple stakeholders.
In light of the above, this study supports using ultra-fast, simple optimization heuristics, which despite their moderate accuracy, can still reach the optimum solution when run many times using a multi-start procedure. Namely, since a multi-start procedure should be used for local solution approaches, it might be advantageous to use many fast trials of simple solvers as opposed to using sophisticated slow solvers that have a higher likelihood to reach good solutions. Running a large number of trials of simple local solvers can compensate for its moderate accuracy in reaching good solutions. As such, the final solution from simple optimization heuristics can compete with off-the-shelf nonlinear solvers in terms of accuracy and efficiency. The paper presents a simple optimization heuristic, which is specially tailored for the MMWSS problem, and compares its performance with a state-of-the-art nonlinear solver (IPOPT) on large-scale systems. IPOPT is an open-source nonlinear optimization solver, which uses the interior-point algorithm [31]. Our local optimization heuristic builds on tools of convex optimization to find local optima of the MMWSS problem. The solution approach leverages the ability to efficiently solve linear programs. It iterates between solving a linear program that approximates the original problem and solving a single variable nonlinear optimization problem that reduces the infeasibility of the linear approximation. This local heuristic depends on the initial point that is used to build the first linear approximation. It is therefore typical to initialize the Water 2021, 13, 2209 4 of 18 algorithm with several randomly chosen initial points and take the final point as the one with the lowest objective value over the different runs.
The idea of using sequential convex sub-problems is not new. The famous Sequential Linear Programming (SLP) algorithm [33,34] and Sequential Quadratic Programming (SQP) algorithm [35], utilize this concept. However, these approaches require trust regions to ensure convergence and to limit the iterations to a region where the approximation is valid [36]. Unlike these procedures, our suggested heuristic does not require a trust region. In the suggested approach, the MMWSS problem characteristics are utilized to iterate between the linear approximations and a new (easy-to-solve) nonlinear problem that directs the solution process toward the feasible region of the original nonlinear optimization problem.

The Nonlinear MMWSS Formulation
The water network is comprised of a set of nodes that represent sources, mixing junctions and demand zones, and a set of links that represent the flow of water from one node to the other. Water flow and quality (concentration) in each link are modeled using mass balance equations in each node. For concentration in the links, we consider a nonreactive solute assuming complete mixing at the nodes. The objective is to select the optimal flow distribution in the network that minimizes the overall cost of the management horizon subject to bounding constraints on flow and concentration.
A water network with j ∈ J mixing junctions and l ∈ L links can be represented using nodes-links incidence matrix, A ∈ R |J|×|L| . The matrix has a row for each node and a column for each link such that the only nonzero elements in each row are +1 and −1 for incoming and outgoing links respectively. We also define the sources set S with elements s 1 , . . . , s |S| and the demands set D with elements d 1 , . . . , d |D| . For example, the system in Figure 1, with two sources, two demand nodes, and three mixing junctions, has an incidence matrix A ∈ R 3×7 as given in Table 1  the MMWSS problem. The solution approach leverages the ability to efficiently solve linear programs. It iterates between solving a linear program that approximates the original problem and solving a single variable nonlinear optimization problem that reduces the infeasibility of the linear approximation. This local heuristic depends on the initial point that is used to build the first linear approximation. It is therefore typical to initialize the algorithm with several randomly chosen initial points and take the final point as the one with the lowest objective value over the different runs.
The idea of using sequential convex sub-problems is not new. The famous Sequential Linear Programming (SLP) algorithm [33,34] and Sequential Quadratic Programming (SQP) algorithm [35], utilize this concept. However, these approaches require trust regions to ensure convergence and to limit the iterations to a region where the approximation is valid [36]. Unlike these procedures, our suggested heuristic does not require a trust region. In the suggested approach, the MMWSS problem characteristics are utilized to iterate between the linear approximations and a new (easy-to-solve) nonlinear problem that directs the solution process toward the feasible region of the original nonlinear optimization problem.

The Nonlinear MMWSS Formulation
The water network is comprised of a set of nodes that represent sources, mixing junctions and demand zones, and a set of links that represent the flow of water from one node to the other. Water flow and quality (concentration) in each link are modeled using mass balance equations in each node. For concentration in the links, we consider a nonreactive solute assuming complete mixing at the nodes. The objective is to select the optimal flow distribution in the network that minimizes the overall cost of the management horizon subject to bounding constraints on flow and concentration.
A water network with j J ∈ mixing junctions and l L ∈ links can be represented using nodes-links incidence matrix, The matrix has a row for each node and a column for each link such that the only nonzero elements in each row are 1 + and 1 − for incoming and outgoing links respectively. We also define the sources set S with ele-  Figure 1, with two sources, two demand nodes, and three mixing junctions, has an incidence matrix 3 7 A R × ∈ as given in Table 1 and the sets

Objective Function
The objective function is to minimize the total operation cost. The cost is the sum of two different costs: conveyance cost and cost of water production in sources.
where cost is the total sum of operational costs; CC t is the conveyance cost of year t; PC t is the production cost of year t. T is the number of years in the management horizon.
In this model, the per-unit cost of production is predetermined. Hence, the production cost is calculated as in Equation (2).
where PC t s is the production cost of source s at time t; Q t s is water production in source s at time t; PUPC t s is the per-unit production cost for source s at time t. Generally, the conveyance cost in the pipes is derived from the head-loss, which is calculated by the nonlinear Hazen-Williams formula. Nevertheless, in regional water systems such as those considered herein, it is possible to use the Hazen-Williams formula to obtain an average constant per-unit conveyance cost. For example, this can be done by assuming a typical hydraulic gradient for the short-term operation in the distribution system of 1‰ as given in Equation (3).
where CC t p is conveyance cost in pipe p; PUCC t p is the per-unit conveyance cost in pipe p; Q t p is flow in pipe p; X t p is head-loss in pipe p; w t is the number of pumping hours; η is the efficiency; KW HC t is the energy cost; L p is the length of pipe p; P is a set of pipes with elements p 1 , . . . , p |P| .

Water Conservation Law
Given the representation of the network as a directed graph with nodes-links incidence matrix, A, the water balance for each time step, t, is imposed by the following linear equation system that ensures water conservation at the network nodes: where Q t ∈ R |L|×1 is a vector of flow decision variables in the network links. Since the demanded flow is known, the flow ingoing the demand nodes is also fixed. That is, given the values of the flow to demand zones Q t d ∀d ∈ D, the following linear equality constraints should be imposed: where I D ∈ R |D|×|L| is a given incidence matrix that extracts the demands' links from the vector Q t , namely, each row in I D is corresponding to a demand node where each element in the row takes the value one if the link is associated with the demand node and zero otherwise. Q t D ∈ R |D|×1 is a given vector that collects the known demands Q t d ∀d ∈ D. For the illustrative example in Figure 1, the definition of the matrix I D is given in Table 2. Table 2. Demands-links incidence matrix, I D , for the illustrative example.

The Solute Mass Conservation Law
For each time period, t, the following bilinear (i.e., nonlinear, non-convex) system of equations ensures solute mass conservation at network nodes: where C t ∈ R |L|×1 is a vector of concentration decision variables in the network links; M t ∈ R |L|×1 is a vector of solute mass in the network links; • is the Hadamard (i.e., element-wise) product between two vectors. This bilinear system of equations is the only source of nonlinearity and non-convexity in the MMWSS problem. Since the concentration of the sources is known, the concentration that outgoes source nodes is fixed. That is, given the values of the concentrations in the sources C t s ∀s ∈ S, the following linear equality constraints should be imposed: where I S ∈ R |S|×|L| is a given incidence matrix which extracts the sources' links from vector C t , namely, each row in I S is corresponding to a source node where each element in the row takes the value one if the link is associated with the source node and zero otherwise. C t S ∈ R |S|×1 is a given vector that collects the known concentrations C t s ∀s ∈ S. For the illustrative example in Figure 1, the definition of the matrix I S is given in Table 3. Table 3. Sources-links incidence matrix, I S , for the illustrative example.

Dilution Condition
The model assumes total mixing at all junction nodes, so the salinity in all links leaving a node is the same. This dilution condition is given by the linear system of equations as defined in Equation (8).
where the matrix B ∈ R (|L|−|J|−|S|)×|L| is defined such that each row indicates equal concentration for two outgoing links which share the same node; i.e., each row has only two non-zero elements +1 and −1; when three links leave the same node there are two rows, each with two non-zero elements +1 and −1. In the illustrative example (Figure 1), the matrix B ∈ R 2×7 is given in Table 4. The junction nodes j 1 , j 2 have two outgoing links, therefore each of them contributes one row in the matrix B. Table 4. Dilution constraints matrix, B, for the illustrative example.

Conveyance Capacity Constraints
The model deals with the balance of water and solute mass and does not consider explicitly the hydraulic energy balance of the system. However, to prevent infeasibilities of hydraulic conditions, the flows in the pipes are limited by capacity constraints in Equation (9). The lower bound is set to Q t p,min , which is larger or equal to zero since the flow direction in the pipes is fixed.
where Q t p,min , Q t p,max are the minimum and the maximum flow in pipe p, respectively.

Production Capacity Constraints
In each source, there is a maximum physical capacity in each period (e.g., pumping station capacity) as well as multi-period (i.e., cumulative) water production capacity that represents the overall production in the management horizon (e.g., to accounting for sustainability aspects, which ensure that the source is not overexploited). These constraints are given in Equations (10) and (11), respectively. Note that the lower bound, Q t s,min , should be set to a non-negative value.
where Q t s,min , Q t s,max are the minimum and the maximum flow from source s, respectively; Q total s,max is the total amount of water withdrawal from source s.

Concentration Constraints
Concentration requirements are imposed in demand nodes. As such the concentration of water that reaches the demand node should be within a prescribed minimum and maximum concentration as given in Equation (12). In case the demand node does not have concentration bounds, practical bounds could be added using the known concentration of the sources by setting C t d,min = min where C t d,min , C t d,max are the minimum and the maximum concentration in demand node d, respectively. In the absence of concentration limits in pipes, one can define the following bounds: min

Optimization Problem
The overall optimization model is given in Equations (14)- (22), hereafter we will donate this model as the Q-C model. This optimization problem is a nonlinear and nonconvex optimization program due to the use of bilinear terms in the third constraint. This formulation is given in terms of decision variables in the network links Subject to where f t ∈ R 1×|L| is a given row vector that collects the per-unit costs from Equations (2) and (3); Q t min , Q t max ∈ R 1×|L| are column vectors that collect the flow bounds from Equations (9) and (10); C t min , C t max ∈ R 1×|L| are column vectors that collect the concentration bounds from Equations (12) and (13); Q total S,max ∈ R |S|×1 is a given vector that collects the values Q total s,max ∀s ∈ S.

Simple Optimization Heuristic
As discussed previously, the inclusion of both quantity and quality variables increases the computational complexity of mathematical models because of the nonlinear and nonconvex relationships involved in the solute mass balance. This heightened complexity is dominant in large-scale and long-term models. In view of this, it is a necessity to develop efficient solution methods to solve Q-C models. In this section, we propose and examine the performance of a Simple Optimization Heuristic (SOH) to solve quantityquality management models. The basic idea consists of replacing the bilinear terms with first-order Taylor approximation to obtain a search direction. Then, this direction is used to formulate a simple nonlinear problem in which the infeasibility of the solute mass balance is minimized. The process is reiterated by successively solving linear and nonlinear optimization problems until convergence is obtained.
We denote the convex (polyhedral) feasible region of the constrains in Equations (15), (17)-(22) as F , hence, after substitution of M t , the optimization problem could be written in compact forms as: Subject to To perform the linearization, the formulation of the Q-C model in Equations (23)-(25) remains the same except for the second constraint, which is the source of the nonlinearity and non-convexity. Thus, the linear approximation of the Hadamard (i.e., element-wise) product between two vectors is obtained based on the first order, bi-dimensional Taylor approximation as shown in Equations (26).
where Q t,k and C t,k are the approximation points at iteration k. Using this convex approximation, the sequential linear approximation subproblem for iteration k can be formulated as: Subject to This linearization, however, introduces a convex approximation of the feasible region, which might be smaller than the original feasible region of the optimization problem in Equations (23)- (25). This can lead to infeasibility in the linear subproblems. This issue of infeasibility in the subproblems is also encountered in SQP and other iterative techniques [37,38]. To deal with the feasibility of subproblems, we propose the following relaxation that introduces one penalized artificial variable in Equations (30) and (32). Clearly, the relaxed problem is always feasible (when the original problem has a nonempty feasible region), and the artificial variable v corresponds to the maximum deviation in Equation (29).
Subject to where β is a predetermined penalty parameter, and v is an artificial variable that ensures the feasibility of the constraint in Equation (32).
For large values of β, when the feasible region is nonempty, the solutions to the relaxed problem are solutions to the linear approximation problem in Equations (27)- (29), and the optimal value of v will be zero [39]. Nonetheless, the feasibility of the linear approximation does not guarantee the feasibility of the original nonlinear problem in Equations (23)- (25). To move to the feasible region of the original nonlinear problem, another subproblem that reduces the nonlinear infeasibility between iterations is solved. Given the current approximation points, ( Q t,k , C t,k ), and given the current solution of the relaxed problem, denoted as ( Q t,k lin , C t,k lin ), the following single variable nonlinear optimization problem should be solved: where ε is predefined small acceptable tolerance; Q t (α) is a point on the search direction as defined in Equation (34); Rel In f (α) is the relative infeasibility and it is defined in Equation (36).
where · is the Euclidean norm and the notation [ · ] ∀t indicates a vector the collects all vectors for each t = 1...T.
The relative infeasibility is interpreted as the violation in the solute mass balance relative to the solute mass that flows in the network. The relative measure of violation is considered to account for the magnitude of the solute mass in the network nodes. Consider, for example, a node with one ingoing pipe and one outgoing pipe where the flow in the two pipes is 100 million cubic meters per year, the salinity in each pipe is 200 mg-CL per Liter, then a violation of 1 kg in salt mass is negligible compared to the total salt mass, which is 2 × 10 7 kg.
It is noteworthy that any solution of the nonlinear problem in Equation (33) will be feasible for the set of constraints in F . This is because for 0 ≤ α ≤ 1, the values of Q t (α) and C t (α) are convex combinations of feasible points as defined in Equations (34) and (35). In light of the above, the SOH algorithm is stated in Algorithm 1.

Given:
Randomly generated initial point Q t,0 , C t,0 ) large positive penalty parameter β; small positive tolerance ε; k = 0 Repeat 1. Solve Linear Approximation. Set the value of (Q t,k lin , C t,k lin ) to the solution of the optimization problem in Equations (30)-(32). 2. Solve Nonlinear Problem. Set the value of (Q t,k+1 , C t,k+1 ) to the solution of the optimization problem in Equation (33). 3. Update Iteration. k = k + 1 Until Stopping criterion is satisfied.

Initialize
The SOH above is a local optimization heuristic; as such, the final point found may depend on the initial starting point. Therefore, the algorithm is initialized with several random initial points, where the best run among all runs is taken as the solution to the optimization problem.

Solve Linear Approximation
The obtained approximation is a linear problem, which could be solved with any offthe-shelf optimization solver. In this study, we used CPLEX [40] as the linear programming solver.

Solve Nonlinear Problem
The obtained problem in this step is nonlinear non-convex, but it is easy to solve because it only involves a single variable, which is bounded between zero and one. It is possible to solve this problem by grid search in the variable bounds or using dedicated nonlinear single variable methods. For example, in this study, we used the MATLAB solver fminbnd, which implements the golden section method [41].

Stopping Criterion
One reasonable stopping criterion is that the distance (Euclidean norm) between two successive points during the iteration process is less than the tolerance. However, we found that the SOH might oscillate between two points, which are not necessarily successive. To prevent oscillation, the algorithm stops if the current solution is within relative distance ε from any previously obtained solution in the iteration process. This condition is defined in Equation (37).

Results and Discussion
In this section, we analyze the SOH performance to show that it can compete with state-of-the-art nonlinear solvers when it runs multiple times using a multi-start procedure. The layout of the Modena network (Figure 2), which is available in [42], is used to compare the performance of the SOH against the nonlinear solver. For solving the nonlinear Q-C model, the IPOPT solver was used [31]. IPOPT is an open-source nonlinear optimization solver, which uses the interior-point algorithm. The experiment is programmed in MATLAB using the YALMIP modeling language [43], the code is available in GitHub Repository (https://github.com/SWSLAB/SOH/ (Accessed on 12 August 2021). The SOH algorithm is used with β = 10 6 , ε = 0.001 for all runs. IPOPT is used with default tuning parameters except for the maximum iteration limit, which is increased to one million iterations. In the absence of other studies that compare the performance of optimization algorithms on the Q-C model, we only compared the SOH to IPOPT. Future research can build on the code and data in the GitHub Repository to test the performance of other optimization heuristics compared to the proposed SOH. distance ε from any previously obtained solution in the iteration process. This condition is defined in Equation (37).

Results and Discussion
In this section, we analyze the SOH performance to show that it can compete with state-of-the-art nonlinear solvers when it runs multiple times using a multi-start procedure. The layout of the Modena network (Figure 2), which is available in [42], is used to compare the performance of the SOH against the nonlinear solver. For solving the nonlinear Q-C model, the IPOPT solver was used [31]. IPOPT is an open-source nonlinear optimization solver, which uses the interior-point algorithm. The experiment is programmed in MATLAB using the YALMIP modeling language [43], the code is available in GitHub Repository (https://github.com/SWSLAB/SOH/ (Accessed on 12 August 2021). The SOH algorithm is used with for all runs. IPOPT is used with default tuning parameters except for the maximum iteration limit, which is increased to one million iterations. In the absence of other studies that compare the performance of optimization algorithms on the Q-C model, we only compared the SOH to IPOPT. Future research can build on the code and data in the GitHub Repository to test the performance of other optimization heuristics compared to the proposed SOH.  We used the same network layout of the Modena network as in [42], but some of the configurations were randomly generated to create synthetic problems for testing the optimization solvers. For example, we have defined 30 sources and 50 demands nodes in the network that were randomly chosen. The demands, bounds, and costs were also randomly generated along the management horizon. The salinity levels in the source nodes are chosen such that the salinity is high for sources with low production cost, this characteristic will create a more challenging problem because of the tradeoff between meeting the quality constraints and minimizing the production cost. It is noteworthy that because the layout and pipe directions are predetermined in the Modena network, the feasibility of randomly generated problems is not guaranteed. For example, if a node that has only outgoing pipes is selected as a demand node, the problem will be infeasible, or if the total capacity of the links ingoing a demand node is below the required demand, the problem will be infeasible. In this study, we only consider randomly generated problems, which have feasible solutions. The configurations of the solved problems are available in https://github.com/SWSLAB/SOH/ (accessed on 12 August 2021).
To study the impact of model size on computation time and accuracy, a series of runs are made in which the number of years is increased from one to eight. For a single year, the nonlinear Q-C model (Equations (14)- (22)) of the Modena network includes 1191 decision variables. The size of the optimization problem will linearly increase with the management horizon. For T = 8, the nonlinear optimization model will have 1191 × 8 = 9528 decision variables. Since both the SOH and IPOPT are local optimization solvers, we adapted a multi-start procedure in which the optimization problem is solved from 25 starting points, which are randomly drawn from uniform probability distribution defined by the decision variables bounds. At the end of the process, each solver will end up with a "bank" of 25 solutions for the problem. The selected solution from each solver is the one that is feasible and minimizes the objective function.
Before we start the comparison, Figure 3 demonstrates the iteration process in the SOH. Figure 3 shows the solution progress for the optimal run with T = 1. In this run, the relative infeasibility is more than 1% up to iteration 10, and the objective function drops drastically in these first 10 iterations from 525 M$ to 300 M$. The relative infeasibility is within the tolerance ε = 0.1% starting from iteration 15, while the stopping criterion in Equation (37)  To study the impact of model size on computation time and accuracy, a series of runs are made in which the number of years is increased from one to eight. For a single year, the nonlinear Q-C model (Equations (14)- (22)) of the Modena network includes 1191 decision variables. The size of the optimization problem will linearly increase with the management horizon. For 8 T = , the nonlinear optimization model will have 1191 × 8 = 9528 decision variables. Since both the SOH and IPOPT are local optimization solvers, we adapted a multi-start procedure in which the optimization problem is solved from 25 starting points, which are randomly drawn from uniform probability distribution defined by the decision variables bounds. At the end of the process, each solver will end up with a "bank" of 25 solutions for the problem. The selected solution from each solver is the one that is feasible and minimizes the objective function.
Before we start the comparison, Figure 3 demonstrates the iteration process in the SOH. Figure 3 shows the solution progress for the optimal run with 1 T = . In this run, the relative infeasibility is more than 1% up to iteration 10, and the objective function drops drastically in these first 10 iterations from 525 M$ to 300 M$. The relative infeasibility is within the tolerance 0.1% ε = starting from iteration 15, while the stopping criterion in Equation (37)   To compare the efficiency of the SOH and IPOPT, Figure 4 shows the average solver time for different problems with increasing size (number of decision variables ranges from To compare the efficiency of the SOH and IPOPT, Figure 4 shows the average solver time for different problems with increasing size (number of decision variables ranges from 1191 to 5955), which are obtained by increasing the problem horizon from one year to five years. The results show that the solution time of the nonlinear solver is increasing exponentially with the problem size while the SOH solves the problems with a moderate increase in the solution time. For example, for a Q-C model with a horizon of five years, the solution time of the multi-start procedure using IPOPT is 25 × 693 = 17,325 s (~4.81 h) while using the SOH is 25 × 34 = 850 s (~0.24 h).  Further increasing the problem horizon will create larger problems in which the nonlinear solver will require a longer solution time. It is typical when using nonlinear solvers to limit the number of iterations or to limit the solver time. In Figure 5, the problem is further increased to 9528 decision variables by increasing the problem horizon to eight Further increasing the problem horizon will create larger problems in which the nonlinear solver will require a longer solution time. It is typical when using nonlinear solvers to limit the number of iterations or to limit the solver time. In Figure 5, the problem is further increased to 9528 decision variables by increasing the problem horizon to eight years. Nonetheless, to examine the practicality of the nonlinear solver, the IPOPT solver time limit is set to 1800 s (0.5 h). While the exponential increase in IPOPT solver is not maintained after T = 5 (due to limiting the solver time), the difference between the solution time of the SOH and IPOPT is still evident, since the SOH shows an almost linear increase in solution time with respect to the problem horizon. For a large problem with 9528 decision variables, the multi-start procedure will require about 11 h using IPOPT, but only 0.7 h using the SOH. Moreover, for the largest problem, only two out of the 25 runs in IPOPT were declared successful (reached first-order optimality conditions), the rest of the runs either failed to converge to local optima or reached the maximum time limit. Figure 5 also shows the successful runs of IPOPT out of 25 for increasing problem horizon (i.e., increasing problem size).  Figure 6 shows the box-and-whiskers plots for IPOPT and SOH solution time, respectively. The boxes show the interquartile ranges with the red crossbar depicting the median and the red circle depicting the mean. The whiskers show the minimum and maximum values. The results show that while IPOPT solver can reach the solution fast when the problem is small in most of the runs, the solution time becomes impractical for large problems as most of the runs reach the practical limit of 0.5 h per run. For example, when 8 T = , 20 out of 25 of the runs (i.e., 80%) reach the maximum time limit of the solver. On the other hand, the worst solution time of the SOH is 155 s.   Figure 6 shows the box-and-whiskers plots for IPOPT and SOH solution time, respectively. The boxes show the interquartile ranges with the red crossbar depicting the median and the red circle depicting the mean. The whiskers show the minimum and maximum values. The results show that while IPOPT solver can reach the solution fast when the problem is small in most of the runs, the solution time becomes impractical for large problems as most of the runs reach the practical limit of 0.5 h per run. For example, when T = 8, 20 out of 25 of the runs (i.e., 80%) reach the maximum time limit of the solver. On the other hand, the worst solution time of the SOH is 155 s.   Figures 7 and 8 show the detailed solver time histograms of the smallest and largest problem for IPOPT and SOH, respectively. In the smallest problem with T = 1, 20 out of 25 the runs in IPOPT required a short run time of 6.6 s, however for the large problem with T = 8, 20 out of the 25 runs reach the time limit. Figure 8 shows the solver time in the SOH for the small problem is less than 3 s in all runs, while for the large problem the solver time is almost uniformly distributed between 46-154.7 s. In light of the above, it is clear that the SOH solves large problems much faster than the nonlinear solver, but the question to tackle next is: is it as accurate?
To answer this question we define the error between the best objective values obtained by the multi-start procedure using IPOPT and SOH as given in Equation (38). Figure 9 shows the objective error, as defined in Equation (38), for increasing problem horizon (i.e., increasing problem size). The SOH shows good accuracy for all the solved problems. The worst-case error is as low as 0.45%. In fact, for 5 T  the error is negative, indicating that the optimal objective value reached by the SOH is better than the optimal value obtained in IPOPT. For example, when 7 T = the SOH optimal value is 0.32% better than the IPOPT solution. However, in general, the error is relatively low (regardless of its sign) and the two solutions could be considered identical for practical purposes.  In light of the above, it is clear that the SOH solves large problems much faster than the nonlinear solver, but the question to tackle next is: is it as accurate?
To answer this question we define the error between the best objective values obtained by the multi-start procedure using IPOPT and SOH as given in Equation (38). Figure 9 shows the objective error, as defined in Equation (38), for increasing problem horizon (i.e., increasing problem size). The SOH shows good accuracy for all the solved problems. The worst-case error is as low as 0.45%. In fact, for 5 T  the error is negative, indicating that the optimal objective value reached by the SOH is better than the optimal value obtained in IPOPT. For example, when 7 T = the SOH optimal value is 0.32% better than the IPOPT solution. However, in general, the error is relatively low (regardless of its sign) and the two solutions could be considered identical for practical purposes. In light of the above, it is clear that the SOH solves large problems much faster than the nonlinear solver, but the question to tackle next is: is it as accurate?
To answer this question we define the error between the best objective values obtained by the multi-start procedure using IPOPT and SOH as given in Equation (38). Figure 9 shows the objective error, as defined in Equation (38), for increasing problem horizon (i.e., increasing problem size). The SOH shows good accuracy for all the solved problems. The worst-case error is as low as 0.45%. In fact, for T > 5 the error is negative, indicating that the optimal objective value reached by the SOH is better than the optimal value obtained in IPOPT. For example, when T = 7 the SOH optimal value is 0.32% better than the IPOPT solution. However, in general, the error is relatively low (regardless of its sign) and the two solutions could be considered identical for practical purposes. To judge the quality of the solution it is not enough to examine the objective function, one should also examine the constraints violation of the problem. Figure 10 shows the relative infeasibility in the solute balance as defined in Equation (39).
where * * Q , t t C are the optimal decision values as obtained from the multi-start procedure using the SOH. The results in Figure 10 show that the relative infeasibility is below the acceptable tolerance of the SOH ( ) for all the solved problems. Figure 10. Relative infeasibility using the SOH. Figure 11 shows the maximum solute balance infeasibility as defined in Equation (40). The results in Figure 11 show that while maximum infeasibility is generally increasing with problem size, the maximum infeasibility is as low as 8.13 kg per year, which is a very negligible imbalance for practical regional water supply systems that deliver millions of cubic meters per year. To judge the quality of the solution it is not enough to examine the objective function, one should also examine the constraints violation of the problem. Figure 10 shows the relative infeasibility in the solute balance as defined in Equation (39).
where Q t * , C t * are the optimal decision values as obtained from the multi-start procedure using the SOH. The results in Figure 10 show that the relative infeasibility is below the acceptable tolerance of the SOH ( ε = 0.1%) for all the solved problems. To judge the quality of the solution it is not enough to examine the objective function, one should also examine the constraints violation of the problem. Figure 10 shows the relative infeasibility in the solute balance as defined in Equation (39).
where * * Q , t t C are the optimal decision values as obtained from the multi-start procedure using the SOH. The results in Figure 10 show that the relative infeasibility is below the acceptable tolerance of the SOH ( ) for all the solved problems. Figure 10. Relative infeasibility using the SOH. Figure 11 shows the maximum solute balance infeasibility as defined in Equation (40). The results in Figure 11 show that while maximum infeasibility is generally increasing with problem size, the maximum infeasibility is as low as 8.13 kg per year, which is a very negligible imbalance for practical regional water supply systems that deliver millions of cubic meters per year.  Figure 11 shows the maximum solute balance infeasibility as defined in Equation (40). The results in Figure 11 show that while maximum infeasibility is generally increasing with problem size, the maximum infeasibility is as low as 8.13 kg per year, which is a very negligible imbalance for practical regional water supply systems that deliver millions of cubic meters per year.
Solution efficiency is a key element for the practical implementation of Q-C models. Often, Q-C optimization models results will be used within Decision Support Systems to facilitate active discussion between stakeholders. A slow solution process will hinder realtime stakeholders' interaction with the model in which scenarios (i.e., changing constraints or different parameters) are refined rapidly. In practice, it is hard to define the scenarios in advance; usually, the scenarios are refined during an active discussion between the stakeholders (partly by seeing how the model reacts). Near-real-time interaction with the model is very important, and it could be done only if the model can be solved in a reasonable time. Thus, a fast solution process can contribute to expediting stakeholders' engagement with Decision Support Systems. Our results indicate that using simple optimization heuristics combined with a multi-start optimization procedure can be an alternative to off-the-shelf nonlinear solvers when solving multisource quantity-quality models. The results from the suggested approach compete with nonlinear solvers in terms of both accuracy and efficiency. The results show that for large-scale problems (five-year horizon) the nonlinear solver will require ~4.8 h, making it an impractical tool for nearreal-time interactions with the model, while the suggested heuristic can solve the same problem in ~15 minutes. If we further increase the horizon to eight years (i.e., 9528 decision variables), the nonlinear solver will reach the maximum time limit of 1800 s in 80% of the runs, and the runtime will be as high as ~11 h which is compared to ~0.7 h when using the SOH. This efficiency of the SOH is obtained by harnessing ultra-fast optimization procedures within an iterative process. Moreover, the results show that SOH can reach good accuracy with maximum relative error in the objective function of 0.45% and relative infeasibility within the prescribed tolerance. Still, the SOH is a local solver (as well as the nonlinear solver, IPOPT), and thus, it cannot guarantee a globally optimal solution. In this study, we compared the performance of the nonlinear solver and the SOH using the same settings of the multi-start procedure (i.e., the same number of initial points). However, owing to the fast solution process of the SOH, we can consider a larger number of initial points, which will increase the chance to reach a globally optimal solution. Further research is needed to examine the performance of the proposed approach in reaching the global optimum compared to state-of-the-art global optimization solvers.  Solution efficiency is a key element for the practical implementation of Q-C models. Often, Q-C optimization models results will be used within Decision Support Systems to facilitate active discussion between stakeholders. A slow solution process will hinder realtime stakeholders' interaction with the model in which scenarios (i.e., changing constraints or different parameters) are refined rapidly. In practice, it is hard to define the scenarios in advance; usually, the scenarios are refined during an active discussion between the stakeholders (partly by seeing how the model reacts). Near-real-time interaction with the model is very important, and it could be done only if the model can be solved in a reasonable time. Thus, a fast solution process can contribute to expediting stakeholders' engagement with Decision Support Systems. Our results indicate that using simple optimization heuristics combined with a multi-start optimization procedure can be an alternative to off-the-shelf nonlinear solvers when solving multisource quantity-quality models. The results from the suggested approach compete with nonlinear solvers in terms of both accuracy and efficiency. The results show that for large-scale problems (five-year horizon) the nonlinear solver will require~4.8 h, making it an impractical tool for near-real-time interactions with the model, while the suggested heuristic can solve the same problem iñ 15 minutes. If we further increase the horizon to eight years (i.e., 9528 decision variables), the nonlinear solver will reach the maximum time limit of 1800 s in 80% of the runs, and the runtime will be as high as~11 h which is compared to~0.7 h when using the SOH. This efficiency of the SOH is obtained by harnessing ultra-fast optimization procedures within an iterative process. Moreover, the results show that SOH can reach good accuracy with maximum relative error in the objective function of 0.45% and relative infeasibility within the prescribed tolerance. Still, the SOH is a local solver (as well as the nonlinear solver, IPOPT), and thus, it cannot guarantee a globally optimal solution. In this study, we compared the performance of the nonlinear solver and the SOH using the same settings of the multi-start procedure (i.e., the same number of initial points). However, owing to the fast solution process of the SOH, we can consider a larger number of initial points, which will increase the chance to reach a globally optimal solution. Further research is needed to examine the performance of the proposed approach in reaching the global optimum compared to state-of-the-art global optimization solvers.
Funding: This research was funded by the Israeli Water Authority.