An Iterated Local Search Algorithm for Multi-Period Water Distribution Network Design Optimization

Water distribution networks consist of different components, such as reservoirs and pipes, and exist to provide users (households, agriculture, industry) with high-quality water at adequate pressure and flow. Water distribution network design optimization aims to find optimal diameters for every pipe, chosen from a limited set of commercially available diameters. This combinatorial optimization problem has received a lot of attention over the past forty years. In this paper, the well-studied single-period problem is extended to a multi-period setting in which time varying demand patterns occur. Moreover, an additional constraint—which sets a maximum water velocity—is imposed. A metaheuristic technique called iterated local search is applied to tackle this challenging optimization problem. A full-factorial experiment is conducted to validate the added value of the algorithm components and to configure optimal parameter settings. The algorithm is tested on a broad range of 150 different (freely available) test networks.


Access to Drinking Water
A safe, adequate, and accessible supply of drinking water is one of the basic necessities of any human being.According to a study from the World Health Organization (WHO), improving access to safe potable water not only reduces the overall risk of disease, but can also be an effective part of poverty reduction strategies [1].
The most efficient and effective way to transport drinking water is through a network of pipelines.The first water distribution networks date back to the ancient Greeks, the first civilization to use underground pipes for water supply [2].In 2010, water distribution networks seem omnipresent, but nonetheless, 11% of the world's population does not have access to an improved drinking water source (i.e., a drinking water source that is protected from outside contamination), and only 54% of the world's population has piped drinking water on premises [3].
A large majority of all water distribution networks are maintained by water distribution companies.These operate in a dynamic, evolving setting as a result of three main evolutions: changes in water usage: population growth in certain parts of the world, combined with increasing living standards (houses equipped with sanitary installations, washing and dish washing machines, etc.) leads to a rising water consumption.A trend in the opposite direction is the growing consciousness of the scarcity of natural resources, which leads to a more rational use of water.Intermittent water supply also poses a challenge, since this is not only inconvenient for end-users, but also affects water quality and leads to a higher probability of pipe breaks and leakages.Development of dual systems: around 80% of the water supplied to residences is used for activities other than human consumption, such as sanitary services and landscape irrigation.This urges source separation and dual systems that convey water of different levels of quality according to their end use: high quality water is delivered in smaller pipes for consumption, whereas the so-called "grey-water systems" transport recycled water for lower quality needs.The use of high-quality drinking water as firewater will decrease, which reduces the need for over-sized water distribution pipes [4].Urban development and renewal: urban growth, increased attention for livability, and sustainable cities are some of the key drivers for urban (redevelopment), which also includes the (re)design of water distribution networks.

Decision Support Tools
The trends mentioned in the previous paragraph force water distribution companies to rethink their network configuration by modifying or expanding their current distribution systems.Construction or reorganization of these networks requires major investments.Hence, an efficient layout, design, and planning of water distribution networks is of crucial importance.Decision support systems are therefore potentially very useful in supporting water distribution companies in these decision making processes.This paper aims to develop a decision support system that supports network design decisions.These are decisions related to the optimal type of pipe connecting the supply, demand, and junction nodes in the distribution network.Such decisions are made when pipes need to be (re)placed or rehabilitated due to aging, changing demographic settings, network redesign, or network expansion.
Traditionally, such design decisions are made based on expert experience.When networks increase in size, however, it is doubtful that the applied rules of thumb will lead to optimal decisions.Hence, the use of decision support systems (DSS) could result in significant savings.Decision making in this area could benefit from the use of such a DSS in two ways.First, it may improve the efficiency with which a user makes a decision, since alternative feasible solutions are generated faster than using a manual process.Second, the effectiveness of the decision process may also be improved, as for problems with a huge solution space, algorithms generally find better solutions than human brainpower.

Water Distribution Network Design Optimization
As stated in the previous paragraph, the focus of this paper is finding optimal network designs.Figure 1 illustrates the water distribution network design optimization problem.A toy network that contains 60 demand nodes, 1 water supply reservoir and 83 water distribution pipes has to be designed.In the network design phase, network topology is assumed given, as can be seen in Figure 1a.The aim is to find the best suited diameter for every pipe in this network.Assume that there are 16 possible pipe diameters, ranging from 20 mm up to 1000 mm.Since all possible diameters can be selected for every pipe, 16 83 possible configurations (or combinations of diameters) could be designed.Finding the lowest-cost design (or the optimal diameter size for every pipe) while respecting hydraulic laws and customer requirements is not straightforward-even for such a small network.The right side of the figure illustrates the optimized network, where diameters are assigned to every pipe.Thicker lines represent larger diameters in the figure.

State of the Art
Water distribution network design optimization has been an active research field for over four decades.A large majority of papers in this area focus on the single-period, single-objective, gravity-fed design optimization problem.Early work applies more traditional mathematical programming methods, such as linear programming (LP), originally applied by Alperovits and Shamir [5] and extended by Quindry et al. [6] and Kessler and Shamir [7]; or non-linear programming (NLP) in Shamir [8], El-Bahrawy and Smith [9], Fujiwara and Khang [10], and Duan et al. [11].
In the last three decades, considerable attention has been given to the development of (meta)heuristic algorithms to solve the water distribution network design optimization problem.These techniques use a hydraulic solver (usually EPANET 2.0 [12]) to solve the hydraulic equations externally, while the heuristic manipulates the selected pipe types.Local search metaheuristics operate on a single solution and try to improve this solution in small steps.Loganathan [13] and Cunha and Sousa [14] applied simulated annealing, and Cunha et al. [15] applied a tabu search.Constructive metaheuristics iteratively construct solutions, rather than improving complete solutions.Ant colony optimization, for example, is applied by Maeier et al. [16], and ant systems by Zecchin et al. [17].However, population-based metaheuristics are the most popular in this field.These algorithms operate on a population of solutions and combine them into new solutions.Among this class of metaheuristics are genetic algorithms in Dandy et al. [18], Gupta et al. [19], and Bi et al. [20], scatter search in Lin et al. [21], and differential evolution in Vasan and Simonovic [22].A more recent paper by Marchi et al. [23] does not develop a new method, but introduces a methodology to compare various evolutionary algorithms for water distribution network design.The authors show that algorithm performance depends on the problem characteristics and the number of function evaluations.Moreover, the importance of correct calibration is stressed.A more detailed overview of the state-of-the-art in this research area is given in De Corte and Sörensen [24].Some research has already been dedicated to the extensions formulated in this paper.Farmani et al. [25], for example, also perform a multi-period optimization, but formulate the design problem as a multi-objective optimization problem.They define maximizing network reliability as a second objective next to cost minimization, and apply a multi-objective evolutionary algorithm to optimize the Anytown network.Gupta et al. [19] and Bragalli et al. [26] also apply a velocity constraint on the water flowing through the distribution pipes.The first apply a genetic algorithm on six (small) instances while the latter use mathematical programming on bigger, closer-to-reality instances.

Contributions
The contribution of this paper is twofold.A first contribution is that the well-studied single-period, pressure-constrained model is extended, and therefore, closer to reality.This work adjusts this model in two ways.The single-period setting is extended to a multi-period set-up, in which every demand node has a 24 h water demand pattern.Moreover, an additional constraint-which imposes a limit on the maximal velocity of water through the pipes-is introduced.These additions are formulated in Section 2.
A second contribution is related to algorithm composition, configuration, and testing.The developed algorithm is developed so that only components that show a significant added value are included in its final configuration.A clear and systematic test procedure is used to determine these components.Moreover, the algorithm contains some parameters that need to be configured.A full-factorial experiment is performed to determine the optimal level for each parameter.This allows solid conclusions to be drawn regarding the effects of algorithm configuration and parameter settings on solution quality and computing time.More details are given in Section 3.3.In terms of algorithm testing, the algorithm is tested on a broad set of challenging HydroGen test networks.HydroGen (De Corte and Sörensen [27]) is a tool to automatically generate water distribution test networks of varying size and characteristics.By using an extensive set of test networks, we are able to draw robust conclusions on the performance of the algorithm.Moreover, these HydroGen networks are free to use and available online, which should foster easy comparison of results.The experimental results are given in Section 4.

Model Formulation
A water supply system consists of a transmission and a distribution system.The focus in this paper is on the distribution network.Moreover, the considered networks are all gravity-fed, which means that the water is supplied by gravitational forces and no pumps are needed.Water quality considerations are not taken into account.This is in line with previous research in this area, where a majority of the work focuses on the well-defined single-period, single-objective, gravity-fed water distribution network optimization problem.As stated above, This research extends this problem in two ways: (1) by imposing an additional constraint (i.e., a maximal velocity of the water in the distribution pipes); and (2) by applying time-varying diurnal demand patterns.This problem will therefore be formulated as the single-objective, multi-period, gravity-fed water distribution network design (WDND) optimization problem with pressure and velocity constraints.

Mathematical Formulation
To formulate the water distribution network design problem as a mathematical model, the water distribution network is represented as a connected graph with a set of water demand or supply nodes N = {n 1 , n 2 , . ..} and a set of water distribution pipes P = {p 1 , p 2 , . ..}.The set of closed loops in this graph is denoted L = {l 1 , l 2 , . ..}.The objective of this WDND optimization problem is to minimize the total investment cost (TIC) of the network design by selecting the optimal pipes from a set of available pipe types.The cost of an individual pipe depends on the type t that is chosen for this pipe from the list of commercially available types T = {t 1 , t 2 , . ..}.A pipe's type determines both its diameter and the material from which it is made, which in turn determine its hydraulic properties.If the cost per meter of a pipe p of type t is represented by IC t , and the length of pipe p is represented as L p , the objective function of the multi-period, gravity-fed water distribution network design optimization problem can be written as: where x p,t is the binary decision variable that determines whether pipe p is of type t (x p,t = 1) or not (x p,t = 0).Since the network layout is assumed to be given, L p is known.The investment cost IC t is also given for every commercially available pipe type.This objective function expresses the minimization of the total network cost.
The objective function is limited by physical mass and energy conservation laws, by minimum head requirements in the demand nodes, and by maximal water velocities in the pipes, for every time period τ ∈ T .
The mass conservation law must be satisfied for each node n ∈ N = {n 1 , n 2 , . ..} in every time period τ ∈ T .This law states that the volume of water flowing into a node in the network per unit of time must be equal to the volume of water flowing out of this node.Let Q (n 1 ,n),τ represent the amount of water flowing from node n 1 to node n in time τ, and let WS n,τ be the external water supply and WD n,τ the external water demand of node n in period τ (all expressed in m 3 /s); then, the following should hold: Furthermore, for each closed loop l ∈ L = {l 1 , l 2 , . ..}, the energy conservation law must be satisfied for every time period τ.This law states that the sum of pressure drops in a closed loop is zero.Pressure drops (or head losses) in piping systems are mainly caused by wall shear in pipes.The energy conservation law can be stated as: with ∆H p,τ representing the head loss in pipe p in time τ.Head losses in the pipes of the network are approximated using Hazen-Williams equations, with the parameters set to the values used by EPANET 2.0, the hydraulic solver used in this paper: Consequently, Equation (3) can be rewritten as: In this equation, y p,τ is the sign of Q p,τ , which is the amount of water flowing through pipe p (in m 3 /s) in time τ.This sign incorporates changes in the water flow direction relative to the defined flow directions.L p is the pipe length (in m), C t is the Hazen-Williams roughness coefficient of pipe type t (unit-less), and D t is the diameter of pipe type t (in m).Parameters D t and C t are determined by the type of pipe and are assumed given for each available type.Note that y p,τ and Q p,τ represent alternative formulations of Q (n 1 ,n 2 ),τ if pipe p connects n 1 and n 2 .
Minimum pressure head requirements exist for every (demand) node n ∈ N at every time period τ ∈ T .Let H n,τ be the pressure head in node n (in m) at time τ, and H min n,τ the minimum pressure head in node n (in m) for time τ.This constraint can be represented as: A maximal water velocity limit is present for every pipe p ∈ P. The velocity of the water flowing through pipe p in time period τ, v p,τ cannot exceed the imposed maximal velocity v max p,τ : with: Q p,τ is the amount of water flowing through pipe p in time τ, x p,t the binary decision variable, and D t the diameter of pipe type t, which is given for each available pipe type.
Moreover, every pipe p should get exactly one type t assigned: From this mathematical formulation, it is clear that the WDND optimization problem is a mixed-integer, non-linear optimization problem, which puts this problem out of reach of exact linear or mixed-integer programming solvers like CPLEX or Gurobi.Moreover, this problem is a combinatorial optimization problem, since its decision variables (i.e., the type of each pipe) are discrete.The problem has also been proven to be NP-hard (nondeterministic polynomial time hard) by Yates [28].

Extension to Multi-Period Setting
The multi-period extension adds an extra layer of complexity to the WDND optimization problem, which is evident by the fact that all the flow-related variables receive an extra index for time period τ.An easy and intuitive solution to cope with this added complexity is to reduce the dynamic, multi-period setting to a static, single-period setting.Two different approaches to do so-maxPeriod and maxDemand-are compared below, and it is shown that such reductions are not possible.maxPeriod A first simplification strategy is to run a single-period model for the time period in which the total network demand is at its highest.This approach to reducing the multi-period problem to a single-period setting fails.The reason is that some areas will generally have a low water demand during the calculated maximal period, which will lead to small diameters for this area when optimizing the network using the single-period model.When this optimized network design is simulated in the multi-period setting, pressure constraints will be violated in these areas.When water demand increases in this area in other time periods, the small diameters will lead to high pressure drops, and consequently, a violation of the minimal pressure head constraint.This is illustrated in Figure 2. In (a), the network is optimized with the multi-period algorithm described in Section 3.During the multi-period simulation, no pressure deficits occur.In (b), the network is optimized according to the maxPeriod strategy.This will lead to a lower cost network, but during the multi-period simulation, some nodes will experience a pressure deficit.These nodes are visualized by the larger (red colored) demand nodes in (b).maxDemand An alternative approach would be to reduce the dynamic model to a single-period model in which water demand is maximal.For every node, the highest demand that is encountered during the multi-period timeframe is set as the actual demand in the static model.This approach will only be successful when there is only one demand pattern, which is far from realistic.Unavoidably, as soon as there is variation in the demand patterns, the designs based on this single-period setting will lead to over-sized (and therefore, more expensive) designs.As can be seen in Figure 2c, the design-based on the static maxDemand model-does not lead to any pressure violation in the dynamic setting.The design, however, is considerably more expensive than the design found by the multi-period model.

Constraint on Velocity in Pipes
The water velocity in a pipe is a function of (among other factors) the diameter of the pipe.Consequently, altering the diameter of a pipe leads to a change of the water velocity in the pipe.Hence, when a pipe diameter is altered, a velocity check has to be performed.(Un)fortunately, decreasing a single pipe's diameter affects the entire network flow, and therefore, the velocity needs to be checked in every pipe of the network, not just the one whose diameter was changed.This is visible in Figure 2d.If the indicated (blue) pipe is decreased in diameter, velocity violations will occur in the pipes colored in red.

An Iterated Local Search Algorithm
In this section, the WDND optimization problem is tackled with a metaheuristic technique, called iterated local search.Iterated local search (ILS) is a straightforward, easy to implement, and easy to adjust heuristic optimization algorithm.These characteristics make it especially suitable as the underlying optimization engine in a decision support system, since decision makers are understandably less reluctant to use more transparent decision support tools.Despite its simplicity, ILS is broadly applied in different research fields and as the basis of many state-of-the-art algorithms for problems such as travelling salesperson problems, scheduling problems, graph partitioning problems, etc. [29].
ILS is a local search metaheuristic: it operates on a single solution and tries to improve this solution in small steps during the local search.Figure 3, for example, shows different candidate solutions or network configurations that could be created during the local search.These solutions are networks for which diameters are assigned to every pipe.These solutions can also be represented by a vector, where every element of the vector corresponds to the diameter that is selected for that pipe, as can be seen in Table 1.This table gives vector representations of the candidate solutions in Figure 3. Every solution corresponds to a total investment cost (TIC)-calculated via the objective function, Equation (1), and using the input data from Figure 4-and is either hydraulically feasible (satisfying all constraints mentioned in Section 2) or infeasible (violating one or more of the constraints).Candidate solution 2, for example, violates the minimum pressure constraint in node 7, and candidate solution 4 violates the maximum velocity constraint in pipes 2 and 4. Therefore, both solutions are labelled infeasible.ILS is an iterated local search, implying that it alternates the local search phase with a perturbation phase.During the local search phase, small improvements (or changes in diameters that lead to lower cost, feasible solutions) are made to the current solution (which is a potential network design) until a local optimum is reached, meaning that there is no more improvement possible.Next, a perturbation is performed.This perturbation temporarily makes the solution worse by randomly changing some pipe diameters, with the aim of leading to better solutions in the consecutive local search phase.A more detailed algorithm description is given in Section 3.2.
ILS can therefore be understood as a combination of intensification or exploitation by the use of the local search, and diversification or exploration by the perturbation.

Implementation
To enable a straightforward comparison with existing methods, all hydraulic equations are solved by EPANET 2.0, which is the hydraulic solver applied in most existing works.The ILS algorithm is implemented in C++.The interaction between the ILS algorithm and the EPANET simulation engine is established by the use of an extended EPANET toolkit, developed by M. López-Ibáñez [30,31].As visualized in Figure 5, two different input files are needed.The first gives specifications on the available pipe types (diameter, cost, roughness coefficient) and specifies minimal head requirements for the demand nodes and maximal water velocity constraints for the pipes.The second file describes the network in EPANET input format, and contains the network topology and the characteristics of pipes and nodes.These two files serve as input for the ILS algorithm and the hydraulic solver.Every time a candidate solution is generated by the algorithm, this information (which basically is a candidate network design) is sent to EPANET 2.0 and hydraulically simulated for every time period.This simulation yields hydraulic results, which are used by the algorithm to evaluate the problem constraints.

EPANET input file
The final output will be an optimized network (x * p,t ) with a corresponding cost TIC.This network is also outputted in GraphMl format, to enable easy visualization, since this benefits the process of solution delivery.

Detailed Algorithm Description
The developed ILS algorithm consists of six steps, which are explained in more detail below.Figure 6 depicts the corresponding flow chart.

Sort
In a preliminary step, the set of pipes is sorted.This sorting of the pipes determines the order in which the local search adjusts the pipe diameters and how the algorithm moves through the solution space.Two different sorts are tested.The first, lengthSort, is similar to the sorting that was applied in the single-period algorithm and sorts the set of pipes according to decreasing length.The second, costSort, sorts the pipes according to decreasing cost savings that could be made by decreasing the diameter of that pipe by one size.When the pipes are sorted according to this decreasing cost difference, re-sorting of the pipe list is necessary during the local search.

Initial Solution
In a next step, an initial feasible solution, s-init, is constructed.A solution is considered feasible if all hydraulic constraints are strictly satisfied.Two different mechanisms are tested.The first, highCost, starts with a solution where all diameters are set at the largest available diameter.If this combination of diameters generates a solution that is feasible in all time periods, all pipe diameters are decreased by one size.This procedure is repeated until an infeasible solution is encountered: diameters are increased again and this results in an initial solution in which all pipes have the lowest diameter that is possible without violating the imposed constraints.The second mechanism, lowCost, is expected to be slower, but generates initial solutions with a lower cost than those generated by method highCost.First, all pipes are set to the smallest possible diameter.This solution is almost always infeasible (if it is feasible, it is optimal).Therefore, a second step consists of increasing the pipe diameters by going through the set of pipes and trying to increase each pipe's diameter by one size at a time and checking the hydraulic feasibility of this potential solution for every time period.Increasing the pipe diameter lowers (ceteris paribus) the head loss in that pipe, which could result in higher pressure heads at the nearby demand nodes.If this increase in diameter actually lowers the hydraulic deficit and does not violate the velocity constraint, it is applied.When there is no more deficit or velocity violation for the first time period, the procedure is repeated for the next time period.This procedure ends when a solution that is feasible in every time period is attained.This solution is set as the initial global best solution s-best.In the most extreme scenario, this solution will be one where all diameters are set to their maximal values.

Local Search
Next, a loop is performed until a stopping criterion is reached.The local search procedure is able to quickly improve the quality of the initially generated solutions by iteratively performing small changes-called moves-on the current solution.The move type that is applied in this local search (decrease) reduces the diameter of one pipe with one size at a time.Therefore, neighboring solutions have configurations in which all pipes but one have the same diameter as the current solution.Table 1 and Figure 3 visualize different (neighboring) solutions.Candidate solution 1 is visualized in Figure 1 and numerically visualized in Table 1.This solution can also be represented as a vector of pipe diameters: {150, 150, 80, 80, 100, 60, 60, 80}.Hydraulic simulation shows that candidate solution 1 is hydraulically feasible.A decrease move is applied on pipe 3 of candidate solution 1, implying that the diameter of that pipe will be decreased from 80 to 60 mm.This new candidate solution 2 is a neighbour of candidate solution 1; its vector representation is: {150, 150, 80, 80, 100, 60, 60, 80}.Simulation in EPANET 2.0 shows that candidate solution 2 does not violate any of the constraints.Therefore, this move leads to a feasible solution of lower total investment cost (the TIC decreased from 9621 e to 9140 e, as can be derived from Table 1).If we start from candidate solution 1 again and apply the decrease move on pipe 4, this diameter will decrease from 80 mm to 60 mm, leading to candidate solution 3. Hydraulic simulation of candidate solution 3 shows that this solution is infeasible, since it would violate the minimal pressure constraint in demand node 7, visualized in Figure 3. Another move could be the decrease of pipe 2 to a diameter of 100 mm.Hydraulic simulation shows that this leads to a violation of the maximal velocity constraint in pipes 2 and 4; candidate solution 4 is therefore also labeled as infeasible.
Deciding which pipe will be decreased in diameter depends on the applied pipe selection method.Two alternative pipe selection methods are tested.The first method, noGrasp, goes through the list of pipes successively.The other method, grasp, uses a greedy randomized adaptive search procedure to select which pipe should be changed in that current move.The larger the grasp size, the more random the pipe selection.When, for example, for a network of 200 pipes, the grasp size is set at 20%, a pipe will be uniformly randomly selected out of the first 40 pipes in the list.Note that a grasp size of 0% is equal to the noGrasp selection procedure.A first improving strategy is applied, meaning that as soon as a better (lower value for the objective function) feasible solution is encountered, s-current is replaced by this solution.A simple memory structure (memory) is introduced to speed up the algorithm.This list keeps track of all pipes that were changed unsuccessfully (i.e., violating the hydraulic constraints).It is prohibited to change these pipes' diameters again during the current search.The memory list is erased after every local search.
In the multi-period setting, the hydraulic simulation is even more time consuming than in the single-period setting, since a simulation has to be performed for every time period.Therefore, results are evaluated after every time period for the velocity and the pressure constraints.As soon as infeasible situations are encountered, the simulation is stopped in order to avoid (superfluous) simulations for the next time periods.

Evaluation and Acceptance
After every local search, the current solution, s-current, which is the local optimum of that local search (or the lowest cost feasible solution that can be reached by applying the decrease move), is evaluated.If the global best solution, s-best, is of lower cost, the current solution is replaced by s-best.If the current solution is of lower cost than the global best solution, s-best, this global best solution is replaced by the current solution.

Perturbation
When a local optimum is reached, a perturbation move is used to jump out of this local optimum.The perturbation move is performed on the current solution, s-current, which is at that moment equal to the global best solution, s-best.The perturbation takes a certain percentage-the so-called perturbation rate-of randomly selected pipes, and increases the diameters of these pipes by one size.For every change in diameter, a hydraulic check is performed.When such a change violates the pressure and/or velocity constraints, it is reversed.This perturbed solution is therefore always hydraulically feasible and is the starting solution for the next local search.The optimal perturbation rate balances between two extremes: if too many pipes are changed, the procedure reduces to a random (feasible) restart-which is not necessarily bad.If too few pipes are changed, the algorithm is unable to escape from local optima.

Termination Criterion
The algorithm is stopped after a certain maximal number of iterations.The lowest cost solution encountered by the algorithm is s-best.
As stated above, it is important to validate whether all of the aforementioned mechanisms have an added value (meaning that they either lead to lower cost solutions, or else a decrease in algorithm running time).Some parameters can also take up different values or levels.Therefore, a full-factorial experiment is conducted to determine the best parameter settings.This experiment will lead to the optimal algorithm configuration, in terms of reaching low cost solutions and in terms of minimizing algorithm running time.

Optimal Algorithm Configuration
Two linear mixed-effects models are fitted to detect which factors or algorithm parameters significantly influence the algorithm's performance in terms of achieving low cost solutions and in terms of running time.Table 2 displays the analyzed factors and their corresponding levels.This analysis is performed on a set of 75 different test networks of different size (expressed by the number of pipes) and characteristics (reflected by the meshedness coefficient).Since the perturbation contains a random component, the algorithm will find different solutions when it is executed multiple times on the same problem instance.Therefore, the algorithm is executed five times on each test instance and for each factor combination.The average cost or time over these five runs is taken as the response variable.As a consequence, both data sets will contain 36,000 data points (75 instances × 480 parameter combinations).The first model quantifies the relationship between the achieved solution cost and the different parameters, the second model explains how much of the variance in the running time is explained by the analyzed parameters.The relative cost model with main effects has a coefficient of determination (R-squared adjusted) equal to 0.79, meaning that about 79% of the variation in the response variable (in this case, the relative cost) is explained by the independent variables.Adding interaction effects does not have a large effect on the adjusted R-squaredit is only increased to 0.81.

Model on the Running Time
For the estimation of the model on the running time, it seems reasonable to include the network size as a factor, since it can be expected that the size of the network will influence the running time.Therefore, the number of pipes was added as a factor in the model on the running time.The meshedness coefficient of the network is also included as a factor, to check whether the meshedness influences the algorithm running times-for example, if a higher amount of loops results in longer hydraulic simulations.The running time model with main effects only has a coefficient of determination (adjusted R-squared) of 0.77.It is clear that adding interaction effects significantly increases the explanatory power: with main and interaction effects included, the factors are able to explain 97% of the variance in the algorithm running time.
Figures 7 and 8 plot the values of the coefficients estimated for each factor in the main effects model on the cost and the running time, respectively.Figures 9 and 10 plot the values of the coefficients estimated for the factor interactions.The lower parts of these figures show the mean plots and confidence intervals for the means for all parameters.These mean plots visualize the exact effect of each of the significant parameters.

Influence of the Different Factors Initial Solution
The spread of the coefficients depicted in Figure 7 for the model on the relative cost is rather small for this factor.The mean plot in Figure 7b visualizes that the highCost strategy generates slightly better results in terms of solution quality.The numerical results also showed that, for most of the test instances, the lowCost procedure is not able to construct a feasible "low-cost" solution, and therefore all diameters are set at their maximal value.This implies that most initial solutions generated via the lowCost strategy will have a higher cost than those generated under highCost.
The manner in which the initial solution is constructed does not influence the algorithm running time, as can be derived from the very small spread of the coefficients in Figure 8. Constructing an initial solution according to the lowCost procedure takes some extra seconds compared to highCost, but these are negligible compared to the total algorithm running time.

Pipe Sort
Pre-testing showed that sorting the pipes according to decreasing length or according to decreasing cost difference generates much better results in terms of solution quality than randomly sorting the set of pipes.Therefore, this random sorting was not included in the full-factorial experiment, and only sorting according to decreasing length (lengthSort) or according to decreasing cost difference (costSort) were compared.The very low spread of the coefficients in Figures 7 and 8 shows that there is not much difference between these two ways of sorting.This is logical, since both sorts are related: longer pipes-which will appear at the top of the lengthSort list-are more likely to induce higher cost differences, since this cost is a linear function of the pipe length.These longer pipes therefore also appear higher on the list sorted based on decreasing cost differences (costSort).
Despite the fact that sorting according to decreasing cost (costSort) requires a re-sorting of the list after every local search, there is no significant difference in running time of the two types of sorting, as is visible in Figure 8.

Local Search
The coefficients show that adding a memory structure influences both the solution quality and the algorithm running time.From the mean plots in Figures 7e and 8c, it is clear that adding a memory structure to the local search is very beneficial: running times are decreased by 40% without significantly losing solution quality.This is also visualized in Figure 11: the number of calls to EPANET (which is the time-consuming part of the algorithm) is twice as high for the algorithm without memory, whereas the evolution of solution cost is not very different for the algorithm with or without memory structure.Figure 9b shows that the memory structure interacts with the perturbation rate: the algorithm with memory structure is more sensitive to variations in the perturbation rate.
Figure 7d displays that a grasp size of 10% leads to the lowest cost solutions.Figure 8 shows that the grasp size does not influence the algorithm running time, nor is there a difference in running time between the algorithm with grasp selection procedure and the algorithm without grasp.

Perturbation
From the spread of the coefficients in Figures 7 and 8, it is clear that the perturbation rate influences both the solution quality and the algorithm running time.From the mean plot in Figure 7f, it is clear that a perturbation rate of 10% achieves the best results in terms of solution quality.This is explained by the fact that perturbations that are too small are not able to escape the local optima, and on the other hand, perturbation rates that are too high lead to a significant loss of information about the (potentially good) regions of the solution space.As can be seen in Figure 9c, when pipes are sorted according to decreasing length (lengthSort), the algorithm is more sensitive to (higher) perturbation rates.When pipes are (re)sorted according to decreasing cost difference, pipes that were perturbed-and thus increased in diameter-will appear relatively higher on the list: this increased diameter implies a higher cost difference that can be realized by decreasing this diameter again.Therefore, especially for higher perturbation rates, a large part of the perturbed pipes will be decreased first, which undoes part of the larger perturbation and moves the current solution back to good areas of the solution space.When pipes are sorted according to decreasing length, the local search moves differently through the solution space and it is less capable of fully exploiting the good areas when larger perturbation rates are used.
Figure 8d visualizes how the running time increases linearly as the perturbation rate increases.There are two reasons for this.Firstly, every change in diameter creates a new solution that has to be simulated hydraulically, since only feasible solutions are allowed as a result of the perturbation.As a consequence, changing more pipes-due to a higher perturbation rate-will take more time.Therefore, the perturbation phase itself will take longer for higher perturbation rates.This is clearly demonstrated in Figure 12a-c.These figures plot 26 iterations of the ILS algorithm executed on a toy network.At the end of every local search phase, a perturbation is performed.It is clear that the horizontal distance between the consecutive local searches is bigger for the perturbation rate of 40%, which implies that the related perturbations are more time consuming.A second reason for the longer running time is that a higher perturbation rate results in solutions that have more "potential" for improvement, which is depicted by the larger vertical distance between two consecutive local search phases.In Figure 12c, it is clear that the perturbation rate of 40% implies more moves that are executed per iteration, or more "new" current solutions that are generated per iteration.

Termination Criterion
The solution quality is strongly influenced by the maximal number of iterations.It is logical that the solution cost decreases as the number of iterations is increased.The mean plot of Figure 7a shows that the related slope flattens out when the number of iterations increases, which means that most of the improvement is made in the beginning.This is also clearly visible in Figure 13, where the logarithm of the solution cost is plotted as a function of the number of iterations.Even for the bigger networks, the algorithm stabilizes after a relatively small number of iterations, and most of the improvement in solution quality is made at the very beginning.The improvement that is made after a higher number of iterations is negligible compared to the improvement made in the beginning.
The running time increases linearly as a function of the maximal number of iterations, which is to be expected: every iteration follows the same procedure, and will therefore take an approximately equal amount of time.From Figure 10a, it is clear that the number of iterations interacts with the perturbation rate.Lower perturbation rates imply shorter running times.Therefore, the slope of the observations with low perturbation rates will be less steep than the slope of the observations with high perturbation rates.A similar conclusion can be drawn for the interaction between the maximal number of iterations and the network size (Figure 10b).

Network Size and Structure
It is clear that the number of pipes strongly influences the algorithm running time: larger networks will have longer running times.On the contrary, the meshedness of the network does not significantly influence the algorithm running time.This implies that more looped networks do not necessarily involve longer hydraulic simulation times.

Experimental Results
As mentioned above, other authors also tackled similar extensions.Farmani et al. [25] also use time varying demand patterns, but formulate the problem as a multi-objective optimization problem (in contrast to this single-objective formulation).Gupta et al. [19] and Bragalli et al. [26] also impose velocity constraints, but in a single-period setting.Therefore, comparison of results has no use.Since, to the authors' knowledge, no multi-period, single-objective, pressure-and velocity-constrained test instances are available, it is impossible to compare our results to other findings.
Multiple authors (e.g., Maier et al. [32]) have stated that algorithms for WDND optimization should be tested on more, larger, and more challenging instances.Therefore, the ILS algorithm is run on a diverse set of HydroGen [27] test instances, available via [33].These networks have varying size and characteristics, which enables more robust conclusions regarding the performance of the ILS algorithms to be drawn.
Since none of the previously developed algorithms for the WDND optimization problem are freely available, the results of our algorithm could not be compared to those of other algorithms.Moreover, a re-implementation of these algorithms comes with its own issues, especially the fact that the performance of the re-implemented algorithms might suffer from poor algorithm configuration and calibration.As stressed by Marchi et al. [23], a good calibration of all algorithms is essential in comparative analysis.All test networks, however, are freely available online, and we invite other researchers to test their algorithms on the networks, using their own calibration.
The input files are available via [33].Information on the instances is given in Table 3.A set of 16 different pipe types is used.The set of available pipe types and their corresponding costs can be found in Table 4.A minimal pressure head of 20 m is required in every demand node at every time period, and a maximal velocity of 2 m • s −1 is set for the water flow through every pipe in the network, at every time period.Demand nodes are divided in five categories (domestic, industrial, energy, public services, and commercial demand nodes), each with a corresponding base load and demand pattern.The base loads can be found in the EPANET input files of the instances.Demand patterns are visualized in Figure 14.The algorithm was configured with the initial solution constructed according to highCost, since this leads to lower cost solutions without negatively influencing the running time.Pipes were sorted according to decreasing length (lengthSort), since this yields slightly better results for a perturbation rate of 10%.The grasp size was set at 10%, again, because this leads to lower cost solutions without worsening the running time.The memory structure in the local search is activated to reduce running times, and the perturbation rate is set at 10% since this leads to good solutions in relatively short running times.The algorithm was executed 10 times per instance for a fixed number of calls to EPANET.The lowest cost solution found after a fixed number of function evaluations is recorded, and minimum, mean, and maximum costs over the 10 runs are calculated and reported.These results, together with the mean number of iterations and the standard deviations, can be found in the tables in Appendix A.
From these tables, it is clear that that more iterations are performed on the smaller instances for an equal amount of calls to EPANET.This is to be expected: larger instances imply larger neighborhoods, and therefore, more calls to EPANET (or function evaluations) are required per iteration.For instance, in HG-MP-30-1, only five iterations were finished after 100,000 function evaluations; whereas for HG-MP-1-1, 160 iterations were finished on average.
The minimum, mean, and maximum cost are monotonically decreasing, which is to be expected because the reported results are related to the lowest cost solutions found so far.For some instances, it might seem that these costs do not decrease any further as the number of function calls increases.When looking at the non-rounded results, however, it is clear that the solutions are still improving, but this improvement is very small and therefore non-noticeable from the rounded results.
Another thing to notice is that the percentage decrease of the minimal, mean, and maximal cost is smaller for smaller instances.This is due to the fact that for these smaller instances, more iterations were performed, the algorithm has converged more, and only small additional improvements will be made.
From these tables, it is also clear that it would be interesting to develop a multi-start version of the ILS algorithm.When looking at a small instance (e.g., HG-MP-2-5) it is clear that the minimum and the mean cost found after 100,000 function evaluations are considerably lower than the maximum cost found after 1,500,000 function evaluations.This implies that running the algorithm 15 times for 100,000 function evaluations will almost certainly lead to a lower cost solution than running the algorithm once for 1,500,000 iterations.For bigger instances (e.g., HG-MP-26-5), this conclusion cannot be drawn, at least not for the considered number of evaluations.If the bigger instances were run for a higher number of function evaluations, similar behavior could occur and the (potential) benefit of a multi-start algorithm could appear.
The last column reports the standard deviation of the mean, and gives information on how widely the values are dispersed from this mean.Standard deviations are higher for bigger instances, due to the differences in order of magnitude of the total network cost.Apart from some exceptions (i.e., all smaller instances that already converged), the standard deviation decreases as the number of function evaluations increases.This implies that the ten different runs converge as the algorithm runs longer.

Conclusions
The main contribution of this paper is an ILS algorithm that solves the multi-period, velocity-constrained WDND optimization problem quickly and effectively.
The problem is formulated in line with previous research in this area, and aims at finding the optimal pipe configuration out of a set of discrete pipe types, while satisfying hydraulic constraints and customer requirements.This formulation is extended in two ways: (1) an additional constraint-which imposes a maximal water velocity in the distribution pipes-is added, and (2) the problem is extended to a multi-period setting.A simple and easy-to-understand iterated local search algorithm is developed to tackle this mixed-integer, non-linear optimization problem.The developed approach combines a local search step with a perturbation step in order to create a balance between solution space exploitation and exploration, respectively.Only components that show an added value are included in the algorithm.A full-factorial experiment is conducted to test the added value of each of the algorithms' components and to decide on optimal parameter settings.Furthermore, the algorithm is run on a broad set of realistic instances.These are available online and can be used as new benchmark instances and will foster further research in this area.
Other challenges and opportunities for further research remain.An interesting extension of this work would be to extend the gravity-fed formulation to networks that are fed by pumps.Another possible extension is to convert the single-objective formulation to a multi-objective one, where other objectives, such as water quality and network reliability, are also taken into consideration when optimizing network design.

Figure 1 .
Figure 1.Example of the water distribution network design optimization problem for a small network.(a) Unoptimized network; (b) Optimized network.Thicker lines represent larger diameters.

Figure 6 .
Figure 6.Flowchart of the iterated local search algorithm.s-best: initial global best solution; s-current: current solution; s-init: initial feasible solution.

#Figure 7 .Figure 8 .
Figure 7. Values of the coefficients and mean plots for the regression model on the relative cost: main effects.

Figure 9 .Figure 10 .
Figure 9. Values of the coefficients and mean plots for the regression model on the relative cost: main and interaction effects.

Figure 11 .
Figure 11.Effect of memory structure.

Figure 12 .
Figure 12.Effect of different perturbation rates.

Figure 13 .
Figure 13.Long run of 1000 iterations for three networks.

Figure 14 .
Figure 14.One-day hourly demand patterns of the predefined user categories.

Table 1 .
Different solutions or network designs in vector representation.

Table 2 .
Parameters and their tested values.

Table 3 .
Information on the HydroGen instances.

Table 4 .
Available pipe types and their corresponding costs.

in 10 6 EU R) (in 10 6 EU R) (in 10 6 EU R) (in 10 6 EU R)
This research is funded by the Research Foundation Flanders (FWO).Annelies De Corte and Kenneth Sörensen designed the algorithm.Annelies De Corte programmed the algorithm.Annelies De Corte performed the experiments.Annelies De Corte and Kenneth Sörensen analyzed the data.Annelies De Corte wrote the paper and Kenneth Sörensen provided general guidance and reviewed the paper.The authors declare no conflict of interest. Acknowledgments: