# Parameterization of NSGA-II for the Optimal Design of Water Distribution Systems

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Current Understanding of NSGA-II

_{c}), and the probability of PM (P

_{m}). PS and NFEs together determine the total computational budget applied to a given problem. That is, the ratio of NFEs to PS is equal to the number of generations over which NSGA-II will evolve. Note that a larger number of generations normally ensures better convergence of NSGA-II. However, the convergence rate declines significantly as the optimization proceeds, and only minor improvements may be achieved with an additional computational budget. However, it is also essential to pay particular attention to PS, as an inadequately small value may result in a crowded population, i.e., with a number of similar solutions, rather than a diversified set. This normally leads to premature convergence due to the insufficient exchange of new information in the gene pool of the population. P

_{c}and P

_{m}control the chance of each chromosome undergoing the crossover and mutation processes, respectively. A widely adopted strategy is to keep a high value of P

_{c}(e.g., 0.9) and a low value of P

_{m}(e.g., the inverse of the number of decision variables (NDVs), 1/NDVs). The crossover rate, as the predominant search driver, plays a critical role during optimization. The mutation rate contributes mainly to prevent the population from being trapped in local optima.

_{c}and DI

_{m}, respectively). As previously mentioned, SBX and PM are the main search operators within NSGA-II, producing improved children from parents with designated probabilities (i.e., P

_{c}and P

_{m}). SBX mimics the search behavior of the single-point crossover used in binary-coded genetic algorithms and is suitable for optimization problems with real or discrete decision variables. The positions of children points are distributed around their parents following the exponential laws of DI

_{c}(for more details, see [20]). Similar to SBX, the search behavior of PM also depends on the exponential laws of DI

_{m}. As such, each distribution index directly influences the Euclidean distance of the offspring from their parents in the decision variable space (eventually reflected in the objective space). Explicitly, a larger value of DI

_{c}or DI

_{m}keeps the offspring similar (i.e., close) to their parents. In contrast, a smaller value increases the probability of generating offspring substantially different (i.e., far) from their parents (Figure 1). In short, DI

_{c}and DI

_{m}control the search step sizes, while P

_{c}and P

_{m}determine the likelihood of implementing such search steps in the decision variable space. Consequently, a proper combination of these five parameters (i.e., PS, P

_{c}, P

_{m}, DI

_{c}, and DI

_{m}), in addition to a sufficient computational budget, can lead to a better and more robust search behavior of NSGA-II, thus eventually improving the quality of Pareto fronts obtained.

_{c}, and P

_{m}—were found to be within the following bounds: PS ∈ [40,1000], P

_{c}∈ [0.80,0.98], and P

_{m}∈ o(1/NDVs). NFEs was often determined according to the size of the design problems, with larger cases using higher NFEs values. Surprisingly, only a few previous studies focused on the fine-tuning of DI

_{c}and DI

_{m}, which implies that the potential of NSGA-II might not be fully utilized in those applications.

## 3. Methodology

#### 3.1. Problem Formulation

_{total}is the total capital costs, U(D

_{i}) is the unit cost of pipe i which depends on its corresponding diameter (the relationship between sizes and unit costs is available from the manufacturer which is usually non-linear), L

_{i}is the length of pipe I, np is the number of pipes considered in the design stage, H

_{d}is the total head deficit, H

_{min}is the minimum required head at node j, H

_{j}is the actual head at node j, nn is the number of nodes within the network, D

_{i}is the diameter option for pipe i, and ns is the number of commercially available pipe sizes.

#### 3.2. Proposed Methodology of Investigating the Parameterization of NSGA-II

_{c}, DI

_{m}, P

_{c}, and P

_{m}) are investigated, leading to 32 parameter combinations in total (i.e., 2

^{5}= 32). For each parameter combination, 100 independent runs are conducted for smaller benchmark design problems, while 50 runs are carried out for a larger design problem. Compared to the previous studies summarized in Table 1, we believe these choices can provide more reliable results. At least 2000 generations are allowed to ensure a sufficient convergence of NSGA-II. Total NFEs vary from case to case and are determined according to the search space size of each WDS design problem.

^{j}is a Boolean value, indicating whether the currently best-known solution for a specific case study is found at run j; ${C}_{total}^{j}$ is the best solution found at run j; ${H}_{d}^{j}$ is the total head deficit of the best solution found at run j; and nr is the number of independent runs.

_{c}, DI

_{m}, P

_{c}, and P

_{m}) are shown in five colored rings (in an outwards order), and their effectiveness in terms of Freq is shown in the outermost gray ring. Every five-colored-slot in the radial direction of the compass plot (from the red to purple schemes) denotes a particular combination of parameters, and the gray patch next to it in the outermost ring suggests its effectiveness. The whole plot is sorted by the performance of NSGA-II in descending order in the counter-clockwise direction. In this way, one can quickly identify how the highly effective parameter combinations are comprised. For example, in Figure 3 the most effective parameter combination for solving the first case study (i.e., PS = 200, DI

_{c}= 20, DI

_{m}= 1, P

_{c}= 0.9, P

_{m}= 0.0476) was able to identify the best-known solution with a Freq equal to 0.97 over 100 independent runs. Furthermore, using the compass plot facilitates the analysis on the “sweet spots” within the parameter space of NSGA-II, which in turn contributes to the identification of practical guidelines for setting those parameters.

## 4. Case Studies

#### 4.1. New York Tunnel Network (NYT)

^{21}≈ 1.93 × 10

^{25}discrete combinations.

#### 4.2. Hanoi Network (HAN)

^{34}≈ 2.87 × 10

^{26}discrete combinations. Due to a very limited range of pipe sizes, the HAN has a vast region of infeasible solutions in the landscape of decision variables, thus increasing the level of difficulty to identify near-optimal solutions.

#### 4.3. Balerma Irrigation Network (BIN)

^{454}discrete combinations.

#### 4.4. Experimental Setup

## 5. Results and Discussion

#### 5.1. The NYT Design Problem

_{c}and DI

_{m}were both bounded by 1 and 20 due to the preliminary observations in which the values more than 20 significantly reduced the diversity of the offspring derived from their parents [20]. The most recommended P

_{c}and P

_{m}were adopted, i.e., 0.9 and 0.0476 (1/21), respectively. In contrast, a P

_{c}of 0.45 and a P

_{m}of 0.0952 were also used as choices of a much lower SBX rate and a much higher PM rate.

_{c}and DI

_{m}were not equal to 20 concurrently. From the right half of the compass plot, it is observed that DI

_{m}was probably the second important parameter since a larger DI

_{m}always resulted in worse performance. P

_{c}and P

_{m}turned out to have a minor impact on the behavior of NSGA-II. This is somewhat contradictory to what previous studies had concluded, in which P

_{c}and P

_{m}were fine-tuned but DI

_{c}and DI

_{m}were usually neglected.

#### 5.2. The HAN Design Problem

_{c}, DI

_{c}, and DI

_{m}were the same as those used in NYT. PS was set to 60 and 300. The recommended P

_{m}(i.e., 1/34) and a double of that setting were adopted again. From Figure 4, it is revealed that NSGA-II had difficulties in dealing with the HAN design problem, as the maximum Freq was less than 0.7 even if the search space size is quite close to that of NYT, and larger PS and NFEs were used. This is perhaps because HAN has a vast region of infeasible solutions; hence NSGA-II had to spend more effort in finding a feasible solution which fully satisfies the nodal head requirements, leading to less robust and accurate performance.

_{c}and DI

_{m}were less critical compared to PS but had a more significant impact than P

_{c}and P

_{m}. In particular, When DI

_{c}and DI

_{m}were both equal to 1, no matter what PS, P

_{c}, and P

_{m}were set, the Freq was consistently higher than or equal to 0.5. However, if both DI

_{c}and DI

_{m}were set to 20, the combinations of the other three parameters had only a 25% chance to obtain Freq values larger than 0.5. Again, P

_{c}and P

_{m}only had a marginal effect on the effectiveness of NSGA-II.

#### 5.3. The BIN Design Problem

_{c}and DI

_{m}seemed to have at least the same level of impact, if not higher, compared with PS (see Figure 5). In particular, six out of the top seven parameter combinations, which achieved Freq values larger than 0.8, had both parameters equal to 20. All the parameter combinations followed this setting were able to obtain Freq values no less than 0.5, implying a 50% chance to identify a least-cost solution that is less than €1.93 million. In contrast, if both DI

_{c}and DI

_{m}were set to 1, the majority of parameter combinations performed less competent or even resulting in complete failure (i.e., the Freq was equal to zero) no matter what PS was chosen. The values of Freq seem to be insensitive to P

_{c}and P

_{m}, and a P

_{c}of 0.45 or a P

_{m}of 0.011 (which was five times the most recommended mutation probability) had equally good performance compared with the generally used settings, which looks quite counterintuitive.

#### 5.4. A Further Experiment on the BIN Design Problem

^{213}) compared with that (i.e., 10

^{200}) reported in [18], while the latter one dramatically decreased the search space size from 10

^{454}to 2.26 × 10

^{101}. Hinted from the findings on three benchmark design problems, the population sizes were raised to 400 and 1000, and the two representative boundary values of DI

_{c}and DI

_{m}were both set to 10 and 30. In addition, P

_{c}and P

_{m}were kept at 0.9 and 0.0022 (i.e., most recommended settings), respectively.

## 6. Conclusions

_{c}and DI

_{m}. In contrast, the values of P

_{c}and P

_{m}are of minor impact on the effectiveness of NSGA-II in this paper. These findings highlight some aspects that were often neglected in previous studies, with many papers cited in Table 1 having used a fixed PS of 100, a recommended P

_{c}and P

_{m}in combination with default or unknown DI

_{c}and DI

_{m}, no matter how large the search space size was.

_{c}and DI

_{m}, rather than P

_{c}and P

_{m}, since the former two parameters are more important to affect the positions of the offspring generated from their parents. Considering the discrete nature of WDS design problems and the monotonicity of DI

_{c}and DI

_{m}, their ranges should be kept between 1 and 20. By contrast, default values of P

_{c}and P

_{m}(i.e., 0.9 and 1/NDVs) are expected to be sufficient (no need of fine-tuning).

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Mala-Jetmarova, H.; Barton, A.; Bagirov, A. A history of water distribution systems and their optimisation. Water Sci. Technol.-Water Supply
**2015**, 15, 224–235. [Google Scholar] [CrossRef] - Savic, D.A.; Walters, G.A. Genetic Algorithms for Least-Cost Design of Water Distribution Networks. J. Water Resour. Plan. Manag. ASCE
**1997**, 123, 67–77. [Google Scholar] [CrossRef] - Dandy, G.C.; Simpson, A.R.; Murphy, L.J. An Improved Genetic Algorithm for Pipe Network Optimization. Water Resour. Res.
**1996**, 32, 449–458. [Google Scholar] [CrossRef][Green Version] - Walters, G.A.; Lohbeck, T. Optimal Layout of Tree Networks Using Genetic Algorithms. Eng. Optim.
**1993**, 22, 27–48. [Google Scholar] [CrossRef] - Walski, T.M. The Wrong Paradigm, Why Water Distribution Optimization Doesn’t Work. J. Water Resour. Plan. Manag. ASCE
**2001**, 127, 203–205. [Google Scholar] [CrossRef] - Prasad, T.D.; Park, N.-S. Multiobjective Genetic Algorithms for Design of Water Distribution Networks. J. Water Resour. Plan. Manag. ASCE
**2004**, 130, 73–82. [Google Scholar] [CrossRef] - Kanakoudis, V.K. Vulnerability based management of water resources systems. J. Hydroinform.
**2004**, 6, 133–156. [Google Scholar] [CrossRef] - Halhal, D.; Walters, G.; Ouazar, D.; Savic, D. Water Network Rehabilitation with Structured Messy Genetic Algorithm. J. Water Resour. Plan. Manag. ASCE
**1997**, 123, 137–146. [Google Scholar] [CrossRef][Green Version] - Khu, S.-T.; Keedwell, E. Introducing more choices (flexibility) in the upgrading of water distribution networks: the New York city tunnel network example. Eng. Optim.
**2005**, 37, 291–305. [Google Scholar] [CrossRef] - Fu, G.; Kapelan, Z.; Kasprzyk, J.R.; Reed, P. Optimal Design of Water Distribution Systems Using Many-Objective Visual Analytics. J. Water Resour. Plan. Manag. ASCE
**2013**, 139, 624–633. [Google Scholar] [CrossRef][Green Version] - Woodruff, M.J.; Reed, P.M.; Simpson, T.W. Many objective visual analytics: rethinking the design of complex engineered systems. Struct. Multidiscip. Optim.
**2013**, 48, 201–219. [Google Scholar] [CrossRef] - Reed, P.M.; Hadka, D.; Herman, J.D.; Kasprzyk, J.R.; Kollat, J.B. Evolutionary multiobjective optimization in water resources: The past, present; future. Adv. Water Resour.
**2013**, 51, 438–456. [Google Scholar] [CrossRef] - Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II. IEEE Trans. Evol. Comput.
**2002**, 6, 182–197. [Google Scholar] [CrossRef] - Vrugt, J.A.; Robinson, B.A. Improved Evolutionary Optimization from Genetically Adaptive Multimethod Search. Proc. Natl. Acad. Sci. USA
**2007**, 104, 708–711. [Google Scholar] [CrossRef] - Hadka, D.; Reed, P. Borg: An Auto-Adaptive Many-Objective Evolutionary Computing Framework. Evol. Comput.
**2013**, 21, 231–259. [Google Scholar] [CrossRef] - Wang, Q.; Savić, D.A.; Kapelan, Z. GALAXY: A new hybrid MOEA for the optimal design of Water Distribution Systems. Water Resour. Res.
**2017**, 53, 1997–2015. [Google Scholar] [CrossRef][Green Version] - Maier, H.R.; Kapelan, Z.; Kasprzyk, J.; Kollat, J.; Matott, L.S.; Cunha, M.C.; Dandy, G.C.; Gibbs, M.S.; Keedwell, E.; Marchi, A. Evolutionary algorithms and other metaheuristics in water resources: Current status, research challenges and future directions. Environ. Model. Softw.
**2014**, 62, 271–299. [Google Scholar] [CrossRef][Green Version] - Cisty, M.; Bajtek, Z.; Celar, L. A two-stage evolutionary optimization approach for an irrigation system design. J. Hydroinform.
**2017**, 19, 115–122. [Google Scholar] [CrossRef] - Mala-Jetmarova, H.; Sultanova, N.; Savic, D. Lost in Optimisation of Water Distribution Systems? A Literature Review of System Design. Water
**2018**, 10, 307. [Google Scholar] [CrossRef] - Deb, K.; Agrawal, R.B. Simulated binary crossover for continuous search space. Complex Syst.
**1995**, 9, 115–148. [Google Scholar] - Zheng, F.; Qi, Z.; Bi, W.; Zhang, T.; Yu, T.; Shao, Y. Improved Understanding on the Searching Behavior of NSGA-II Operators Using Run-Time Measure Metrics with Application to Water Distribution System Design Problems. Water Resour. Manag.
**2017**, 31, 1121–1138. [Google Scholar] [CrossRef] - Zheng, F.; Zecchin, A.C.; Maier, H.R.; Simpson, A.R. Comparison of the searching behavior of NSGA-II, SAMODE; borg MOEAs applied to water distribution system design problems. J. Water Resour. Plan. Manag. ASCE
**2016**, 142, 04016017. [Google Scholar] [CrossRef] - Bi, W.; Dandy, G.C.; Maier, H.R. Use of Domain Knowledge to Increase the Convergence Rate of Evolutionary Algorithms for Optimizing the Cost and Resilience of Water Distribution Systems. J. Water Resour. Plan. Manag. ASCE
**2016**, 142, 04016027. [Google Scholar] [CrossRef] - Wang, Q.; Guidolin, M.; Savic, D.; Kapelan, Z. Two-Objective Design of Benchmark Problems of a Water Distribution System via MOEAs: Towards the Best-Known Approximation of the True Pareto Front. J. Water Resour. Plan. Manag. ASCE
**2015**, 141, 04014060. [Google Scholar] [CrossRef][Green Version] - Asadzadeh, M.; Tolson, B. Hybrid Pareto archived dynamically dimensioned search for multi-objective combinatorial optimization: application to water distribution network design. J. Hydroinform.
**2012**, 14, 192–205. [Google Scholar] [CrossRef][Green Version] - Raad, D.; Sinske, A.; van Vuuren, J. Robust multi-objective optimization for water distribution system design using a meta-metaheuristic. Int. Trans. Oper. Res.
**2009**, 16, 595–626. [Google Scholar] [CrossRef] - Olsson, R.J.; Kapelan, Z.; Savic, D.A. Probabilistic building block identification for the optimal design and rehabilitation of water distribution systems. J. Hydroinform.
**2009**, 11, 89–105. [Google Scholar] [CrossRef][Green Version] - Jayaram, N.; Srinivasan, K. Performance Based Optimal Design and Rehabilitation of Water Distribution Networks Using Life-Cycle Costing. Water Resour. Res.
**2008**, 44, W01417. [Google Scholar] [CrossRef] - Perelman, L.; Ostfeld, A.; Salomons, E. Cross Entropy multiobjective optimization for water distribution systems design. Water Resour. Res.
**2008**, 44, W09413. [Google Scholar] [CrossRef] - Farmani, R.; Walters, G.; Savic, D. Evolutionary multi-objective optimization of the design and operation of water distribution network: total cost vs. reliability vs. water quality. J. Hydroinform.
**2006**, 8, 165–179. [Google Scholar] [CrossRef] - Farmani, R.; Walters, G.A.; Savic, D.A. Trade-Off between Total Cost and Reliability for Anytown Water Distribution Network. J. Water Resour. Plan. Manag. ASCE
**2005**, 131, 161–171. [Google Scholar] [CrossRef] - Farmani, R.; Savic, D.A.; Walters, G.A. Evolutionary multi-objective optimization in water distribution network design. Eng. Optim.
**2005**, 37, 167–183. [Google Scholar] [CrossRef] - Wolpert, D.H.; Macready, W.G. No free lunch theorems for optimization. IEEE Trans. Evol. Comput.
**1997**, 1, 67–82. [Google Scholar] [CrossRef][Green Version] - Wang, Q.; Zhou, Q.; Lei, X.; Savić, D.A. Comparison of multi-objective optimization methods applied to urban drainage adaptation problems. J. Water Resour. Plan. Manag. ASCE
**2018**, 144, 04018070. [Google Scholar] [CrossRef] - Schaake, J.C.; Lai, D. Linear Programming and Dynamic Programming Application to Water Distribution Network Design; Report No. 116; Hydrodynamics Laboratory, Department of Civil Engineering, Massachusetts Institute of Technology: Cambridge, MA, USA, 1969. [Google Scholar]
- Fujiwara, O.; Khang, D.A. Two-phase decomposition method for optimal design of looped water distribution networks. Water Resour. Res.
**1990**, 26, 539–549. [Google Scholar] [CrossRef] - Reca, J.; Martínez, J. Genetic algorithms for the design of looped irrigation water distribution networks. Water Resour. Res.
**2006**, 42, W05416. [Google Scholar] [CrossRef] - Zheng, F.; Zecchin, A.C.; Simpson, A.R. Self-Adaptive Differential Evolution Algorithm Applied to Water Distribution System Optimization. J. Comput. Civ. Eng.
**2013**, 27, 148–158. [Google Scholar] [CrossRef][Green Version] - Hadka, D.; Reed, P. Diagnostic Assessment of Search Controls and Failure Modes in Many-Objective Evolutionary Optimization. Evol. Comput.
**2012**, 20, 423–452. [Google Scholar] [CrossRef][Green Version]

**Figure 1.**Impact of DI

_{c}and DI

_{m}on the distribution of offspring in the decision variable space (The red dots indicate the four parents used in SBX and PM. The blue dots denote the ten thousand offspring generated randomly from these parents with specified DI

_{c}and DI

_{m}); (

**a**) DI

_{c}= 1; DI

_{m}= 20; (

**b**) DI

_{c}= 20; DI

_{m}= 20; (

**c**) DI

_{c}= 1; DI

_{m}= 1; (

**d**) DI

_{c}= 20; DI

_{m}= 1.

**Figure 5.**Parameterization of NSGA-II applied to the Balerma Irrigation Network (BIN) design problem.

**Table 1.**Literature survey on the applications of the Non-dominated Sorting Genetic Algorithm II (NSGA-II) to Water Distribution Systems (WDS) design problems.

Reference | Cases | Number of | Range of | |||||
---|---|---|---|---|---|---|---|---|

NDVs | PS | Gen | P_{c} | P_{m} | DI_{c} | DI_{m} | ||

[18] | 1 | 454 | N/A | N/A | 0.93–0.98 | 0.001–0.05 | N/A | N/A |

[21] | 6 | 21–454 | 240–800 | 2500 | 0.9 | 1/NDVs | 15 | 7 |

[22] | 6 | 21–454 | 240–800 | 2500 | 0.9 | 1/NDVs | 15 | 7 |

[23] | 5 | 341–274 | 240–1000 | 2500 | 0.9 | 1/NDVs | 15 | 7 |

[24] | 12 | 8–567 | 40–800 | 250–10,000 | 0.9 | 1/NDVs | 15 | 7 |

[25] | 5 | 21–454 | 50–100 | 40–10,000 | 0.9 | 1/NDVs | 20 | 20 |

[26] | 3 | 8–34 | 100 | 100 | N/A | 0.01 | N/A | N/A |

[27] | 3 | 21–632 | 200 | 1000–1250 | N/A | N/A | N/A | N/A |

[28] | 1 | 14 | 500 | 10,000 | 0.8 | 0.03 | 20 | 100 |

[29] | 1 | 21 | 200–1000 | 840–3360 | N/A | 0.075 | N/A | N/A |

[30] | 1 | 112 | 100 | 5000 | N/A | N/A | N/A | N/A |

[31] | 1 | 112 | 100 | 5000 | N/A | N/A | N/A | N/A |

[32] | 3 | 21–567 | 100 | 1000–3000 | 0.9 | 0.03 | N/A | N/A |

_{c}and DI

_{m}. N/A indicates that such information cannot be found in related references.

**Table 2.**Statistical results of solving the BIN design problem using 20 million number of function evaluations (NFEs).

PS | DI_{c} | DI_{m} | Strategy I | Strategy II | ||||
---|---|---|---|---|---|---|---|---|

Min | Avg | Freq * | Min | Avg | Freq * | |||

400 | 10 | 10 | 1.9245 | 1.9273 | 1 | 1.9220 | 1.9242 | 1 |

400 | 10 | 30 | 1.9268 | 1.9389 | 0.3 | 1.9232 | 1.9267 | 0.8 |

400 | 30 | 10 | 1.9231 | 1.9277 | 0.8 | 1.9216 | 1.9248 | 1 |

400 | 30 | 30 | 1.9357 | 1.9572 | 0 | 1.9255 | 1.9275 | 0.9 |

1000 | 10 | 10 | 1.9236 | 1.9274 | 0.9 | 1.9222 | 1.9242 | 1 |

1000 | 10 | 30 | 1.9239 | 1.9294 | 0.7 | 1.9235 | 1.9253 | 1 |

1000 | 30 | 10 | 1.9215 | 1.9263 | 0.9 | 1.9219 | 1.9244 | 1 |

1000 | 30 | 30 | 1.9275 | 1.9324 | 0.2 | 1.9243 | 1.9266 | 1 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Wang, Q.; Wang, L.; Huang, W.; Wang, Z.; Liu, S.; Savić, D.A. Parameterization of NSGA-II for the Optimal Design of Water Distribution Systems. *Water* **2019**, *11*, 971.
https://doi.org/10.3390/w11050971

**AMA Style**

Wang Q, Wang L, Huang W, Wang Z, Liu S, Savić DA. Parameterization of NSGA-II for the Optimal Design of Water Distribution Systems. *Water*. 2019; 11(5):971.
https://doi.org/10.3390/w11050971

**Chicago/Turabian Style**

Wang, Qi, Libing Wang, Wen Huang, Zhihong Wang, Shuming Liu, and Dragan A. Savić. 2019. "Parameterization of NSGA-II for the Optimal Design of Water Distribution Systems" *Water* 11, no. 5: 971.
https://doi.org/10.3390/w11050971