# A Graph-Based Optimization Framework for Large Water Distribution Networks

## Abstract

**:**

## 1. Introduction

^{5}faster). That approach is particularly suitable to solve problems with an extremely large decision space in acceptable time [14]. However, when comparing the design solutions from the graph-based optimization with those from evolutionary algorithm, for design solutions with a high level of resilience, the graph-based method is outperformed. However, in that region of the Pareto-front (high resilience and high costs), water quality issues arise due to the low flow velocities. But water quality analysis is even more computationally intensive for large-scale WDN. Recently, based on CNA, a highly efficient surrogate method for assessing water quality in large WDNs was developed (2.4 × 10

^{5}times faster than extended period simulation Epanet2) [15]. Therein, the edges in the graph of the WDS are weighted based on the residence time in the pipes. Based on shortest path analysis, pattern correction, and topological correction functions (network dispersion), good estimates for nodal water age values were obtained.

## 2. Materials and Methods

#### 2.1. Graph-Based Multi-Objective Design with Demand Edge Betweenness Centrality

_{ij}= 1, otherwise a

_{ij}= 0. Each link/edge k in the graph can have a weight w

_{k}. Often unweighted graphs (w

_{k}= 1) or the Euclidean distance (i.e., pipe length L

_{k}) is used. But also, other weights can be used, e.g., mimicking hydraulic or water quality behavior, such as friction losses or residence time in an edge.

_{i,j}, is the path between two vertices, i and j, where the path length (i.e., the sum of edge weights) is minimal.

_{i,j}when connecting all possible node pairs. For a WDN, each node has to be connected to at least one source node, therefore counting the shortest paths of σ

_{S,j}from all the demand nodes to the source node S can be an indicator for required transport capacity [18]. However, instead of simply counting the number of shortest paths going through an edge, the nodal demand d

_{j}can be used to weigh the EBC counts along σ

_{S,j}. This then gives an estimate of ideal/design flows. This customized EBC measure is denoted ‘demand edge betweenness centrality’ (EBC

^{Q}) [14].

^{Q}is used for the design of a WDN, only the pipe length L

_{k}is available as edge weight, as no other hydraulic characteristics are known before the design process. When we consider all edges, which are part of a shortest path from all demand nodes to a source node (all edges with EBC

^{Q}> 0), this set of edges is called the shortest path tree, and the remaining elements are the co-tree.

^{Q}values are determined. Based on the EBC

^{Q}values, the diameters (DN

_{k}) for each pipe (k) can be determined with a continuity equation, a flow velocity (v

_{design}), and commercially available diameter classes (DN

_{available}):

_{design}in the described design process, Pareto-optimal design solutions can be obtained (minimal pipe costs versus maximum resilience according to [19]), which partly outperforms designs from optimization with an evolutionary algorithm. For the proposed CNA design procedure itself, no hydraulic simulations are required, only to check the pressure threshold afterwards (e.g., minimal pressure of 30 m under design load), and a hydraulic simulation run is needed [20].

#### 2.2. Hybrid Demand Edge Betweenness Centrality and Least Cost Designs

^{Q}. The initial EBC

^{Q}distribution is similar to Figure 1. When now applying the continuity equation (Equation (1)) with a design velocity (v

_{design}) of, e.g., 0.8 m/s, the required diameters can be determined. When using a set of available discrete pipe diameters (e.g., 76.2, 101.6, and 152.4 mm) and corresponding costs per one-meter pipe (EUR 8, 11, and 16/m), the diameters can be determined for all pipes (rounding up to the nearest available discrete diameter), and the total construction costs can be determined. For different values for v

_{design}, the diameters can be determined, starting with the lowest v

_{design,}it is then increased stepwise (0.8, 1.0, to 1.2 m/s). When the pressure distribution is checked with Epanet2 (EN2), pressure deficits can be determined (e.g., ≤30 m). If there are no pressure deficits, the design solution is technically feasible, and the next (higher) design velocity is evaluated. If there are pressure deficits present in that design, it is technically infeasible, subsequently these solutions are enhanced to meet the pressure criterion.

_{3,S1}). Along that path, a fraction (f) of the demand of the node with pressure deficit is added to the EBC

^{Q}(i.e., 1.2 L/s in the example with f = 1) to (slightly) increase the flow capacity from the source to that node. The design procedure is repeated, and the pressures are determined again with the new (hybrid) EBC

^{Q}values. This procedure is repeated until there are no pressure deficits in the system. In the next step, the design velocity is further increased and, if necessary, pressure deficits are again addressed by iteratively using the proposed hybrid approach.

#### 2.3. Water Quality Assessment with Complex Network Analysis

_{design}) are used. To account that, for water age simulation, the average demand load is decisive (Q

_{avg}), the flow velocities from the resilience assessment are reduced by a factor (f

_{V}) describing the ration between design and average demand (Q

_{avg}= f

_{v}Q

_{design}). In principal, there could be changes in the flow regime with different load cases, but Sitzenfrei [15] showed that for typical WDNs with that assumption, neglectable errors regarding the flow velocities are obtained.

_{i,j}for water age are basically similar to a hydraulic snapshot simulation, where only edges in the shortest path tree are considered in the model. To account for extended period simulations (i.e., 24 different hourly values for a diurnal demand pattern for multiple days), time consideration can be assumed in a simple way [15], and a vector for pattern correction (C

_{p}) can be determined based on the hourly pattern multipliers Cp

_{i}(see Equation (3)).

_{0}= 0. However, with increasing network size, the network dispersion gets more important [15] and should be calibrated with simulation results of water quality of the investigated network. As global network correction, the fraction of flow in the co-tree (alternative flow, 0 ≤ Q

_{alt}≤ 1) and the mean node degree (mD) of the WDN were identified to describe the global network dispersion [15]. Q

_{alt}is the sum of flows in edges outside the shortest path tree divided by the sum of flows in the shortest path tree. Q

_{alt}can also be determined based on hydraulic simulation results from the resilience assessment. For the toy example in Figure 4, Q

_{alt}can be determined with 0.026 (flow outside the shortest path tree 0.78 divided by the sum of all edge flows 30.34).

_{i,j}the water age based on CNA can be determined (A

_{CNA}).The entire procedure, systematic testing with hundreds of different designs and network topologies, and validation of the graph-based water quality model can be found in [15]. For the toy example with f

_{V}= 0.5, C

_{p}= constant = 1 and C

_{0}= 0, the results of CNA for determining water quality are in Figure 5, compared to an extended period simulation (1 h with 1 min water quality time step) with Epanet2 (A

_{EPS}).

^{−6}less than with an extended period simulation with Epanet2 [15]. However, this computational efficiency can even be further enhanced because in the graph model of the water quality has not been determined for all nodes, e.g., often only a few nodes in the WDN are decisive for water quality problems. These decisive nodes could be determined before the optimization task (with the graph-based model or as a hybrid approach with extended period simulation with Epanet2) and, subsequently in the course of optimization, the water quality of only these decisive nodes can be determined with the graph model.

#### 2.4. Case Study and Optimal Design Solutions

^{®}Core™ i5–6500 CPU @ 3.2 GHz, Mountain View, CA, USA) was 24 weeks.

^{Q}and hybrid EBC

^{Q}for comparison. To determine the impact of numerical dispersion, Δt is also varied between 60 min, 15 min, 5 min, and 1 min.

## 3. Results

^{Q}and hybrid EBC

^{Q}design are compared with the design solutions obtained with GALAXY. For better visibility, the figure only shows the costs up to values of EUR 30 M, because the maximum cost provided by EBC

^{Q}is around EUR 25 M, and as shown late on, design solutions with higher costs have insufficient water quality performance. In Figure 7a, the colored dots are design solutions determined with EBC

^{Q}. Design solutions from EBC

^{Q}above EUR 10 M (v

_{design}< 0.5 m/s) are dominated by design solutions with higher velocities. This is due to the fact that the resilience indicator used in this work not only considers hydraulic excess pressure but also the uniformity of diameters connected to a node [19]. This basically favors solutions, where the same diameters are connected to one node. With another resilience indicator focusing on the hydraulic performance (e.g., [2]), this effect would not be observed. However, solutions starting from EUR 10 M exceed water quality requirements, therefore investigations on this area of the Pareto-front are not intensified.

^{Q}produces solutions with the minimal resilience of 0.65. Solutions below that value are more and more driven by topography and less by topology. For obtaining the design solutions with EBC

^{Q}, for each design solution, one single hydraulic simulation run was required to determine the resilience value. Therefore, in total, 55 NFE (number of function evaluations) were required for that part of the Pareto-front. However, when comparing these results (especially the knee bend) with those obtained with the genetic algorithm (GA) with 30 Mio NFE, excellent results were obtained.

^{Q}solutions produce design solutions with lower costs. However, more NFE were required due to the iterative nature of the procedure. For the first design solution (I

_{n}= 0.6257, EUR 4.54 M), 3 additional NFE were required, while for the least-cost design (In = 0.3808, EUR 3.5363 M), 719 NFE were required in total. Although the results obtained with hybrid EBC

^{Q}in general improved the results of the CNA approach, the results obtained with the GA are still slightly better (but with tremendously more computational costs).

^{Q}and hybrid EBC

^{Q}from Figure 7a) are assessed with extended period simulation with Epanet2. The obtained water ages (A

_{EPS}) are then compared with the results obtained with graph-based water quality analysis (A

_{CNA}) by calculating the coefficient of determination (R

^{2}). With the plain shortest path analysis (σ

_{i,j}) with residence time as edge weights, useful solutions can be obtained (R

^{2}up to 0.71), but in median the results are less convincing (median R

^{2}= 0.55). However, by applying the pattern correction to σ

_{i,j}, the results can be significantly improved (median R

^{2}= 0.95, maximum R

^{2}= 0.97). Furthermore, with the topology correction, the improvement is even more (median R

^{2}= 0.97, maximum R

^{2}= 0.98). Note that the investigated design solutions represent a wide range of different hydraulic designs with highly varying flow velocities, and the CNA based approach is capable to produce valuable results.

^{Q}designs, the impact of Δt sizes was investigated. The potential loss of accuracy with a larger Δt, can be interpreted as numerical dispersion.

_{i,j}with the shortest residence time passes through 51 nodes, where it is potentially mixed with flows from the co-tree. Due to the rounding up to the next Δt, the water age is usually overestimated with increasing Δt. That effect is clearly shown in Figure 8b, where the x-axis represents the number of mixing nodes for reaching each demand and on the y-axis, with the ΔAge (h)—which is different between the current A

_{EPS}(Δt > 1 min) and A

_{EPS}determined with Δt = 1 min. For each Δt, the slope k of the linear regression is shown. For example, with Δt = 1 h, an average of 1.72 h is additionally added for each mixing node. For the case study, the over-estimation of water age for 1 h of water quality time step is up to approximately 100 h, 15 min up to 20 h, and for 5 min up to 7 h, respectively. This analysis highlights the importance of using a small enough Δt for water quality analysis with Epanet2. Furthermore, it demonstrates that with larger Δt values, only partly usable results are obtained.

^{Q}design (see also Figure 8a), the statistical differences in nodal water age and the computation times are shown. The graph-based method has by far the shortest computation time, and compared to the results with Δt = 1 min, the best accuracy of the solution (median difference −0.19 h).

_{CNA}. In the left of Figure 9, the flow paths of CN1 and CN2 from the source are highlighted in red. In the right of Figure 9, the cumulative distribution function of all travel path lengths is presented, ranging from 0 m to 7480 m; however, the two controlling nodes CN1 and CN2 have significantly less travel path length than the maximum path length, namely 5150 m and 3010 m, respectively (red markers in the right of Figure 9). This underpins the importance of considering the actual flow regime.

_{CNA}) and the extended period simulation with Epanet2 (A

_{EPS}) for all design solutions are shown. The colors are according to the obtained resilience values. One can see, that with decreasing resilience values (from blue to red marker colors), the maximum water age in the controlling nodes also decreases. But, it can be observed that A

_{CNA}slightly underestimates the water age when comparing it to A

_{EPS}. However, this underestimation could be compensated by, e.g., using a safety factor for the threshold for A

_{EBC}(to be on the safe side). Following the suggestion by [15], C

_{0}= 0.1 was used as parameter value for network correction. Also, the network correction/dispersion could be increased in the graph-based water quality model to potentially improve the results, but this would require more investigations. As a threshold for sufficient water age in the graph-based designs in this work, 36 h based on A

_{CNA}is used.

^{Q}design with the remaining part of the pareto front obtained with GALAXY (black dots), very good solutions are obtained with the proposed graph-based framework.

## 4. Summary and Conclusions

## Funding

## Data Availability Statement

## Conflicts of Interest

## References

- Möderl, M.; Sitzenfrei, R.; Rauch, W. How Many Network Sources Are Enough? In Proceedings of the World Environmental & Water Resources Congress, Challenges of Change, Providence, RI, USA, 16–20 May 2010. [Google Scholar]
- Todini, E. Looped water distribution networks design using a resilience index based heuristic approach. Urban Water
**2000**, 2, 115–122. [Google Scholar] [CrossRef] - Farmani, R.; Walters, G.; Savic, D. Evolutionary multi-objective optimization of the design and operation of water distribution network: Total cost vs. Reliability vs. Water quality. J. Hydroinformatics
**2006**, 8, 165–179. [Google Scholar] [CrossRef] - Zischg, J.; Klinkhamer, C.; Zhan, X.; Rao, P.S.C.; Sitzenfrei, R. A century of topological coevolution of complex infrastructure networks in an alpine city. Complexity
**2019**, 2019, 16. [Google Scholar] [CrossRef] - Maier, H.R.; Kapelan, Z.; Kasprzyk, J.; Kollat, J.; Matott, L.S.; Cunha, M.C.; Dandy, G.C.; Gibbs, M.S.; Keedwell, E.; Marchi, A.; et al. Evolutionary algorithms and other metaheuristics in water resources: Current status, research challenges and future directions. Environ. Model. Softw.
**2014**, 62, 271–299. [Google Scholar] [CrossRef] [Green Version] - De Corte, A.; Sörensen, K. Optimisation of gravity-fed water distribution network design: A critical review. Eur. J. Oper. Res.
**2013**, 228, 1–10. [Google Scholar] [CrossRef] - Mala-Jetmarova, H.; Sultanova, N.; Savic, D. Lost in optimisation of water distribution systems? A literature review of system operation. Environ. Model. Softw.
**2017**, 93, 209–254. [Google Scholar] [CrossRef] [Green Version] - Möderl, M.; Sitzenfrei, R.; Fetz, T.; Fleischhacker, E.; Rauch, W. Systematic generation of virtual networks for water supply. Water Resour. Res.
**2011**, 47, W02502. [Google Scholar] [CrossRef] - Simone, A.; Ciliberti, F.G.; Laucelli, D.B.; Berardi, L.; Giustolisi, O. Edge betweenness for water distribution networks domain analysis. J. Hydroinformatics
**2020**, 22, 121–131. [Google Scholar] [CrossRef] - Giudicianni, C.; Herrera, M.; di Nardo, A.; Adeyeye, K. Automatic multiscale approach for water networks partitioning into dynamic district metered areas. Water Resour. Manag.
**2020**, 34, 835–848. [Google Scholar] [CrossRef] [Green Version] - Ulusoy, A.-J.; Stoianov, I.; Chazerain, A. Hydraulically informed graph theoretic measure of link criticality for the resilience analysis of water distribution networks. Appl. Netw. Sci.
**2018**, 3, 31. [Google Scholar] [CrossRef] [PubMed] - Meng, F.; Fu, G.; Farmani, R.; Sweetapple, C.; Butler, D. Topological attributes of network resilience: A study in water distribution systems. Water Res.
**2018**, 143, 376–386. [Google Scholar] [CrossRef] [PubMed] - Giudicianni, C.; Di Nardo, A.; Di Natale, M.; Greco, R.; Santonastaso, G.; Scala, A. Topological taxonomy of water distribution networks. Water
**2018**, 10, 444. [Google Scholar] [CrossRef] [Green Version] - Sitzenfrei, R.; Wang, Q.; Kapelan, Z.; Savić, D. Using complex network analysis for optimization of water distribution networks. Water Resour. Res.
**2020**, 56, e2020WR027929. [Google Scholar] [CrossRef] [PubMed] - Sitzenfrei, R. Using complex network analysis for water quality assessment in large water distribution systems. Water Res.
**2021**, 201, 117359. [Google Scholar] [CrossRef] [PubMed] - Sitzenfrei, R.; Qiu, M.; Ostfeld, A.; Savic, D.; Kapelan, Z. A Hybrid Approach for Considering Topography in Graph-Based Optimization of Water Distribution Networks. In Proceedings of the World Environmental and Water Resources Congress 2023, Henderson, NV, USA, 21–24 May 2023; pp. 831–841. [Google Scholar]
- Wang, Q.; Savić, D.A.; Kapelan, Z. Galaxy: A new hybrid moea for the optimal design of water distribution systems. Water Resour. Res.
**2017**, 53, 1997–2015. [Google Scholar] [CrossRef] [Green Version] - Sitzenfrei, R.; Oberascher, M.; Zischg, J. Identification of network patterns in optimal water distribution systems based on complex network analysis. In Proceedings of the World Environmental and Water Resources Congress 2019, Pittsburgh, PA, USA, 19–23 May 2019; pp. 473–483. [Google Scholar]
- Prasad, T.D.; Park, N.-S. Multiobjective genetic algorithms for design of water distribution networks. J. Water Resour. Plan. Manag.
**2004**, 130, 73–82. [Google Scholar] [CrossRef] - Rossman, L.A. Epanet 2 User Manual; National Risk Management Research Laboratory—U.S. Environmental Protection Agency: Cincinnati, Ohio, 2000.
- Hwang, H.; Lansey, K. Water distribution system classification using system characteristics and graph-theory metrics. J. Water Resour. Plan. Manag.
**2017**, 143, 04017071. [Google Scholar] [CrossRef] - Giustolisi, O.; Simone, A.; Ridolfi, L. Network structure classification and features of water distribution systems. Water Resour. Res.
**2017**, 53, 3407–3423. [Google Scholar] [CrossRef] - Sitzenfrei, R.; Wang, Q.; Kapelan, Z.; Savic, D. A complex network approach for pareto-optimal design of water distribution networks. In Proceedings of the World Environmental and Water Resources Congress 2021, Online, 7–11 June 2021. [Google Scholar]
- Hajibabaei, M.; Hesarkazzazi, S.; Minaei, A.; Savić, D.; Sitzenfrei, R. Pareto-optimal design of water distribution networks: An improved graph theory-based approach. J. Hydroinformatics
**2023**. [Google Scholar] [CrossRef]

**Figure 1.**Multi-objective design procedure based on CNA, adapted from [14] (licensed under CC-BY 4.0).

**Figure 3.**Concept of water quality assessment with CNA adapted from [15] (licensed under CC-BY 4.0).

**Figure 5.**Toy example of water quality assessment with CNA (A

_{CNA}) and comparison with extended period simulation with Epanet2 (A

_{EPS}).

**Figure 6.**(

**a**) Anonymized layout of large real WDS; (

**b**) Pareto-front of optimal design solutions from meta-heuristic optimization (GALAXY).

**Figure 7.**(

**a**) Comparison of designs based on genetic algorithm (GA) with graph-based design (EBC

^{Q}and hybrid EBC

^{Q}); (

**b**) comparison (R

^{2}) of water quality assessment (A

_{EPS}and A

_{CNA}) of the graph-based design (EBC

^{Q}and hybrid EBC

^{Q}).

**Figure 8.**(

**a**) Impact of different Δt on the water age; (

**b**) impact of number of mixing nodes on the water age difference ΔAge to the results with Δt = 1 min.

**Figure 10.**(

**a**) water age comparison (A

_{CNA}and A

_{EPS}) for different resilience values for the controlling nodes CN1 and CN2 (

**b**) results from graph-based design (colored dots) in comparison with results from GALAXY genetic algorithm (GA) with number of function evaluations NFE = 30 Mio.

**Table 1.**Mean differences in water age with different methods for one design solution and computation time.

Δt/Method | Median Difference (h) and Standard Deviation to Δt = 1 min | Computational Time for Water Quality Assessment of One Design Solution |
---|---|---|

60 min | −46.18 h ± 17.22 h | 0.09 min = 5.5 s |

15 min | −10.57 h ± 3.97 h | 0.13 min = 7.9 s |

5 min | −2.74 h ± 1.06 h | 0.88 min = 53.0 s |

1 min | - | 30.61 min = 1836.7 s |

A_{CNA} | −0.19 h ± 0.27 h | 0.00038 min = 0.023 s |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Sitzenfrei, R.
A Graph-Based Optimization Framework for Large Water Distribution Networks. *Water* **2023**, *15*, 2896.
https://doi.org/10.3390/w15162896

**AMA Style**

Sitzenfrei R.
A Graph-Based Optimization Framework for Large Water Distribution Networks. *Water*. 2023; 15(16):2896.
https://doi.org/10.3390/w15162896

**Chicago/Turabian Style**

Sitzenfrei, Robert.
2023. "A Graph-Based Optimization Framework for Large Water Distribution Networks" *Water* 15, no. 16: 2896.
https://doi.org/10.3390/w15162896