Network science provide a means to extract and characterize the topological [
41,
42] structure that is present in water distribution systems. To this, it should be noted that its topological structure can be described as a graph
, where the set of vertices
represents nodes, reservoirs and tanks, and the set of edges
represents the pipes, valves and pumps of the hydraulic system, where
n is the number of nodes, and
m is the number of pipes.
2.1. First Stage: Clustering of the System
A
community or
module is a set of vertices which are densely interconnected between themselves, but relatively weakly connected to other vertices. The detection of communities is a topic studied by network science. Different methods exist for communities extraction [
43,
44]. In this study, the most popular method is applied that consists of partitioning the network by the maximization of a metric known as
modularity, which is defined as [
30,
34]:
where the sum runs over all pairs of vertices,
are the elements of the adjacency matrix
,
is the degree of the vertex
i,
is the total number of edges, and
is the structural resolution parameter. The Kronecker delta function
is equal to one if vertex
i and
j belongs to the same community (
), and zero otherwise. The structural resolution parameter
allows for the control of number (and size) of communities. As smaller is
, smaller is the number (larger is the size) of communities. In this work, it is used
and
for the first and second study cases respectively. These values were fixed in order to regulate a number of initial communities that will be the starting point for the second stage. Indeed, the first stage provides a set of
conceptual cuts that defines communities. As illustrated in
Figure 1, these communities are the “building blocks” for the districts by selecting which conceptual cuts will be “real cuts”, i.e., a valve (closed) and which one must remain connected, i.e., no intervention. Here,
is tuned to have 10 communities for the first case study and 20 communities for the second case study. Of course, finer initial partition will offer larger connectivity possibilities to the cost of increasing the computational effort. It is important to note that modularity maximization was proved to be a NP-hard problem [
45] (the acronym NP denotes Nondeterministic Polynomial time). There exist several heuristic strategies for modularity maximization
Q. In this work, a Louvain-type greedy algorithm it is applied [
32,
33].
Maximization of Q occurs among all the possible partitioning of the network in communities. Each partition provides a set of the edges, named conceptual cuts. These are the design variables for this first stage which are given by the binary vector , where in case that there exists a conceptual cut, and otherwise. From the set of s cuts given by it is defined the vector of conceptual cuts: . Note that the extraction of communities is accomplished without resolving the hydraulic system. As a consequence, only a subset of the s conceptual cuts needs to be materialized as real interventions by a utility looking to sectorize its WDN.
2.2. Second Stage: Physical Dividing of the System
Aimed to partitioning the water system, the second stage consists of determining the optimal location of isolation devices, or equivalently, determining the boundary pipes which must remain connected. It is important to note that boundary pipes between two neighbor communities will be specified all together with the same state: all real cuts (closed valves) or all connected pipes. This condition reduces importantly the number of design variables of the optimization problem and assures a significant improvement to the computational effort. This strategy is accomplished here by defining a reduced or collapsed adjacency matrix. Starting from the structure detected during the first stage, a reduced adjacency matrix (
) is built which encodes the connectivity among communities, similar to the approach presented in [
36,
37] and subsequently applied to a large WDN [
38].
As before, the topological structure can be described as a new graph , where the set of vertices now represents the communities previously determined, and the set of edges represents the edges (indeed multi-edges) between communities, where N is the number of communities, and v is the number of multi-edges among communities. To represent these multi-edges, a new binary design vector is defined as , where represents a set valves to be installed at the position i, and represents no intervention at i. Importantly, a given set of boundary edges (i.e., conceptual cuts) between two nearest neighbor communities, for instance , is embedded in a multi-edge , that is: and it is verified that .
This design stage is formulated as a two objective optimization problem where it is proposed to minimize the number of conceptual cuts that will be established as
no intervention (i.e., no valves installed) at these pipes together with some of the following indicators characterizing the hydraulic system. One of the requirements of any sectorization is to provide districts with similar level of demands [
28,
29]. Focusing on this objective, it is proposed to compare two specific indicators of demand similarity with a non-specific one such as loss of resilience. Specifically, these indicators are: (a) Gini coefficient (G), (b) Standard deviation (S), (c) Loss of resilience (L).
(a) Gini coefficient. Although the Gini coefficient was originally defined as a measure of income inequality in a society, it is also used as a measure of any unequal distribution. It consists of an index that varies between 0 (absolute equality) and 1 (absolute inequality). In this study, the Gini coefficient is used to minimize the inequality of the demands required by each community. Formally, the Gini coefficient is defined as:
where
denotes the total demand of community
i,
is the average demand of the complete water system, and
N is the number of communities.
(b) Standard deviation. This PI quantifies the dispersion of the required demands among communities, that is:
where
is the total demand of community
i,
is the average demand of the community
i, and
N is the number of communities.
(c) Loss of resilience. The resilience index was introduced by Todini [
46] and, for a network without pumps, is given by:
where
and
denote, respectively, the minimum required demand and head at node
i. Similarly,
denote the head at node
i,
n is the number of nodes,
and
are the discharge and head, respectively, of the reservoir
k, where
r is the number of reservoirs.
The resilience index represents an important indicator of the power available that allows water to find alternative paths when the network is partitioned. Network reconfiguration via partitioning leads to a decreasing resilience. In this work, it is proposed minimize the loss of resilience:
Formally, the two objective optimization is formulated as:
subject to
where
h is hour of peak demand.
Hydraulic analysis is performed under pressure driven approach (PDA) which are represented by the matrix equations:
where
, and
is the number of hour of the hydraulic analysis.
is the diagonal matrix of pipe resistance,
is derived from the pipe-node topological matrix,
is the matrix of static heads,
is the vector of unknown flow rates,
is the vector of unknown nodal heads,
is the vector of known nodal heads,
is the vector of water demand which depend on time and current pressure (for detail on the hydraulic model see ref. [
47])). Equation (
7) is completed with a model for the supplied demands [
48]:
where
is the pressure at node
i,
is the minimum required pressure for supplying the demand
. For all case studies were used
m and
m for all nodes, and
h for the second case study.
2.3. Simulated Annealing
The minimization problem stated in this second stage is also a NP-hard problem. In this work, the simulated annealing algorithm [
49,
50] was applied to solve the two objective optimization problem. This algorithm consists of three essential features: (i) to generate new solutions within the search space, (ii) an acceptance probability for a proposed solution from a previous solution, and (iii) a cooling strategy. The algorithm starts from an initial solution
at a certain fictitious temperature
and continues during a prefixed number of steps. At each step, the method considers new solution
within its neighborhood of the current solution
. For the case of single objective version, the new solution is accepted as current solution according to an
acceptance probability given by:
where
. That is, if the cost of the new solution
is smaller than the cost of the current solution
, the new solution is accepted as current solution. Otherwise, the worse solution
is accepted according to a probability that depends on the change of the cost and on the temperature parameter. This strategy offers an opportunity to escape of a local minima.
Different implementations of the simulated annealing can be found for dealing with multiobjective problems [
51]. Here, the method proposed by Suppapitnarm and Parks (SMOSA) [
52] is implemented where a new acceptance probability with multiple temperatures (one for each objective) is also proposed:
this means that the overall acceptance probability is given by product of individual acceptance probabilities for each objective associated with a temperature
.
Temperature plays a critical role for the convergence properties of the algorithm. Initially, temperature must be high enough to accept around of the solutions generated. Then it is decreased at each steps following a proper cooling strategy, typically , with . In this work, it is used for both and .
A summary of the methodology proposed in the study is shown in the flowchart of
Figure 2.