Generating Scenarios of Cross-Correlated Demands for Modelling Water Distribution Networks

: A numerical approach for generating a limited number of water demand scenarios and estimating their occurrence probabilities in a water distribution network (WDN) is proposed. This approach makes use of the demand scaling laws in order to consider the natural variability and spatial correlation of nodal consumption. The scaling laws are employed to determine the statistics of nodal consumption as a function of the number of users and the main statistical features of the unitary user’s demand. Besides, consumption at each node is considered to follow a Gamma probability distribution. A high number of groups of cross-correlated demands, i.e., scenarios, for the entire network were generated using Latin hypercube sampling (LHS) and the numerical procedure proposed by Iman and Conover. The Kantorovich distance is used to reduce the number of scenarios and estimate their corresponding probabilities, while keeping the statistical information on nodal consumptions. By hydraulic simulation, the whole number of generated demand scenarios was used to obtain a corresponding number of pressure scenarios on which the same reduction procedure was applied. The probabilities of the reduced scenarios of pressure were compared with the corresponding probabilities of demand.


Introduction
The conventional modelling of water distribution networks (WDNs) is normally based on a deterministic approach, not merely with regard to the geometrical and hydraulic features, but also with respect to the demand loadings [1]. However, water demand, being influenced by many factors, i.e., type of users, socio-economic conditions, geographic location with its climate, seasonal fluctuation of weather, water fixtures technology, policies in water management, and tariffs, is subject to a natural variability. Surely, the variability of the demand represents the major source of uncertainty, which affects the overall reliability of the model for the assessment of the spatial and temporal distribution of pressure heads as well as for the evaluation of the water quality in the different pipes. Uncertainty produced by the random nature of demand assumes a different importance in relation to the spatial and temporal scales that are considered in modelling the network. Obviously, they become more and more relevant as the finer scales are reached, that is small groups of users and instantaneous demands are considered. As stated by Bargiela & Sterling [2], it is possible to obtain accurate predictions for the network as a whole, however estimating nodal consumptions for nodes where the population is low is more difficult.
Thus, considering and quantifying the uncertainty of water demand could allow for the association of an acceptable probability/level of risk to the results from hydraulic and optimization models of WDNs at different temporal and spatial scales: a more robust design and control of these systems can be realized, and obvious opportunities can arise in dimensioning pipes, formulating water balances, controlling a system's components, and identifying and quantifying leakages.
An approach to dealing with the uncertainty of demand consists in explicitly considering different possible realizations of its value at the nodes of a WDN, i.e., different loading scenarios and associating to each of them a measure of their probability. So, it is possible to find a feasible solution which is also as close as possible to the optimum for all the scenarios: this scenario-based approach is known as robust optimization [3]. In the same way it is possible also to derive WDN reliability [4] or to localize leakages [5] or to map pressure-heads for a real-time control [6], all under uncertain demand conditions. Different possible scenarios, which include various aspects, such as peak flows, fire conditions at certain nodes, or pipe breakage, and the corresponding probabilities of occurrence could be obtained by consulting a panel of experts. However, this solution can have strong limitations in such a mathematically sensitive problem and can lead to arbitrary solutions.
To this aim, this paper focuses on defining an objective methodology for the generation of demand scenarios. In the literature, the issue of generating demand scenarios has been faced in [7] where uncertain future water consumptions are modeled using probability density functions (PDF) assigned in the problem formulation phase. Scenarios with correlated nodal demands are generated using LHS and the procedure suggested by Iman and Conover [8]. All nodal demands follow a Gaussian PDF with a coefficient of variation Cv = 0.10. The correlation coefficient between any two nodal demands is assumed equal to 0.50, as completed by Tolson et al. [9]. In [10] different demand scenarios for a WDN are derived combining demand values with a specific return period at each node. The probability of each scenario is obtained considering a multivariate normal distribution (MVN). The correlation between demand was found to significantly affect the occurrence probability of the demand scenarios. Also, Eck et al. [11] generated demand scenarios considering a MVN distribution. They assumed a prior estimate of the mean values and covariance matrix of water demand from a preliminary analysis. An original, but computationally expensive, approach for estimating the probability of a given demand scenario is based on the use of the contingency tables [12]. This is a non-parametric method in which the various random variables are divided into classes and their marginal probabilities are calculated through the frequency of occurrence of each class. The joint probabilities are evaluated by counting the occurrences of the simultaneous classes.
In this paper a numerical approach for generating a limited number of water demand scenarios and estimating their occurrence probabilities in a WDN is presented. Scaling laws [1,13,14] are used to evaluate the statistics of aggregated demand at each node of the network, starting from the statistics of demand of the unitary user. The Apulian WDN [15] is considered as a case-study and two different hypotheses are made for the number of users in the nodes. For each of these hypotheses a large number of cross-correlated demands, i.e., scenarios, was generated using LHS from Gamma PDF and a procedure based on the approach proposed by Iman and Conover [8]. The Kantorovich distance [16] is used to reduce the number of scenarios and estimate their corresponding probabilities. In this way, the limited number of network demand scenarios maintain the statistical information on the demand and the effects on the performance of a WDN. In the design and control of WDNs, demand scenarios are mostly functional to the evaluation of service conditions and specifically pressure-heads in the nodes. In order to evaluate how the uncertainty of demand is conveyed to the pressure-head field, a demand-driven hydraulic model was also applied to solve the Apulian network. Many pressure scenarios were obtained, and the same reduction procedure was employed. The probabilities of the reduced scenarios of pressure were compared with the corresponding probabilities of the demand.
Here, a brief summary of the structure of this work is presented. In Section 2 the description of the proposed methodology is provided. In particular, a summary of the scaling law approach for water demand is reported in Section 2.1, the scenario generation procedure in Section 2.2, and the scenario reduction procedure in Section 2.3. In Section 3 an example is carried out by applying the whole procedure on a literature WDN and the results are discussed. Finally, Section 4 provides some conclusive comments.

Scaling Laws
In the first phase of the proposed methodology, the statistical features of WDN nodal demand are derived using the scaling laws [1,13]: i.e., the first and second statistical moments and cross-correlation are evaluated, and the probability distributions defined. Input data are the nodal number of consumers and the statistical features of water demand of the typical single user, the unitary user. The statistical parameters of the unitary user's demand can be derived either by monitoring, or simulating by descriptive models, such as End-Uses models [17] or Poisson Rectangular Pulse models preserving correlation [18].
We consider a network with i = 1, 2, 3, . . . , N nodes and n i = n 1 , n 2 , n 3 , . . . , n N users at each node. The demand of a unitary user, identified by subscript 1, is described by an ergodic stationary stochastic process, with mean µ 1 , variance σ 2 1 and cross-correlation coefficient between each couple of single-user ρ 1 . The whole nodal demand, that is the sum of all users' consumption at each node, is a realization of the stochastic process whose statistics are dependent on the number of nodal users. The expected value E[µ n i ] for the mean of the aggregated process at the ith node is given by: and the expected value for the variance E[σ 2 n i ] at the same node, neglecting the bias that can be caused when using small the demand series (short observation periods) [13], is given by: Equation (2) shows that the expected value of the variance of the aggregated process depends on E[ρ 1 ] : if demands are perfectly correlated in space, i.e., E[ρ 1 ] is equal to one, Equation (2) is simplified into: if demands are uncorrelated in space, i.e., E[ρ 1 ] is equal to zero, equation [2] is simplified into: For partially correlated demands, a power law E[σ 2 n i ] = n α i E[σ 2 1 ], with α scaling variance exponent, has been also derived [1,14]. Therefore, a complete statistical characterization of demand requires not only the definition of its mean and variance, but also the definition of the correlation between demands of each couplet of users and groups of users. The cross-correlation refers to the similarity between demand patterns from different consumers or from different nodes. This parameter was proved to be not negligible [19] and to affect the hydraulic performance of a WDN as well as the cost to achieve a desired level of reliability. It was verified that higher cross correlations lead to higher pressure fluctuations, which have negative impacts on the reliability of the WDN [20]. Following the same assumption and notation, the expected value for the cross-correlation between all nodes of the network is represented by a N-by-N square matrix, whose elements are given by: with i, j = 1, 2, 3, . . . , N, and where, for example, ρ n 1 n 2 is the cross-correlation coefficient between the demand of n 1 aggregated users at node 1, and the demand of n 2 aggregated users at node 2. Through Equations (1)-(3) the nodal demand statistics of the network and the correlation structure between them are entirely defined.

Generation of Scenarios
In simulation and optimization problems, several methods exist to cope with uncertainty [21,22]. If we do not know exactly the input data, in our case water demand at the nodes of a WDN, because they can assume different values and then many combinations of them are possible, we are dealing with scenarios. But, if the statistical features of uncertain data are known, numerical solutions can be obtained by approximating the probability distribution function with discrete distributions having a finite number of outputs, again referred to as scenarios.
Then, the second phase of the present approach consists in the generation of a large number of water demand scenarios for a WDN, based on the statistics estimated by the scaling laws and making the hypothesis of Gamma-distributed water demand at each node. The knowledge of the scaling laws of the statistical moments and the type of the probability distributions of water demand in relation to the number of users, prove to be a useful tool to face the inherent uncertainty of demand and in particular to address the optimization problems. Using Gamma distribution for demand is supported when the number of aggregated users is high enough or the time resolution is greater than five minutes, by measurements in the Latina case-study [1]. Furthermore, in a recent work Kossieris and Makropoulos [23] investigated the performance of ten probabilistic models showing that both Gamma and Weibull distributions can be used to adequately describe the nonzero water demand recorded at different time scales.
Scenarios can be generated following different methods: by matching a small set of statistical properties, e.g., moments [24,25] or simulating some defined mathematical process (e.g., Brownian motion) or sampling from known distributions [26]. For scenarios with a large number of variables correlated in between, sampling from the joint distribution is not usual for the difficulty in defining the distribution itself. An alternative to specifying the joint distribution is to make use of just the marginal distributions and the linear or rank correlation matrix alone. If nothing is known about the form of the joint distribution, a coupling procedure can be used: in this case the generated scenarios will respect an arbitrary dependency structure based on the procedure followed.
Each demand scenario D u is defined here as a set or combination of nodal demand values occurring simultaneously in the WDS. This can be represented by the N dimensional vector: where u = 1, 2, . . . , S is the index identifying the different scenarios, and d i,u is the demand at node i for the uth scenario and D u depicts a one-dimensional stochastic data process. A sampling method from the marginal distribution based on the LHS and the Iman-Conover approach [8] is followed. The restricted pairing technique by Iman and Conover induces rank correlation between the given marginals by shuffling finite-size samples from each of them. The appropriate shuffle is determined by ranking the input samples the same as in a reference sample with the desired rank correlation. The complete procedure, that will be described in the following, is quite straightforward because it requires only the Cholesky decomposition, some matrix algebra, and the final rearrangement of the original uncorrelated sample.
Description of the Procedure: Step 1. Create a random (S, N) dimensional matrix Z*, containing S Latin Hypercube Samples of size N from a standardized normal distribution, where S is the number of scenarios and N the number of the demand nodes in the WDN. For this purpose, the Matlab function lhsnorm was used. Owing to the finite size of the samples their correlation matrix I* (Here, the asterisk is used to distinguish data and corresponding correlation matrices to be corrected) does not coincide with the identity matrix I, that is they are not independent. Then, the lower triangular Cholesky decomposition is applied to induce the desired correlation [27]. Specifically: (5) and the (S, N) dimensional matrix Z of perfectly independent S samples of size N from a standardized normal distribution is obtained. In order to obtain the Cholesky root, the Matlab function chol was used.
Step 2. Create a random (S, N) dimensional matrix G containing S standardized normal samples with the correlation matrix Corr from the scaling laws for nodal demand. To this aim the desired correlation is induced in Z also applying the lower triangular Cholesky decomposition [27], that is: Step 3. Transform the matrix G in the (S, N) dimensional matrix D* which complies with the desired marginal distributions at each demand ith node. Transformation is based on the inverse cumulative distribution function, CDF, of the desired marginals, F i . Specifically, for each element D* i of matrix D*, i.e., a non-normal random sample with the desired CDF, the following equation holds: where Φ(G i ) is the CDF of the ith samples of G and it is uniformly distributed. This procedure is known as the inverse transformation method [28]. Function Φ(G i ) can also be interpreted as a realization from the Gaussian copula. Applying the inverse CDF F −1 i to the uniform random variable Φ(G i ) ensures that D * i is distributed according to Φ i . Unfortunately, the transformation in Equation (1) is non-linear, and therefore the correlation matrix Corr* of D* is not equal to the desired correlation matrix Corr.
Step 4. Apply the Iman-Conover algorithm proposed by Ekström [29] in order to get a better approximation of the desired correlation matrix Corr for the (S, N) matrix of nodal demand scenarios D*. The algorithm is described in the following steps: 4.1 Calculate lower triangular Cholesky decomposition V of Corr, i.e., Corr = V·V T .

4.3
Obtain T such that Corr = T·Corr·T T , can be calculated as T = V·Q −1 .

4.4
Obtain the matrix ScoreD* by rank-transforming D and convert to van der Waerden scores, defined as F −1 i (Φ(i/(N + 1)) where φ is the CDF of the standard normal distribution, i is the assigned rank and N is the total number of samples.

4.5
Calculate the target scores matrix ScoreD = ScoreD*·T T . 4.6 Match up the rank pairing in D* according to ScoreD, obtaining the new (S,N) dimensional matrix D containing the S scenarios of the N nodal demand in the WDS. The N samples are distributed according to the desired marginals and their correlation matrix is close to the correlation matrix derived from the scaling laws.

Scenario Reduction
With the scenario generation procedure, we obtain a great number of pictures, each of which represent a single snapshot of the whole water demand in the network. The higher the number of scenarios generated, the better the description of the variability of water demand in the WDS. However, it is not possible to manage such a large number of scenarios to deal with stochastic or robust optimization problems. Moreover, the probability associated with each of them is not very significant.
We have to reduce the scenarios and at the same time determine, for the reduced scenarios, a significant weight representative of their possibility to be realized. Then, the goal of scenario reduction is to approximate the discrete distribution of the generated scenarios with another discrete distribution having fewer elements. At this point, the choice of the number of scenarios becomes a critical step in obtaining meaningful solutions taking into account the system performance and the robustness of the solution to variations in the uncertain data.
It is assumed that the probability distribution P of the N-dimensional stochastic data process is approximately given by many scenarios: to which the probabilities p u are associated and ∑ S u=1 p u = 1. In order to approximate the probability distribution P with another Q distribution, with a smaller number of elements, so that Q will be as close as possible to P in terms of probabilistic distance, we use the Kantorovich distance, K, which is the most common probability distance used in stochastic optimization [30]. It represents the optimal value to a linear transportation problem. In this problem a cost function c N , defined by some norm | .| on n is introduced as a measure of the distance between couples of scenarios [30].
The previous problem is not easy to solve and in order to overcome to these difficulties, heuristic algorithms have been developed, in particular fast-backward and fast forward strategies have been implemented [31]. In this paper, we make use of the forward selection algorithm. It defines an iterative process which starts with an empty set. At each iteration, from the set of the non-selected scenarios, the scenario minimizing the Kantorovich distance between the reduced and original sets is selected and inserted in a reduced set. The optimal selection of a single scenario may be repeated recursively until the prescribed number S of elements is reached.
Actually, the forward selection algorithm does not guarantee that the reduced set of scenarios is the closest in the Kantorovich distance with respect to the original set and represents the optimal solution of the original transportation problem. However, the empirical results described in the literature [32] indicate that the forward selection algorithm works well in practice.
Description of the Procedure: In this paper a scenario reduction algorithm based on Kantorovich distance was used ( Figure 1). In particular, the fast-forward selection algorithm as described in [30] was applied. Starting from an empty set, an iterative process was followed until the required number of selected scenarios was reached. In the following is reported a brief description of this methodology.
First, the high number of generated demand scenarios at each node have been assumed to be equiprobable. Thus, at each iteration, using the Euclidean norm 2 , the distances between all the possible pairs of scenarios were calculated for each node. Then, by summing the corresponding distances for all the different nodes, the cost function matrix c N was derived considering the Euclidean norm. This function allows the evaluation of the Kantorovich distances matrix between pairs of scenarios taking into account their probability of occurrence.
The scenario corresponding to the minimum value of the Kantorovich distance is then selected and the cost matrix is updated by replacing each element with the minimum value between the original element and the one corresponding to the selected scenario. At this point, the procedure is repeated, and a new scenario is added to the reduced scenario set until the number of requested scenarios is reached. In the end, an optimal redistribution of probabilities was carried out by adding the probabilities of non-selected scenarios to the probabilities of those in the reduced set, that is the probability of each non-selected scenario was summed to the probability of the closest selected scenario according to the cost function.
Therefore, according to equation, the new probability of a preserved scenario is equal to the sum of its former probability and all the probabilities of the deleted scenarios that are closest to it according to c N . Obviously, the deleted scenarios have probability zero.
Water 2018, 10, x FOR PEER REVIEW 7 of 20 Therefore, according to equation, the new probability of a preserved scenario is equal to the sum of its former probability and all the probabilities of the deleted scenarios that are closest to it according to . Obviously, the deleted scenarios have probability zero.

Theoretical WDN: The Apulian Network
For a better understanding of the developed approach, an application using the Apulian network layout [15] is elaborated as an example. The Apulian network layout comprises 1 reservoir, 23 demand nodes and 34 pipes (Figure 2).

Theoretical WDN: The Apulian Network
For a better understanding of the developed approach, an application using the Apulian network layout [15] is elaborated as an example. The Apulian network layout comprises 1 reservoir, 23 demand nodes and 34 pipes (Figure 2). Therefore, according to equation, the new probability of a preserved scenario is equal to the sum of its former probability and all the probabilities of the deleted scenarios that are closest to it according to . Obviously, the deleted scenarios have probability zero.

Theoretical WDN: The Apulian Network
For a better understanding of the developed approach, an application using the Apulian network layout [15] is elaborated as an example. The Apulian network layout comprises 1 reservoir, 23 demand nodes and 34 pipes (Figure 2).  This network is used only with the objective of defining a scene and providing visual help to the application example. The scaling laws have an important role, mostly when the number of users at each node is low and the sampling time short, because in this case the cross-correlation between pairs of nodal demands preserves values that are not too large. In great WDN the approach can be used but it is significant only if the network is well described, i.e., it is not too skeletonized. Otherwise, with a great number of users in the nodes, the most probable scenario can be derived by simply considering the mean demand in the node. The Apulian network is representative of a real situation and its small size determines very short computational times and makes more readable the results. The geometrical features of the WDN are summarized in Table 1. Data from two-years long water demand measurements of 82 single household users in the town of Latina, Italy [1] were considered for the definition of the statistical parameters of a typical residential consumption and the calibration of the scaling laws. Table 2 summarizes the average water demand of each typical user and its relevant statistics at peak hour (7:00 a.m.-8:00 a.m.), considering a five-minute time step in data monitoring. The users were considered all the same type and for this reason the same statistical parameters were employed. The number of users at each node of the WDN is also listed in Table 1. In the following examples, residential water consumers are considered, but the approach can be applied to any type of consumption, provided that the statistical characteristics of the typical user are available through monitoring or numerical simulation. Regarding the correlation between pairs of single households a very low value of the Pearson coefficient, i.e., E[ρ 1 ] = 0.0043, was considered, in agreement with most of the experimental data from the case study of Latina. Table 1 also shows the number of consumption units for each water demand node. In order to highlight how the number of users per node and their relationships affect the generated demand scenarios two different frameworks were examined to which correspond respectively DemandA and DemandB column. The DemandA values are the same used by Giustolisi et al. [33] for Apulian network and the users' number is consequent. Differently, DemandB values were defined assuming a smaller total number of users and a greater variability of the number of users per node. Table 2. Relevant statistics and parameters derived from the consumption data of Latina.

Generation of Demand Scenarios
The methodology presented in this work was applied to generate scenarios of contemporary water demands in the supply nodes of the Apulian distribution network. Two different hypotheses have been made on the number of users at the demand nodes. In the first one, DemandA, the number of users was determined in order to obtain the demand values assumed by Giustolisi et al. [33], which are appropriate for the subsequent hydraulic simulation of the network. Instead, the second hypothesis, DemandB, considers a lower number of total users, but, above all, considerably differentiates the number of users at each node. This was done with the aim of highlighting how the correlation matrix obtained from the scaling laws is influenced by the number of users in the nodes and by their mutual relations, and the proposed methodology can manage complex scenarios. The statistical parameters describing the unitary user's demand are obtained from the experimental data of a group of users in the case study of Latina [1]. Data are referred to peak hour and their sampling interval is equal to five minutes.

DemandA
Twelve-thousand demand scenarios were generated using the statistics estimated by the scaling laws and making the hypothesis of Gamma-distribution at each node, as shown in Figure 3. The large number of scenarios makes it difficult to distinguish them. But above all, nothing can be said about their probability. Water 2018, 10, x FOR PEER REVIEW 10 of 20 Gamma distribution proves to be the best in fitting the generated data in all nodes of the WDN, as shown in Figure 4. The scale and shape parameters of the input distributions Γ(a,b), estimated by the scaling laws (INPUT), agree well with the corresponding parameters of the generated scenarios (OUTPUT), Table 3. Regarding the correlation matrix of the generated scenarios, it almost perfectly matches with the input correlation matrix obtained from the scaling laws. Table 4 compares the minimum, average and maximum value of the input and output correlation matrices.  Gamma distribution proves to be the best in fitting the generated data in all nodes of the WDN, as shown in Figure 4. The scale and shape parameters of the input distributions Γ(a,b), estimated by the scaling laws (INPUT), agree well with the corresponding parameters of the generated scenarios (OUTPUT), Table 3. Regarding the correlation matrix of the generated scenarios, it almost perfectly matches with the input correlation matrix obtained from the scaling laws. Table 4 compares the minimum, average and maximum value of the input and output correlation matrices.  Gamma distribution proves to be the best in fitting the generated data in all nodes of the WDN, as shown in Figure 4. The scale and shape parameters of the input distributions Γ(a,b), estimated by the scaling laws (INPUT), agree well with the corresponding parameters of the generated scenarios (OUTPUT), Table 3. Regarding the correlation matrix of the generated scenarios, it almost perfectly matches with the input correlation matrix obtained from the scaling laws. Table 4 compares the minimum, average and maximum value of the input and output correlation matrices.

DemandB
Also, in this case twelve-thousand scenarios were generated using the statistics estimated by the scaling laws, considering Gamma-distributed demands at each node, Figure 5. Differently from DemandA case, the value of generated data shows a great excursion between node and node due to the large variability in the number of users.  Similar to the previous case, Gamma distribution proves to be the best in fitting the generated data, as in Figure 6. Also, the INPUT distributions parameters, estimated by the scaling laws, well agree with the corresponding parameters of the generated scenarios distributions (OUTPUT), as in Table 5. In this case the correlation matrix of the generated scenarios almost matches perfectly with the input correlation matrix obtained from the scaling laws. Table 6 compares the minimum, average and maximum value of the input and output correlation matrices. The input and output correlation matrices are almost identical, but it should be noted that the correlation coefficient has very low values when considering node pairs with low number of users, intermediate values when one of the two has many users, higher values when the number of users is high for both. This responds to the fact that the correlation coefficient depends on the product of the number of users of the two considered nodes, see Equation (3).

Reduction of Demand Scenarios
The final phase of the procedure involves reducing the number of scenarios by aggregating them in relation to the distance of Kantorovich. For the application of the method the Euclidean norm was considered here. A reduced number of scenarios equal to 20 was chosen. In general, the choice of the number of scenarios should be based on the requirements of robust optimization problems and on the need for the reduced set to continue to describe the whole probability distribution of the demand at each node of WDN.

DemandA
In Figure 7 the reduced demand scenarios are represented. The 'mean scenario', i.e., the one defined by the average nodal values of nodal demand is plotted by the black dotted line. The red dotted line indicates the most probable scenario, as reported in Figure 7. Mean and most probable scenarios, actually, do not coincide but are very close.  The most probable scenario in the set of generated ones, number 3720, shows a relevant weight, approximately equal to 0.3. Also, scenarios 5945 and 4492 exhibit significant weights, respectively 0.25 and 0.11. All the others have weights lower than 0.05, Figure 8. Nodal demands of the three most remarkable scenarios are reported in Table 7 and compared with the average value of demand in each node, 'mean scenario'. We can notice that all the selected scenarios are close to the mean one. The most probable scenario in the set of generated ones, number 3720, shows a relevant weight, approximately equal to 0.3. Also, scenarios 5945 and 4492 exhibit significant weights, respectively 0.25 and 0.11. All the others have weights lower than 0.05, Figure 8. Nodal demands of the three most remarkable scenarios are reported in Table 7 and compared with the average value of demand in each node, 'mean scenario'. We can notice that all the selected scenarios are close to the mean one.
The most probable scenario in the set of generated ones, number 3720, shows a relevant weight, approximately equal to 0.3. Also, scenarios 5945 and 4492 exhibit significant weights, respectively 0.25 and 0.11. All the others have weights lower than 0.05, Figure 8. Nodal demands of the three most remarkable scenarios are reported in Table 7 and compared with the average value of demand in each node, 'mean scenario'. We can notice that all the selected scenarios are close to the mean one.

DemandB
In Figure 9 the reduced demand scenarios are represented. In this case it is more difficult to distinguish one scenario from the other because of the big difference of average demand in the nodes Also, here there is no coincidence between the mean and the most probable scenario, number 6232 in the set of generated scenarios, but they are very close, as in Figure 10. In this case only another scenario, number 9392 has a weight greater than 0.1, but four scenarios exceed the probability threshold of 0.05. In Table 8 nodal demands of the three most remarkable scenarios are reported and compared with the average value of demand in each node, the 'mean scenario'.
Also, here there is no coincidence between the mean and the most probable scenario, number 6232 in the set of generated scenarios, but they are very close, as in Figure 10. In this case only another scenario, number 9392 has a weight greater than 0.1, but four scenarios exceed the probability threshold of 0.05. In Table 8 nodal demands of the three most remarkable scenarios are reported and compared with the average value of demand in each node, the 'mean scenario'.

Hydraulic Simulation with Scenarios from DemandA
Water demand is the forcing parameter of a WDN and its natural variability is reflected on the variability of the quantities describing the hydraulic behavior of the whole system. The following question arises: how does the uncertainty of water demand determine the uncertainty of nodal pressure-heads and pipe flow-rate in a WDN? More specifically, how does the most probable water demand scenario coincide with the most probable pressure-head scenario? At this aim a set of

Hydraulic Simulation with Scenarios from DemandA
Water demand is the forcing parameter of a WDN and its natural variability is reflected on the variability of the quantities describing the hydraulic behavior of the whole system. The following question arises: how does the uncertainty of water demand determine the uncertainty of nodal pressure-heads and pipe flow-rate in a WDN? More specifically, how does the most probable water demand scenario coincide with the most probable pressure-head scenario? At this aim a set of pressure scenarios was derived from the set of generated demand scenarios in the DemandA case using a demand-driven hydraulic model based on the Global-Gradient algorithm proposed by Todini and Pilati [34]. Twelve-thousand pressure scenarios have been obtained for the Apulian network, as shown in Figure 11. This example does not take into account service constraints on each node. Actually, not all pressure scenarios deriving from demand scenarios can be significant in dealing with design and optimization problems. Using pressure-driven hydraulic models can allow the automatic filtering of unrealistic scenarios. The reduction algorithm was also carried on and twenty reduced scenarios with the corresponding probabilities were derived, as in Figures 12 and 13.        The results show how the twelve-thousand scenarios generated, unlike the corresponding ones of the water demand, have the same 'shape', which depends on the geometrical and hydraulic characteristics of the WDN. From the results it is also possible to identify the nodes in which the pressure value shows greater variability: this can provide important support in positioning pressure gauges. Only two scenarios, number 7577, with weight approximately equal to 0.48, and number 6780, with weight about 0.39, remark a significant probability. All the others have probability lower than 0.025. In Table 9 pressure height in each node is reported for the two most probable pressure scenarios, for the scenario of mean pressure, for that of most probable demand (3720) and for the 'mean scenario'. There are no significant differences between the highlighted scenarios.

Conclusions
This paper describes a complete procedure to generate and reduce the number of water consumption scenarios in order to represent the uncertainty due to the variability of demand in a WDN. Moreover, with this procedure, an objective measure of the probability of occurrence is related to each of the reduced scenarios.
For this purpose, the scaling laws, proposed by Magini et al. [1], prove all their potentials. In fact, they allow for a definition of the main statistical features of consumption for a different number of aggregated consumers, starting from the statistical characteristics of a typical user. Here, the features of a specific residential water demand are considered. However, if water demand is adequately typified, it is possible to examine different kinds of water use (commercial, public, . . . ), different socio-economic categories and geographical locations.
One of the main goals of the proposed procedure is to correctly reproduce the statistical characteristics of water consumption in generating scenarios. The nodal water demand is supposed to follow a Gamma probability distribution with parameters dependent on the number of users. To respect cross-correlation between couples of water demand in the nodes, a generation algorithm based on the approach of Iman-Conover is implemented [8]. The generated water demand scenarios respect the hypothesized marginal probability distributions and reproduce the cross-correlations almost perfectly.
Regarding the scenario reduction, a probability/weight is defined for each of the reduced scenarios. Obviously, the probability/weight depends on the number of the reduced scenarios. The results show that the most likely scenario is close to the scenario represented by the average demand in each node. This fact may seem trivial, less trivial is the weight that is associated with them. We can observe that the reduced scenarios do not cover the whole set of generated scenarios, so they cannot completely describe the uncertainty of the demand, leaving out the tails of probability distributions, i.e., not considering the least likely scenarios. Probably, the relevant cross-correlation between couples of nodal water demand makes the Euclidean norm not perfectly suitable to describe the whole probability distributions, and the use of a different norm should be tested in future developments.
The reduction procedure has been applied also to the pressure-head scenarios which are derived from the generated demand. Also, in this case the most probable scenario is close to that defined by the average pressures in each node of the WDN, but it does not coincide with that produced by the most probable demand scenario.
The proposed procedure is certainly of considerable importance not only in the problems of robust optimization, but more in general in the modeling of the WDNs, both for their design and for their control.

Conflicts of Interest:
The authors declare no conflict of interest.