 Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

# Generating Scenarios of Cross-Correlated Demands for Modelling Water Distribution Networks

Department of Civil, Building, and Environmental Engineering, Sapienza University of Rome, Via Eudossiana 18, 00184 Rome, Italy
*
Author to whom correspondence should be addressed.
Water 2019, 11(3), 493; https://doi.org/10.3390/w11030493
Received: 30 December 2018 / Revised: 1 March 2019 / Accepted: 4 March 2019 / Published: 8 March 2019

## Abstract

:
A numerical approach for generating a limited number of water demand scenarios and estimating their occurrence probabilities in a water distribution network (WDN) is proposed. This approach makes use of the demand scaling laws in order to consider the natural variability and spatial correlation of nodal consumption. The scaling laws are employed to determine the statistics of nodal consumption as a function of the number of users and the main statistical features of the unitary user’s demand. Besides, consumption at each node is considered to follow a Gamma probability distribution. A high number of groups of cross-correlated demands, i.e., scenarios, for the entire network were generated using Latin hypercube sampling (LHS) and the numerical procedure proposed by Iman and Conover. The Kantorovich distance is used to reduce the number of scenarios and estimate their corresponding probabilities, while keeping the statistical information on nodal consumptions. By hydraulic simulation, the whole number of generated demand scenarios was used to obtain a corresponding number of pressure scenarios on which the same reduction procedure was applied. The probabilities of the reduced scenarios of pressure were compared with the corresponding probabilities of demand.

## 1. Introduction

The conventional modelling of water distribution networks (WDNs) is normally based on a deterministic approach, not merely with regard to the geometrical and hydraulic features, but also with respect to the demand loadings . However, water demand, being influenced by many factors, i.e., type of users, socio-economic conditions, geographic location with its climate, seasonal fluctuation of weather, water fixtures technology, policies in water management, and tariffs, is subject to a natural variability. Surely, the variability of the demand represents the major source of uncertainty, which affects the overall reliability of the model for the assessment of the spatial and temporal distribution of pressure heads as well as for the evaluation of the water quality in the different pipes. Uncertainty produced by the random nature of demand assumes a different importance in relation to the spatial and temporal scales that are considered in modelling the network. Obviously, they become more and more relevant as the finer scales are reached, that is small groups of users and instantaneous demands are considered. As stated by Bargiela & Sterling , it is possible to obtain accurate predictions for the network as a whole, however estimating nodal consumptions for nodes where the population is low is more difficult.
Thus, considering and quantifying the uncertainty of water demand could allow for the association of an acceptable probability/level of risk to the results from hydraulic and optimization models of WDNs at different temporal and spatial scales: a more robust design and control of these systems can be realized, and obvious opportunities can arise in dimensioning pipes, formulating water balances, controlling a system’s components, and identifying and quantifying leakages.
An approach to dealing with the uncertainty of demand consists in explicitly considering different possible realizations of its value at the nodes of a WDN, i.e., different loading scenarios and associating to each of them a measure of their probability. So, it is possible to find a feasible solution which is also as close as possible to the optimum for all the scenarios: this scenario-based approach is known as robust optimization . In the same way it is possible also to derive WDN reliability  or to localize leakages  or to map pressure-heads for a real-time control , all under uncertain demand conditions. Different possible scenarios, which include various aspects, such as peak flows, fire conditions at certain nodes, or pipe breakage, and the corresponding probabilities of occurrence could be obtained by consulting a panel of experts. However, this solution can have strong limitations in such a mathematically sensitive problem and can lead to arbitrary solutions.
To this aim, this paper focuses on defining an objective methodology for the generation of demand scenarios. In the literature, the issue of generating demand scenarios has been faced in  where uncertain future water consumptions are modeled using probability density functions (PDF) assigned in the problem formulation phase. Scenarios with correlated nodal demands are generated using LHS and the procedure suggested by Iman and Conover . All nodal demands follow a Gaussian PDF with a coefficient of variation Cv = 0.10. The correlation coefficient between any two nodal demands is assumed equal to 0.50, as completed by Tolson et al. . In  different demand scenarios for a WDN are derived combining demand values with a specific return period at each node. The probability of each scenario is obtained considering a multivariate normal distribution (MVN). The correlation between demand was found to significantly affect the occurrence probability of the demand scenarios. Also, Eck et al.  generated demand scenarios considering a MVN distribution. They assumed a prior estimate of the mean values and covariance matrix of water demand from a preliminary analysis. An original, but computationally expensive, approach for estimating the probability of a given demand scenario is based on the use of the contingency tables . This is a non-parametric method in which the various random variables are divided into classes and their marginal probabilities are calculated through the frequency of occurrence of each class. The joint probabilities are evaluated by counting the occurrences of the simultaneous classes.
In this paper a numerical approach for generating a limited number of water demand scenarios and estimating their occurrence probabilities in a WDN is presented. Scaling laws [1,13,14] are used to evaluate the statistics of aggregated demand at each node of the network, starting from the statistics of demand of the unitary user. The Apulian WDN  is considered as a case-study and two different hypotheses are made for the number of users in the nodes. For each of these hypotheses a large number of cross-correlated demands, i.e., scenarios, was generated using LHS from Gamma PDF and a procedure based on the approach proposed by Iman and Conover . The Kantorovich distance  is used to reduce the number of scenarios and estimate their corresponding probabilities. In this way, the limited number of network demand scenarios maintain the statistical information on the demand and the effects on the performance of a WDN. In the design and control of WDNs, demand scenarios are mostly functional to the evaluation of service conditions and specifically pressure-heads in the nodes. In order to evaluate how the uncertainty of demand is conveyed to the pressure-head field, a demand-driven hydraulic model was also applied to solve the Apulian network. Many pressure scenarios were obtained, and the same reduction procedure was employed. The probabilities of the reduced scenarios of pressure were compared with the corresponding probabilities of the demand.
Here, a brief summary of the structure of this work is presented. In Section 2 the description of the proposed methodology is provided. In particular, a summary of the scaling law approach for water demand is reported in Section 2.1, the scenario generation procedure in Section 2.2, and the scenario reduction procedure in Section 2.3. In Section 3 an example is carried out by applying the whole procedure on a literature WDN and the results are discussed. Finally, Section 4 provides some conclusive comments.

## 2. Description of Methodology

#### 2.1. Scaling Laws

In the first phase of the proposed methodology, the statistical features of WDN nodal demand are derived using the scaling laws [1,13]: i.e., the first and second statistical moments and cross-correlation are evaluated, and the probability distributions defined. Input data are the nodal number of consumers and the statistical features of water demand of the typical single user, the unitary user. The statistical parameters of the unitary user’s demand can be derived either by monitoring, or simulating by descriptive models, such as End-Uses models  or Poisson Rectangular Pulse models preserving correlation .
We consider a network with nodes and $n i = n 1 , n 2 , n 3 , … , n N$ users at each node. The demand of a unitary user, identified by subscript 1, is described by an ergodic stationary stochastic process, with mean $μ 1$, variance $σ 1 2$ and cross-correlation coefficient between each couple of single-user $ρ 1$. The whole nodal demand, that is the sum of all users’ consumption at each node, is a realization of the stochastic process whose statistics are dependent on the number of nodal users. The expected value $E [ μ n i ]$ for the mean of the aggregated process at the ith node is given by:
$E [ μ n i ] = n i · E [ μ i ] ,$
and the expected value for the variance $E [ σ n i 2 ]$ at the same node, neglecting the bias that can be caused when using small the demand series (short observation periods) , is given by:
$E [ σ n i 2 ] = n i 2 · E [ ρ 1 ] · E [ σ 1 2 ] + n i · [ 1 − E [ ρ 1 ] ] · E [ σ 1 2 ]$
Equation (2) shows that the expected value of the variance of the aggregated process depends on $E [ ρ 1 ] :$ if demands are perfectly correlated in space, i.e., $E [ ρ 1 ]$ is equal to one, Equation (2) is simplified into:
$E [ σ n i 2 ] = n i 2 · E [ σ 1 2 ]$
if demands are uncorrelated in space, i.e., $E [ ρ 1 ]$ is equal to zero, equation  is simplified into:
$E [ σ n i 2 ] = n i · E [ σ 1 2 ]$
For partially correlated demands, a power law $E [ σ n i 2 ] = n i α E [ σ 1 2 ]$, with α scaling variance exponent, has been also derived [1,14]. Therefore, a complete statistical characterization of demand requires not only the definition of its mean and variance, but also the definition of the correlation between demands of each couplet of users and groups of users. The cross-correlation refers to the similarity between demand patterns from different consumers or from different nodes. This parameter was proved to be not negligible  and to affect the hydraulic performance of a WDN as well as the cost to achieve a desired level of reliability. It was verified that higher cross correlations lead to higher pressure fluctuations, which have negative impacts on the reliability of the WDN . Following the same assumption and notation, the expected value for the cross-correlation between all nodes of the network is represented by a N-by-N square matrix, whose elements are given by:
$E [ ρ n i n j ] = E [ c o v n i n j ] E [ σ n i ] · E [ σ n j ] = n i n j E [ ρ 1 ] [ n i ( 1 + E [ ρ 1 ] ( n i − 1 ) ) ] 1 2 · [ n j ( 1 + E [ ρ 1 ] ( n j − 1 ) ) ] 1 2$
with , and where, for example, $ρ n 1 n 2$ is the cross-correlation coefficient between the demand of $n 1$ aggregated users at node 1, and the demand of $n 2$ aggregated users at node 2.
Through Equations (1)–(3) the nodal demand statistics of the network and the correlation structure between them are entirely defined.

#### 2.2. Generation of Scenarios

In simulation and optimization problems, several methods exist to cope with uncertainty [21,22]. If we do not know exactly the input data, in our case water demand at the nodes of a WDN, because they can assume different values and then many combinations of them are possible, we are dealing with scenarios. But, if the statistical features of uncertain data are known, numerical solutions can be obtained by approximating the probability distribution function with discrete distributions having a finite number of outputs, again referred to as scenarios.
Then, the second phase of the present approach consists in the generation of a large number of water demand scenarios for a WDN, based on the statistics estimated by the scaling laws and making the hypothesis of Gamma-distributed water demand at each node. The knowledge of the scaling laws of the statistical moments and the type of the probability distributions of water demand in relation to the number of users, prove to be a useful tool to face the inherent uncertainty of demand and in particular to address the optimization problems. Using Gamma distribution for demand is supported when the number of aggregated users is high enough or the time resolution is greater than five minutes, by measurements in the Latina case-study . Furthermore, in a recent work Kossieris and Makropoulos  investigated the performance of ten probabilistic models showing that both Gamma and Weibull distributions can be used to adequately describe the nonzero water demand recorded at different time scales.
Scenarios can be generated following different methods: by matching a small set of statistical properties, e.g., moments [24,25] or simulating some defined mathematical process (e.g., Brownian motion) or sampling from known distributions . For scenarios with a large number of variables correlated in between, sampling from the joint distribution is not usual for the difficulty in defining the distribution itself. An alternative to specifying the joint distribution is to make use of just the marginal distributions and the linear or rank correlation matrix alone. If nothing is known about the form of the joint distribution, a coupling procedure can be used: in this case the generated scenarios will respect an arbitrary dependency structure based on the procedure followed.
Each demand scenario $D u$ is defined here as a set or combination of nodal demand values occurring simultaneously in the WDS. This can be represented by the N dimensional vector:
$D u = [ d 1 , u , d 2 , u , … d N , u ] ,$
where $u = 1 , 2 , … , S$ is the index identifying the different scenarios, and $d i , u$ is the demand at node i for the $u t h$ scenario and $D u$ depicts a one-dimensional stochastic data process.
A sampling method from the marginal distribution based on the LHS and the Iman-Conover approach  is followed. The restricted pairing technique by Iman and Conover induces rank correlation between the given marginals by shuffling finite-size samples from each of them. The appropriate shuffle is determined by ranking the input samples the same as in a reference sample with the desired rank correlation. The complete procedure, that will be described in the following, is quite straightforward because it requires only the Cholesky decomposition, some matrix algebra, and the final rearrangement of the original uncorrelated sample.
Description of the Procedure:
Step 1.
Create a random (S, N) dimensional matrix Z*, containing S Latin Hypercube Samples of size N from a standardized normal distribution, where S is the number of scenarios and N the number of the demand nodes in the WDN. For this purpose, the Matlab function lhsnorm was used. Owing to the finite size of the samples their correlation matrix I* (Here, the asterisk is used to distinguish data and corresponding correlation matrices to be corrected) does not coincide with the identity matrix I, that is they are not independent. Then, the lower triangular Cholesky decomposition is applied to induce the desired correlation . Specifically:
$I = C · C T I * = E · E T Z = Z * · C · E − 1$
and the (S, N) dimensional matrix Z of perfectly independent S samples of size N from a standardized normal distribution is obtained. In order to obtain the Cholesky root, the Matlab function chol was used.
Step 2.
Create a random (S, N) dimensional matrix G containing S standardized normal samples with the correlation matrix Corr from the scaling laws for nodal demand. To this aim the desired correlation is induced in Z also applying the lower triangular Cholesky decomposition , that is:
$C o r r = P · P T G = Z · P · C − 1$
Step 3.
Transform the matrix G in the (S, N) dimensional matrix D* which complies with the desired marginal distributions at each demand ith node. Transformation is based on the inverse cumulative distribution function, CDF, of the desired marginals, Fi. Specifically, for each element D*i of matrix D*, i.e., a non-normal random sample with the desired CDF, the following equation holds:
$D i * = F i − 1 ( Φ ( G i ) )$
where $Φ ( G i )$ is the CDF of the ith samples of G and it is uniformly distributed. This procedure is known as the inverse transformation method . Function $Φ ( G i )$ can also be interpreted as a realization from the Gaussian copula. Applying the inverse CDF $F i − 1$ to the uniform random variable $Φ ( G i )$ ensures that $D i *$ is distributed according to $Φ i$. Unfortunately, the transformation in Equation (1) is non-linear, and therefore the correlation matrix Corr* of D* is not equal to the desired correlation matrix Corr.
Step 4.
Apply the Iman-Conover algorithm proposed by Ekström  in order to get a better approximation of the desired correlation matrix Corr for the (S, N) matrix of nodal demand scenarios D*. The algorithm is described in the following steps:
4.1
Calculate lower triangular Cholesky decomposition V of Corr, i.e., Corr = V∙VT.
4.2
Calculate lower triangular Cholesky decomposition Q of Corr*, i.e., Corr* = Q∙QT.
4.3
Obtain T such that Corr = T∙Corr∙TT, can be calculated as T = V∙Q−1.
4.4
Obtain the matrix ScoreD* by rank-transforming D and convert to van der Waerden scores, defined as $F i − 1 ( Φ ( i / ( N + 1 ) )$ where ϕ is the CDF of the standard normal distribution, i is the assigned rank and N is the total number of samples.
4.5
Calculate the target scores matrix ScoreD = ScoreD*·TT.
4.6
Match up the rank pairing in D* according to ScoreD, obtaining the new (S,N) dimensional matrix D containing the S scenarios of the N nodal demand in the WDS. The N samples are distributed according to the desired marginals and their correlation matrix is close to the correlation matrix derived from the scaling laws.

#### 2.3. Scenario Reduction

With the scenario generation procedure, we obtain a great number of pictures, each of which represent a single snapshot of the whole water demand in the network. The higher the number of scenarios generated, the better the description of the variability of water demand in the WDS. However, it is not possible to manage such a large number of scenarios to deal with stochastic or robust optimization problems. Moreover, the probability associated with each of them is not very significant. We have to reduce the scenarios and at the same time determine, for the reduced scenarios, a significant weight representative of their possibility to be realized. Then, the goal of scenario reduction is to approximate the discrete distribution of the generated scenarios with another discrete distribution having fewer elements. At this point, the choice of the number of scenarios becomes a critical step in obtaining meaningful solutions taking into account the system performance and the robustness of the solution to variations in the uncertain data.
It is assumed that the probability distribution $ℙ$ of the N-dimensional stochastic data process is approximately given by many scenarios:
to which the probabilities $p u$ are associated and $∑ u = 1 S p u = 1$.
In order to approximate the probability distribution $ℙ$ with another $ℚ$ distribution, with a smaller number of elements, so that $ℚ$ will be as close as possible to $ℙ$ in terms of probabilistic distance, we use the Kantorovich distance, $K$, which is the most common probability distance used in stochastic optimization . It represents the optimal value to a linear transportation problem. In this problem a cost function $c N$, defined by some norm on $ℜ n$ is introduced as a measure of the distance between couples of scenarios .
The previous problem is not easy to solve and in order to overcome to these difficulties, heuristic algorithms have been developed, in particular fast-backward and fast forward strategies have been implemented . In this paper, we make use of the forward selection algorithm. It defines an iterative process which starts with an empty set. At each iteration, from the set of the non-selected scenarios, the scenario minimizing the Kantorovich distance between the reduced and original sets is selected and inserted in a reduced set. The optimal selection of a single scenario may be repeated recursively until the prescribed number S of elements is reached.
Actually, the forward selection algorithm does not guarantee that the reduced set of scenarios is the closest in the Kantorovich distance with respect to the original set and represents the optimal solution of the original transportation problem. However, the empirical results described in the literature  indicate that the forward selection algorithm works well in practice.
Description of the Procedure:
In this paper a scenario reduction algorithm based on Kantorovich distance was used (Figure 1). In particular, the fast-forward selection algorithm as described in  was applied. Starting from an empty set, an iterative process was followed until the required number of selected scenarios was reached. In the following is reported a brief description of this methodology.
First, the high number of generated demand scenarios at each node have been assumed to be equiprobable. Thus, at each iteration, using the Euclidean norm $ℓ 2$, the distances between all the possible pairs of scenarios were calculated for each node. Then, by summing the corresponding distances for all the different nodes, the cost function matrix $c N$ was derived considering the Euclidean norm. This function allows the evaluation of the Kantorovich distances matrix between pairs of scenarios taking into account their probability of occurrence.
The scenario corresponding to the minimum value of the Kantorovich distance is then selected and the cost matrix is updated by replacing each element with the minimum value between the original element and the one corresponding to the selected scenario. At this point, the procedure is repeated, and a new scenario is added to the reduced scenario set until the number of requested scenarios is reached. In the end, an optimal redistribution of probabilities was carried out by adding the probabilities of non-selected scenarios to the probabilities of those in the reduced set, that is the probability of each non-selected scenario was summed to the probability of the closest selected scenario according to the cost function.
Therefore, according to equation, the new probability of a preserved scenario is equal to the sum of its former probability and all the probabilities of the deleted scenarios that are closest to it according to $c N$. Obviously, the deleted scenarios have probability zero.

## 3. Application Example

#### 3.1. Theoretical WDN: The Apulian Network

For a better understanding of the developed approach, an application using the Apulian network layout  is elaborated as an example. The Apulian network layout comprises 1 reservoir, 23 demand nodes and 34 pipes (Figure 2).
This network is used only with the objective of defining a scene and providing visual help to the application example. The scaling laws have an important role, mostly when the number of users at each node is low and the sampling time short, because in this case the cross-correlation between pairs of nodal demands preserves values that are not too large. In great WDN the approach can be used but it is significant only if the network is well described, i.e., it is not too skeletonized. Otherwise, with a great number of users in the nodes, the most probable scenario can be derived by simply considering the mean demand in the node. The Apulian network is representative of a real situation and its small size determines very short computational times and makes more readable the results. The geometrical features of the WDN are summarized in Table 1.
Data from two-years long water demand measurements of 82 single household users in the town of Latina, Italy  were considered for the definition of the statistical parameters of a typical residential consumption and the calibration of the scaling laws. Table 2 summarizes the average water demand of each typical user and its relevant statistics at peak hour (7:00 a.m.–8:00 a.m.), considering a five-minute time step in data monitoring. The users were considered all the same type and for this reason the same statistical parameters were employed. The number of users at each node of the WDN is also listed in Table 1. In the following examples, residential water consumers are considered, but the approach can be applied to any type of consumption, provided that the statistical characteristics of the typical user are available through monitoring or numerical simulation. Regarding the correlation between pairs of single households a very low value of the Pearson coefficient, i.e., E[ρ1] = 0.0043, was considered, in agreement with most of the experimental data from the case study of Latina. Table 1 also shows the number of consumption units for each water demand node. In order to highlight how the number of users per node and their relationships affect the generated demand scenarios two different frameworks were examined to which correspond respectively DemandA and DemandB column. The DemandA values are the same used by Giustolisi et al.  for Apulian network and the users’ number is consequent. Differently, DemandB values were defined assuming a smaller total number of users and a greater variability of the number of users per node.

#### 3.2. Generation of Demand Scenarios

The methodology presented in this work was applied to generate scenarios of contemporary water demands in the supply nodes of the Apulian distribution network. Two different hypotheses have been made on the number of users at the demand nodes. In the first one, DemandA, the number of users was determined in order to obtain the demand values assumed by Giustolisi et al. , which are appropriate for the subsequent hydraulic simulation of the network. Instead, the second hypothesis, DemandB, considers a lower number of total users, but, above all, considerably differentiates the number of users at each node. This was done with the aim of highlighting how the correlation matrix obtained from the scaling laws is influenced by the number of users in the nodes and by their mutual relations, and the proposed methodology can manage complex scenarios. The statistical parameters describing the unitary user’s demand are obtained from the experimental data of a group of users in the case study of Latina . Data are referred to peak hour and their sampling interval is equal to five minutes.

#### 3.2.1. DemandA

Twelve-thousand demand scenarios were generated using the statistics estimated by the scaling laws and making the hypothesis of Gamma-distribution at each node, as shown in Figure 3. The large number of scenarios makes it difficult to distinguish them. But above all, nothing can be said about their probability.
Gamma distribution proves to be the best in fitting the generated data in all nodes of the WDN, as shown in Figure 4. The scale and shape parameters of the input distributions Γ(a,b), estimated by the scaling laws (INPUT), agree well with the corresponding parameters of the generated scenarios (OUTPUT), Table 3. Regarding the correlation matrix of the generated scenarios, it almost perfectly matches with the input correlation matrix obtained from the scaling laws. Table 4 compares the minimum, average and maximum value of the input and output correlation matrices.

#### 3.2.2. DemandB

Also, in this case twelve-thousand scenarios were generated using the statistics estimated by the scaling laws, considering Gamma-distributed demands at each node, Figure 5. Differently from DemandA case, the value of generated data shows a great excursion between node and node due to the large variability in the number of users.
Similar to the previous case, Gamma distribution proves to be the best in fitting the generated data, as in Figure 6. Also, the INPUT distributions parameters, estimated by the scaling laws, well agree with the corresponding parameters of the generated scenarios distributions (OUTPUT), as in Table 5. In this case the correlation matrix of the generated scenarios almost matches perfectly with the input correlation matrix obtained from the scaling laws. Table 6 compares the minimum, average and maximum value of the input and output correlation matrices. The input and output correlation matrices are almost identical, but it should be noted that the correlation coefficient has very low values when considering node pairs with low number of users, intermediate values when one of the two has many users, higher values when the number of users is high for both. This responds to the fact that the correlation coefficient depends on the product of the number of users of the two considered nodes, see Equation (3).

#### 3.3. Reduction of Demand Scenarios

The final phase of the procedure involves reducing the number of scenarios by aggregating them in relation to the distance of Kantorovich. For the application of the method the Euclidean norm was considered here. A reduced number of scenarios equal to 20 was chosen. In general, the choice of the number of scenarios should be based on the requirements of robust optimization problems and on the need for the reduced set to continue to describe the whole probability distribution of the demand at each node of WDN.

#### 3.3.1. DemandA

In Figure 7 the reduced demand scenarios are represented. The ‘mean scenario’, i.e., the one defined by the average nodal values of nodal demand is plotted by the black dotted line. The red dotted line indicates the most probable scenario, as reported in Figure 7. Mean and most probable scenarios, actually, do not coincide but are very close.
The most probable scenario in the set of generated ones, number 3720, shows a relevant weight, approximately equal to 0.3. Also, scenarios 5945 and 4492 exhibit significant weights, respectively 0.25 and 0.11. All the others have weights lower than 0.05, Figure 8. Nodal demands of the three most remarkable scenarios are reported in Table 7 and compared with the average value of demand in each node, ‘mean scenario’. We can notice that all the selected scenarios are close to the mean one.

#### 3.3.2. DemandB

In Figure 9 the reduced demand scenarios are represented. In this case it is more difficult to distinguish one scenario from the other because of the big difference of average demand in the nodes Also, here there is no coincidence between the mean and the most probable scenario, number 6232 in the set of generated scenarios, but they are very close, as in Figure 10. In this case only another scenario, number 9392 has a weight greater than 0.1, but four scenarios exceed the probability threshold of 0.05. In Table 8 nodal demands of the three most remarkable scenarios are reported and compared with the average value of demand in each node, the ‘mean scenario’.

#### 3.4. Hydraulic Simulation with Scenarios from DemandA

Water demand is the forcing parameter of a WDN and its natural variability is reflected on the variability of the quantities describing the hydraulic behavior of the whole system. The following question arises: how does the uncertainty of water demand determine the uncertainty of nodal pressure-heads and pipe flow-rate in a WDN? More specifically, how does the most probable water demand scenario coincide with the most probable pressure-head scenario? At this aim a set of pressure scenarios was derived from the set of generated demand scenarios in the DemandA case using a demand-driven hydraulic model based on the Global-Gradient algorithm proposed by Todini and Pilati . Twelve-thousand pressure scenarios have been obtained for the Apulian network, as shown in Figure 11. This example does not take into account service constraints on each node. Actually, not all pressure scenarios deriving from demand scenarios can be significant in dealing with design and optimization problems. Using pressure-driven hydraulic models can allow the automatic filtering of unrealistic scenarios. The reduction algorithm was also carried on and twenty reduced scenarios with the corresponding probabilities were derived, as in Figure 12 and Figure 13.
The results show how the twelve-thousand scenarios generated, unlike the corresponding ones of the water demand, have the same ‘shape’, which depends on the geometrical and hydraulic characteristics of the WDN. From the results it is also possible to identify the nodes in which the pressure value shows greater variability: this can provide important support in positioning pressure gauges. Only two scenarios, number 7577, with weight approximately equal to 0.48, and number 6780, with weight about 0.39, remark a significant probability. All the others have probability lower than 0.025. In Table 9 pressure height in each node is reported for the two most probable pressure scenarios, for the scenario of mean pressure, for that of most probable demand (3720) and for the ‘mean scenario’. There are no significant differences between the highlighted scenarios.

## 4. Conclusions

This paper describes a complete procedure to generate and reduce the number of water consumption scenarios in order to represent the uncertainty due to the variability of demand in a WDN. Moreover, with this procedure, an objective measure of the probability of occurrence is related to each of the reduced scenarios.
For this purpose, the scaling laws, proposed by Magini et al. , prove all their potentials. In fact, they allow for a definition of the main statistical features of consumption for a different number of aggregated consumers, starting from the statistical characteristics of a typical user. Here, the features of a specific residential water demand are considered. However, if water demand is adequately typified, it is possible to examine different kinds of water use (commercial, public, …), different socio-economic categories and geographical locations.
One of the main goals of the proposed procedure is to correctly reproduce the statistical characteristics of water consumption in generating scenarios. The nodal water demand is supposed to follow a Gamma probability distribution with parameters dependent on the number of users. To respect cross-correlation between couples of water demand in the nodes, a generation algorithm based on the approach of Iman-Conover is implemented . The generated water demand scenarios respect the hypothesized marginal probability distributions and reproduce the cross-correlations almost perfectly.
Regarding the scenario reduction, a probability/weight is defined for each of the reduced scenarios. Obviously, the probability/weight depends on the number of the reduced scenarios. The results show that the most likely scenario is close to the scenario represented by the average demand in each node. This fact may seem trivial, less trivial is the weight that is associated with them. We can observe that the reduced scenarios do not cover the whole set of generated scenarios, so they cannot completely describe the uncertainty of the demand, leaving out the tails of probability distributions, i.e., not considering the least likely scenarios. Probably, the relevant cross-correlation between couples of nodal water demand makes the Euclidean norm not perfectly suitable to describe the whole probability distributions, and the use of a different norm should be tested in future developments.
The reduction procedure has been applied also to the pressure-head scenarios which are derived from the generated demand. Also, in this case the most probable scenario is close to that defined by the average pressures in each node of the WDN, but it does not coincide with that produced by the most probable demand scenario.
The proposed procedure is certainly of considerable importance not only in the problems of robust optimization, but more in general in the modeling of the WDNs, both for their design and for their control.

## Author Contributions

Conceptualization, R.M.; Methodology, R.M., M.A.B., R.G.; Data curation, R.M.; Formal analysis, R.M., M.A.B.; Writing software, R.M., M.A.B.; Supervision, R.M., R.G.; Writing and editing the paper, R.M., M.A.B.; Funding acquisition, R.M.

## Funding

This research was funded by Sapienza University: ‘Progetti Ateneo Sapienza, small projects’, protocol RP116154FD8D5BF3.

## Conflicts of Interest

The authors declare no conflict of interest.

## References

1. Magini, R.; Pallavicini, I.; Guercio, R. Spatial and Temporal Scaling Properties of Water Demand. J. Water Resour. Plan. Manag. 2008, 134, 276–284. [Google Scholar] [CrossRef]
2. Bargiela, A.; Sterling, M. Adaptive forecasting of daily water demand. In Comparative Model for Electrical Load Forecasting; Bunn, D.W., Farmer, E.D., Eds.; John Wiley and Sons: New York, NY, USA, 1985. [Google Scholar]
3. Perelman, L.; Housh, M.; Ostfeld, A. Robust optimization for water distribution systems least cost design. Water Resour. Res. 2013, 49, 6795–6809. [Google Scholar] [CrossRef][Green Version]
4. Xu, C.C.; Goulter, I.C. Reliability-based optimal design of water distribution networks. J. Water Resour. Plan. Manag. 1999, 125, 352–362. [Google Scholar] [CrossRef]
5. Steffelbauer, D.; Neumayer, M.; Günther, M.; Fuchs-Hanusch, D. Sensor Placement and Leakage Localization considering Demand Uncertainties. Procedia Eng. 2014, 89, 1160–1167. [Google Scholar] [CrossRef]
6. Pallavicini, I.; Magini, R.; Guercio, R. Assessing the spatial distribution of pressure head in municipal water networks. In Proceedings of the Eighth International Conference on Computing and Control for the Water Industry, Exeter, UK, 5–7 September 2005. [Google Scholar]
7. Kapelan, Z.S.; Savic, D.A.; Walters, G.A. Multiobjective design of water distribution systems under uncertainty. Water Resour. Res. 2005, 41, W11407. [Google Scholar] [CrossRef]
8. Iman, R.L.; Conover, W.J. A distribution-free approach to inducing rank correlation among input variables. Communun. Stat. 1982, 11, 311–334. [Google Scholar] [CrossRef]
9. Tolson, B.A.; Maier, H.R.; Simpson, A.R.; Lence, B.J. Genetic algorithms for reliability-based optimisation of water distribution systems. J. Water Resour. Plan. Manag. 2004, 130, 63–72. [Google Scholar] [CrossRef]
10. Vertommen, I.; Magini, R.; Cunha, M.D.C. Generating Water Demand Scenarios Using Scaling Laws. Procedia Eng. 2014, 70, 1697–1706. [Google Scholar] [CrossRef][Green Version]
11. Eck, B.; Fusco, F.; Taheri, N. Scenario Generation for Network Optimization with Uncertain Demands. In Proceedings of the 17th Water Distribution Systems Analysis Symposium, World Environmental and Water Resources Congress, Austin, TX, USA, 17–21 May 2015; pp. 844–852. [Google Scholar]
12. Ridolfi, E.; Vertommen, I.; Magini, R. Joint probabilities of demands on a water distribution network: A non-parametric approach. AIP Conf. Proc. 2013, 1558, 1681–1684. [Google Scholar]
13. Vertommen, I.; Magini, R.; Cunha, M.C.; Guercio, R. Water demand uncertainty: The scaling law approach. In Water Supply Systems Analysis: Selected Topics; Ostfeld, D.A., Ed.; InTech: London, UK, 2012; pp. 1–25. ISBN 978-953-51-0889-4. [Google Scholar]
14. Vertommen, I.; Magini, R.; Cunha, M. Scaling Water Consumption Statistics. J. Water Resour. Plan. Manag. 2015, 141, 04014072. [Google Scholar] [CrossRef][Green Version]
15. Giustolisi, O.; Kapelan, Z.; Savic, D. Algorithm for Automatic Detection of Topological Changes in Water Distribution Networks. J. Hydraul. Eng. 2008, 134, 435–446. [Google Scholar] [CrossRef]
16. Vershik, A. Kantorovich metric: Initial history and little-known applications. J. Math. Sci. 2006, 133, 1410–1417. [Google Scholar] [CrossRef]
17. Blokker, E.J.M.; Vreeburg, J.H.G.; Dijk, J.C.V. Simulating Residential Water Demand with a Stochastic End-Use Model. J. Water Resour. Plan. Manag. 2010, 136, 19–26. [Google Scholar] [CrossRef]
18. Creaco, E.; Alvisi, S.; Farmani, R.; Vamvakeridou-Lyroudia, L.; Franchini, M.; Kapelan, Z.; Savic, D. Preserving Duration-intensity Correlation on Synthetically Generated Water Demand Pulses. Procedia Eng. 2015, 119, 1463–1472. [Google Scholar] [CrossRef][Green Version]
19. Filion, Y.R.; Karney, B.W.; Moughton, L.; Buchberger, S.G.; Adams, B.J. Cross Correlation Analysis of Residential Demand in the City of Milford, Ohio. In Proceedings of the Water Distribution Systems Analysis Symposium, Cincinnati, OH, USA, 27–30 August 2006; ASCE: Reston, VA, USA, 2008. [Google Scholar]
20. Filion, Y.; Adams, B.; Karney, B. Cross Correlation of Demands in Water Distribution Network Design. J. Water Resour. Plan. Manag. 2007, 133, 137–144. [Google Scholar] [CrossRef]
21. Savic, D. Coping with risk and uncertainty in urban water infrastructure rehabilitation planning. In Proceedings of the Acqua e Città-1° Convegno Nazionale di Idraulica Urbana, Sorrento, Italy, 28–30 September 2005. [Google Scholar]
22. Hutton, C.J.; Kapelan, Z.; Vamvakeridou-Lyroudia, L.; Savic, D. Dealing with Uncertainty in Water Distribution System Models: A Framework for Real-Time Modeling and Data Assimilation. J. Water Resour. Plan. Manag. 2014, 140, 169–183. [Google Scholar] [CrossRef]
23. Kossieris, P.; Makropoulos, C. Exploring the Statistical and Distributional Properties of Residential Water Demand at Fine Time Scales. Water 2018, 10, 1481. [Google Scholar] [CrossRef]
24. Ponomareva, K.; Roman, D.; Date, P. An algorithm for moment-matching scenario generation with application to financial portfolio optimization. Eur. J. Oper. Res. 2015, 240, 678–687. [Google Scholar] [CrossRef]
25. Magini, R.; Capannolo, F.; Ridolfi, E.; Guercio, R. Demand uncertainty in modelling WDS: Scaling laws and scenario generation. WIT Trans. Ecol. Environ. 2016, 210, 735–746. [Google Scholar] [CrossRef]
26. Mitra, S. Scenario Generation for Stochastic Programming; White Paper Series, Optirisk Systems, Domain: Finance Reference Number: OPT 004; SINTEF Technology and Society: Uxbridge, UK, 2006. [Google Scholar]
27. Golub, G.H.; Van Loan, C.F. Matrix Computations, 3rd ed.; Johns Hopkins: Baltimore, MD, USA, 1996; ISBN 978-0-8018-5414-9. [Google Scholar]
28. Ross, S.M. Introduction to Probability and Statistics for Engineers and Scientists, 5th ed.; Academic Press: Cambridge, MA, USA, 2014; ISBN 978-0-1239-48113. [Google Scholar]
29. Ekstrom, P.A. A Simulation Toolbox for Sensitivity Analysis. Master‘s Thesis, Faculty of Science and Technology, Uppsala Universitet, Uppsala, Sweden, 2005. [Google Scholar]
30. Dupacová, J.; Gröwe-kuska, N.; Römisch, W. Scenario reduction in stochastic programming. An approach using probability metrics. Math. Program 2003, 95, 493–511. [Google Scholar] [CrossRef]
31. Morales, J.M.; Pineda, S.; Conejo, A.J.; Carrión, M. Scenario Reduction for Future Market Trading in Electricity Markets. IEEE Tras. Power Syst. 2009, 24, 878–888. [Google Scholar] [CrossRef]
32. Heitsch, H.; Romisch, W. Scenario reduction in stochastic programming. Comput. Optim. Appl. 2003, 24, 187–206. [Google Scholar] [CrossRef]
33. Giustolisi, O.; Laucelli, D.; Colombo, A.F. Deterministic versus stochastic design of water distribution networks. J. Water Resour. Plan. Manag. 2009, 135, 117–127. [Google Scholar] [CrossRef]
34. Todini, E.; Pilati, S. A gradient algorithm for the analysis of pipe networks. In Computer Applications in Water Supply; Research Studies Press: Letchworth, UK, 1988. [Google Scholar]
Figure 1. Flow-chart of the procedure for reducing scenarios.
Figure 1. Flow-chart of the procedure for reducing scenarios.
Figure 2. Apulian Network layout.
Figure 2. Apulian Network layout.
Figure 3. Generated demand scenarios, DemandA.
Figure 3. Generated demand scenarios, DemandA.
Figure 4. Gamma PDF distributions (cyan line) fitting demand data (grey bar), DemandA.
Figure 4. Gamma PDF distributions (cyan line) fitting demand data (grey bar), DemandA.
Figure 5. Generated demand scenarios, DemandB.
Figure 5. Generated demand scenarios, DemandB.
Figure 6. Gamma PDF distributions (cyan line) fitting demand data (grey bar) at each node, DemandB.
Figure 6. Gamma PDF distributions (cyan line) fitting demand data (grey bar) at each node, DemandB.
Figure 7. Reduced demand scenarios, DemandA. Reduced nodal scenarios -> green line; average nodal values -> black dotted line; scenario with max probability -> red dotted line.
Figure 7. Reduced demand scenarios, DemandA. Reduced nodal scenarios -> green line; average nodal values -> black dotted line; scenario with max probability -> red dotted line.
Figure 8. Reduced demand scenarios’ probabilities, DemandA. Inside the bar is the number of the rank order of the generated scenario.
Figure 8. Reduced demand scenarios’ probabilities, DemandA. Inside the bar is the number of the rank order of the generated scenario.
Figure 9. Reduced demand scenarios, DemandB. Reduced nodal scenarios -> green line; average nodal values -> black dotted line; scenario with max probability -> red dotted line.
Figure 9. Reduced demand scenarios, DemandB. Reduced nodal scenarios -> green line; average nodal values -> black dotted line; scenario with max probability -> red dotted line.
Figure 10. Reduced demand scenarios’ probabilities, DemandB. Inside the bar is the number of the rank order of the generated scenario.
Figure 10. Reduced demand scenarios’ probabilities, DemandB. Inside the bar is the number of the rank order of the generated scenario.
Figure 11. Pressure headlines from the generated demand scenarios (DemandA).
Figure 11. Pressure headlines from the generated demand scenarios (DemandA).
Figure 12. Reduced pressure headlines scenarios, DemandA. Total reduced scenarios -> green line; average nodal values -> black dotted line; scenario with max probability -> red dotted line.
Figure 12. Reduced pressure headlines scenarios, DemandA. Total reduced scenarios -> green line; average nodal values -> black dotted line; scenario with max probability -> red dotted line.
Figure 13. Reduced pressure scenarios’ probabilities DemandA. Inside the bar is the number of the rank order of the generated demand scenario.
Figure 13. Reduced pressure scenarios’ probabilities DemandA. Inside the bar is the number of the rank order of the generated demand scenario.
Table 1. WDN geometrical features, number of users and water demand at nodes.
Table 1. WDN geometrical features, number of users and water demand at nodes.
PIPES
Pipe NumberStart NodeEnd NodeLength (m)C Hazen WilliamsD (m)Pipe NumberStart NodeEnd NodeLength (m)C Hazen WilliamsD (m)
112348.51000.32718119583.91000.164
223955.71000.29195184521000.229
3344831000.120616794.71000.1
439400.71000.2921715717.71000.1
524791.91000.122814655.61000.258
615404.41000.368231514165.51000.1
756390.61000.327241615252.11000.1
864482.31000.1251716331.51000.1
9910934.41000.12618175001000.204
101110431.31000.184271721579.91000.164
111112513.11000.1281923842.81000.1
121013428.41000.184292120792.61000.1
1312134191000.1302014846.31000.184
1422131023.11000.1319111641000.258
15822455.11000.164322321427.91000.1
1678182.61000.29331918379.21000.1
1767221.31000.2934241158.21000.368
NODES
node IDelevation (m)users ADemandA (l/s)users BDemandB (l/s)
16.493210.861551.8
27146117.034274.98
36128214.95480.56
48.4122414.281291.5
57.486910.13127014.81
69131615.354865.67
79.17829.11630.73
89.590110.517668.93
98.4104512.18630.73
1010.5124914.57450.52
119.68489.8896411.24
1211.76507.583544.12
1312.3130315.21221.42
1410.6116213.551852.15
1510.17919.23810.94
169.596011.2550.64
1710.298411.4796811.29
189.692810.82550.64
199.1125814.68450.52
2013.9114213.324164.85
2111.1125514.635676.61
2211.4103012.011371.59
231088610.3392010.73
24 (Reservoir)36.424258282.86832196.97
Table 2. Relevant statistics and parameters derived from the consumption data of Latina.
Table 2. Relevant statistics and parameters derived from the consumption data of Latina.
Statistical ParameterValue
0.365
0.700
0.870
scaling law exponent α1.230
Table 3. Parameters a and b of Gamma distributions of input and output data, DemandA.
Table 3. Parameters a and b of Gamma distributions of input and output data, DemandA.
INPUTOUTPUT INPUTOUTPUT
Node IDababNode IDabab
1125.200.0868125.210.086813162.060.0938162.070.0938
2176.990.0963177.010.096314148.380.0914148.390.0914
3160.040.0935160.060.093415110.350.0836110.360.0836
4154.440.0925154.450.092516128.090.0874128.100.0874
5118.630.0855118.650.085517130.550.0879130.560.0879
6163.300.0940163.320.094018124.790.0868124.800.0868
7109.380.0834109.390.083419157.730.0930157.750.0930
8121.980.0862122.000.086220146.410.0910146.420.0910
9136.730.0892136.750.089221157.440.0930157.460.0930
10156.860.0929156.880.092922135.220.0889135.240.0889
11116.420.0850116.430.085023120.420.0858120.430.0858
1294.860.079994.870.0799-----
Table 4. Min, average, max values of input and output cross-correlation matrix, DemandA.
Table 4. Min, average, max values of input and output cross-correlation matrix, DemandA.
E[ρ1] = 0.0043
ρ
Input
Correlation Matrix
(Scaling Laws)
ρ
Output
Correlation Matrix
(Scenarios)
min0.75420.7542
average0.81470.8147
max0.85680.8567
Table 5. Gamma PDF parameters a and b input and output data at each node, DemandB.
Table 5. Gamma PDF parameters a and b input and output data at each node, DemandB.
INPUTOUTPUT INPUTOUTPUT
Node IDababNode IDabab
131.460.057531.460.05751326.160.054426.160.0544
268.640.072668.650.07261436.050.059936.050.0599
312.760.043912.760.04391519.080.049519.090.0495
427.310.055127.310.05511614.170.045314.170.0453
5158.890.0933158.900.093217128.910.0876128.920.0876
675.830.074875.840.07481814.170.045314.170.0453
715.730.046715.730.04671912.140.043312.140.0432
8107.650.0830107.660.08302067.280.072167.280.0721
915.730.046715.730.04672185.390.077585.400.0775
1012.140.043312.140.04322228.600.055928.610.0559
11128.500.0875128.510.087523123.960.0866123.970.0866
1259.410.069559.420.0695-----
Table 6. Min, average, max values of input and output cross-correlation matrix, DemandB.
Table 6. Min, average, max values of input and output cross-correlation matrix, DemandB.
E[ρ1] = 0.0043
ρ
Input
Correlation Matrix
(Scaling Laws)
ρ
Output
Correlation Matrix
(Scenarios)
min0.16270.1627
average0.43290.4330
max0.82610.8261
Table 7. Nodal demand (l/s) of most probable scenarios and mean scenario, DemandA.
Table 7. Nodal demand (l/s) of most probable scenarios and mean scenario, DemandA.
Node IDScenario 3720Scenario 5945Scenario 4492Mean ScenarioNode IDScenario 3720Scenario 5945Scenario 4492Mean Scenario
110.8711.3110.2910.731315.2015.6614.9814.94
217.0416.8815.7117.271413.5613.4313.0913.53
314.9614.6714.1514.66159.238.968.448.90
414.2813.8213.3214.351611.2010.8510.4011.01
510.149.9710.7310.401711.4812.0410.7711.59
615.3515.1914.7116.081810.8310.6110.3110.64
79.128.948.318.711914.6814.6514.1114.71
810.5110.649.9210.242013.3214.1013.4513.40
912.1912.3811.2612.852114.6414.3514.3913.89
1014.5714.3314.1914.412212.0212.3411.1212.19
119.899.539.539.962310.3410.5310.1210.37
127.587.587.337.25-----
Table 8. Nodal demand (l/s) of most probable scenarios and mean scenario, DemandB.
Table 8. Nodal demand (l/s) of most probable scenarios and mean scenario, DemandB.
Node IDScenario 6232Scenario 9392Scenario 9550Mean ScenarioNode IDScenario 6232Scenario 9392Scenario 9550 Mean Scenario
11.811.811.751.80131.421.521.271.39
24.985.014.804.71142.162.411.582.13
30.560.460.640.48150.941.190.991.06
41.501.521.391.55160.640.660.650.58
514.8215.0314.5514.921711.2911.3411.2411.16
65.675.385.385.62180.640.600.830.64
70.730.560.590.68190.520.490.450.55
88.948.828.458.82204.855.044.864.91
90.730.650.760.69216.617.056.656.59
100.520.500.340.46221.601.721.531.10
1111.2511.4511.0411.242310.7311.3011.1210.64
124.134.144.044.05-----
Table 9. Nodal pressure (m) for different scenarios, DemandA.
Table 9. Nodal pressure (m) for different scenarios, DemandA.
Node IDScenario 7577Scenario 6780Scenario Mean PressureScenario 3720Scenario Mean DemandNode IDScenario 7577Scenario 6780Scenario Mean PressureScenario 3720Scenario Mean Demand
125.96525.96225.96325.96625.9651315.75415.90415.68415.83915.723
228.07728.05228.06128.08528.0701417.72717.85017.66117.69717.692
324.54824.54324.51224.59224.5341518.13518.27818.00118.10718.041
421.82921.85221.71421.96321.7551621.02321.22221.04221.13821.068
525.28025.31425.25425.28925.2671722.94623.04022.90822.93022.927
624.87524.92924.82924.88824.8491822.93322.96622.89522.90622.911
722.45322.52222.38322.44922.4071925.14925.18525.14725.13925.155
821.54521.61821.45421.52221.4832021.01921.19320.99721.02121.026
920.64720.69120.59220.69420.6202118.79418.85918.76818.73118.785
1019.42119.55319.36619.50119.4042215.92616.36315.90816.11115.959
1118.80818.86418.73118.85518.7642321.91921.97621.89421.86221.909
1217.84617.90217.70017.90117.747------

## Share and Cite

MDPI and ACS Style

Magini, R.; Boniforti, M.A.; Guercio, R. Generating Scenarios of Cross-Correlated Demands for Modelling Water Distribution Networks. Water 2019, 11, 493. https://doi.org/10.3390/w11030493

AMA Style

Magini R, Boniforti MA, Guercio R. Generating Scenarios of Cross-Correlated Demands for Modelling Water Distribution Networks. Water. 2019; 11(3):493. https://doi.org/10.3390/w11030493

Chicago/Turabian Style

Magini, Roberto, Maria Antonietta Boniforti, and Roberto Guercio. 2019. "Generating Scenarios of Cross-Correlated Demands for Modelling Water Distribution Networks" Water 11, no. 3: 493. https://doi.org/10.3390/w11030493

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.