1.1. Overview and Motivation
The main motivation of this paper is two-fold. The first is theoretical, meaning the introduction of a “rich” representation of a graph underlying a water distribution network as an element in a space of probability distributions. This space can be endowed with different distance measures which allow the computation of a new index of the dissimilarity between networks. The second is to show that the vulnerability index derived from this representation can offer additional insights to those derived from the loss of efficiency and the eigenvalue analysis of the adjacency and Laplacian matrices.
Indeed, the availability of new vulnerability measures is important in the analysis of networked infrastructures, as water, energy, and transport, which have developed similar functional and structural features in their evolution over time: spatial, but also financial, constraints have significantly restricted their connectivity, robustness, and their capability to deliver their service with failed or damaged components, in short, their robustness. The above constraints have also generated systemic risk and cascading effects exacerbated by the complexity of the infrastructure with a large number of components: pipes, valves, pumping stations, tanks and consumption points in the case of water distribution networks; generation structures, switching substations and high voltage connections in power grids and pipes, pumping and switching stations, storage facilities and refineries for other distribution networks, like gas and oil.
Both robustness and resilience describe the capability of the network to withstand failures and perturbations in its components and keep delivering services regardless of disruptive events, either random or malicious, as, in a Water Distribution Network (WDN), failures in pumping stations or valves, and severe bursts in the main pipes.
Resilience, robustness, reliability, and vulnerability are terms strictly linked and often confusingly used. Ref. [
1] gives a comprehensive analysis of the different contexts in which the above terms are used.
The term resilience is more common in the literature about engineered network infrastructures and it often takes a more general meaning that vulnerability, also implying the capacity of the network to bounce back, to regain a new stable position close to the original state after perturbations and adapt to the new situation [
2]. Reliability is linked to the concept of risk which implies the use of a measure of the probability that the network will keep working under certain circumstances.
The structure and functions of the network rely on the existence of paths between pairs of nodes: the failure of components is simulated by the removal from the network of the corresponding nodes/edges. When nodes and/or links are removed, the length of such paths will increase and eventually some couples of nodes will become disconnected.
One relevant question is this: which are the critical components (i.e., nodes/edges) whose failure impairs the functioning of the network and how much does this failure impact the ensuing increase in vulnerability?
In this paper, the drop in the network robustness is measured by the increase in vulnerability of the perturbed network with respect to the original one. This first analysis of vulnerability is carried out by using different measures of the connectivity of the graph as they are expressed by centrality indices.
According to a widely used metric [
3], an increase in vulnerability is the loss of efficiency as a consequence of the failure of a set of nodes/edges and their removal from the network.
Another analysis can be carried out using spectral graph theory. The use of spectral methods in networks and graph theory has a long tradition [
4,
5].
Algebraic connectivity was introduced in [
6]. The larger the algebraic connectivity is, the more difficult it is to cut the network into disconnected components.
The key argument in this paper is that beside the vulnerability measures based on centrality indices, average value of node–node distances, and spectral analysis, new insights could be obtained through an additional analysis based on the node-to-node distance distributions aggregated at network level and the computation of their distance. The advantage of these measures is that they enable the comparison between probability distributions taking into account not only the average values, but all the information presented by the distributions.
There are many distance measures between distributions. Two such measures are considered in this paper: the Jensen-Shannon (JS) divergence, based on Kullback-Leibler (KL) divergence, and the Wasserstein (WST) distance.
Distances between distributions are an important tool in machine learning. Entropy- based distances like KL and JS are the most widely used [
7]. Recently the WST distance, which is based on optimal transport theory, gained increasing importance due to its properties mainly in natural language processing [
8] and imaging [
9].
The Wasserstein distance has a strong mathematical basis [
10], can be adapted to different situations and offers a smooth and naturally interpretable distance, in particular between discrete distributions.
This paper is of interest to the water research community because it offers a vulnerability measure which can be used along other measures and give additional insight into the structural features of the network. It can be also of interest to the machine learning community in offering an important specific instance of a distributional graph representation suitable for learning.
1.2. Related Works
The related concepts of vulnerability, robustness, and resilience in WDN have spawned a line of research originated by [
11] using graph theoretic and complex network principles; the paper argues that reliability in the water distribution network is largely defined by the network layout and reports the results of extensive computational studies for four benchmark networks examining two important topological features, robustness, and path redundancy. In the same paper are also studied the cumulative distributions of the edge lengths and geodesic distance. It also hints at the issue of how much the WDN structure deviates from a “small world” network, indicating as a possible cause of deviation the near-planarity of the network.
Papers [
12,
13] use the network and spectral approach in [
11] jointly with clustering techniques and hydraulic simulation.
The approach in [
14], also a graph theoretic, is based on analyzing the K-shortest paths between each demand node and water sources, where paths are weighted by the hydraulic attributes of the supply routes and propose a resilience index based on a surrogate measure of the energy loss associated to each path.
More recently [
15] proposed a hydraulically informed measure of criticality called water flow edge betweenness centrality.
An alternative approach is termed flow entropy [
16] which measures the strength of supply to a node in terms of the number of connections and their similarity.
The demand-adjusted entropic degree in [
17] is another approach that uses demand on nodes and volume capacity to compute a weighted entropic degree.
Spectral analysis has been also used for WDNs [
18,
19,
20], which propose a graph theoretic framework for assessing the resilience in sectorized WDN.
Ref. [
21] also focuses on graph spectral techniques and proposes a novel tool set adapted to improve main water management tasks. The key point is to show how spectral metrics and algorithms support critical tasks of WDN management by just using topological and geometric information. Spectral analysis also helps for the efficient and automatic definition of district metered areas and to facilitate the localization of water losses through the definition of an optimal network partitioning.
More recently [
22] proposed a metric based on robustness and redundancy to evaluate resilience along with an optimization framework. A basic recent reference is [
23].
A related line of research is carried out by [
24], which proposes a graph-based analysis, including hydraulic simulation, in order to estimate the energy balance components, which has been tested on 20 real networks. [
25] and [
26] discriminate between different water consumption in order to detect abnormal events (e.g., leaks, illegal use, and metering inaccuracy).
Ref. [
27] is a wide survey of quantitative resilience methods of WDNs including network-based approaches. [
28] extends this analysis to multiscale resilience in water distribution and drainage systems.
1.3. The Contributions of This Paper
The main contribution of this paper is to propose a novel vulnerability measure which can be used along other measures in order to give additional insight into the structural features of the network.
This result is based on the introduction of a mapping from an “input” space where the elements are graphs or graph elements like nodes or edges to a probabilistic space whose elements are probability distributions associated to elements in the input space. The use of a probabilistic distance—Wasserstein distance in particular—between elements in the probabilistic space, can be specialized to discrete distributions and particularly histograms.
Histograms are suitable to represent node-to-node distance distributions in the graph model of the WDN. This allows the introduction of a new set of vulnerability metrics given by the distance between the probability distributions of node-node distances between the original network and that resulting from the removal of nodes/edges.
Two such probabilistic measures have been analyzed: Jensen-Shannon (JS), based on information theory, and the Wasserstein (WST) distance, an instance of optimal transport. The computational results confirm that the value of the distances JS and WST is strongly related to the criticality of the removed edges.
There are two major advantages of the Wasserstein distance: the first is that JS might become undefined in many situations while WST distances are generally well defined and provide an interpretable distance metric between distributions.
The second is that, under quite general conditions, the WST distance is a differentiable function of the parameters of the distributions which makes possible its use to assess the sensitivity of the network robustness to distributional perturbations.
A general methodological scheme is proposed connecting different modelling and computational elements, concepts, and analysis tools; it enables an analysis framework suitable for assessing robustness also of other networked infrastructure like energy, gas, and transport.
This framework has been designed, implemented, and tested on two real-life urban networks; it can support decision-making both at the design stage, to simulate alternative network layouts of different robustness, and at the operational stage where it is necessary to make a decision about which nodes/edges are to be temporarily removed for maintenance and rehabilitation.