Water Network-Failure Data Assessment

The water-supply system is one of the basic and most important critical infrastructures. Water supply service disruption (water quality or quantity) may have serious consequences in modern societies. Water supply service is subject to various failure modes. Failure modes are specified by their degradation mechanisms, criticality, occurrence frequency and intensity. These failure modes have a random nature that impacts on the network disruption indicators, such as disruption frequency, network downtime, network repair time and network back-to-service time, i.e., the network resilience. This paper focuses on the water leakage failure mode. The water leakage failure mode assessment considers the unavoidable annual real water losses and the infrastructure leakage index recommended by the International Water Association’s Water Loss Task Force specialist group. Probabilistic statistical modelling was implemented to assess the seasonal index, the failure rates and the expectation value of the “mean time between failures.” The assessment is based on real operational data of the network. Specific attention is paid to the sensitivity of failures to seasonal variations. The presented methodology of the analysis of the water leakage failure mode is extendable to other failure modes and can help in developing new strategies in the management of the water-supply system in normal operation and crisis situations.


Introduction
The network must supply consumers with a required amount of water of a specific quality at an acceptable price. The requirements of the Drinking Water Directive (Council Directive 98/83/EC of 3 November 1998 on the quality of water intended for human consumption) should be fulfilled with its latest amendments including the EU Commission Directive (EU) 2015/1787 of 6 October 2015. In addition, the World Health Organization published guidelines (2004) for the application of water safety plans in which it recommends the analysis of the supply chain underpinning the water intended for human consumption. The water supply service disruption can cause a crisis with the consequence of a "loss of safety for consumers" [1,2].
This explains the critical importance of the development of emergency plans to manage the continuity of the drinking water supply in various critical situations. In addition, detailed analyses of potential risks leading to the water supply service disruption must be carried out in order to construct a comprehensive programme of system safety management [3,4]. The water-distribution network consists mainly of the pipe network, with mains, distribution pipes, service connections, and such specific fittings as gate valves, check valves, hydrants, drainage, aeration and flow meters.
The complexity of the distribution system, as well as the interactions between different physical, chemical and biological processes, and the lack of failure data, make the analyses of failure events very

Studied Case Characterisation
The selected water-supply system is located in Poland's Podkarpackie province, with a population of 9000 inhabitants. The city has a population density of 642 inhabitant•km −2 . The analysis was performed using a stream of daily operational failure data covering a four-year operating period. The network pipe structural specifications are mains: 1.7 km, distribution pipes: 21.1 km, and service connections: 41 km. Originally, water pipes were built of grey cast iron or galvanised steel, but along with the continuous expansion and modernisation of the network, more modern materials such as polyvinyl chloride and polyethylene are increasingly used. The examined network is branching, so the pipes run parallel to the main streets, with the most concentrated in the center where the buildings are the densest. The average daily water demand equals to 1337 m 3 •d −1 and the maximum hour demand is 134 m 3 •h −1 . Galvanised steel pipes constitute the largest part of the water supply network, accounting for about 56.43%. Grey cast-iron pipes account for 4.55%, while almost 39.02% of all pipelines are polyvinyl chloride and polyethylene.

Failure Data & Statistical Analyses
The leakage failure assessment is based on an available failure database that includes the following classes of data [26,27] If there is any ageing effect, it can be determined by comparing between the monthly number of failures n(i) for month i in the following way:

Studied Case Characterisation
The selected water-supply system is located in Poland's Podkarpackie province, with a population of 9000 inhabitants. The city has a population density of 642 inhabitant·km −2 . The analysis was performed using a stream of daily operational failure data covering a four-year operating period. The network pipe structural specifications are mains: 1.7 km, distribution pipes: 21.1 km, and service connections: 41 km. Originally, water pipes were built of grey cast iron or galvanised steel, but along with the continuous expansion and modernisation of the network, more modern materials such as polyvinyl chloride and polyethylene are increasingly used. The examined network is branching, so the pipes run parallel to the main streets, with the most concentrated in the center where the buildings are the densest. The average daily water demand equals to 1337 m 3 ·d −1 and the maximum hour demand is 134 m 3 ·h −1 . Galvanised steel pipes constitute the largest part of the water supply network, accounting for about 56.43%. Grey cast-iron pipes account for 4.55%, while almost 39.02% of all pipelines are polyvinyl chloride and polyethylene.

Failure Data & Statistical Analyses
The leakage failure assessment is based on an available failure database that includes the following classes of data [26,27] If there is any ageing effect, it can be determined by comparing between the monthly number of failures n(i) for month i in the following way:
We are interested in working out leakage failure rates per unit time and per km, λ(t), based on the available data. The standard procedures for processing failure data are proposed in [27,28]. We are applying the following classification for failure rate critical values and their correspondent reliability levels [27]: • Low reliability level when failure rate λ is higher than 0.5 failures/km/year, • High reliability level when failure rate λ is less than 0.1 failures/km/year, • Medium reliability level when failure rates are between 0.1 and 0.5 failures/km/year.
To carry on this statistical analysis and modelling we are exploring three directions: direct statistical averaging, deseasonalised averaging & linear regression and mean time-to-fail (MTTF) data fitting to the Poisson Stochastic Process.

Direct Statistical Analyses
The values of failure rates are calculated according to Equation (1) [26]: where n j (t, t + ∆t) is the number of all failures in the time interval ∆t for the j part of the network; L j is the length of the j part of the network [km], and j is a given part of the water supply network. In the analysis of the degree of failure of the water supply network, an important parameter is the standard deviation of the failure rates, which is calculated according to the formula:

Deseasonalisalisation Linear Regration
Regarding seasonal impact, we applied a method based on the average identical period comparative analysis [29] in order to isolate seasonal fluctuations.
A seasonal index S i is calculated according to: where d is equal to 4 (over the four seasons).
The mean values of failures y i over homonymous subperiods (quarters) is calculated by: where k i is the average number of failures in quarter i. The deseasonalised average number of failures g i considering the seasonal fluctuations is calculated using the following formula: where g i is expressed in the same units as the examined phenomenon.
Once the deseasonalised number of failures is determined, we can determine the trend of the number of failures with the time using linear regression.

MTTF Fitting to Poisson Stochastic Process
Keeping the assumption of a time-constant failure rate in mind, we will then fit the failure data with a Poisson Stochastic Process. That means admitting the absence (/the insignificance) of ageing effects over the short interval of 16 months. The adjustment criteria will then be the conservation of the expectation value of the time between successive failures over the corresponding intervals. Now, we did not count "failures/month" but the "Time to Fail" within a given interval of time. The expectation value of the time between successive failures E(T) over the interval T is called the "Mean Time-To-Fail" (MTTF) and determined for the Poisson Stochastic Process by: giving: where; T is the MTTF, T o is the interval of interest and λ is the failure rate to be determined. E(T) is called the "1st moment," as well.
Considering the available data, one may statistically estimate a MTTF, T * , that will be called the statistical MTTF and determined by: It is largely called the "Estimator," as well. One should always distinguish between the "Expectation" value and the "Estimated" one.
Where t i are the times to fail observed within the corresponding interval and n is the number of the observations within the interval, knowing that the statistical MTTF tends to the expectation value if the size of the sample is big enough (the law of large number): then, combining with Equation (10), one can write: and then expressing the failure rate by: where T * is the statistically estimated MTTF from the database. To solve the equation, one will define two functions. The failure moment function (FMF) and the failure rate function (FRF) are respectively defined as: The adjustment criteria will then be the conservation the expectation value of the time between successive failures.
Solutions exist at the intersection of both functions as demonstrated in Figures 9 and 10.

Water Loss Assessment
The unavoidable Annual Real Loss (UARL) is the annual volume of the inevitable loss and is economically viable. This means that the economic losses due to the leakages are lesser than the expenses of repairing/reducing the leakages through technical design solutions. The indicator is determined from the relation: where Infrastructure Leakage Index (ILI) is the ratio of the annual volume ratio of losses to UARL, which is described by the relationship: where V loss is the annual volume of water unsold, [m 3 ·year −1 ], UARL are the unavoidable losses, Real Loss Benchmark is the indicator that enables the assessment of the technical condition of the water supply system. If the number of connections per kilometer of water pipe network is at least 20, the RLB is obtained through relationship: where L pw is the number of water connections, [m 3 ·year −1 ].

General View Water Losses Analysis
In Figure 2, the average values of the failure rate in the examined interval of time for the whole water network are presented.
In each of the examined years, the water supply network is characterized by a high failure rate. However, we can notice that the number of annual failures decreases significantly in the last two years. Comparing the obtained values during the first two years with the criteria proposed in [23], λ > 0.5 failures/y/km, it was decided to renovate the network pipelines. A summary of failure rates due to their cause is presented in Figure 3.
The most common failure modes include connection failure (97 cases, 35.14% of all failures). This type of failure is mainly the result of design errors or improper operation conditions. Other frequent failure modes are water meter failures and corrosion-stress cracking (in both cases these were the causes of 44 failures, 15.94%). The other category includes leaks or pipe cracking due to excavation work. Steel pipes corrosion-stress cracking constituted more than 50% of all pipes in the entire network, while water meter failures were mainly observed at customer's residences and caused by improper use or ageing. In each of the examined years, the water supply network is characterized by a high failure rate. However, we can notice that the number of annual failures decreases significantly in the last two years. Comparing the obtained values during the first two years with the criteria proposed in [23], > 0.5 failures/y/km, it was decided to renovate the network pipelines. A summary of failure rates due to their cause is presented in Figure 3. Energies 2020, 13, x FOR PEER REVIEW 7 of 14 > 0.5 failures/y/km, it was decided to renovate the network pipelines. A summary of failure rates due to their cause is presented in Figure 3. The most common failure modes include connection failure (97 cases, 35.14% of all failures). This type of failure is mainly the result of design errors or improper operation conditions. Other frequent failure modes are water meter failures and corrosion-stress cracking (in both cases these were the causes of 44 failures, 15.94%). The other category includes leaks or pipe cracking due to excavation work. Steel pipes corrosion-stress cracking constituted more than 50% of all pipes in the entire network, while water meter failures were mainly observed at customer's residences and caused by improper use or ageing.
The repair/replace time is one of the most important indicators we examined. It is determined by the number of hours needed to repair a failure or replace damaged equipment. It depends on many factors such as the size of the maintenance brigade, the extent of the failure, the accessibility to the location of the failure or the pipe diameter. The classification of failure modes by repair time is shown in Figure 4. The repair/replace time is one of the most important indicators we examined. It is determined by the number of hours needed to repair a failure or replace damaged equipment. It depends on many factors such as the size of the maintenance brigade, the extent of the failure, the accessibility to the location of the failure or the pipe diameter. The classification of failure modes by repair time is shown in Figure 4.
The most frequent repair time varies from six to eight hours (about 25.7% of all occurring failures). The least frequent took more than 10 h (<3%).
The Unavoidable Annual Real Losses has an upward trend that equals to 41,525 m 3 ·a −1 . The same situation occurs with the water supply leakage indicator, which remains at a stable level not exceeding the value of two. The previous indicators classify the water supply network state as good, according to IWA's criteria. A downward trend of RLB was observed, ranging from 258.48 to 122.16 dm 3 ·d −1 per water connection. It should be noted that the resulting value of the ILI indicates the appropriate state of the water supply system and compensates the unsold water volume. This is due to the substantial length of the network in relation with the volume of sold water and the high rate of unavoidable losses.
improper use or ageing.
The repair/replace time is one of the most important indicators we examined. It is determined by the number of hours needed to repair a failure or replace damaged equipment. It depends on many factors such as the size of the maintenance brigade, the extent of the failure, the accessibility to the location of the failure or the pipe diameter. The classification of failure modes by repair time is shown in Figure 4.

Overview on Failure Data Deseasonalisation
The failure rate of the water network in the subsequent quarters of the analysed period to show the seasonal fluctuations is presented in Figure 5. It should be noted that the resulting value of the ILI indicates the appropriate state of the water supply system and compensates the unsold water volume. This is due to the substantial length of the network in relation with the volume of sold water and the high rate of unavoidable losses.

Overview on Failure Data Deseasonalisation
The failure rate of the water network in the subsequent quarters of the analysed period to show the seasonal fluctuations is presented in Figure 5. Seasonal fluctuations, whose characteristic parameter is the annual fluctuation cycle, are an important factor that allows for a more accurate analysis and trend prediction of the water supply network failures. Considering the seasonal fluctuations, the total number of failures was higher than the quarterly average (1.0) by approximately 0.23 in the second quarter of the year, 0.06 in the third quarter, and was greatly lower than the monthly average by 0.28 in the fourth quarter ( Figure 6). The highest failure rate, and hence the lowest reliability, occurred in the second quarter (1.332 no. of failures·km −1 ·a −1 ). The lowest failure rate was observed in the fourth quarter (0.784 no. of failures·km −1 ·a −1 ).
Seasonal fluctuations, whose characteristic parameter is the annual fluctuation cycle, are an important factor that allows for a more accurate analysis and trend prediction of the water supply network failures. Considering the seasonal fluctuations, the total number of failures was higher than the quarterly average (1.0) by approximately 0.23 in the second quarter of the year, 0.06 in the third quarter, and was greatly lower than the monthly average by 0.28 in the fourth quarter ( Figure 6).
The absolute seasonal fluctuation of g i shows that, in the fourth quarter, the total number of failures was lower than the average monthly (y = 17.25 failures·quarter −1 ) by 4.75 failures and, in the second quarter, was higher than the average by 4.0 failures/quarter. The standard deviation for the analysed, used to assess degrees of variation, was equal to 3.63.
To smooth fluctuations and remove seasonality, the deseasonalised index was calculated by the division of the actual data concerning failures by a seasonal index S i , which is presented in Figure 7. no. of failures•km −1 •a −1 ). The lowest failure rate was observed in the fourth quarter (0.784 no. of failures•km −1 •a −1 ).
Seasonal fluctuations, whose characteristic parameter is the annual fluctuation cycle, are an important factor that allows for a more accurate analysis and trend prediction of the water supply network failures. Considering the seasonal fluctuations, the total number of failures was higher than the quarterly average (1.0) by approximately 0.23 in the second quarter of the year, 0.06 in the third quarter, and was greatly lower than the monthly average by 0.28 in the fourth quarter ( Figure 6). The absolute seasonal fluctuation of gi shows that, in the fourth quarter, the total number of failures was lower than the average monthly ( y = 17.25 failures•quarter −1 ) by 4.75 failures and, in the second quarter, was higher than the average by 4.0 failures/quarter. The standard deviation for the analysed, used to assess degrees of variation, was equal to 3.63.
To smooth fluctuations and remove seasonality, the deseasonalised index was calculated by the division of the actual data concerning failures by a seasonal index Si, which is presented in Figure 7.  The deseasonalisation helps to remove some peaks. The down trend shows that the number of failures decreases slightly. By implementation of presented deseasonalised failure prediction, the actual number of failures in the next quarter of the following year can be estimated according to the least square trend line, which equals to nine failures per quarter. Such an approach can be helpful in the prediction of water pipe failure.

Network Global Failure Rate Assessment
The straightforward statistical approach was applied to process failure data from the database. The database comprises 276 recorded failure events cumulated over four years. Recorded failures cover six types of failure categories: corrosion, cracks, service connections damage, valves damage, water-meters damage and others. The monthly failures histogram is presented in Figure 8.
Assuming that the whole network is characterised by a constant failure rate over the four years, the statistical analysis of the whole interval of 48 months with its 276 recorded failures resulted in a failure rate of 9 × 10 −2 per month per kilometer (3 × 10 −3 −1 −1 ) with a standard deviation ( ) of 57.8%, Table 1. This large dispersion, despite of the relatively good size of the database, advocated for some additional finer analyses.  I  II  III IV  I  II  III IV  I  II  III IV  I  II  III  The deseasonalisation helps to remove some peaks. The down trend shows that the number of failures decreases slightly. By implementation of presented deseasonalised failure prediction, the actual number of failures in the next quarter of the following year can be estimated according to the least square trend line, which equals to nine failures per quarter. Such an approach can be helpful in the prediction of water pipe failure.

Network Global Failure Rate Assessment
The straightforward statistical approach was applied to process failure data from the database. The database comprises 276 recorded failure events cumulated over four years. Recorded failures cover six types of failure categories: corrosion, cracks, service connections damage, valves damage, water-meters damage and others. The monthly failures histogram is presented in Figure 8.
Assuming that the whole network is characterised by a constant failure rate over the four years, the statistical analysis of the whole interval of 48 months with its 276 recorded failures resulted in a failure rate of 9 × 10 −2 per month per kilometer (3 × 10 −3 d −1 km −1 ) with a standard deviation (σ) of 57.8%, Table 1. This large dispersion, despite of the relatively good size of the database, advocated for some additional finer analyses.
on the correspondent failure rates and standard deviations. The failure rates of each is in the uncertainty region of the other because of the relatively high standard deviations of each.
• the third subset is clearly well discriminated by the values of its corresponding failure rate and standard deviation.
We will then consider only the first and the third subsets for a finer statistical analysis, see Figure  8. For interval 1, one can also determine a mean value of the monthly failures (over 16 months) and its standard deviation. The mean value of the monthly failures is found to be around 7.82 failures/month with a standard deviation equal to 3.11 which represents about 39.8% of the mean value. That corresponds to a global constant failure rate equal to 1.23 × 10 −1 /month.
For interval 3, one can also determine a mean value of the monthly failures (over the last 16 months) and its standard deviation. The mean value of the monthly failures is found to be around 3.47 failures/month with a standard-deviation equal to 1.91 which represents about 55.1% of the mean value. That corresponds to a global constant failure rate equal to 5.44 × 10 −2 /month.
Obviously, the network reliability was much better within the last 16 months than the first 16 months.
Regarding the intermediate interval between 17 and 32 months, something indicates a very rapid (almost linear) decrease of the failure rate. That could be due to a radical improvement in maintenance or a large replacement of the old network by a new one.  The observed visual topology of the histogram in Figure 8 suggested the discrimination of three equal sub-intervals within the total 48 month-interval. The failure rate and the correspondent standard deviation of each of these three sub-intervals are given in Table 1. Considering the values of the failure rates and the correspondent standard deviations, one may conclude that:

Failure Data Fitting to Poisson Stochastic Process
• The global set (over 48 months) and the first and the second sets are not very discriminated based on the correspondent failure rates and standard deviations. The failure rates of each is in the uncertainty region of the other because of the relatively high standard deviations of each.

•
The third subset is clearly well discriminated by the values of its corresponding failure rate and standard deviation.
We will then consider only the first and the third subsets for a finer statistical analysis, see Figure 8. The mean value of the monthly failures is determined to be around 5.75 failures/month with a standard deviation equal to 3.33 which represents about 57.8% of the mean value. This standard deviation represents a relatively large uncertainty. That corresponds to a global constant failure rate equal to 9.01 × 10 −2 /month.
For interval 1, one can also determine a mean value of the monthly failures (over 16 months) and its standard deviation. The mean value of the monthly failures is found to be around 7.82 failures/month with a standard deviation equal to 3.11 which represents about 39.8% of the mean value. That corresponds to a global constant failure rate equal to 1.23 × 10 −1 /month.
For interval 3, one can also determine a mean value of the monthly failures (over the last 16 months) and its standard deviation. The mean value of the monthly failures is found to be around 3.47 failures/month with a standard-deviation equal to 1.91 which represents about 55.1% of the mean value. That corresponds to a global constant failure rate equal to 5.44 × 10 −2 /month.
Obviously, the network reliability was much better within the last 16 months than the first 16 months.
Regarding the intermediate interval between 17 and 32 months, something indicates a very rapid (almost linear) decrease of the failure rate. That could be due to a radical improvement in maintenance or a large replacement of the old network by a new one.

Failure Data Fitting to Poisson Stochastic Process
Keeping the assumption of a time-constant failure rate in mind, we adjusted the failure data of the first sub-interval (over 16 months) and of the third sub-interval (over 16 months) to fit with a Poisson Stochastic Process, Figure 8. That means admitting the absence (/the insignificance) of ageing effects over the short interval of 16 months. The adjustment criteria is the conservation of the expectation value of the time between successive failures (MTTF) over 16 months.
Regarding the first interval, the estimated failure rate is 2.14 × 10 −1 d −1 or 3.35 × 10 −3 d −1 km −1 if one considers a network size of 63.8 km, Figure 9. expectation value of the time between successive failures (MTTF) over 16 months.
A comparison between the failure rates estimated by the direct statistical approach and those estimated using the conservation of the first moment of a Poisson Stochastic Process is given, Table  2.  Regarding the third interval, the estimated failure rate is 8.04 × 10 −2 d −1 or 1.32 × 10 −3 d −1 km −1 if one considers a network size of 63.8 km, Figure 10.
A comparison between the failure rates estimated by the direct statistical approach and those estimated using the conservation of the first moment of a Poisson Stochastic Process is given, Table 2.
One may distinguish two different phases, the first 16 months and the last 16 months, within the total 48 months of the available data. The difference may be due to changes in the network itself (structural, technological, or organisational) or changes in the data collection and recording process. In both phases of 16 months, the network can be considered non-ageing. Or, one could consider that the ageing process is insignificant over the short period of 16 months. The straightforward approach is more conservative and results in slightly higher failure rates. One may distinguish two different phases, the first 16 months and the last 16 months, within the total 48 months of the available data. The difference may be due to changes in the network itself (structural, technological, or organisational) or changes in the data collection and recording process. In both phases of 16 months, the network can be considered non-ageing. Or, one could consider that the ageing process is insignificant over the short period of 16 months. The straightforward approach is more conservative and results in slightly higher failure rates.

Conclusions and Perspectives
Due to major water losses and frequent pipe failures in the examined network, recommendations are to improve the network control and monitoring system as well as to use more accurate measuring devices. The paper treated only the water leakage failure mode. However, the demarche is generic and can be applied to all other failure modes in a straightforward manner. Actions to be recommended would entail the development of an online detection and monitoring algorithm (classifiers) and tools. That includes the design of an active leak-control system and the development of a comprehensive information technology system with intelligent communication modules, telecommunications infrastructure and an automatic database management system.
Governments, citizen and other stakeholders require a higher quality of drinking water and higher level of water supply service continuity. National concerned institutions and global legal regulations require the adaptation and the development of scientific and technical research to enhance the safety and the resilience of the water supply systems. Such exigences can become even critical societal concerns, especially in the case of some extraordinary unprecedented terrorist or epidemiological threats. Conclusions from the history of individual failures involving mass contamination of tap water in urban agglomerations offer a signpost where active risk-based water supply management is concerned. In this situation, it is important to develop risk-management procedures and smart decision-support tools. The risk-based management of water supply services allows the integration of sustainable development principles.
Water supply service stakeholders should be able to estimate risk, inform users about the service quality, take appropriate actions to minimise the risk and initiate actions necessary to mitigate threats in crisis situations. In normal operation situations, risk-based management is required to optimise maintenance, periodic testing and inspection, and diagnostic and prognostic activities.

Conclusions and Perspectives
Due to major water losses and frequent pipe failures in the examined network, recommendations are to improve the network control and monitoring system as well as to use more accurate measuring devices. The paper treated only the water leakage failure mode. However, the demarche is generic and can be applied to all other failure modes in a straightforward manner. Actions to be recommended would entail the development of an online detection and monitoring algorithm (classifiers) and tools. That includes the design of an active leak-control system and the development of a comprehensive information technology system with intelligent communication modules, telecommunications infrastructure and an automatic database management system.
Governments, citizen and other stakeholders require a higher quality of drinking water and higher level of water supply service continuity. National concerned institutions and global legal regulations require the adaptation and the development of scientific and technical research to enhance the safety and the resilience of the water supply systems. Such exigences can become even critical societal concerns, especially in the case of some extraordinary unprecedented terrorist or epidemiological threats. Conclusions from the history of individual failures involving mass contamination of tap water in urban agglomerations offer a signpost where active risk-based water supply management is concerned. In this situation, it is important to develop risk-management procedures and smart decision-support tools. The risk-based management of water supply services allows the integration of sustainable development principles. Water supply service stakeholders should be able to estimate risk, inform users about the service quality, take appropriate actions to minimise the risk and initiate actions necessary to mitigate threats in crisis situations. In normal operation situations, risk-based management is required to optimise maintenance, periodic testing and inspection, and diagnostic and prognostic activities.
Author Contributions: All authors equally contributed to the development of this manuscript. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by subsidies for statutory activity by Faculty of Civil and Environmental Engineering and Architecture, Rzeszow University of Technology, 35-959 Rzeszow, Poland.