# Statistical Dependence of Pipe Breaks on Explanatory Variables

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Data and Methods

#### 2.1. Data

^{2}and supplies water to more than 177 municipalities. It is composed of a set of trunk mains and a set of distribution lines. The distribution lines are formed by more than 370,000 pipe segments with a total length of more than 14,000 km. The trunk mains are formed by close to 40,000 elements with more than 3000 km of total length. The present analysis is focused on the water supply networks (trunk mains and distribution pipes). Part of the assets managed by Canal de Isabel II, such as water treatment plants, pumping stations, water channels and service connections, are not considered in this study.

#### 2.2. Methodology

#### 2.2.1. Explanatory Variables Selection

#### 2.2.2. Bayesian Models for Failure Prediction

- Event A: A failure event occurs.
- Event B: Explanatory variable takes a value in the interval [x, x + Δx].

_{f}is the number of failure events registered in the period and N

_{T}is the total number of components. In the case of a pipe segment, N

_{T}would be equal to the total length of the network divided by the length of the pipe segment.

_{X}is the probability distribution function of the explanatory variable X.

_{FX}is the probability distribution function of the explanatory variable X among the failed components.

#### 2.2.3. Model Building

#### 2.2.4. Model Validation

_{p}) and the real number of failures observed (N

_{o}) at each validation period. Models are evaluated in a set of network samples of varying size. Samples are defined according to a geographical criterion, considering square network areas surrounding a random central point. The size of these square areas is defined sampling from a uniform distribution between a minimum limit (L

_{min}) and a maximum one (L

_{max}), with the condition to contain at least a minimum number of elements (N

_{e}). Figure 1a shows the selection of elements in blue color for a sample with size L centered in a red point, while Figure 1b represents all central points (100 in total) analyzed for the validation process.

^{2}. Optimal value for both of them is one.

_{min}and L

_{max}) keeping a constant minimum number of elements N

_{e}. For the evaluation of the validation period, the period of available data was split into four years. Data from one year were used to adjust the model, which was then validated in the other three one-year periods. To do so, every possible combination was used for a total of 12 combinations.

## 3. Results

#### 3.1. Zero Order Models

#### 3.2. First Order Models

#### 3.3. Distribution Lines

#### 3.4. Trunk Mains

## 4. Discussion

## 5. Conclusions

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## Abbreviations

Diam | Diameter |

GIS | Geographic Information System |

Mat | Material |

Pave | Average pressure |

Pmax | Maximum pressure |

Pmin | Minimum pressure |

Terr | Type of terrain |

Tran | Transient index |

Use | Land use |

Vave | Average velocity |

Vmax | Maximum velocity |

Vmin | Minimum velocity |

## References

- Hukka, J.J.; Katko, T.S. Appropriate Pricing Policy Needed Worldwide for Improving Water Services Infrastructure. J. Am. Water Works Assoc.
**2015**, 107, E37–E46. [Google Scholar] [CrossRef] - Geem, Z.W. Multiobjective Optimization of Water Distribution Networks Using Fuzzy Theory and Harmony Search. Water
**2015**, 7, 3613–3625. [Google Scholar] [CrossRef] - Campbell, E.; Izquierdo, I.; Montalvo, I.; Pérez-García, R. A NovelWater Supply Network Sectorization Methodology Based on a Complete Economic Analysis, Including Uncertainties. Water
**2016**, 8, 179. [Google Scholar] [CrossRef] - Xu, Q.; Chen, Q.; Li, W.; Ma, J.; Blanckaert, K. Optimal pipe replacement strategy based on break rate prediction through genetic programming for water distribution network. J. Hydro-Environ. Res.
**2013**, 7, 134–140. [Google Scholar] [CrossRef] - Malm, A.; Ljunggren, O.; Bergstedt, O.; Pettersson, T.J.R.; Morrison, G.M. Replacement predictions for drinking water networks through historical data. Water Res.
**2012**, 46, 2149–2158. [Google Scholar] [CrossRef] [PubMed] - Li, C.Q.; Mahmoodian, M. Risk based service life prediction of underground cast iron pipes subjected to corrosion. Reliab. Eng. Syst. Saf.
**2013**, 119, 102–108. [Google Scholar] [CrossRef] - Selvakumar, A.; Matthews, J.C.; Condit, W.; Sterling, R. Innovative research program on the renewal of aging water infrastructure systems. J. Water Suppl. Res. Technol.
**2015**, 64, 117–129. [Google Scholar] [CrossRef] - Kleiner, Y.; Rajani, B. Comprehensive review of structural deterioration of water mains: Statistical models. Urban Water
**2001**, 3, 131–150. [Google Scholar] [CrossRef] - Wang, Y.; Moselhi, O.; Zayed, T. Study of the suitability of existing deterioration models for water mains. J. Perform. Constr. Facil.
**2009**, 40, 40–46. [Google Scholar] [CrossRef] - Da Costa Martins, A.D. Stochastic Models for Prediction of Pipe Failures in Water Supply Systems. Master’s Thesis, Universidade Técnica de Lisboa, Lisboa, Portugal, 2011. [Google Scholar]
- Yoo, D.G.; Kang, D.; Jun, H.; Kim, J.H. Rehabilitation Priority Determination of Water Pipes Based on Hydraulic Importance. Water
**2014**, 6, 3864–3887. [Google Scholar] [CrossRef] - Casillas, M.V.; Garza-Castañón, L.E.; Puig, V. Optimal Sensor Placement for Leak Location in Water Distribution Networks using Evolutionary Algorithms. Water
**2015**, 7, 6496–6515. [Google Scholar] [CrossRef][Green Version] - Liu, Z.; Kleiner, Y. State of the art review of inspection technologies for condition assessment of water pipes. Measurement
**2013**, 46, 1–15. [Google Scholar] [CrossRef] - Hunaidi, O. Condition assessment of water pipes. In Proceedings of the EPA Workshop on Innovation and Research for Water Infrastructure in the 21st Century, EPA Workshop, Arlington, VA, USA, 2006.
- Economou, T.; Kapelan, Z.; Bailey, T.C. On the prediction of underground water pipe failures: Zero inflation and pipe-specific effects. J. Hydroinform.
**2012**, 14, 872–883. [Google Scholar] [CrossRef] - Casillas, M.V.; Garza-Castañón, L.E.; Puig, V.; Vargas-Martinez, A. Leak Signature Space: An Original Representation for Robust Leak Location in Water Distribution Networks. Water
**2015**, 7, 1129–1148. [Google Scholar] [CrossRef][Green Version] - Kabir, G.; Tresfamariam, S.; Sadiq, R. Bayesian model averaging for the prediction of water main failure for small to large Canadian municipalities. Can. J. Civ. Eng.
**2016**, 43, 233–240. [Google Scholar] [CrossRef] - Díaz, S.; Mínguez, R.; González, J. Stochastic approach to observability analysis in water networks. Ingeniería Agua
**2016**, 20, 139–152. [Google Scholar] [CrossRef] - Dridi, L.; Mailhot, A.; Parizeau, M.; Villeneuve, J.P. Multiobjective Approach for Pipe Replacement Based on Bayesian Inference of Break Model Parameters. J. Water Resour. Plan. Manag.
**2009**, 135, 344–354. [Google Scholar] [CrossRef] - Economou, T.; Kapelan, Z.; Bailey, T.C. An aggregated hierarchical Bayesian model for the prediction of pipe failures. In Proceedings of the 9th International Conference on Computing and Control for the Water Industry (CCWI), Leicester, UK, September 2007.
- Watson, T.G.; Christian, C.D.; Mason, A.J.; Smith, M.H.; Meyer, R. Bayesian-based pipe failure model. J. Hydroinform.
**2004**, 6, 259–264. [Google Scholar] - Arias-Nicolás, J.P.; Martín, J.; Ruggeri, F.; Suárez-Llorens, A. A Robust Bayesian Approach to an Optimal Replacement Policy for Gas Pipelines. Entropy
**2015**, 17, 3656–3678. [Google Scholar] [CrossRef] - Kabir, G.; Tresfamariam, S.; Loeppky, J.; Sadiq, R. Integrating Bayesian linear regression with ordered weighted averaging: Uncertainty analysis for predicting water main failures. ASCE-ASME J. Risk Uncertain. Eng. Syst. A Civ. Eng.
**2015**, 1, 3. [Google Scholar] [CrossRef] - Xu, Q.; Chen, Q.; Li, W.; Ma, J. Pipe break prediction based on evolutionary data-driven methods with brief recorded data. Reliab. Eng. Syst. Saf.
**2011**, 96, 942–948. [Google Scholar] [CrossRef] - Heywood, G.; Starr, M. Development of national Deterioration Models; UKWIR: London, UK, 2007. [Google Scholar]
- Large, A.; Le Gat, Y.; Elachachi, S.M.; Renaud, E.; Breysse, D.; Tomasian, M. Improved modelling of ‘long-term’ future performance of drinking water pipes. J. Water Supply Res. Technol.
**2015**, 64, 404–414. [Google Scholar] [CrossRef] - Babovic, V.; Drécourt, J.P.; Keijzer, M.; Hasen, P.F. A data mining approach to modeling of water supply assets. Urban Water
**2002**, 4, 401–414. [Google Scholar] [CrossRef] - Boxall, J.B.; O’Hagan, A.; Pooladsaz, S.; Saul, A.J.; Unwin, D.M. Estimation of burst rates in water distribution mains. Water Manag.
**2007**, 160, 73–82. [Google Scholar] [CrossRef] - Ahn, J.C.; Lee, S.W.; Lee, G.S.; Koo, J.Y. Predicting water pipe breaks using neural network. Water Suppl.
**2005**, 5, 159–172. [Google Scholar] - Røstum, J.; Dören, L.; Schilling, W. The Concept of Business Risk Used for Rehabilitation of Water Networks; Norwegian University of Science and Technology: Trondheim, Norway, 1997. [Google Scholar]
- Røstum, J. Statistical Modeling of Pipe Failures in Water Networks. Master’s Thesis, Norwegian University of Science and Technology, Trondheim, Norway, 2000. [Google Scholar]

**Figure 2.**Global quality figure explanation. Mean radius is represented versus standard deviation for each validation point of each model. Average mean radius is represented by a larger dot of the same color. The legend presents the average values of mean radius obtained for each variable. Order zero mean radius is represented with a brown dashed line; the best model is represented with a blue dashed line. Improvement from order zero to best model is indicated as the distance between both lines.

**Figure 3.**First order models results based on mean radius and standard deviation: (

**a**) distribution lines; and (

**b**) trunk mains. Individual validation models are represented as small dots and average value for the model is represented as a large dot (in the same color). The legend presents the numerical values of mean radius obtained for each model. Order zero mean radius is represented by the brown dashed line and the best predictive model is represented as a blue dashed line.

**Figure 4.**Distribution lines. Mean radius and standard deviation for models with all independent variables. Order zero mean radius is represented with brown dotted line; and best predictive model with blue line for: (

**a**) order two; (

**b**) order three; (

**c**) order four; and (

**d**) order five models.

**Figure 5.**Distribution lines. Mean radius and standard deviation for models with two joint variables plus independent variables. Order zero mean radius is represented with brown dotted line; and best predictive model with blue line for: (

**a**) order two; (

**b**) order three; (

**c**) order four; and (

**d**) order five models.

**Figure 6.**Trunk mains. Mean radius and standard deviation for models with all independent variables. Order zero mean radius is represented with brown dotted line; and best predictive model with blue line for: (

**a**) order two; (

**b**) order three; (

**c**) order four and (

**d**) order five models.

**Figure 7.**Trunk mains. Mean radius and standard deviation for models with two joint variables plus independent variables. Order zero mean radius is represented with brown dotted line; and best predictive model with blue line for: (

**a**) order two; (

**b**) order three; (

**c**) order four and (

**d**) order five models.

**Figure 8.**Distribution: Mean Radius vs. Model order. Blue dots represent values obtained for individual models and the dashed red line represents the best models: (

**a**) independent variables; and (

**b**) two joint variables plus additional independent variables.

**Figure 9.**Trunk mains: Mean Radius vs. Model order. Blue dots represent values obtained for individual models and the dashed red line represents the best models: (

**a**) independent variables; and (

**b**) two joint variables plus additional independent variables.

Category | Number of Components | 2011 | 2012 | 2013 | 2014 |
---|---|---|---|---|---|

Events | Events | Events | Events | ||

Distribution lines | 373,113 (14,176 km) | 1758 | 1773 | 1970 | 2237 |

Trunk mains | 39,915 (3297 km) | 94 | 78 | 74 | 79 |

Type of Variable | Distribution Lines | Trunk Mains |
---|---|---|

Physical | Diameter | Diameter |

Physical | Installation year | Installation year |

Physical | Material | Material |

Environmental | Terrain | - |

Environmental | Land use | Land use |

Environmental | Depth | Depth |

Internal | Maximum pressure | Maximum pressure |

Internal | Average pressure | Average pressure |

Internal | Minimum pressure | Minimum pressure |

Internal | Maximum velocity | Maximum velocity |

Internal | Average velocity | Average velocity |

Internal | Minimum velocity | Minimum velocity |

Internal | Transient index | Transient index |

Category | Mean Slope (m) | Mean R^{2} (r) | Mean Radius (M) | Standard Dev. Radius (SD) |
---|---|---|---|---|

Distribution lines | 1.107 | 0.975 | 0.157 | 0.125 |

Trunk mains | 0.991 | 0.981 | 0.110 | 0.078 |

**Table 4.**One variable models: predictive capacity for distribution lines (left) and trunk mains (right) networks.

Distribution Lines | Trunk Mains | |||||||
---|---|---|---|---|---|---|---|---|

Variable | Mean Slope | Mean R^{2} | Mean Radius | Std. Dev. | Mean Slope | Mean R^{2} | Mean Radius | Std. Dev. |

Diameter | 1.021 | 0.971 | 0.075 | 0.053 | 0.948 | 0.977 | 0.073 | 0.063 |

Material | 1.030 | 0.980 | 0.085 | 0.066 | 1.000 | 0.980 | 0.073 | 0.056 |

Year | 1.049 | 0.980 | 0.097 | 0.084 | 0.971 | 0.979 | 0.086 | 0.059 |

Land Use | 1.099 | 0.975 | 0.149 | 0.119 | 1.015 | 0.980 | 0.093 | 0.078 |

Depth | 1.100 | 0.976 | 0.151 | 0.122 | 0.988 | 0.981 | 0.095 | 0.068 |

P_{max} | 1.112 | 0.975 | 0.153 | 0.127 | 1.010 | 0.981 | 0.096 | 0.078 |

P_{ave} | 1.112 | 0.975 | 0.153 | 0.127 | 1.006 | 0.981 | 0.098 | 0.078 |

P_{min} | 1.111 | 0.975 | 0.154 | 0.127 | 1.003 | 0.981 | 0.100 | 0.078 |

Terrain | 1.108 | 0.975 | 0.156 | 0.125 | ||||

V_{max} | 1.108 | 0.975 | 0.156 | 0.125 | 0.987 | 0.980 | 0.102 | 0.073 |

V_{ave} | 1.108 | 0.975 | 0.156 | 0.125 | 0.983 | 0.981 | 0.100 | 0.078 |

V_{min} | 1.107 | 0.975 | 0.157 | 0.125 | 0.995 | 0.981 | 0.107 | 0.078 |

Tran | 1.108 | 0.975 | 0.157 | 0.126 | 0.993 | 0.980 | 0.103 | 0.074 |

**Table 5.**Best predictive models: Distribution lines. Values in parenthesis represent the number of intervals considered in variable discretization (F stands for full resolution).

Category | Ord. | Joint Variables | Independent Variables | Mean Radius |
---|---|---|---|---|

Independent Variables | 2 | - | Diam (8)–Terr (4) | 0.076 |

3 | - | Diam (8)–Terr (4)–Pmin (F) | 0.065 | |

4 | - | Diam (8)–Terr (4)–Pmin (F)–Vmax (6) | 0.064 | |

5 | - | Diam (8)–Terr (4)–Pmin (F)–Vmax (6)–Vmin (6) | 0.064 | |

Joint Variables | 2 | Diam (6)–Tran (8) | - | 0.075 |

3 | Diam (2)–Tran (7) | Pmin (F) | 0.060 | |

4 | Diam (8)–Year (4) | Pave (F)–Depth (3) | 0.052 | |

5 | Diam (2)–Mat (7) | Depth (5)–Vave (9)–Vmax (9) | 0.051 |

**Table 6.**Best predictive models: Trunk mains. Values in parenthesis represent the number of intervals considered in variable discretization (F stands for full resolution).

Category | Or. | Joint Varables | Independent Variables | Mean Radius |
---|---|---|---|---|

Independent Variables | 2 | - | Diam (10)–Mat(3) | 0.035 |

3 | - | Diam (10)–Mat (3)–Use (6) | 0.028 | |

4 | - | Diam (10)–Mat (3)–Use (6)–Depth (F) | 0.033 | |

5 | - | Diam (10)–Mat (3)–Vave (F)–Vmax (2)–Vmin (2) | 0.047 | |

Joint Variables | 2 | Mat (11)–Depth (10) | - | 0.061 |

3 | Mat (11)–Pmin (F) | Diam (3) | 0.031 | |

4 | Diam (F)–Vmin (F) | Mat (2)–Vmax (9) | 0.033 | |

5 | Mat (11)–Pmax (6) | Diam (3)–Vmax (3)–Vmin (3) | 0.031 |

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Gómez-Martínez, P.; Cubillo, F.; Martín-Carrasco, F.J.; Garrote, L. Statistical Dependence of Pipe Breaks on Explanatory Variables. *Water* **2017**, *9*, 158.
https://doi.org/10.3390/w9030158

**AMA Style**

Gómez-Martínez P, Cubillo F, Martín-Carrasco FJ, Garrote L. Statistical Dependence of Pipe Breaks on Explanatory Variables. *Water*. 2017; 9(3):158.
https://doi.org/10.3390/w9030158

**Chicago/Turabian Style**

Gómez-Martínez, Patricia, Francisco Cubillo, Francisco J. Martín-Carrasco, and Luis Garrote. 2017. "Statistical Dependence of Pipe Breaks on Explanatory Variables" *Water* 9, no. 3: 158.
https://doi.org/10.3390/w9030158