The application of statistical and Machine Learning models plays a critical role in planning and decision support processes for efficient and reliable Water Distribution Network (WDN) management. Failure models can provide valuable information for prioritizing system rehabilitation even in data scarcity scenarios, such as developing countries. Few studies have analyzed the performance of more than two models, and examples of case studies in developing countries are insufficient. This study compares various statistical and Machine Learning models to provide useful information to practitioners for the selection of a suitable pipe failure model according to information availability and network characteristics. Three statistical models (i.e., Linear, Poisson, and Evolutionary Polynomial Regressions) were used for failure prediction in groups of pipes. Machine Learning approaches, particularly Gradient-Boosted Tree (GBT), Bayes, Support Vector Machines and Artificial Neuronal Networks (ANNs), were compared in predicting individual pipe failure rates. The proposed approach was applied to a WDN in Bogotá (Colombia). The statistical models showed an acceptable performance (R2
between 0.695 and 0.927), but the Poisson Regression was the most suitable for predicting failures in pipes with lower failure rates. Regarding Machine Learning models, Bayes and ANNs exhibited low performance in the prediction of pipe failure condition. The GBT approach had the best performing classifier.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited