Leak detection techniques based on Machine Learning (ML) models can assist or even replace manual work in leak detection operations in water distribution systems (WDSs). However, studies on leakage detection based on on-site leak signals are limited compared to studies on lab-scale leak detection. The on-site leak signals have stronger interference and randomness, while leak signals in the laboratory are relatively simpler. To better assist on-site leak detection operations, the present paper develops and compares three ML-based models. For this purpose, many on-site tests were carried out, and tens of thousands of sets of on-site leak detection signals were collected. More than 6000 sets of these signals were marked and the signal features were extracted and analyzed from a statistical point of view. It was found that features such as the main frequency, the spectral roll-off rate, the spectral flatness, and one-dimensional (1-D) Mel Frequency Cepstrum Coefficient (MFCC) could well distinguish the leakage signals from non-leakage signals. After training the decision tree model, the performances of the random forest and Adaboost models were thoroughly compared. It was found that the false positive rates of the three models were 9.80%, 8.27% and 7.35%, all lower than 10%. In particular, the Adaboost model had the lowest false positive rate of 7.35%. The recall rate of the random forest and Adaboost models were 100% and 99.52%.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.