1. Introduction
Water Distribution Networks (WDNs) constitute a critical infrastructure in the ever-increasingly urbanized world. They comprise massive systems of pressurized pipelines to transport portable water to millions of households across the globe. However, a considerable amount of the transmitted water is lost, mainly due to leakages. Indeed, the deteriorating state of aging water infrastructure presents a significant challenge that exacerbates the leakage problem in WDNS. Leakages in pressurized water pipelines lead to considerable economic losses, environmental degradation, and potential health hazards [
1]. The situation is particularly alarming as leaks can remain undetected for extended periods, resulting in the wastage of valuable potable water, which is becoming increasingly scarce. A concrete example of this problem can be seen in Hong Kong, one of the most urbanized places globally. Between 2004 and 2015, the cost of non-revenue water losses was estimated at HKD 17 billion (around USD 2.2 billion) [
2]. In 2015, approximately 321 million m
3 of transported potable water (33% of the supplied freshwater) was lost, and leaks in pressurized water pipelines were identified as the primary cause of water loss [
2]. This marks a substantial rise from the water loss rates of 26.5% in 2010 and 31.6% in 2013. Recent assessments highlight that Hong Kong’s water loss situation remains a critical concern. More specifically, coverage from the Centre for Water Research and Resource Management (CWRRR) indicates that in 2023–2024, metered water loss rates in Hong Kong reached a record high of 38.3%, surpassing the earlier cited 33% benchmark from 2015 [
3]. Notably, almost half of the annual water loss was attributed to leaks and bursts in government mains alone, with a leakage rate estimated at about 15% in 2018 [
2,
4]. These trends underscore a persistent or even worsening challenge, contradicting earlier improvements and indicating that, despite infrastructure upgrades, water loss remains a major issue. Hence, starting in 2008, the Water Supplies Department of Hong Kong established a goal under the total water management strategy to save 85 million m
3 of water per year by 2030. As of 2024, the Water Supplies Department (WSD) reports a 13.4% leakage rate in government mains, with ongoing rollout influencing a projected decline below 10% by 2030 [
5]. Estimated fresh-water leakage volumes for government mains were approximately 121 million m
3 in 2024, increasing gradually from 97 million m
3 in 2020 [
6]. With that said, early and effective leak detection is vital to prevent water waste, minimize economic impact, and ensure a continuous supply of clean water to communities. Hence, robust detection methods are vital to identify leaks of varying magnitudes in WDNs [
1,
7]. The conventional methods for leak detection in water pipelines primarily rely on periodic visual inspections or pressure measurements, which are labor-intensive, time-consuming, and often prone to errors [
8]. Moreover, these methods may not effectively detect small or intermittent leaks, which can gradually exacerbate into larger ones [
9]. To overcome these limitations, researchers have increasingly turned to a variety of sensing technologies, such as acoustic [
9], vibration [
10], ultrasonic [
11], infrared thermography [
12], fiber optic sensor [
13], and ground penetration radar [
14]. Acoustic-based techniques have emerged as an outstanding means for leak detection in WDNs, as leakage involves the generation of flow-induced acoustic signals that manifest as trans-tube stress waves [
15,
16].
Recently, acoustic-based techniques for leak management have been gaining increasing attention. However, most studies were based on small databases from laboratories and testbed facilities [
17]. Multiple studies have investigated the effects of various factors on leak-induced acoustic signals, including leak flow rate [
18], pipe flow pressure [
19], pipe diameter [
20,
21], pipe material [
22], pipe surrounding media [
17], and leak shape and size [
18]. These studies are essential to advance the understanding of acoustic signal phenomena in leaking pipes. However, they do not constitute a good resemblance to real WDNs that are complex and manifold. Thus, laboratory and testbed experiment results are insufficient to develop robust leak detection and pinpointing models. Notably, an increasing number of studies have been investigating the use of acoustic sensing technologies for leak detection in WDNs in recent years. The investigated technologies include Noise Loggers [
15], accelerometers MEMS (Micro-Electro-Mechanical Sensors) [
10], and hydrophones [
23]. Among these, Noise Logger-based monitoring systems have emerged as a promising solution due to their simplicity of deployment, cost-effectiveness, and ability to capture subtle changes in pipeline behavior [
15]. The use of such technologies is often coupled with the employment of machine learning algorithms.
Machine learning, a subfield of artificial intelligence, has revolutionized various industries by enabling computers to learn from data and make predictions or decisions without explicit programming [
24,
25]. In the context of leak detection, machine-learning models can analyze large datasets of acoustic signals and identify patterns associated with leak conditions [
15]. ML models can automatically learn from vast acoustic datasets, extracting subtle features that traditional rule-based analyses might miss. For example, by learning complex patterns in noisy data, ML can detect small leaks or adapt to varying background noise levels without manual tuning. Thus, ML offers a powerful approach that overcomes the limitations of static threshold or manual signal analysis. A variety of machine learning techniques have been used to advance leak detection, such as Support Vector Machines (SVMs), K-Nearest Neighbors (KNN), and Decision Tree (DT) [
26]. The use of machine learning models offers several advantages over traditional methods, including the ability to process complex data, adapt to changing conditions, and detect anomalies more effectively [
27]. Another group of research studies adopted deep learning models in their work. Ullah et al. [
28] deployed an integrated model of a 1D convolutional neural network and long short-term memory (LSTM) network for pipe leak detection. The Savitzky–Golay filter was adopted to minimize noise and retain features of the original signals. In another attempt, Siddique et al. [
29] adopted convolutional autoencoders and a convolutional neural network to retrieve features from leak-augmented scalograms. The obtained feature map was then passed to an artificial neural network (ANN) with one hidden layer to diagnose the health status of pipelines. Thirdly, Ullah et al. [
27] leveraged the use of continuous wavelet transform alongside enhanced short-time Fourier transform to identify the temporal and spatial characteristics of acoustic emission signals. ANN was then applied to classify the retrieved feature vector into leak and normal states. Fourthly, Ullah et al. [
30] presented a time series-based deep learning framework for leak detection. They experimented with the use of gated recurrent units (GRUs), LSTM, and bi-directional LSTM to differentiate between the acoustic emission signals. It was found that bi-LSTM was able to provide superior prediction performance in terms of precision and recall. In response to the discussed challenges, this study proposes a real-time monitoring system that harnesses the potential of wireless Noise Loggers and machine learning models for efficient leak detection in water pipelines. This study investigated multiple machine-learning models, including KNN, SVM, DT, Random Forest (RF), Naïve Bayes (NB), Logistic Regression (LogR), and Multi-Layer Perceptron (MLP), to evaluate their suitability in detecting leaks in water pipelines. Knowing that each model has unique strengths and weaknesses, including bias, variance, and overfitting problems, this study employed an ensemble approach. Ensemble methods have shown considerable success across various domains by capitalizing on the diversity of individual models and aggregating their predictions to make more reliable decisions, notably in the context of leak detection [
31,
32]. Therefore, this article proposes an ensemble approach to achieve high detection accuracy by selecting the most accurate and robust combination of models to further improve leak detection accuracy [
33].
The significance of our study lies in its potential impact on water infrastructure management and conservation efforts. The successful implementation of a real-time leak detection system utilizing wireless Noise Loggers and machine learning models can empower municipalities to promptly detect leaks, enabling faster intervention and reduced water loss. This not only preserves valuable resources but also fosters sustainable water management practices and environmental preservation. This article aims to develop robust machine learning models that can effectively detect leaks in real WDNs, to contribute a step towards improving and automating leak management in WDNs.
2. Materials and Methods
This section details the methodology employed for the analysis of acoustic signals and the development of machine learning-based leak detection models. As illustrated in
Figure 1, the methodology encompasses multiple phases of data collection, data preprocessing, feature extraction, data visualization, statistical analysis, and development of machine learning models. The subsequent presentation covers the results obtained, followed by a discussion of limitations and potential avenues for future research.
The adopted experimental setup involved the deployment of wireless Noise Loggers on the exterior of valves connecting water pipelines throughout the city of Hong Kong. The acoustic signals captured by these Noise Loggers were processed to extract relevant features. Subsequently, several machine learning models were employed, including SVM, RF, NB, KNN, DT, LogR, and MLP, to classify the acoustic signals and identify leak states by means of a supervised learning approach, trained on manually labeled leak and non-leak signal data, with ground truth established via inspection or hydrophone confirmation. The classifiers (SVM, RF, etc.) were trained to discriminate between the two predefined classes. The ensemble model was constructed to leverage the diverse strengths of individual classifiers, leading to superior leak detection accuracy. The illustrated methodology in
Figure 1 is further expanded upon in the following subsections.
2.1. Data Collection and Preprocessing
The data collection process involved the acquisition of acoustic signals using high-quality recording devices. Efforts were made to ensure consistent sampling rates and formats across all collected signals. First, the raw audio signals were captured using condenser microphones with a 44.1 kHz sampling rate and saved in WAV (WAVeform Audio File) format. Data were collected at multiple valve locations in Hong Kong. Leak signals were identified by acoustic noise loggers and verified manually using inspections or hydrophones, and non-leak signals were recorded from similar pipes under normal operation. The dataset includes 992 leak samples and 1118 no-leak samples (a slightly unbalanced but representative set). Each signal is labeled by confirming the presence/absence of a leak via inspection. To avoid geographic bias, leak and non-leak recordings were obtained from similar sites and times. Also, both classes included signals from all regions of the study area.
The adopted data collection system is depicted in
Figure 2. As shown in
Figure 2, the system includes a Noise logger deployed on-site, accompanied by a cloud service responsible for receiving and storing data. These data can be accessed remotely from an office desk. The experiment involved the use of PermaNET+ Noise Loggers. The sensors were deployed in the chamber, where the gate valve was located, to collect vibration signals. The PermaNET+ device (manufactured by HWM—Water Ltd., Ty Coch House, United Kingdom, model PermaNET+) features a built-in Hydrophone 2 acoustic sensor paired with a leak-noise logger, offering effective monitoring from approximately 20 Hz up to 5 kHz. Its sensitivity is around −50 dB re 1 Pa, typical for high-performance leak detectors designed for trunk mains and plastic pipelines. The Hydrophone 2 excels at capturing low-frequency leak noise over large or non-metallic pipes. The used Noise Loggers are equipped with an integral modem equipped with GPRS/2G and 4G technologies, enabling data transmission to the cloud. The collected acoustic signals have a length of 10 s. Four readings were collected at deployment with 15 min lag. The sampling frequency was set at 4096 Hz.
Subsequently, the collected data were preprocessed for noise reduction, which was applied through spectral subtraction, and resampling was conducted to ensure uniformity in sampling rates. Librosa library [
34] was utilized to load audio signals. The acquired signals in this study were collected from metallic pipes (such as steel, stainless steel, cast iron, and ductile iron) and non-metallic pipes (such as polyethylene and unplasticized polyvinyl chloride). In addition, the pipe diameters varied between 40 and 300 mm.
Figure 3 illustrates a common deployment location for noise loggers within water distribution networks of Hong Kong. The red arrow present in
Figure 3a indicate the positions of the noise loggers mounted on the pipe valves to capture leakage noise signals. Moreover,
Figure 3b a real-world leak site where water is observed within the access chamber.
2.2. Feature Extraction
Mixed time domain and frequency domain features are highly efficacious in deriving actual information about leaks and diminishing false alarms [
35]. In addition, another group of studies implemented statistical features of acoustic signals in leakage diagnosis [
36,
37]. With that said, multiple techniques were employed to extract meaningful features from the acoustic signals. Short-Time Fourier Transform (STFT) was computed to convert signals into spectrograms, providing insights into time-frequency representations. Mel-frequency cepstral coefficients (MFCCs) were calculated, along with spectral contrast and chroma features. Statistical features encompassed mean, variance, skewness, and kurtosis. Additionally, time-domain features like zero-crossing rate and root mean square energy were extracted.
2.3. Data Visualization
Multiple visualization techniques were employed to present the characteristics of the acoustic signals. Spectrograms were generated using STFT to visualize signal energy over time and frequency. Waveform plots depicted the raw signal’s amplitude variations. MFCCs were displayed as heatmaps to convey audio feature nuances. Scatter plots were utilized to showcase the distributions of extracted features. The analysis of STFT-based derived features (MFCC, spectral contrast, etc.) showed clear separability between leak and non-leak signals via clustering and visualization. Accordingly, MFCC-based envelopes captured distinct spectral energy patterns in leak signals, justifying their use. Quantitative metrics (e.g., higher average energy in leak MFCC bands) supported this decision and motivated further modeling using time–frequency features.
2.4. Machine Learning Models
The proposed model is an innovative approach that leverages wireless Noise Loggers to detect leaks in pressurized water pipelines. The Noise Loggers capture acoustic signals from the pipelines, which are then processed as input features for machine learning models. This study evaluates several machine-learning algorithms, including SVM, RF, NB, KNN, DT, LogR, and MLP. Each algorithm analyzes the acoustic signals and classifies them into leak or non-leak states. The models were trained using the extracted features on training sets. Hyperparameter tuning was performed via grid search. Performance was evaluated using accuracy, precision, recall, F1-score, and confusion matrices. Ultimately, the models were compared based on their evaluation metrics, allowing for an understanding of their capabilities in varying scenarios. The hyperparameters used for the selected models are found in
Table A1 of the
Appendix A.
A significant differentiator of this study is the development of ensemble machine learning using the best-performing individual machine learning models. The ensemble approach combines the predictions of multiple machine learning models to achieve enhanced accuracy and robustness. By leveraging the diversity of individual classifiers, the ensemble model can capture various aspects of the acoustic signals, leading to improved detection performance.
Also, a newly developed deep learning model known as YamNet was used to develop a leak detection model based on the acoustic data. YamNet analyzes sound patterns to identify leaks in water pipelines. It operates by capturing acoustic signals from the pipelines and processing them using pre-trained deep learning models [
38]. The system uses a large-scale convolutional neural network (CNN) to classify the acoustic signals into leak and non-leak categories. While YamNet offers promise in detecting leaks based on acoustic signatures, it faces certain challenges. Acoustic-based methods may be susceptible to environmental noise interference, which can affect the accuracy of leak detection. Additionally, accurately distinguishing leak-related sounds from other sources of noise can be challenging, especially in noisy urban environments. It is critical to note that the publicly available YamNet model (pre-trained on AudioSet) was used without further fine-tuning on the collected data. Furthermore, the YamNet model was assessed using the same dataset; no separate field data were used in this experiment. In future work, it is recommended to fine-tune YamNet on the noise-logger dataset for improved performance. The current results thus reflect the baseline performance of YamNet’s general sound classification on leak data.
2.5. Results and Interpretation
Finally, the results were presented and compared. Classification accuracy and evaluation metrics were provided for each model. Confusion matrices were used to evaluate accuracy, recall, precision, and F1 score for the initial models. Receiver Operating Characteristic (ROC) curves were plotted, Area Under the Curve (AUC), and Jaccard Cross-Validation values were calculated for binary classification tasks. The obtained results were interpreted with respect to the research objectives. Insights into classification performance and potential real-world applications were discussed.
The accuracy was computed as (
TP +
TN)/(
TP +
TN +
FP +
FN), as illustrated in Equation (1), where
TP,
TN,
FP, and
FN are the true positive, true negative, false positive, and false negative counts. This measures overall classification correctness. Additionally, Equations (2)–(5) provide the equations for the remaining performance metrics.
Measures the overall proportion of correct predictions.
Indicates the model’s ability to correctly detect actual leak events.
Reflects the ability to correctly identify non-leak conditions.
Indicates the proportion of predicted leak cases that were actually leaks.
Balances precision and recall, giving an overall sense of model reliability.
These metrics were calculated for each classifier to ensure a comprehensive evaluation of leak detection capability. In addition to quantitative metrics, SHAP (SHapley Additive Explanations) analysis (shown in the following Figures) was used to interpret model behavior by identifying the most influential acoustic features driving predictions. Together, these metrics and explainability tools provide both performance and transparency for the proposed models.
3. Model Implementation
This section presents the results of the performance evaluation of different machine learning models in accurately detecting leak states and non-leak states. The results comparison is useful to provide valuable insights into the performance of various machine learning models for leak detection in water pipelines.
Table 1 compares the performance of seven machine learning models across four performance metrics. Also,
Table 1 presents the ranking of the machine learning models based on their performance. The accuracy of the model was used as the main metric for ranking. From
Table 1, it is evident that the RF model achieved the highest accuracy of 93.68%, making it the top-performing model among all the tested machine learning algorithms. RF’s ensemble nature and ability to effectively handle large and highly dimensional datasets and feature interactions contributed to its superior performance. Furthermore, RF is distinguished by its robustness against noise and outliers. The KNN model closely followed RF with an accuracy of 93.40%. KNN’s adaptability to local patterns and its capability to perform well on non-linear data distributions made it a strong candidate for leak detection. The MLP achieved an accuracy of 92.15%, demonstrating the efficacy of deep learning approaches in analyzing acoustic signals and detecting leaks in water pipelines. These results highlight the significance of ensemble methods like Random Forest and the potential of deep learning models like Multi-Layer Perceptron.
The three best-performing models were selected to develop the ensemble model.
Table 2 presents the results of the ensemble model, which combined the predictions of multiple machine learning models. The ensemble model outperformed individual models and reinforced the value of combining different perspectives to improve detection accuracy. These findings underscore the importance of leveraging modern techniques such as ensemble models to enhance leak detection capabilities, contributing to sustainable water management practices.
The results of the proposed model and YamNet are significantly different, as seen from the accuracy metrics obtained in their respective experiments. The proposed model outperformed YamNet, which achieved only 52.63% accuracy. This substantial difference in accuracy demonstrates the superiority of the proposed model in identifying leaks in water pipelines. In addition, the low performance of YamNet can be attributed to its struggle in the detection of subtle and small leaks due to their weak acoustic signatures [
39]. Furthermore, YamNet was trained based on controlled audio environments; thus, it lacks the adaptability to deal with real-world variabilities in pipe materials, pressure fluctuations, and background noise, leading to inconsistent performance [
40]. As such, YamNet requires specialized training and fine-tuning to optimize its architecture so that it can be adopted in a domain-specific application like leak detection of water pipes.
The precision values for the Ensemble Model and YamNet were 89.78% and 41.7%, respectively. This discrepancy emphasizes the heightened ability of the Ensemble Model to accurately identify positive instances, showcasing a more refined and precise discrimination in comparison to YamNet.
In terms of recall, the Ensemble Model exhibited a notably higher value of 92.66% compared to YamNet’s 8.77%. This distinction underscores the Ensemble Model’s proficiency in capturing a substantial proportion of actual positive instances, reinforcing its efficacy in the identification of leaks within water pipelines. The exceptionally low recall of YamNet (8.77%) suggests it missed most actual leak cases, whereas the ensemble model achieved a high recall (92.66%). This indicates that ensemble training captured leak-specific frequency patterns, while YamNet likely misinterpreted them. MFCCs (Mel–Frequency Cepstral Coefficients) quantify the slow-changing spectral envelope of an audio signal in the mel-scaled frequency domain. MFCCs reflect harmonic and resonance characteristics of sound, making them ideal for detecting structured acoustic leak signatures. High MFCC coefficients (especially MFCC1–10) correspond to strong harmonic energy typical of pressurized leak flows.
The F1 Score, a harmonic mean of precision and recall, further accentuates the Ensemble Model’s superiority. With an F1 Score of 91.00%, the Ensemble Model achieves a more balanced trade-off between precision and recall, indicative of a more robust and reliable predictive model. In contrast, YamNet’s F1 Score of 14.49% suggests a less harmonized performance in balancing precision and recall.
Figure 4 depicts the Receiver Operating Characteristic (ROC) curve of the developed ensemble model. The ROC curve graphically exemplifies the trade-off between true positive rate and false positive rate. In addition, Area Under the Curve (AUC) signifies an aggregated measurement of the performance of the classification model. The AUC for the ROC curve is notably higher for the Ensemble Model (98.00%) in comparison to YamNet (49.45%). This substantial difference signifies the Ensemble Model’s superior discriminatory ability, crucial for distinguishing between positive and negative instances in the context of leak detection.
Examining the Jaccard index, the Ensemble Model once again outshines YamNet with a value of 89.31% compared to 23.53%. This emphasizes the Ensemble Model’s capacity for an enhanced intersection over union, indicative of a more substantial overlap between predicted and actual positive instances.
The key difference between the proposed model and YamNet lies in their underlying analysis mechanisms. While the ensembled machine learning models utilize the extracted and selected features to classify acoustic signals captured by Noise Loggers, YamNet relies directly on the acoustic signals for leak detection. The use of Noise Loggers in the proposed model offers advantages in terms of ease of deployment, cost-effectiveness, and sensitivity to subtle changes in pipeline behavior. It also reduces susceptibility to environmental noise, leading to more accurate leak detection in various environments. SHAP (shapely additive explanations) is applied to identify the significant factors influencing leak detection (see
Figure 5).
Figure 5 displays grouped SHAP values for the ensemble model, highlighting the relative importance of the four main acoustic feature types: MFCC, Spectral Contrast, Tonnetz, and Chroma. The MFCC group shows the widest range of SHAP values, approximately from −0.04 to +0.04, indicating that these features exert the strongest influence on the model’s predictions. In contrast, Spectral Contrast ranges from −0.035 to +0.035, Tonnetz from −0.03 to +0.035, and Chroma has the smallest impact, ranging between −0.028 and +0.03. The color coding (from blue to red) represents the original feature values, and for MFCCs in particular, higher values (red dots) tend to correspond with positive SHAP values, meaning they push the prediction toward the leak class. This aligns with known acoustic leak signatures, where elevated MFCC components reflect the structured harmonic patterns of leak-induced flow. Overall,
Figure 5 confirms that the ensemble model relies most heavily on MFCC-derived time–frequency features, reinforcing their diagnostic value in distinguishing leak events from normal pipeline conditions. Unlike YamNet—used without fine-tuning on audio data dominated by general ambient sounds—the developed model focuses specifically on leak acoustic signatures. It extracts domain-specific features (e.g., MFCCs capturing low-frequency harmonics of leaks), trained on labeled leak and non-leak samples—even under conditions with environmental noise. As SHAP analysis confirms, leak-specific components like MFCC
1–10 heavily influence the model’s decisions, while ambient noise-related features carry low SHAP influence. This targeted feature selection and supervised training explain the superior performance and robustness of this article’s model compared to YamNet.
The confusion matrix of the developed ensemble model is given in
Figure 6. It is indicated that the developed model managed to detect leak and non-leak events with nearly equal efficiency. In this respect, it is conceived that 166 actual leaks and 207 actual non-leaks are accurately predicted by the developed model. In the same vein, 17 non-leak cases are identified as leak cases, besides 18 leak cases that are wrongly determined as non-leak cases.
4. Analysis and Discussion
The results of the experiments conducted on various machine learning models for leak detection in water pipelines provide valuable insights into the performance and significance of each approach. This segment analyzes the results obtained for each tested machine learning model and highlights their individual strengths and implications for real-world leak detection applications.
The Random Forest model gave the best results among all the tested models and achieved a high accuracy of 93.68%. The superior performance of RF can be attributed to its ensemble nature, which combines multiple decision trees and aggregates their predictions. This approach helps reduce overfitting and enhances the model’s robustness. RF’s ability to handle large datasets and feature interactions makes it particularly well-suited for detecting leaks in water pipelines. Its high accuracy and efficiency make it a strong candidate for real-time monitoring and early detection of leaks.
KNN performed remarkably well, achieving an accuracy of 93.40%. KNN’s effectiveness lies in its ability to classify data based on the majority class of its neighboring data points in the feature space. This approach allows KNN to adapt to local patterns and perform well on non-linear data distributions. Its simplicity and ease of implementation make it an attractive choice for leak detection, particularly when spatial relationships between data points are crucial. However, KNN’s performance may be sensitive to the choice of the number of neighbors (K) and the distance metric used.
MLP achieved a high accuracy of 92.15% in leak detection. As a deep learning model, MLP can capture complex patterns in the data through multiple layers of interconnected neurons. Its ability to learn hierarchical representations makes it well-suited for leak detection tasks involving large datasets and non-linear relationships. Fine-tuning and hyperparameter optimization can further enhance MLP’s performance in leak detection.
Decision Tree obtained an accuracy of 77.95% in leak detection. Decision Trees are interpretable and easy to visualize, providing valuable insights into the decision-making process. However, they may suffer from overfitting and lack the complexity to capture intricate relationships within the data. To improve DT’s accuracy, pruning techniques and ensemble methods, such as Random Forest, can be explored to enhance generalization capability and robustness.
SVM demonstrated a respectable accuracy of 88.97% in distinguishing between leak states and non-leak states. This model’s high accuracy suggests that SVM can effectively capture patterns and features in the acoustic signals associated with leaks. SVM’s ability to identify intricate boundaries between classes makes it suitable for handling complex datasets, making it a promising choice for leak detection in water pipelines. While SVM’s accuracy is impressive, further optimization and tuning may be required to enhance its performance and reduce false positives or negatives.
Logistic Regression achieved an accuracy of 72.05% in leak detection. Logistic Regression is a simple and interpretable model that provides probability estimates for binary classification tasks. While it may not be as powerful as other complex models, LogR’s interpretability can aid in understanding the importance of individual features in leak detection. To improve LogR’s performance, feature engineering and regularization techniques can be employed to handle complex datasets more effectively.
NB, a probabilistic classifier, achieved an accuracy of 68.38% in leak detection. While the accuracy is relatively lower compared to other models, NB’s simplicity and computational efficiency make it a suitable option for certain applications. NB assumes independence between features, which may not always hold in real-world scenarios, but it remains effective in certain situations with limited computational resources or when the dataset conforms to its assumptions. Further improvements in leak detection accuracy could be achieved by exploring more sophisticated feature engineering techniques that align with NB’s assumptions.
The results of the experiments demonstrate the significance of each tested machine learning model for leak detection in water pipelines. By understanding the implications of each model’s performance, researchers and practitioners can make informed decisions in selecting the most suitable models for their specific leak detection needs. The results of this study show that RF stands out as the top-performing model, showcasing its potential for real-time monitoring and accurate leak detection. However, other models, such as KNN and MLP, also demonstrate promising performance, each with unique strengths and applicability. The combination of the three best-performing models in the ensemble approach further enhances the accuracy of leak detection. The ensemble model surpassed the performance of individual models, thus emphasizing the efficacy of integrating diverse perspectives to enhance detection accuracy.
The results of this study confirm the efficacy of Noise Logger-based monitoring and the potential of ensemble machine learning models for accurate leak detection in water pipelines. The ensemble model’s success can be attributed to its ability to combine the strengths of different classifiers, leading to improved performance. In contrast, YamNet’s lower accuracy suggests limitations in handling the complexities of the data and highlights the benefits of the proposed approach.
In conclusion, the proposed model for Noise Logger-based leak detection with ensemble machine learning offers a novel and efficient approach for real-time monitoring of water pipelines. Leveraging wireless Noise Loggers and ensemble machine learning, the proposed model achieves a high accuracy in detecting leaks, outperforming the YamNet approach. The superiority of the proposed model lies in its robustness, sensitivity to subtle pipeline changes, and reduced susceptibility to environmental noise. By utilizing acoustic signals and ensemble machine learning, the proposed model presents a promising solution for accurate and efficient leak detection, contributing to sustainable water management practices.
On the level of computational costs, the computational requirements of the ensemble model and individual machine learning models were assessed to provide insights into their practicality for deployment in resource-constrained environments. The ensemble model combines predictions from three top-performing individual models—Random Forest (RF), K-Nearest Neighbors (KNN), and Multi-Layer Perceptron (MLP). Each model has unique computational demands:
Random Forest (RF) is relatively computationally efficient during inference but requires substantial memory for storing multiple decision trees.
K-Nearest Neighbors (KNN) has minimal training costs but incurs significant computational overhead during inference due to the need to calculate distances from all training data points.
Multi-Layer Perceptron (MLP) involves higher computational costs during both training and inference due to its deep learning architecture.
In comparison, the ensemble model aggregates the outputs of these individual models, which increases inference time and resource usage. For example, combining predictions requires running all three models in parallel or sequentially, depending on system design, leading to higher latency and energy consumption. However, the ensemble model’s increased computational cost is justified in scenarios where high detection accuracy is critical, such as monitoring high-value infrastructure or environments prone to frequent leaks.
In the common test split, the Voting Ensemble achieved an accuracy of 94.40%, while the best single model (Random Forest) achieved 93.68%. Precision was slightly higher for the ensemble (89.78%) compared with RF (93.00%, rounded to whole percentages in
Table 1). Recall was almost the same for both models, and the F1 score for the ensemble (91.00%) was slightly lower than for RF (93.00%).
To check if this small accuracy difference was real or due to chance, McNemar’s test was applied to paired predictions from both models. The result was p = 0.0987, which is not below the significance level of 0.05. This means there is no strong statistical proof that one model is better than the other on this specific test set. However, a one-way ANOVA on cross-validated accuracies for all models gave F ≈ 496.8, p < 0.0001, showing that model choice does have a big impact when results are averaged over many data splits.
The ensemble was chosen over RF not because it won in every single metric, but because it is more robust. By combining three different types of models—tree-based, instance-based, and neural—it reduces random variation in results and works more reliably across different data samples and conditions. This is especially useful when each model makes different types of detection mistakes, because the ensemble can correct them.
In practice, the extra computation for using the ensemble is very small. The three models are all fast to run, and the total prediction time per signal is less than 10 ms on a standard laptop CPU. If resources are very limited, RF alone can still be used, with only a small drop in performance.
Although the ensemble model achieved high accuracy on the Hong Kong dataset, some caution is warranted. The model may be partially tailored to the specific noise characteristics of Hong Kong urban pipelines; its performance on data from other regions or environments could be lower. Even with noise reduction, ambient noise (traffic, construction, etc.) may still affect detections in some cases. Additionally, the machine learning models (especially with multiple features) risk overfitting the training set; future work should validate the approach on independent datasets. To ensure robust generalization, it is planned to collect more diverse data (as suggested in
Section 5) and to explore regularization or data augmentation. These steps would mitigate the above limitations. Additionally, while the studied dataset includes signals from multiple districts within Hong Kong, this study does not conduct a formal region-wise significance analysis. In future work, geographic stratification will be explored, and statistical tests will be applied (e.g., Chi-square or stratified ANOVA) to evaluate whether leak detection performance significantly varies by district or pipeline characteristics.
5. Conclusions
Water pipeline leaks present a critical challenge, with significant economic, environmental, and resource implications. Existing detection methods are either limited in scope or inefficient for real-world applications, necessitating innovative approaches. This study presents a pioneering approach by integrating wireless acoustic noise loggers with ensemble machine learning models to achieve real-time leak detection in water pipelines. Unlike previous studies, which often rely on laboratory datasets, this work utilizes a comprehensive dataset collected from real-world urban environments, ensuring robust and practical applicability. Accordingly, this study analyzes water pipeline leak detection using machine learning with 2110 sound signals from Hong Kong. Using acoustic noise loggers, the dataset had 992 leak signals and 1118 non-leak signals. The goal was to evaluate different machine learning algorithms for effective leak detection and sustainable water management. Results show RF as the top-performing algorithm with 93.68% accuracy. KNN and MLP also demonstrated significant accuracy, providing diverse leak detection approaches. The ensemble model, combining Random Forest, K-Nearest Neighbors, and Multi-Layer Perceptron, significantly surpassed existing approaches, including YamNet. One of the primary challenges faced by YamNet was its sensitivity to environmental noise. Unlike the proposed ensemble model, which uses carefully extracted features and noise reduction techniques, YamNet processes raw acoustic signals directly. This approach made it more susceptible to interference from urban background noise, such as traffic, construction, and other environmental sounds. These noise sources likely obscured the subtle acoustic patterns associated with leaks, leading to reduced classification accuracy. Another limitation of YamNet is its reliance on general-purpose sound classification. YamNet was pre-trained on the AudioSet dataset, which encompasses a wide range of audio events but does not specifically target leak-induced acoustic signals in pressurized water pipelines. As a result, the model may have struggled to identify domain-specific features critical for leak detection. Furthermore, the fixed architecture of YamNet, while efficient for general-purpose tasks, lacked the flexibility to adapt to the nuances of this specialized dataset. Additionally, YamNet’s reliance on a single deep-learning architecture contrasts with the ensemble approach’s ability to combine the strengths of multiple models. This highlights the importance of tailoring machine learning models to specific applications, particularly when dealing with specialized datasets and challenging environmental conditions. Future work could explore domain-specific training for deep learning models or the integration of noise reduction preprocessing pipelines to mitigate these limitations.
Beyond academia, efficient leak detection has broad implications, minimizing water losses, reducing economic burdens, and preserving water resources. Real-time monitoring allows for early detection and repairs, contributing to sustainable water management. While promising, future research should explore additional features and data preprocessing. Expanding the dataset will enhance model generalizability. In conclusion, leveraging wireless noise loggers and machine learning is crucial for effective water pipeline leak detection, with the potential to revolutionize practices for sustainable water management. This study contributes to sustainable water management by providing municipalities with a cost-effective, accurate, and scalable solution for detecting water pipeline leaks. The proposed approach sets a foundation for future research and development in leveraging machine learning for real-world infrastructure challenges. By integrating real-world datasets with innovative ensemble machine learning models, this study offers a significant leap toward practical and reliable water leak detection, paving the way for smarter and more sustainable urban water management systems. On the level of future advancements, one avenue would be the integration of real-time data streams to enhance model performance in dynamic environments. Adaptive learning methods, such as online learning or reinforcement learning, could be employed to continuously update the models as new data become available, improving their accuracy and robustness over time. Additionally, expanding the dataset to include diverse geographical regions and pipeline materials would enhance the model’s generalizability. These advancements would not only refine leak detection capabilities but also support the development of fully autonomous and scalable monitoring solutions for water distribution networks. One challenge is the increased computational demand when analyzing signals from extensive networks with a high density of noise loggers. The ensemble model, while accurate, requires more computational resources compared to individual models, which may limit its feasibility in resource-constrained environments. Accordingly, for city-wide implementation, the model can be integrated with an IoT-based monitoring system, where acoustic noise loggers are deployed in key junctions and transmit data to a centralized processing unit. The trained ensemble model can then be used in real time to analyze signals and flag potential leaks. Region-specific retraining or transfer learning could be applied to finetune the model to new environments, accounting for different pipe materials, ambient noise profiles, or leak typologies.
Future research should focus on addressing these scalability concerns by developing lightweight, energy-efficient models and refining data collection strategies to balance performance with practical deployment needs. Furthermore, one challenge that may arise is the increased computational demand when analyzing signals from extensive networks with a high density of noise loggers. The ensemble model, while accurate, requires more computational resources compared to individual models, which may limit its feasibility in resource-constrained environments.
Additionally, the deployment of noise loggers across large-scale systems introduces logistical challenges, including the initial investment cost, sensor maintenance, and data transmission infrastructure. Addressing these concerns may require optimizing the model for edge computing, exploring cost-effective sensor solutions to enable widespread adoption, or optimizing the model for cloud-based computing.
This research work can be expanded in the future to include a comparison of the developed ensemble model against the combined model of a convolutional neural network and a long short-term memory network. Furthermore, more experimental cases from disparate geographic locations need to be considered to test the robustness of the developed ensemble model against various noise levels and ambient conditions. From a dataset standpoint, while the dataset comprises 992 leak and 1118 non-leak samples from similar valves and pipe types across multiple regions in Hong Kong, it is critical to acknowledge that leak acoustic signatures can vary with pipe material, diameter, and pressure. As part of future work, it is paramount to conduct statistical tests or sampling stratification to confirm representativeness in more heterogeneous network sections.