Application of CNN Models to Detect and Classify Leakages in Water Pipelines Using Magnitude Spectra of Vibration Sound

: Conventional schemes to detect leakage in water pipes require leakage exploration experts. However, to save time and cost, demand for sensor-based leakage detection and automated classiﬁ-cation systems is increasing. Therefore, in this study, we propose a convolutional neural network (CNN) model to detect and classify water leakage using vibration data collected by leakage detection sensors installed in water pipes. Experiment results show that the proposed CNN model achieves an F1-score of 94.82% and Matthew’s correlation coefﬁcient of 94.47%, whereas the corresponding values for a support vector machine model are 80.99% and 79.86%, respectively. This study demonstrates the superior performance of the CNN-based leakage detection scheme with vibration sensors. This can help one to save detection time and cost incurred by skilled engineers. In addition, it is possible to develop an intelligent leak detection system based on the proposed one.


Introduction
Recently, with the progress of the artificial intelligence technology based on big data, the provision of public services based on information and communications technology is expanding [1].Accordingly, technology development and pilot projects for intelligent management systems are being actively conducted in water-related industries to mitigate economic losses.Reduction in water pressure due to leakage costs additional energy and results in secondary pollution, which is estimated to cause a loss of trillions of Korean Won per year in Korea [2].
Pipelines that transport various fluids, including water, are considered critical infrastructure assets and contribute significantly to the nation's economy.In general, pipelines that are installed across the country to provide different services crack and fail at a given rate depending on their length.Pipeline failures are typically caused by aging infrastructures and poor environmental conditions [3,4].
In particular, water pipeline facilities are buried underground.They can be damaged by various factors, such as deterioration of materials, improper use of materials and structures for piping connections, internal corrosion, traffic load, and ground movement, which can result in leaks.Therefore, in recent years, as communication technologies have progressed, smart metering technology that collects and transmits information, such as water flow, pipe network sensor, and water quality data in real-time are being developed [5][6][7].Real-time leak detection technology is critical for developing the remote monitoring systems of water pipes.Unfortunately, detecting leaks in advance is challenging [8].Therefore, various approaches have been developed to detect leaks in the past.
Several approaches are based on signal processing schemes.An example is a study that proposed a method for detecting and locating leaks based on vibration sensors and generalized correlation techniques.In this study, the use of a modified maximum likelihood pre-filter with regularization coefficients is proposed taking into account the estimation error of the cross spectrum and power spectral density [9].Similarly, the mel-frequency cepstral coefficients of acoustic signals were used to detect leaks based on pipeline data [10].
Many studies on leak detection through machine learning have been conducted.For example, a leak detection technology based on convolutional neural networks (CNN) using thermal images was proposed to solve the problem of poor leak detection performance due to the lack of experts and reliance on the skills of individual skilled workers [11].In this study, research was conducted to develop a smart water management system based on deep learning and machine learning by improving the binary classification performance of normal and abnormal (leak) cases.A study was also conducted to apply a multi-strategy ensemble learning approach to sound signals using a gradient boosting tree classification model [12].Similarly, a deep neural network model is also adopted to perform leak detection [13].In addition, leak detection in pipelines carrying fluids, gases, and water has been considered in various tasks [14,15].Additionally, studies have considered detecting leakage with robots equipped pressure and acceleration sensors [16][17][18].
Recently, approaches based on artificial intelligence models have shown considerable success in similar areas, such as oil and gas pipelines, other than water pipelines.A 2D CNN model and a long short-term memory autoencoder (LSTM AE) with accelerometers mounted on a pipeline wall are applied for oil and gas leak detection.It is reported in the study that the supervised learning-based CNN model outperforms the unsupervised LSTM AE model [19].Furthermore, the combination of both models is also proposed to obtain better performance [20].Another unique approach is based on a CNN model with a kind of swarm intelligence optimization algorithm, the sparrow search algorithm [21].In various engineering fields besides pipeline leak detection problems, CNN models are widely adopted in classification problems to solve detection problems that depend on the experience of engineers [22,23].It is observed that the CNN models are employed for feature extraction automation even in the prediction applications [24].In addition, there was a research result of detecting a leak in a transmission main operating in a realworld environment through transient test-based techniques (TTBTs).TTBTs are based on the transient event dynamics of a pressurized flow.A pressure wave generated along a pressurized pipeline interacts with any leak.As a result, in the first phase of the transient, the leak is detected by utilizing the reflected feature to the place where the pressure wave of lower amplitude is generated.For example, leaks generate negative pressure waves, whereas partial blockages generate both positive and negative pressure waves [25,26].
In the past, for detecting and confirming water leaks, leak exploration, extraction, and recovery were carried out by leak detection experts to check for leaks and manage water flow rates.However, finding the exact location of leaks with audio leak detection is problematic because it can be affected by passing vehicles or surrounding noise due to the nature of the underground water supply pipeline.It also has the disadvantage of necessitating significant exploration skills on the part of workers and frequent visits to the site.
In this paper, in order to detect and classify leaks in water pipes, we employ a 2D-CNN model that takes inputs consisting of spectral magnitudes of vibration sound samples.The sound data are sampled through the vibration sensors, which are mounted on the water pipes.The sampled data are collected from in situ water pipelines in several areas in Korea.They are classified into five categories by skilled experts.The leakage classes are indoor and outdoor sounds, while the other cases consist of typical sound, electrical, and mechanical noise, and environmental noise.The ultimate purpose of this research is to develop an intelligent detection model applicable to the automatic remote monitoring system of water pipes.Various performance measures are employed to investigate the model performance, and for the purpose of comparison, the support vector machine (SVM) model [27,28] is also applied to the same dataset.
As mentioned previously, approaches based on the CNN model show significant results in similar fields [19,22,23].Furthermore, the CNN models can also guarantee realtime detection using already learned parameters [21].For these reasons, the 2D-CNN model is employed.In addition to the CNN model, the SVM model is also adopted for performance comparison because several studies reported that the SVM model showed considerable performance for its application to leak detection [29,30].
It is well-known that the computational efficiency of the CNN is maximized when its input is two-dimensional.For this reason, the magnitude spectrum, which is usually one-dimensional, is reshaped into a two-dimensional matrix for the input to the CNN model.However, since the SVM model can handle a one-dimensional array, it takes the one-dimensional magnitude spectrum as input.The SVM model in this study adopts the radial basis function (RBF) for its kernel.
As leaks from water pipes produce noise, noise can be utilized to detect leaks [31], which is based on developing hearing (electronic) leak detection.For this reason, the dataset used in this study consists of the vibration data of the sound collected through the sensor installed in the water pipe and is the magnitude spectral density that has undergone the fast Fourier transform (FFT) process.
In the existing leak detection methods, additional labor and costs are incurred, and extra time is required to identify if the alarm for leakage is genuine and localize leakage.However, installing the CNN model proposed in this study makes it possible to detect and classify what kind of leak and what kind of noise it is as soon as an abnormal sound is detected.The CNN-based water pipe leak detection model can save time and cost with high classification accuracy.
The structure of the paper is as follows.In Section 2, the description of the data collection module and the collection places, and the detailed data constitution are introduced.Section 3 describes the CNN model employed in this study in terms of structure and training scheme.Section 4 introduces the structure, input characteristics, and hyperparameters of the SVM model.Section 5 presents the model training processes and the evaluation results.Finally, in Section 6, the experimental results are summarized, and the paper is concluded by presenting future works.

Data Description
In this work, we used a dataset provided by AI Hub (This research utilizes the datasets from The Open AI Dataset Project, South Korea.All data information can be accessed through AI-Hub (www.aihub.or.kr) on 23 December 2022.)[32].This dataset was collected from leak detection sensors installed in 11,000 locations in some neighborhoods in Gwangju, Korea, and at the modernization site of the local water supply in Goheung, Korea.The raw data were obtained by monitoring water pipe leakage vibration using the monitoring system illustrated in Figure 1.
Figure 2 shows how the sensor attached to the water pipe is utilized to discriminate indoor and outdoor leakage [32].Indoor leakage refers to the case of leakage in the water supply pipe, where leakage noise is detected by the sensor attached to the pipe, as shown in Figure 2a.In contrast, outdoor leakage refers to the case of leakage in the drain pipe.Drain pipe leakage noise is propagated to nearby water supply pipes and the sensors equipped to supply pipes around the drainpipe detect the leakage simultaneously, as shown in Figure 2b.Therefore, in this case, the leak detection signal is simultaneously transmitted from sensors near the drain.
The collected data are refined based on the decision confirmed in the field through actual leakage exploration and classified for subsequent analysis.Data types are largely divided into three types: normal, abnormal, and noise.The data are specifically labeled into five classes: outdoor leak, indoor leak, electrical/mechanical noise, environmental noise, and normal sound.The outdoor leak sound is the sound when a leak occurs in the drain pipe, and the indoor leak sound is the sound when a leak occurs in the water supply pipe.Electrical/mechanical noise is a sound mixed with electrical sound such as heating wires to prevent water pipes from freezing and bursting, and environmental sound is a sound mixed with life noise, such as the operation sound of an outdoor unit.For each class, 21,923 outdoor leaks, 16,591 indoor leaks, 6288 electrical/mechanical sounds, 8774 environmental sounds, and 24,628 typical sounds, for a total of 78,204 samples are the numbers of data samples used in the experiment were.The composition of the data is summarized in Table 1.The sensor signals are transformed into a spectral density using 1024-FFT, with a spectral range from 0 Hz to 5120 Hz.Each datum consists of an area code, a sensor number, and spectral magnitudes according to frequency.Figure 3 shows the average magnitude spectra for the five cases.In Figure 3a, the spectra of normal sound and indoor and outdoor leak sounds can be compared, while Figure 3b shows the noise sounds, which are classified by the regular classes.

CNN Model
The water pipeline leak detection model is based on a 2D CNN model proposed by Lecun [33]. Figure 4 depicts the conceptual structure of the designed CNN model.In order to take advantage of the two-dimensional convolution layer, a magnitude spectrum vector of 1 × 512 is converted to a matrix of 32 × 16 as input to the CNN.
Through trial and error, the structure of the layer is changed, followed by evaluating the accuracy and loss of the model to optimize the hyperparameters.This repetition process provides gradual convergence to the range where the optimal values exist.The model with the minimum classification loss is selected as the final CNN model for water pipe leak detection.The architecture of CNN is a multilayer feedforward neural network composed of sequentially stacked layers, as shown in Table 2, which summarizes the architecture of the proposed model with each layer type and dimension, the kernel size, and the number of connected perceptrons.There are five layers: Batch Normalization, Convolution, Maxpooling, Flatten, and Fully Connected Layers.Since the features have a density of 512 frequency components, the CNN input layer takes an input matrix of 32 × 16 for better training.The first layer performs batch normalization, which normalizes the input data.The number of filters is set to 16 in the second layer, i.e., the convolution layer.Here, the kernel size is 5 × 5, and the rectified linear unit activation function is used.This structure is repeated twice by increasing the number of two-dimensional convolution filters to 32 and 64.Between the consecutive convolution layers, a batch normalization layer is added to prevent the slope loss problem and congestion.After configuring up to the sixth layer, max pooling is performed at the seventh layer.The max pooling layer has a kernel size of 3 × 3. The eighth layer is a Flatten lay, and all nodes are fully connected.The flattening layer converts two-dimensional information into one dimension and transfers the characteristics acquired from the convolution and the pooling layers to the fully connected layer.In the ninth layer, i.e., the fully connected layer, 3200 nodes are fully connected to five nodes, and the data are classified into five classes through the softmax function.The total number of nodes used in this model was 80,713.We used a cross-entropy loss function and the Adam optimizer.

SVM Model
For the purpose of comparison, as mentioned previously, the SVM model is employed.The radial basis function (RBF) is adopted for the SVM model [34], which is defined as where x 1 and x 2 are the data points, • implies the Euclidean distance, and 2σ 2 is a parameter that controls the width of the RBF kernel.This is set to the dimension of the feature vector X as follows: Since SVM models can use vectors as input features, the input feature of the SVM model is a vector of 1 × 512, which is the same as that of the CNN model without shape conversion.The feature vector undergoes a scaling process using a standard scaler to prevent the cost value from diverging and not being trained normally.
In the experiment, the value for the cost function is set to 10, and the value of the kernel is set to 512, which is the dimension of the feature vector X.

Experiment
Each dataset utilized in this experiment consists of a total of 78,204 data; 50,050 training data, 12,513 validation data, and 15,641 test data.The investigation was carried out 50 times over the randomly selected data sets, the construction ratio of which was maintained.The results presented in this section are based on the average of the 50 runs.The CNN model updates the weights whenever the verification loss is reduced in the learning process.The early termination condition for the training process was set to proceed with 100 additional epochs of training from the point of update.Although all models satisfied the early termination condition in less than 200 epochs, we continued training up to epochs for visual comparison.
Figure 5 shows a graph of the training and validation accuracy over different training epochs.These values were obtained by averaging the 50 models as mentioned previously.
The training and validation loss is shown in Figure 6.The performance of each CNN model was investigated by classifying 15,641 test data into five classes.The average confusion matrix over the 50 CNN models is presented in Figure 7.In addition, that of the SVM models can be found in Figure 8. Comparing both matrices reveal that the diagonal components of the CNN model were greater than those of the SVM model, which implies that the CNN models performed better.Five performance metrics, including precision, recall, accuracy, F1 score, and Matthew's correlation coefficient (MCC), were employed to obtain clear insight into the performance comparison.The precision, recall, and accuracy are, respectively, defined as follows [35,36].
where TP, TN, FP, and FN stand for true positive, true negative, false positive, and false negative, respectively.As may be observed from Equations ( 3) and ( 4), precision and recall indicators evaluate the true predictions.In addition, because predicting false data as false is correct, the accuracy metric evaluates both TP and TN.The data used in this study had an imbalanced ratio by class.In the case of model performance evaluation, simply using accuracy as an evaluation metric can lead to a significant accuracy bias when the data are imbalanced.For this reason, the F1 score and MCC, which can complement this, are employed as additional metrics.The F1 score is the harmonic average of the precision and recall, which is defined as This can accurately evaluate the performance of an unbiased model even when the data ratio for each class is imbalanced [37].
As another measure that compensates for the unbalanced datasets issue, MCC measures the correlation between an actual class and a prediction class.MCC for G classes is defined as in [36,38] where the total number of correctly predicted elements is denoted by c = ∑ G g C gg , the total number of elements s = ∑ G i ∑ G j C ij , the number of times that class g was predicted (column total) p g = ∑ G i C gi , and the number of times that class g truly occurred (row where the value −1 represents the case of perfect misclassification, the value +1 does the case of perfect classification, and MCC = 0 does the coin tossing classifier. MCC can also be used to evaluate the performance of a model with an unbalanced ratio of data between classes, such as F1-score [39,40].The F1 score depends on which class is defined as positive in Equation ( 6).However, MCC is an indicator that has the advantage of preventing a positive class from being incorrectly defined over an F1 score because it does not depend on which class is positive.
Table 3 summarizes the performance of the CNN and SVM models in terms of precision, recall, accuracy, F1 score, and MCC, respectively.In each evaluation, precision, recall, and F1 score were calculated for each class, and then the macro-averaged values were reported as each model's final values.The values shown in Table 3 are obtained by averaging over 50 runs.
According to Table 3, the trained CNN model exhibited better recall performance than precision, whereas the SVM model showed better precision performance than recall.This means that when the trained CNN model classifies a data sample, it is essential not to classify it incorrectly.Conversely, SVM models focus on classifying data into actual categories.The CNN model performed better than the SVM model in terms of accuracy.As mentioned, accuracy may be biased due to an imbalanced dataset issue.Precision and recall have relatively little meaning when viewed alone due to differences in perspective; hence, the performance was evaluated using the F1 score and MCC.These indices comprehensively look at precision and recall at once.
The difference between the accuracy and the F1 score in the CNN model is about 1%, whereas the difference in the SVM model is 3.8%.This means that compared to the SVM model, the bias in classifying the CNN model is less.Furthermore, the MCC of the CNN model is 0.9447, and that of the SVM model is 0.7986, indicating that the CNN model can be classified similarly to the actual correct answer compared to the SVM model.As a result of using various indicators to evaluate the model, the CNN model shows excellent classification performance without being biased to a specific class in leak detection classification prediction.

Conclusions
Because experts are required to visit the site to detect leaks through the existing hearing leak exploration technique, there have been considerable research and development for remote leak detection, such as using IoT-based water pipeline monitoring systems.In this study, we proposed application of a CNN model to detect leaks in a water pipeline by automating leak exploration techniques that rely on the personal capabilities of experts.Using the leak sensor data, the leak detector learns the leak data, which are divided into five classes, and finally determines whether a leak has occurred.We selected a CNN model; 62,563 data samples were collected as a training dataset, and the performance was evaluated with 15,641 samples.For a reliable investigation, 50 models were developed and tested.The results presented in the paper are based on the average performance of the 50 CNN models.Five performance metrics were used for evaluation, including precision, recall, accuracy, F1 score, and MCC.For comparison, an SVM model was also applied under the same environment.The performance of the CNN model was superior to that of the SVM.
The CNN-based water pipe leak detection model proposed in this work can help develop an intelligent leak detection system.This study demonstrates the validity of artificial intelligence-based automatic leak detection.Note in this study that the data are supposed to be collected in a data hub and that the proposed model is implemented in a higher-level language.Thus, the sound samples should be periodically transmitted through wireless networks from sensor nodes, especially long-term evolution networks in this study.This results in high transmission traffic, which is not desirable.
To overcome this disadvantage, the data traffic amount should be lowered.One way is employing a detection model that can be installed in the sensor module.The module only sends an alarm to the management center when the detection model detects an abnormal condition.For this development, the complexity of the model should be so low that it can be implemented in low-level hardware without loss of performance.This development is challenging, and the application of other machine learning models will be investigated for this development.More essential features that can reflect leakage more effectively should be studied in the time and frequency domain.

Figure 1 .Figure 2 .
Figure 1.A flow and water pressure monitoring module was used to collect data [32].

Figure 3 .
Figure 3. Average magnitude spectra of five classes.(a) Normal Sound vs. Leak Sounds and (b) Normal Sound vs. Noise Sounds.

Figure 4 .
Figure 4. Structure of the CNN model used in the water pipeline leak detection.

Figure 5 .
Figure 5. Average training accuracy and validation accuracy of the 50 CNN models.

Figure 6 .
Figure 6.Average training loss and validation loss of the 50 CNN models.

Figure 7 .Figure 8 .
Figure 7. Average confusion matrix of the 50 CNN models.(a) In number and (b) in percentage.

Table 2 .
Summary of the trained CNN model.

Table 3 .
Performance summary of the CNN and SVM models, obtained by averaging over 50 runs.