Leak Detection and Location of Water Pipes Using Vibration Sensors and Modified ML Prefilter

This paper proposes a new leak detection and location method based on vibration sensors and generalised cross-correlation techniques. Considering the estimation errors of the power spectral densities (PSDs) and the cross-spectral density (CSD), the proposed method employs a modified maximum-likelihood (ML) prefilter with a regularisation factor. We derive a theoretical variance of the time difference estimation error through summation in the discrete-frequency domain, and find the optimal regularisation factor that minimises the theoretical variance in practical water pipe channels. The proposed method is compared with conventional correlation-based techniques via numerical simulations using a water pipe channel model, and it is shown through field measurement that the proposed modified ML prefilter outperforms conventional prefilters for the generalised cross-correlation. In addition, we provide a formula to calculate the leak location using the time difference estimate when different types of pipes are connected.


Introduction
This paper deals with the leakage detection and location of water pipes which is one of the main concerns in the water management field. It has been reported that the amount of non-revenue water reaches 15.2-35.1% of drinking water supply in United States, 6.5-24.6% in Europe, and 4.3-27.0% in Korea [1][2][3]. These losses can be classified into unbilled public usage, apparent losses including unauthorized consumption and metering inaccuracies, and real losses through overflows at storage tanks and burst leaks in distribution pipelines caused by bad connections, pipe corrosion, and physical damages. It is especially very important to detect the burst leaks in buried water pipes in order to reduce water production costs as well as to protect public safety [1,4,5].
Most of the water pipes are buried underground, making it difficult to find the location of leaks. For this reason, water leakage has usually been detected when water flows out of the ground due to massive leaks in pipes. As a preemptive detection method, an acoustic-based approach is used that an experienced expert scans a suspected area by listening to leak sounds, yet this scheme is extremely time-consuming and the accuracy highly depends on the detection skills of the personnel [6,7]. As a more systematic approach, a supervisory control and data acquisition system (SCADA) has been considered to periodically measure the flow rate or water pressure using permanently installed sensors [8,9]. This technique has evolved into model-based leak detection that compares the measured data taken on the water network with the predicted values from flow models [10,11], and a model falsification method is introduced for efficiently detecting leaks in large-scale water distribution networks [12,13]. These methods identify the areas with significant water leakage, so additional techniques are required to find the exact position of leaks [8,9,12]. In addition, ground penetration radar can be used to locate leaks in water pipes by inspecting the ground surface and detecting voids in the soil caused by leaking water; however, the use of this approach is limited to specific areas because of high costs and high sensitivity to underground metal objects [4,14]. In [15,16], a non-invasive detection method based on the time domain reflectometry (TDR) has been investigated. This approach finds the position of leaks using the change of the reflection coefficient near leakage points. Because the measured reflection coefficient is sensitive to the temperature and material type around the pipe, robust leak detection is not easy.
The most popular approach to water leak detection is to use acoustic/vibration sensors or pressure transducers attached to the surface of a pipe [5,[17][18][19][20][21][22][23][24][25][26][27][28][29]. When two sensors are used, the leakage location can be determined by estimating the time difference through correlation of the receive signals. The performance of various fast Fourier transform (FFT)-based algorithms is compared in terms of the time difference estimation accuracy in [18][19][20][21]; the time difference estimation schemes based on spectral transform are further enhanced by the short-time Fourier transform (STFT) [22] and the wavelet transform [23]; and the correlation-based leak detector has been verified via hardware implementation [24]. In addition, a leak detector is proposed to estimate the time difference using an entropy algorithm [25]. When a single sensor is used, the leakage can be detected using the fact that the change of the frequency spectrum is highly dependent on the change of leakage volume [26]; however, it is not easy to employ this method in real environments due to the sensitivity to the change of frequency response. In [27][28][29], leak detection schemes with single sensor have been investigated based on the wavelet transform techniques; however, it is difficult to adopt in practical environments because they require reference vibration signals without leaks or the detection performance is vulnerable to the system noise. Additionally, other authors are investigating the use of the filter diagonalization method [30], the radio frequency identification (RFID)-based leakage monitoring system [31], and magnetic flux leakage detection [32]. However, because of the excessive complexity, the difficulty of installation and management, and the technical immaturity, these techniques are rarely used in specific areas.
In this paper, we focus on the time difference estimation method using two vibration sensors attached to water pipes. When the power spectral densities (PSDs) and the cross-spectral density (CSD) of the receive signals are known, the optimal time difference estimator is given by the generalised cross-correlation (GCC) with the maximum likelihood (ML) prefilter derived in [18]. Practically, the PSDs and the CSD include certain estimation errors, and thus the ML prefilter may perform worse than the smoothed coherence transform (SCOT) and Wiener processors [21]. In order to improve the performance of the ML prefilter with erroneous PSDs and CSD, we propose a modified ML method that exploits a regularisation factor to mitigate the estimation errors of the PSDs and CSD. We derive a theoretical variance of the time difference estimation error in the discrete-frequency domain through modification of the variance model in the continuous frequency domain [18], and then find the optimal regularisation factor that minimises the theoretical variance in practical water pipe channels. The time difference estimation accuracy of the proposed method is assessed through computer simulations under the channel similar to an actual water pipe. By implementing the vibration sensor in hardware and the proposed detection method in software, respectively, we carry out field measurement in buried water pipes deployed for household use to verify the performance of the proposed leak detector. Numerical simulations and field tests show that the proposed modified ML method performs better than existing correlation-based techniques, in terms of minimising the root mean square deviation (RMSD) of the time difference estimation error and maximising the peak-to-average ratio (PAR) of the cross-correlation function.
This paper is organized as follows. In Section 2, we derive a formula to find the leak location using a time difference between two vibration sensors when two different kinds of pipes are connected. In Section 3, conventional time difference estimation methods are introduced with the signal model, and the modified ML prefilter is proposed for GCC based on the theoretical variance of time difference estimation error. The performance of the proposed prefilter is compared with those of existing techniques through computer simulations and field measurement in Sections 4 and 5, respectively. Finally, the conclusions are provided in Section 6.

Locating Leaks Using Time Difference
When water is leaked in pipes, internal fluid flows out to generate vibration, and it propagates in both directions from the leak point. As shown in Figure 1, vibration sensors are attached to both ends of the pipes including of a leak point, and thus a difference occurs in the vibration wave reception time due to the difference in distance between the leak point and the sensor. Considering the practical pipeline structure, the material and size of a pipe may be different on both sides of a joint. Define the time difference as where t 1 and t 2 mean the time for the vibration wave to reach sensors 1 and 2 from the leak point, respectively. The time difference ∆t can be estimated by correlating the receive signals of sensors 1 and 2. For the time being, it is assumed that ∆t is known, and the estimation of ∆t is considered in the following section. Suppose that the pipe 1 has a leak as shown in Figure 1a, i.e., 0 ≤ x ≤ L 1 . Then, t 1 and t 2 are given by where x is the distance between the sensor 1 and the leak point; and v 1 and v 2 denote the sound speed in pipes 1 and 2, respectively. From Equations (2) and (3), the time difference is expressed as By rearranging Equation (4) with respect to x, we have where Similarly, when pipe 2 has a leak as shown in Figure 1b, the time difference is expressed as and the distance between leak point and the sensor 1 is given by where In summary, the leak location from the sensor 1 can be computed as where ∆t max = L 1 v 1 + L 2 v 2 and ∆t min = −∆t max . The speed of sound in a water-filled pipe depends on the properties of a pipe wall that the vibration wave is traveling through. In [33], the velocity of a fluid-dominated wave is expressed as where B f is the bulk modulus of the contained medium; a, h, and E are the inner radius, the thickness, and the elastic modulus of the pipe, respectively; ρ is the density of the surrounding medium; ω is the angular velocity; and v f is the free-field internal fluid velocity [34], given by where K s is a stiffness coefficient of the material. We use the database of the sound speed created for various types of pipes by theoretical calculations and experimental corrections provided by [35], and Table 1 presents a part of the database. Practically, the sound speed varies considerably according to the geometry and material type of the pipe. If a pipe is composed of mixed media, it is difficult to calculate the sound speed. Thus, we only consider a single material pipe.

Conventional Time Difference Estimation Methods
In Figure 1, define x 1 (n) and x 2 (n) as the sampled receive signals of sensors 1 and 2 at time t = nT s , respectively, where T s is the sampling period. When there is a leak between sensors 1 and 2, x 1 (n) and x 2 (n) are given by where s(n) is the vibration wave generated from the leak point; d is the time difference in samples between received signals at sensors 1 and 2 that satisfies ∆t = dT s ; and ⊗ means the discrete-time convolution operator. For i ∈ {1, 2}, h i (n) is the discrete-time channel impulse response between the leak point and the sensor i, and w i (n) is a stationary random process representing the background noise at the sensor i. It is assumed that x 1 (n) and x 2 (n) are two stationary random signals uncorrelated with w i (n), and also assumed that w 1 (n) and w 2 (n) are uncorrelated. The time difference can be estimated using the cross-correlation of x 1 (n) and x 2 (n) defined as where τ is the time delay of x 2 (n), E[·] denotes the expectation operator, and (·) * means the complex conjugate. The cross-correlation function r x 1 x 2 (τ) can be computed in the time domain by directly correlating x 1 (n) and x 2 (n) because R x 1 x 2 (τ) has the peak value at τ = d. This approach, however, is very sensitive to the non-leak vibration noises, and requires high computational load when the sample size is large. For these reasons, the time difference estimation method in the frequency domain is widely used based on the relationship between the cross-correlation function and the CSD [18][19][20][21][22][23][24]. Let us define DFT N [x(n)] as the N-point discrete Fourier transform (DFT) of a sequence x(n). Then, the CSD for (13) is written as where By computing S x 1 x 2 (k) and taking its inverse DFT (IDFT), the cross-correlation function can be simply computed when the sample size is large.
To further enhance the frequency domain estimation method by sharpening the peak in the cross-correlation function, prefiltering is performed before the IDFT. In this case, the cross-correlation function is computed using the GCC method [18], defined as where IDFT N [x(k)] means the N-point IDFT of a sequence x(k) and Ψ(k) is a frequency domain prefilter. Figure 2 describes the procedure to compute the cross-correlation function in the GCC method, and Table 2 presents the prefilters for the GCC proposed in [18][19][20], where S w i (k) and S x i (k) are the PSDs of w i (n) and x i (n) for i ∈ {1, 2}, respectively; and γ(k) is the coherence function between x 1 (n) and x 2 (n) defined as In the Wiener and ML prefilters, the background noises w 1 (n) and w 2 (n) are assumed to be zero-mean Gaussian random variables with variances σ 2 1 and σ 2 2 , respectively, to make the signal models (11) and (12) mathematically tractable. As pointed out in [12,13], E[w i (n)] is not zero due to the sensor bias, and the modeling error included in w i (n) may not be expressed as a Gaussian distribution. The effect of the bias in w i (n) can be removed by sensor calibration or subtracting the mean value from x i (n). In addition, for a large N, W i (k) = DFT N [w i (k)] tends to be Gaussian by the central limit theorem, even if w i (n) is not a Gaussian random variable. Hence, the assumption that w 1 (n) and w 2 (n) are Gaussian is not a strong requirement ( [36], Ch.3A).   [18][19][20].

Prefilter Type
Weight Function The Roth prefilter proposed by Peter R. Roth normalises the CSD by S x 1 (k), while the smoothed coherence transform (SCOT) prefilter normalises the CSD by S x 1 (k)S x 2 (k). The Roth and SCOT prefilters can be used only in the high signal-to-noise ratio (SNR) region because the peak of the cross-correlation function is spread out after prefiltering. To avoid the peak spreading, the phase transform (PHAT) normalises the CSD by |S x 1 x 2 (k)| so that the prefiltered CSD has the unit amplitude for all k. However, the peak estimation performance is degraded due to the noise enhancement in the frequency band with low SNR. The Eckart prefilter maximises the ratio of the mean correlator output power to the noise variance of the correlator output, and thus it suppresses the frequency band with low SNR. This method is difficult to use in a practical system because it is not easy to estimate the noise PSDs S w 1 (k) and S w 2 (k) from the receive signals with leaks [18,20,21]. The ML prefilter weights the CSD according to the SNR in each frequency band to minimise the variance of the time difference estimate, yet it may critically overweight the CSD in the frequency band where |γ x 1 x 2 (k)| 2 is close to one [18]. The Wiener prefilter multiplies the CSD by the weight |γ x 1 x 2 (k)| 2 , so that it mitigates the overemphasising problem at certain frequencies [19].

Proposed Time Difference Estimation Scheme Using the Modified ML Method
For practical implementation of the prefilter, it is required to estimate the PSDs S x 1 (k) and S x 2 (k) as well as the CSD S x 1 x 2 (k). Additionally, the Eckart prefilter necessitates the estimation of the noise PSDs S w 1 (k) and S w 2 (k). Suppose that we have long sequences x 1 (n) and x 2 (n) measured at sensors 1 and 2. Then, the PSDs and CSD of x 1 (n) and x 2 (n) can be estimated by the non-overlapped Welch's method [37] as follows:Ŝ where X M is the number of blocks; and i ∈ {1, 2}. Using the power spectrum estimates in Equation (18), the ML prefilter is expressed as whereγ(k) is the estimated coherence function, given bŷ Note that 0 ≤ |γ(k)| ≤ 1. When a GCC technique is used to estimate the time difference, the error variance has been derived using the integration of a weighted CSD function in the frequency domain [18]. In the discrete-frequency domain, the error variance for an arbitrary prefilter Ψ(k) is obtained through the summation of a weighted discrete CSD function as follows: To derive the error variance of the ML prefilter using the estimated PSDs and CSD, we substitute Equation (19) into Equation (21).
In an attempt to mitigate the influence of coherence estimation error, we propose a new modified ML prefilter that employs a regularisation factor when computingβ 0 (k), and thus it is more robust to the coherence estimation error than the conventional ML prefilter. The proposed modified ML prefilter is expressed as whereβ(k) is defined asβ Here, 0 < α ≤ 1 is the regularisation factor. Since the denominator of Equation (25) is equal to or greater than α, the estimation error included inγ(k) gives less impact on the change ofβ(k) compared to the conventional ML prefilter. The proposed modified ML prefilter (24) is equal to the ML prefilter when α = 0, while it is equal to the Wiener prefilter when α = |γ(k)| 2 . By replacingβ 0 (k) in Equation (22) withβ(k) in Equation (25), the error variance for the modified ML method is denoted as As the regularization factor α increases,β(k) becomes more robust to the estimation error of γ(k); however, the bias ofβ(k) increases. In addition, the optimal α minimising var(d) is varied according to SNR values of receive signals and statistical characteristics of γ(k). Therefore, we carefully design α considering the variance of time difference error in numerical simulations and field measurements. The specific design method for α is described in Sections 4 and 5.

Channel Model and Optimisation of Regularisation Factor
To perform computer simulations, we construct a channel model for vibration signals in buried water pipes based on the field measurement data. Details of field measurements are explained in Section 5. We define a channel model for propagation of vibration signals in a water pipe as below: where d 0 = 1 m is the reference distance; d i is the distance between the leak point and the sensor i; m is the path loss exponent; −1 ≤ c 1 ≤ 1 and K are the coefficient and order of a comb filter; 0 < c 2 < 1 is the coefficient of a first-order infinite impulse response (IIR) filter; and E i (z) is an error term describing the channel variation. It is assumed that, for various ω, E i (e jω ) is an independent and identically distributed (i.i.d.) Gaussian random variable with zero mean and variance σ 2 h and that E 1 (e jω ) and E 2 (e jω ) are independent. In Equation (27), 1/(1 + c 1 z −K ) is the inverse comb filter to represent a sharp increase in the frequency response at harmonics of the natural frequency, and the first-order IIR filter describes the lowpass characteristics of vibration signals. Figure 3 compares the coherence function of the signals generated by the channel model (27)  Using the channel model (27), we find the optimal regularisation factor α for the proposed modified ML prefilter. In the simulation, the coherence γ(k) is generated using the same parameters as in Figure 3, and the coherence estimateγ(k) is denoted aŝ where γ e (k) is the estimation error of γ(k) that is assumed to be a zero-mean Gaussian random variable with variance σ 2 e . Figure 4 presents the normalised variance of time difference estimation error, which is defined as where var(d; α) is computed by Equations (25) and (26). The normalised error variance monotonically decreases as the regularisation factor α increases to 0.5, whereas it grows with increment of α when α > 0.6. Thus, the normalised error variance is minimised when 0.5 ≤ α ≤ 0.6, and the optimal α is slightly different depending on σ 2 e . As σ 2 e decreases, the minimum of normalised error variance decreases, i.e., the performance gain of the modified ML prefilter increases compared to the conventional ML prefilter with α = 0.
As mentioned in [12,13], γ e (k) may have a non-Gaussian distribution with a bias in a practical system. However, to the authors' best knowledge, the distribution of γ e (k) in practical water pipes are not available in the literature, and thus γ e (k) is assumed to be zero-mean Gaussian in the simulation. Since var(d) is a function ofγ(k), the normalised error variance curve in Figure 4 is varied according to the distribution of γ e (k) and the optimal α also can be changed. For more practical design, we compute the normalised error variance curve and the corresponding optimal α usingγ(k) measured at real test sites in Section 5.2.

Performance Comparison
In this subsection, we compare the performance of the proposed modified ML method with those of existing prefilters. As mentioned in [18,21], the PHAT prefilter performs better than the Roth and SCOT prefilters with the peak spreading problem. In addition, as explained in Section 3, the Eckart prefilter requires the noise PSDs S w 1 (k) and S w 2 (k), which are difficult to estimate in a practical system. For these reasons, we consider the PHAT, Wiener, and ML prefilters for comparison with the proposed method. To compare the performance of various prefilters, we utilize the RMSD of the time difference estimate given by Note that RMSD(∆t) = T s var(d) whend is unbiased. In addition, we employ the PAR to describe the relative magnitude of the maximum peak compared to the average of the cross-correlation function. Specifically, when the maximum peak appears at τ =d, the PAR is defined as where the cross-correlation function R g x 1 x 2 (τ) is given by Equation (15). Figures 5 and 6 show the RMSD and the PAR for various prefilters, respectively, when the channel model (27) is used, N = 4096, and M = 30. α was set to 0.5 for the proposed modified ML prefilter; σ 2 h was determined by the SNR at sensor 1; d was randomly generated in the range between −200 and 200; and other parameters were the same as in Figure 4. All curves were obtained by averaging the simulation results over 10 4 channel realizations. The RMSD decreases with increment of SNR, and the proposed method outperforms the conventional PHAT, ML, and Wiener prefilters in the whole SNR region. The RMSD value of the PHAT prefilter approaches those of the ML and Wiener prefilters as the SNR increases, because the PHAT prefilter has good performance in high SNR region [18,21]. In Figure 6, the proposed modified ML method presents the highest PAR among all prefilters for entire SNR region, i.e., the proposed prefilter can detect the leakage point more reliably than other techniques. The PAR of the ML prefilter is saturated when SNR ≥ 4 dB because the estimation error of γ(k) influences both the numerator and denominator of Equation (23).   Figure 7 shows the leak detection system for field measurements. We developed a web-based leakage detection system including our own sensor composed of a vibration transducer, a microprocessor, and a communication module. Once the control node sends the start logging command to the sensors via the network node, the sensors record the vibration signals for 30 s and transmit the sampled data to the server through the wireless communication link and the commercial internet protocol (IP) network. Prior to data logging, the network node broadcasts a sync message, and the local clock of the sensor is synchronized using the message to avoid the time difference estimation error caused by the time offset between sensors. The measured data stored in the server can be extracted for leakage detection at the control node. At the sensor, the sampling frequency was 4 kHz (T s = 0.25 ms) and each sample is represented as 12 bits. The wireless link between the sensor and the network node supports 40 kbps of data rate using binary phase shift-keying (BPSK) modulation.

System Setup for Field Testing
Field measurements were carried out in Seogwipo City, Korea, in order to verify the leak detection performance of the proposed method. To determine specific test areas, we utilized the leak rate information in [3] and the pipeline database provided by the local government. We have selected five locations to perform field measurements as shown in Figure 8, and specific pipe layouts in these locations are given by Figure 9. The leaks in water pipes occur mostly at the joint to connect pipes and the branch to distribute water to small pipes, and these leak points can be identified by checking manholes and the entrance of household pipes. Thus, we have found the actual leak point described as red spots in Figure 9 through the inspection of possible leak points as well as the leak detection based on the time difference estimation. The sound velocity in Figure 9 was determined by querying the database provided by [35] using the information about the material and diameter of the water supply pipe. Since two adjacent sensors are selected to log vibration signals for each field trial, there are nine kinds of test cases as shown in Table 3. For notational convenience, we use the test case names in Table 3 in Sections 4 and 5.       Location  1  1  2  3  3  4  4  5 5 Sensor Pair S1, S3 S2, S3 S1, S2 S1, S3 S2, S3 S1, S2 S2, S3 S1, S2 S2, S3

Performance Evaluation
In a practical system, the real coherence γ(k) is not available. Instead, γ(k) is estimated using the received signals x 1 (n) and x 2 (n) through multiple trials of measurement. Using γ(k) measured at five test locations, we compare the normalised variance of time difference error as a function of α as shown in Figure 10. The coherence function γ(k) was obtained by averaging three separate measurement data for each test case. We used N = 4096 and M = 25, and σ 2 e was set to 0.02. For varying α, the normalised error variance presents huge variations depending on a test place because of the change of γ(k) and some interference by the external environment; however, the normalised error variance changes slightly when 0.30 ≤ α ≤ 0.55 having the minimum value between 0.35 and 0.60 as shown in Table 4. Based on the analysis in Figure 10, α was set to 0.45 for the proposed prefilter.  As mentioned in Section 5.1, the field measurement was carried out for nine test cases in five locations. The field measurement was performed three times for each test case. We used N = 4096 and M = 25 for estimating the PSDs S x 1 (k) and S x 2 (k), and the CSD S x 1 x 2 (k); and α was set to 0.45 according to the analysis in Section 4.1. Based on the measurement data, Figures 11 and 12 present the RMSD and the PAR of various prefilters, respectively. The proposed modified ML prefilter achieves the smallest RMSD value in all test cases except L1S13 in which the PHAT prefilter has slightly less RMSD than the proposed method. Note that, in the test case L1S13, the RMSD values of all prefilters are less than 0.6 ms, i.e., the proposed method achieves proper estimation accuracy for finding the leak location. The RMSD of the proposed method is less than 1.1 ms for entire test cases. In Figure 12, the proposed prefilter obtains the highest PAR in all test cases except L2S12 in which the PAR value of the PHAT prefilter is 0.7 dB greater than that of the proposed scheme. The PHAT and Wiener prefilters have similar PAR values; however, the RMSD of the PHAT prefilter is higher than that of the Wiener method as shown in Figure 11. In addition, while the ML and Wiener prefilters have similar RMSD values except L5S23, the ML method presents much lower PAR values than the Wiener prefilter. Thus, the Wiener prefilter is better than the PHAT and ML methods. Overall, the proposed method outperforms the conventional prefilters because it achieves the minimum RMSD value among all prefilters while maximising the PAR. In other words, when the distance between sensors are identical, the proposed prefilter has higher leak detection probability than the conventional methods. In addition, under a fixed RMSD requirement, the distance between sensors can be longer in the proposed prefilter than those in the conventional schemes, and thus the sensor deployment cost can be reduced.

Conclusions
This paper proposed a new time difference estimation method for leakage detection based on the modified ML prefilter with the regularisation factor. Through analysis and numerical simulations, we have found the optimal regularisation factor and compared the performance of the proposed prefilter with that of conventional prefilters. The proposed leak detection method was verified by field measurements using a practical leak detection system, and it was shown that the proposed method outperforms the conventional techniques in terms of RMSD and PAR. The proposed modified ML prefilter can be applied to correlation-based leak detection systems utilizing the STFT and the wavelet transform. Furthermore, the proposed method can contribute to developing an automatic leakage management solution that collects leakage data, alarms about the risk of leaks, and informs about the specific leak locations.