1. Introduction
Fluid transport is a significant issue in the world today. Currently, cities are continually demanding utilities, including drinking water, the distribution of oil products, the treatment of wastewater, etc., and pipelines are predominantly used to do this. The pipeline networks have increased the growth and comfort of society. Nevertheless, there is also a constant risk (in particular, for fuel pipelines) that accidents, environmental pollution or economic losses may occur if the fluid spreads through leaks. In this context, several critical incidents have recently occurred within Mexico, such as San Martín Texmelucan, Puebla in 2010, and more recently in the Tuxpan-Tula poly-duct in the municipality of Tlahuelilpan, Hidalgo in 2019, where many people died as a result of an explosion caused by illegal fuel extraction. On the other hand, according to the National Water Committee (CONAGUA) [
1], about 40% of drinking water is lost due to leakage. Although there are entirely different explanations for each problem, both can be solved by using similar techniques.
The scientific community has paid attention to that problem and has proposed several methodologies for monitoring and supervision purposes in order to avoid losses and accidents (see, e.g., [
2,
3,
4,
5,
6,
7,
8,
9,
10,
11]). In particular, in Begovich et al. [
2], a LDI algorithm based on Billman and Isermann [
3] has been implemented and tested with accurate results and based on steady-state conditions, which increase the convergence time in the leak parameter estimation process. The proposal in Verde et al. [
5] deals with the location of multiple leaks in a pipeline. The key to the leak detector, which should operate in quasi-real-time, is a family of parameterized transient models for all scenarios in the pipeline. In this case, the equivalence in the steady-state of a leak at a position with two leaks allows obtaining the family of dynamic models. Then, to estimate the specific parameter of the leak, an off-line identification process is performed.
Likewise, a multi-leak diagnostic scheme has been suggested in Delgado-Aguiñaga et al. [
4] based on Kalman observers. In general, it considers a model-based approach for detecting and isolating several non-concurrent leaks. The method modifies the nonlinear model for each new leakage event. Thus, it is an extension of the single-leak isolation problem. Although this scheme shows acceptable results, the complexity of computation increases as an additional leak occurs. In Rubio Scola et al. [
12], the authors presented the development of a nonlinear state observer to locate a blockage in a pipeline. The technique uses a mathematical model derived from the equations of the water hammer together with the method of finite differences for its solution, providing a suitable location for the blockage. Besides, concerning the implementation problem, a recent algorithm based on the extended Kalman filter Delgado-Aguiñaga and Begovich [
10] has successfully identified a leak in an aqueduct in Guadalajara, Mexico. A posterior study estimated that approximately 130 million liters of drinking water had been lost in this incident.
There are also other methods with successful application. For example, the approach presented in Ostapkowicz [
6] uses a pressure wave method, and Liu et al. [
11] presented a system based on acoustic waves. A hybrid approach based on a real-time transient simulation system, and a negative pressure wave method is proposed in Zhang et al. [
7]. In the last reference, the authors argued that the most likely future development in pipeline leak detection and location tends to be the use of two or more different methods. Finally, Tian et al. [
8] proposed an algorithm to locate leaks based on the pressure difference profile along the pipeline. It considers the effect of the static pressure increases at the leakage point.
On the other hand, analytical redundancy methods (the technique of several model-based methods) have demonstrated to be useful to improve the precision, reliability, and performance of a system. Notably, in the field of fault detection and isolation, the attention on this class of methods has increased lately in several topics, such as robotics Lyu et al. [
13], control theory Chouchane et al. [
14], diagnosis system Lunze [
15], and the application of evolutionary algorithms and neural networks to fault diagnosis Witczak [
16]. In particular, several works dealing the leak diagnosis problem in Water Distribution Networks (WDN) have also been proposed on the basis of genetic algorithms. In Vitkovsky et al. [
17], a technique in conjunction with the inverse transient method is used to detect leaks and friction factors. Additionally, in [
18], a model calibration process is formulated as an nonlinear optimization problem that is solved by using a genetic algorithm. Case studies are presented to demonstrate how the integrated approach is applied to water leak detection.
The framework previously stated encourages researchers to propose new model-based approaches that can be used in combination with other methods and thus contribute to the development of a robust leak diagnostic tool for single pipelines on the basis of analytical redundancy model.
By relying on an observability property, fulfilled for the single leak case, our approach considers building an observer ensemble together with a genetic algorithm to minimize the observation error and, in this way, estimate the leak parameters, i.e., position and magnitude. The extended Luenberger observer has been chosen to estimate the internal state variable (the pressure at the leak point). Such an observer is exponentially stable only if the parameters of the model are known. Otherwise, the observation error is, at last, uniformly ultimately bounded. Then, if only leak position and magnitude are the unknown parameters (the rest of the pipeline mathematical model parameters are well-known), it is possible to design a bunch of observers, each with different values of leak position and magnitude (the search space). Thus, the best estimation of the leak parameters provided by the observer ensemble is the one that gives the minimum residual. Now, the potential of the genetic algorithm could be exploited to find the best estimation of such parameters.
The paper is organized as follows.
Section 2 provides the mathematical model.
Section 3 describes the Leak Detection and Isolation scheme.
Section 4 presents some successful experimental results. Finally, in
Section 5, some conclusions and future work are discussed.
2. Pipeline Mathematical Model
The pipeline model is classically derived under the following assumptions: the pipeline is considered to be straight without any fitting and without slope; the fluid is slightly compressible; the duct wall is slightly deformable; and the convective velocity changes are negligible. Likewise, the pipeline cross-section area and fluid density are constant. Then, the Partial Differential Equations (PDE) governing the fluid transient response, can be written as Roberson et al. [
19]:
Continuity Equation
where
Q is the flow rate
;
H is the pressure head
;
z is the length coordinate
;
t is the time coordinate
;
g is the gravity acceleration
;
is the cross-section area
;
b is the pressure wave speed in the fluid
;
, where
is the inner diameter
and
f is the friction factor; and the rest of physical parameters are computed as in Delgado-Aguiñaga et al. [
20] considering a constant water temperature of 20
. The dynamics in Equations (
1) and (
2) is fully defined by related pairs of initial and boundary conditions.
Leak model: Furthermore, one leak arbitrarily located at point
(where
L is the total length of the pipeline), can be modeled as follows Roberson et al. [
19] (see
Figure 1):
where the constant
is function of the orifice area and the discharge coefficient (for simplicity, the
coefficient is referred as “leak magnitude” from now on);
is the flow through the leak; and
is the head pressure at the leak point Navarro et al. [
21].
This leak produces a discontinuity in the system. Furthermore, due to the law of mass conservation,
must satisfy the next relation:
where
and
are the flows before and after of the leak point, respectively.
Friction model: In modern pipes (pipes with a relative roughness usually less than
), it is difficult to reach a complete turbulence zone (i.e., the zone where friction factor is almost constant, see
Figure 2).
Therefore, a friction factor deemed as a constant value could yield a poor mathematical model. For this reason, in the present work, the friction factor is calculated by using the well-known Swamee–Jain equation Brkić [
22], Swamee and Jain [
23]:
which is suitable for flow regime in the transition zone (as occurs in plastic pipelines) and where
[m] is the roughness height,
is the Reynolds number given by
and
v is the kinematic viscosity [m
/s].
Spatial Discretization of the Modeling Equations
In order to obtain a finite dimensional model from (
1) and (
2), such PDE’s are discretized with respect to the spatial variable
z, as in Verde [
24], Besançon et al. [
25], by using the following relationships:
where
,
stand for
,
, and
n the number of pipeline sections. Assuming only one partition in the pipeline, as shown in
Figure 1,
(
) becomes the distance from upstream to the point of the leak and from the point of the leak to downstream, respectively. Notice that
and
. The leak position is assumed to be
in this description. Applying the approximations in (
6) and (
7) to equations (
1) and (
2) together with (
3), we get:
Here, the state vector is defined as
, the input vector is
, and the pressure output vector is
,
with
. It is worth noting that
represents the pressure head at the leak point. This value is impossible to measure since the leak position is not known “a priori”, but this value could be observed because, for the system (
8), the observability property is fulfilled, as seen in the next section.
Notice that the mathematical model in (
8) assumes a straight pipe without loss of generality, as, even if the pipe is not straight, it is possible to obtain an Equivalent Straight Length (ESL) of the pipe. This is done by considering losses due to each “non-straight element” (i.e., fitting). The equivalent straight pipe
can be calculated as Mott [
26]:
where
stands for the pipeline physical length [
m] measured between the sensors placed at the ends of the pipeline,
is the fitting loss coefficient for the
jth fitting, and
n the total number of the pipeline fittings.
3. LDI Scheme Approach
The leak diagnosis process (the task of determining the magnitude and location of the leak) proposed in this work is carried out by the design of a bank of observers together with a genetic algorithm method whose selection rule is to minimize the integration error of each observer.
Since the observability property of the system (
8) is fulfilled, it is possible to design an extended Luenberger observer to estimate
. Such an observer is exponentially stable only if the parameters of the model (
,
g,
b,
L,
,
,
, and
, in (
8)) are known; otherwise, the observation error will be uniformly ultimately bounded. Then, if only leak position (
) and magnitude (
) are the unknown parameters (the rest of the pipeline mathematical model parameters are known), it is possible to design a bunch of observer, s each with different values of
and
, i.e., search space. Thus, the best leak position and leak magnitude estimation of the ensemble is the one that gives the minimum residual. Now, the potential of the genetic algorithm could be exploited to find the best estimation of such parameters.
The minimum integral observation error will be reached when the leak position and magnitude match the real ones.
3.1. Extended Luenberger Observer for MIMO Systems
First, let us consider that the space-state representation given by (
8) can be rewritten in compact form:
with the state
, the input
, and the output
(with two components,
and
). Then, the observability is guaranteed by invertibility of the following map (where
denotes the Lie derivative):
which is in fact uniform in
u. If one considers
(for unidirectional flow), such a map induces the following rank observability condition:
such that the system in (
10) is locally observable and satisfies the condition for the extended Luenberger observer design for MIMO systems Birk and Zeitz [
27]. Then, the system (
10) can be rewritten to obtain its additive output nonlinearity form:
where matrices
A,
,
,
, and
C are given by:
where
and
. Here, the additive output nonlinearity can be built from direct measurements and thus compensated in the observer design (as it was originally proposed by the authors in Krener and Isidori [
28], J. Krener and Respondek [
29], for instance). The representation (
13) admits an observer of the form:
By defining the estimation error as
, the dynamic error model is:
where
.
In this equation, is clearly an equilibrium. In addition, K can be chosen so that is Hurwitz (since is observable), that is for any , there exists satisfying the Lyapunov equation .
Notice then that
satisfies a linear growth bound
on the region of operation, and thus if
, where
and
denote the minimum and maximum eigenvalue of a matrix, one can conclude that the origin of the error system in (
13) is exponentially stable (see Khalil [
30] for more details).
3.2. Genetic Algorithm
In computer science, the genetic algorithm is an algorithm inspired in the biological evolution that offers a suitable solution to optimization and search problems. The GA is a recursive algorithm where the aptest individuals of a population are discovered, emphasized, and recombined (reproduction) in order to produce descendants of the next generation. Six phases are considered in a genetic algorithm:
Initial population. The first step of the process is to obtain a set of individuals randomly generated (initial population) in which each such individual is a candidate solution to a problem.
As in the natural selection process, an individual is characterized by a set of parameters called genes. The solutions, known as chromosomes, are genes joined into a string.
In a genetic algorithm, the chromosome is represented using a string in terms of an alphabet. Binary encoding (a string of ones and zeros) is the most common procedure to encode the genes in a chromosome.
Fitness function. The fitness function defines how close an individual fits a solution and, in this way, determines which will reproduce and survive into the next generation. The fitness function provides a “fitness score” to each individual. Such “fitness score” settles the probability that an individual will be selected for reproduction.
Selection. In this phase, the chromosomes in the population that more closely match the fitness function are selected. The solution (chromosome) that fits better during iteration is more likely to be selected to reproduce.
Crossover. After the selection process, a recombination of the chromosomes is carried out in order to generate a new population for the next iteration. Crossover is applied to randomly pair strings and exchanges the sub-sequences before and after to create two offspring.
Mutation. To preserve diversity within the population and prevent premature convergence, a mutation process is done. The mutation operation is applied after the crossover process is achieved. For each bit in a subset of the new offspring, some of their genes can be mutated with a low probability. This is done by flipping some bits in the chromosome bit string.
Termination. If the algorithm does not produce new populations that are sufficiently different from the previous generation, the algorithm has converged. Then, the genetic algorithm has found a set of solutions to the problem, and it is terminated. Such a criteria is predefined by the designer according to specific constraints. In particular, for the proposed scheme, the algorithm is kept in operation during the entire experiment since a permanent pipeline monitoring is assumed no matter if a leak is occurring or not.
Some final remarks. The GA discussed thus far uses a binary string to encode the genes in a chromosome. Nevertheless, for many engineering problems, it is nearly impossible to represent the solution with a binary encoding (as in the case of the leak diagnosis). Thus, it is necessary to make a mapping between binary and real numbers before the process (crossover and mutation) is started. Such a mapping is built in two stages: First, a function
, which assigns a real number
r of a given search interval
to a closed set of integer number,
, is defined:
where
function rounds each element to the nearest integer,
stands for the maximum real number of the interval, and
n is a natural number. Naturally, the longer
n is, the more accurate the mapping will be. Then, once
m is obtained, the process to convert
m into a binary number follows immediately.
To return from binary to a real number and, in this way, apply the fitness function and selection processes, the inverse mapping is applied.
Figure 3 depicts the flowchart of GA (for more information, see Schmitt [
31], Mitchell [
32], Whitley [
33]).
3.3. Evolutionary Ensemble of Observers
At this point, let us analyze an observer with the structure of (
14), in the presence of parameter errors. First, one can consider that the leak parameters (
and
) match with real values. Then, if a parametric error appears (i.e., there is a deviation between the current
or
and the real ones), the observer structure given by (
14) changes in the following form:
where the symbol
denotes a parametric deviation from the real value. This means that
is the difference of
as a sum error (i.e., the wrong value
could be separated as follows:
, where
is the real value). Then, (
17) yields an error model in the following form:
From (
18), we have that
changes the equilibrium point of (
15) away from 0. Thus, a residual
is induced in the output error
when an error presented in
or
and this residual
is zero only when the
and
match the real values. The present work exploits this system property (as long as the rest of the parameters are properly tuned). It is interesting to see that the residuals do not depend on the input signal
.
Hence, it is possible to design an ensemble of observers, each with different values of
and
, such that the residuals of individual observers (namely,
,
,...,
) go away from zero as long as these values do not match with the real ones.
Figure 4 depicts this idea.
If the residual is minimized somehow, then it is possible to estimate the correct values of the leak parameters. The present work proposes a GA that searches the correct values of
and
by minimizing the integral squared residual of the ensemble of observers. In this GA, the population is built with the combination (cartesian product) of the possibles values of
and
. The following optimization problem is considered. Find
such that:
where the residual vector is defined as:
Here, is the residual of the ith observer and n is the cross product between the number of the position and magnitude that we are looking for, i.e., . Here, l and m are the number of position and magnitude, respectively, arbitrarily proposed by the designer. This work suggests to set such that each value of the variables and belongs to a set of equally separated values, i.e., and . Initial time can be deleted, as well as window length T.
5. Conclusions and Future Work
The present work deals with the leak isolation problem (to estimate the position and magnitude of a leak in a water pipeline) using a heuristic method. The proposed scheme assumes only flow and pressure sensors at the upstream and downstream of the pipeline. Exploiting the fact that the pipeline mathematical model is observable, it is possible to design an observer where the observation dynamic error system is exponentially stable only if the leak size and location parameters are known and, at last, uniform ultimate boundedness in other cases. In this way, the authors propose to design a bank of observers together with a Genetic Algorithm. This scheme allows for minimizing the integral observation error. Then, the minimum integral observation error will be reached when the leak position and magnitude match the real ones.
The approach presented in the paper estimated the leak position and its intensity in a very acceptable way. This is corroborated since both downstream and upstream flow rates were well estimated in the presence of noise. It means that the genetic algorithm chooses the real values of the (size of the leak) and (leak location). The use of the integration error as a fitness function helped obtain a good estimation despite the presence of noise.
As future work, this algorithm will be refined to achieve better performance. Moreover, the authors will explore the possibility of extending the present approach to two or more leaks. Finally, the algorithm will be tested to locate leaks in a hydraulic network.