Pre-Localization Approach of Leaks on a Water Distribution Network by Optimization of the Hydraulic Model Using an Evolutionary Algorithm †

The sustainable management of water distribution networks is a crucial challenge especially in emerging countries. The distribution networks have very low efficiency with very high levels of leaks. Locating and prioritizing of water leakage areas becomes a main concern for public services, to optimize the use of resources and improve constancy of supply. A decision support approach was proposed to locate areas with higher leakage rate. It is based on the resolution of the FAVAD equation (Fixed And Variable Area Discharge). The determination of FAVAD parameters enables to simulate the quantities and the location of the leaks, through the use of genetic algorithms coupled with hydraulic modeling and the geographical information system (GIS).


Introduction
One of the major challenges facing water utilities around the world is the high level of water losses in the distribution networks.In a World Bank study published by [1] estimated that more than 32 billion cubic meters of treated water are lost each year to fleeing urban water supply systems through the world and half of these losses occur in developing countries.According to the Water Operators Partnerships of the African Utility Performance Assessment published in (2009), Non-Revenue Water (NRW) have reached alarming levels in many developing cities, particularly in sub-Saharan Africa [2] Regarding the Algiers network, the Algerian Water and Sanitation Company (SEAAL) estimated in 2017 that the percentage of overall loss was 44%.
Leaks in water systems generally appear due to the aging of the infrastructure, generated or amplified by corrosion, or to mechanical constraints which are particularly exposed by the increase in operating pressure.
Many experimental researches have reported that the reduction of the pressure, induces the reduction of the leaks [3] without omitting those of [4] which showed that a reduction of the pressure of 77 m to 50 m resulted a reduction of 25% in MNF (minimum night flow).
The work carried out in this document boils down to proposing a methodology that aims at prelocating leaks in a drinking water supply system using a model calibrated and built using the developed tool ExpaGIS.The application of the FAVAD concept is an appropriate solution for this kind of resolution.
Once the hydraulic model is constructed and calibrated flow rate through a simplified methodology, the use of Genetics Algorithms (GA) is required for the spatial distribution of leaks in the network nodes (model).To do this simulation we deployed a decision tool help 'Optim_Detect' programmed in the Matlab & interfaced with the hydraulic modeling tool through the EPANET toolkit.
Optim_Detect allows in fact the simulation of the parameters composing the concept of FAVAD (Ci and N1) in order to identify the quantity and the positioning of the leak on each nodes i of the hydraulic model.For a better representation of the results and system management, outputs are exported to a Geographic Information System (GIS).
In order to test and validate the proposed approach, many pilots were tested, where the total pipe length vary between 12 and 25 km, the results of these optimization were quite satisfactory.In this document an example of an important network (87 km) is presented.Through the use of the proposed tool and the measurements made in the field, 36 leaks have been pre-localized which represents a leakage rate of 57 m 3 /h, the discharge coefficients of the nodes obtained vary between 0.010 and 0.032, the emitter exponent N1 is simulated at 1.1.

Modeling the Behavior of the Leak in EPANET
One of the main features of EPANET is that its hydraulic calculation engine is demand-driven.The water output data at each node is defined as the base demand.There are two ways to model a water leak in EPANET (Figure 1).The leak is modeled as an additional demand, independent of the pressure in a consumption node Figure 1a, in this case the formulation of the global flow of a node i is demonstrated in the Equation (1): where is the base demand for consumption, it is an average annual consumption of users based on individual meter indices.The customers are geo-located (knowing the x and y coordinates of each customer), their consumption is assigned to the nearest nodes in the hydraulic model by using the SIG/EPANET export tool (ExpaGIS).
is the additional consumption of leakage, determined from the minimum night flow QMNF and homogeneously distributed to all nodes assuming that: =∑ and the leakage time profile is constant =1 over the duration of the simulation (24 h) following a time step of 1 h.The consumption profile ( ) is calculated by subtracting the minimum night flow rate QMNF from the measured hourly total flow rate.According to [5,6], the best way to represent leaks in a hydraulic network model is not through an additional demand, but rather with adding a leakage valve for each node i, as shown in Figure 1b.
Where is the base node demand (consumption) got from the costumers' indexes, Pi is the node pressure, K is the flow coefficient of the valve.It is depending of the change of the section and Qi is the flow rate in the valve, representing the leakage.In this representation, the pressure at the node behaves as a determining factor of the leak, therefore the flow of a leak through the valve can be estimated.

Using the FAVAD Concept for Leak Pre-Locating
When simulating a leak in EPANET, the parameter that represents the coefficient of flow rate through a valve is the emitter.This emitter represents a valve open to the atmosphere it is a simple element of the node.The Equation ( 2) that represents the methodology mentioned above is the FAVAD concept: where is the leakage rate at node i, is the discharge (emitter) coefficient, is the simulated pressure at node i and N1 is the exponent emitter representative of entire network in EPANET, So the total flow of each node i is represented in Equation ( 3): The use of the equation above makes it possible to correctly represent the behavior of the leaks in a hydraulic model.The goal research is therefore the determination of the parameters Ci of each node of the hydraulic model and N1 representative of the whole network in order to be able to determine a leak rate specific to each node over a time t and to be able to pre-locate the leaks on a distribution network.According to the bibliography the exponent N1 = 0.5 for steel pipe, N1 > 1.2 for plastic pipe and N1 = 1 for other types of materials [6].Table 1 summarizes the methodologies proposed in this document to determine the parameters of Equation (3).

Hydraulic Model Calibration Using FAVAD Concept
The determination of the initial consumption profile and the introduction of leaks in the hydraulic model is essential in order to launch a first optimization, the determination of this daily profile is possible by using a simplified methodology which is proposed in this document to distribute the leakage equitably over all network nodes.This methodology requires consideration of the following conditions: • Use of a single emitter coefficient Cunique for all the nodes of the network; = = = ⋯ = .

•
The emitter exponent N1 is considered equal to 0.5 according to the equation of the flow of an orifice through a fixed surface.• ∑ = according to the equation of mass conservation and continuity where; is the flow at a node i and is the observed average flow in the hydraulic model.
The simplified method used is presented below: 1. Determination of the unique emitter coefficient for all nodes of the network by using the FAVAD concept; = ; where is Minimum Night Flow and is the average pressure (During MNF) of the k pressure sensors installed in the network 2. Using the unique emitter coefficient for all the nodes in each time step to calculate of the hourly leakage rate ( ) using the Equation ( 2) and deduce after the hourly leak profile ( ) 3. Subtraction of the leak rate of the measured global flow to deduce the hourly global consumption ( ).
4. Verification of the consistency between the hourly total consumption and the meter data of the costumers.This allows the construction of the consumption profile ( ).

Spatial Pre-Locating of Leaks Using Genetics Algorithms
Genetic algorithms are optimization algorithms based on the natural evolution theory of Darwin and Mendel's work on genetics.They belong to the family of evolutionary algorithms that evolve the individuals of a population, initially chosen at random, by the application of the genetic operators.According to [7] a genetic algorithm is defined by:

•
Individual/chromosome: a potential solution of the problem that corresponds to a coded value of the variable (or variables) in consideration; • Population: a set of chromosomes or points in the search space (therefore coded values of variables); • Environment: the research space (characterized in terms of performance corresponding to each possible individual); • Fitness function: the function to optimize, it represents the adaptation of the individual to his environment.

Basic Concept of a Genetic Algorithm
A genetic algorithm aims to optimize a function defined on a search space.It must have the following five elements in order to be used:

•
A coding principle of the population element.This step associates with each point of the state space a data structure.The quality of data coding conditions the success of genetic algorithms.

•
A generating mechanism of the initial population.This mechanism must be capable of producing a non-homogeneous population of individuals that will serve as a basis for future generations.

•
A function to optimize.This one returns a value called fitness or function of evaluation of the individual.

•
Operators to diversify the population over generations and explore the state space.

•
Sizing parameters: size of the population, total number of generations or stopping criteria, probabilities of application of crossover and mutation operators.

Problem Formulation by Genetic Algorithms
The use of (GA) in this document allows the optimized determination of the parameters of the FAVAD concept.
The fitness function at the end of the process allows to minimize the difference between the simulation results and the measured data transmitted from the exploitation of k sensors in the network.The minimization of the objective function f(x) can be obtained by minimizing the equation of the sum of the absolute differences following: where the vector 1 of dimensions' n + 1, is the chromosome consisting of the parameters of the concept of FAVAD, and are respectively the observed and simulated pressures at the node k pressure sensors placed in the network.The formulation f(x) ≤ ε refers to the minimization principle of the objective function.The error threshold (ε) is defined as the minimum acceptable by the user.The evolution of the optimization process ensures that the following conditions are met:

•
The constraint of the mass conservation (the incoming flow equals the outgoing flow): ∑ = ; Where every generation must respect this mass conservation constraint.

•
The lower and upper bounds of the unknown parameters: ≤ ≤ ; Where ∈ 0, 1 (Principle of flow through an orifice) and 1 ≤ 1 ≤ Where 1 ∈ 0.5, 2.5 , This emitter exponent may be equal to 0.5 for a leak that flows through a fixed surface orifice but may vary in real situations of leakage in the water networks in a range of 0.5 to 2.5 [8].
A decision support tool (Optim_Detect) has been developed for pre-locating leaks in Matlab in interface with EPANET and using the EPANET Toolkit's dynamic link library (DLL) [9].
The input data for the tool for the calibration of the model are: 1.A file (.inp) named from the hydraulic model concerned to its initial state (before calibration).2. Specification of the time that corresponds to the Minimum Night Flow (MNF), in order to calculate the quantities of leaks on each node at this time, before importing the data to the GIS for visualization of the results.3. Definition of GA parameters as (population size, generation number, number of elites, crossing rate ...) according to the user's choice.
The output data are: • A distribution of the transmitter coefficients in each node of the network and the allocation of a single exponent of the transmitter for the whole network, following the definition of the fitness function and the respect of the constraints.

•
The evaluation of each population indicates two fundamental notions: "the Best fitness" which is the best Fitness (adjustment) obtained and "the mean fitness" which represents the average of the Fitness (adjustments) resulting where the achievement of the optimum occurs when (the mean fitness) converges to (the best fitness).

•
The procedure summarizing the main phases of this process is presented in Figure 2:

Case Study
The study area K117 is located in the center of the capital of Algiers whose population is 90,000 inhabitants.The percentage of Non-Revenue Water (NRW) is equal to 47% with a linear loss index (LLI) of 140 m 3 /day/km.The network consists of 1380 pipes with a total length of 87 km, it has 1242 nodes.The diameters of pipes are between 80 and 800 mm, the network is controlled in pressure through a modulation valve in diameter 800 mm, this valve delivers the pressure according to the consumption profile.The main materials of this network are: galvanized steel, ductile iron and high density polyethylene.The network being composed of a multitude of materials.In the first approach the roughness is identical for all the arcs, it is estimated arbitrarily at 0.1 mm on average.This value is representative of the age and the structural state of the network.The roughness is not used as calibration parameter of the model in this study.
The K117 distribution network is composed of 8 sectors, each of these sectors is equipped with a (mpi) pressure sensor located in the center of the sector to improve representativeness, while adopting a secure location.At the entrance of the network, there is a control valve and a flowmeter (fm) where data is uploaded daily at 5-min intervals and archived in a Long-Term Database (LTBD) via a remote management and supervision system TOPKAPI.

Results and Discussions
The calibration of the model (flow and pressure) was successfully carried out using the Optim_Detect tool.It was tested and validated by minimizing the fitness function, comparing the pressure and flow rate values measured with the results of the simulation.
During the calibration process, the coefficients Ci and the emitter exponent N1 were adjusted as calibration parameters of the flow rate and of the 8 points of pressure measurements.The calibration results are presented in Figures 3 and 4.
The Optim_Detect tool simulates and distribute leakage coefficients (Ci) on each network node and simulation of the emitter exponent N1 by optimizing the Genetic Algorithms.The calculation of the magnitude of the leaks at the time corresponding to the (MNF) on each node of the network is done through the use of the Equation (2).After simulation, emitter coefficients Ci ∈ {0-0.032} and an emitter exponent N1 = 1.1.The use of 8 pressure measurements allowed the location of 36 real invisible leaks, these leaks are located at three areas and vary between [0.78-1.98]m 3 /h.    2 respectively represent the results of the spatial distribution of leaks at the network nodes on a geographic information system (GIS) as well as the quantification of these leaks on each node of the network.

Conclusions
This work has developed a method for the continuous pre-location of leaks in a water supply network.The procedure was based on an optimization by Genetic Algorithm (GA) of the parameters of the FAVAD concept.The objective is to minimize the gap between the simulated and measured data.This optimization required an interface between a hydraulic model via a Genetic Algorithm (GA) and a Geographic Information System GIS.This approach can be applied in networks where NRW are very important.The calculation of these losses is done during the night period (2-4 h) where a minimum nighttime flow Q(t)MNF is important and an excessive pressure allowing the suspicion of presence of leakage by the application of the FAVAD concept.
The Optim_Detect tool is proposed to target critical points of leakage, thus contributing to help utilities in their efforts to reduce physical losses.The identification does not always necessarily tip accurately on the location of the leak, but it significantly reduces the uncertainty and thus allows leak detection teams in the field to obtain better detection rates more quickly in areas where leaks have been pre-located.
Author Contributions: K.S. contributed to the implementation of the research, developed the theoretical methodology, performed the simulations, analyzed the results and wrote the manuscript; A.S. commented greatly to improve the manuscript; M.Z.assistance to the development of the Optim_Detect tool.
Funding: This research received no external funding.

Figure 1 .
Figure 1.Modeling a water leak in EPANET: (a) case of additional demand; (b) case of flow through a valve.

2 .
Method 2: The Leak Depends on the Pressure (FAVAD Concept)

Figure 5
Figure 5 and Table2respectively represent the results of the spatial distribution of leaks at the network nodes on a geographic information system (GIS) as well as the quantification of these leaks on each node of the network.

Table 2 .
Quantification of the leaks.