Comparison of Searching Behaviour of Three Evolutionary Algorithms Applied to Water Distribution System Design Optimization

Over the past few decades, various evolutionary algorithms (EAs) have been applied to the optimization design of water distribution systems (WDSs). An important research area is to compare the performance of these EAs, thereby offering guidance for the selection of the appropriate EAs for practical implementations. Such comparisons are mainly based on the final solution statistics and, hence, are unable to provide knowledge on how different EAs reach the final optimal solutions and why different EAs performed differently in identifying optimal solutions. To this end, this paper aims to compare the real-time searching behaviour of three widely used EAs, which are genetic algorithms (GAs), the differential evolution (DE) algorithm and the ant colony optimization (ACO). These three EAs are applied to five WDS benchmarking case studies with different scales and complexities, and a set of five metrics are used to measure their run-time searching quality and convergence properties. Results show that the run-time metrics can effectively reveal the underlying searching mechanisms associated with each EA, which significantly goes beyond the knowledge from the traditional end-of-run solution statistics. It is observed that the DE is able to identify better solutions if moderate and large computational budgets are allowed due to its great ability in maintaining the balance between the exploration and exploitation. However, if the computational resources are rather limited or the decision has to be made in a very short time (e.g., real-time WDS operation), the GA can be a good choice as it can always identify better solutions than the DE and ACO at the early searching stages. Based on the results, the ACO performs the worst for the five case study considered. The outcome of this study is the offer of guidance for the algorithm selection based on the available computation resources, as well as knowledge into the EA’s underlying searching behaviours.


Introduction
Evolutionary algorithms (EAs) have been widely used to deal with many water resources optimization problems over the past few decades, with purposes ranging from system design, operation to management [1,2]. Within the domain of the water distribution system (WDS) design optimization, EAs have been gained even more popularity relative to other water resources problems [1,3,4]. This is mainly because the WDS design is a typically difficult civil engineering problem as they possess very complex, highly nonlinear and discrete search spaces [2,5]. Compared with the traditional optimization methods, such as liner programming [6,7] and non-liner programming [8,9], EAs have better ability to explore the complex search space associated with the WDS design problem, and hence are more likely to identify more reasonable and effective high-quality solutions [10,11].
While many different EAs are available, the genetic algorithms (GAs), the differential evolution algorithm (DE), and the ant colony optimization (ACO) can be considered as three typical representatives that have been widely used to identify optimal solutions for the WDS design problems [1,5,12,13]. This could be supported by that the fact publications in literature regarding these three EAs dominate the other types of EAs [12].
The GA is a computational model that simulates the natural selection and genetic mechanism of Darwin's theory of biological evolution. According to available literature and out best knowledge, a GA can be considered as the first evolutionary algorithm used to deal with water resources problems [14], and it is now the most widely used type of EAs [15,16]. For example, Reca et al. used GAs to optimally design complex looped irrigation water distribution networks [17], and Mambretti et al. employed GAs to optimize the design of Milan water supply network [18]. Li et al. proposed a GA to solve optimization design problem for water distribution system that considered multiple objectives [19].
In terms of the DE, Suribabu probability first proposed and described how to apply the DE algorithm to deal with WDS design problems [20]. Subsequently, Zheng et al. applied decomposition and multi-objective optimization methods combined with a DE to handle complex WDS design problems, such as multi-water supply systems, to improve the optimization efficiency [16,21]. More recently, Bi and Dandy combined artificial neural network (ANN) and DE with a local search strategy and applied it to three typical WDS design problems. Results showed that their method was able to improve the computing efficiency while ensuring the quality of the obtained solutions.
ACO simulates the process by which the ant colony finds the short distance from the ant's burrow to the target food step by step through the pheromone concentration [22]. ACOs have been applied to solve various types of water resources problems such as environmental flow management [23,24], optimal crop and water allocation [25,26], power plant maintenance Scheduling [27]. How to use ACOs to solve the WDS design problem has been described in detail in article [22]. Subsequently, Zecchin et al. conducted parameter study on ACO and gave deterministic and semi-deterministic expressions of related parameters when ACO was applied to WDS design problems [28]. Recently, Zecchin et al. analysed the performance of four ACO algorithms using metrics [29], and Zheng et al. proposed an innovative adaptive convergence-trajectory controlled ant colony optimization algorithm for WDS design optimization [30].
Given that many different EAs are available for WDS design problems, it is natural to compare their searching ability, aimed to offer guidance for the selection of appropriate EAs for practical implementations. For instance, El-Ghandour et al. compared five types of EAs through efficiency rate metrics and convergence speed [3]; and Monsef et al. used precision and computation time to characterize the performance of three EAs [4]; More recently, Yazdi et al. compared the performance of four EAs in the context of multi-objective WDS design optimization [31]. These comparisons mainly focused on the final solution statistics (solutions in the end of EA runs) of the algorithms applied to the benchmark WDS design problems. While such a comparison is straightforward and meaningful, it is unable to provide knowledge on how the search mechanisms associated with different EAs affect the searching quality and how the search quality varies as a function of the searching time (i.e., run-time searching behaviour) [12]. Insights into the EA's run-time searching behaviour is critical not only for providing guidance for selecting suitable EAs based on available computational budgets, but also can building fundamental knowledge regarding future EA design and developments [29]. As Maier et al. pointed out in a review paper, building and refining a basic understanding on how EA works is an important research focus [1].
In recognizing the potential benefit of comparing the run-time searching behaviour of EAs, Zecchin et al. proposed several real-time metrics to assess the performance of four ACO variants [29]. Through these real-time measures, the searching characteristics of four ACO variants applied to different WDSs have been revealed [29]. These metrics provided a new way to identify and quantify the different stages of the algorithm in exploring the solution space and their related performance at these stages. Subsequently, Zheng developed several run-time metrics to measure the behaviour of four DE variants applied to three complex WDS design problems [12]. The behavioural results validated the feasibility of the metrics, and revealed the search characteristics of each DE variant in the search process. More recently, Zheng et al. made the first attempt to characterize the run-time searching behaviour of three multi-objective evolutionary algorithms (MOEAs) applied to the design of six WDSs [32]. The results showed that, through the identification of metrics, the underlying search characteristics of each multi-objective evolutionary algorithm have been successfully identified, which improved the understanding on how different operators affected the search behaviour of MOEAs.
The studies mentioned above have made significant contributions in exploring the underlying searching mechanisms associated with EAs with the aid of run-time metrics. However, these studies focused either on the run-time searching behaviour of variants of a typical EA (e.g., ACO or DE), or on the comparison of the run-time searching behaviour of MOEAs. There are surprisingly almost no efforts have been carried out to compare the run-time searching behaviours of different EA types in the single objective space. To this end, this study aims to use a series of run-time metrics to reveal the searching characteristics of the three widely used EAs in the WDS design domain, which are the GA, DE and ACO. These three EAs are applied to five selected benchmark WDS design problems with various scales and complexities. It is anticipated that the results of the run-time searching behaviours of the three EAs can provide practical guidance for the EA selection, as well as offer insights into the searching properties of each EA.
The rest of this article is paper as follows. The next section briefly introduces the three algorithms investigated in this study, followed by the definition of the real-time search behaviour metrics used in this article. The next step is to analyse and discuss the results. In the end, the observations are summarized and the conclusion is drawn. Figure 1 is the overall method of this study. As shown in this figure, three different types of EAs are considered. To better reflect the performance of the algorithms applied to different types of WDS design problems, the three EAs are applied to five different types of WDS design problems, with the number of decision variables ranging from 34 to 476. In order to better understand the search characteristics of three EAs, the search behaviour of the runtime algorithm is analysed. The information of each component of figure 1 is described in details in the following subsections. Results  Solution quality  Convergence properties

Evolutionary Algorithms and Their Parameterization
The GAs used in this study involve an integer coding, two-point crossover, bitwise mutation and constraint tournament selection following the previous papers [17,33,34], where the crossover, mutation and the selection operators are used to drive the searching towards optimal solutions. The standard differential evolution (DE) algorithm includes three operators, which are mutation, crossover and selection operators. The DE differs significantly from a GA in the mutation process whereby the mutant solution is generated by adding the weighted difference between several random population members to another random member [12]. The ACO is a population-based meta-heuristic analogically derived from the foraging behaviour of a colony of ants [22]. The algorithm is used in this paper as it has shown a better performance than other ACO variant [29].
The parameterization strategy of each EA is taken from previous studies [28,30,34,35] and the resultant parameters are outlined in Table 1. It is noted that the searching behaviour of each EA may be affected by the parameterization of the controlling operators. The searching behaviour results presented in this study are conditioned on the default parameters that have been well calibrated for the three EAs in previous studies and, hence, these results are meaningful.

Case Studies
Within the optimization design of WDS, the decision variables are usually the diameters of the pipes in the system, and hence the number of decision variables equals to the number of pipes involved in the WDS [12]. The constraint conditions are often the minimum water supply pressure as well as the corresponding hydraulic equations that the system must meet [29].
For a given WDS optimization design problem with n pipes = [ 1 , 2 , ⋯ , ] , the objective is to determine the least-cost solution * while ensuring the solution meet design constraints ,where: subject to: where = the cost function of the optimization design problem, which is related to decision variable . It is noted that the cost considered in this study consists of the pipe material cost and the construction cost. =pipe diameter of pipe i=1,2,…,n; ( * ) = the nodal pressure vector for solution * ; = the vector of minimum pressure head that each node needs to meet; { } = a set of available pipe diameters for optimization design. In this study, EPANET 2.0 (Water Supply and Water Resources Division of the U.S. Environmental Protection Agency's National Risk Management Research Laboratory, Cincinnati, Ohio, U.S.) is used as a hydraulic solver to calculate network pressure in this study.
As shown in Table 1 and Figure 1, five case studies of varying sizes and complexities are considered in this paper where the subscript indicates the number of decision variables for each case study in Figure 1. Table 2 shows all details of the five case studies, including the number of decision variables (number of pipes), the pipe options, the size of the corresponding search space as well as the pressure head constraint. More details about the pipe network (such as the layout of the pipe network, the length of the corresponding pipe section, etc.) can be found in the literature given in Table 2.

Searching Measure Metrics
Five real-time search behaviour metrics are used to describe the real-time search behaviour of the three EAs applied, including the search quality metrics and the convergence metrics, with details shown below.

Search Quality Measure Metrics
For the optimization problem with a single objective, the quality of the solution refers to the closeness of the identified solutions to the known optimal solution during the searching process.
( ) is the most intuitive quality measure metric [29], where it represents the best solution found at each iteration [38,39], that is, the optimal solution at each generation (G) of the solution set f(x): This metric can be used to describe the change in algorithm performance as the number of iterations increases. Another quality measure metric is ( ), which represents the overall quality of the searching population. It can be expressed as: where ∈ � 1, , … , , � = the set of N solution vectors at generation G=1,2,…, .
The percentage of feasible solutions ( %) in the population at generation G can also be used as a quality metric. The feasible solution means that a solution meets the hydraulic constraints, i.e., the solution satisfies Equation (2). It can be expressed as: where: where � , � is the pressure head at each node of the WDS for the , solution, which is calculated using a hydraulic solver.

Convergence Measure Metrics
Zecchin et al. and Zheng identified the degree of convergence and population diversity in the decision space by using the average of the pairwise total hamming distances in the decision space [12,29]. It can be described as: where ( − 1) = the total number of pairs of the candidate solutions; � ⃗ , , ⃗ , � is a topological distance function used to calculate the hamming distance between the two solutions of ⃗ , and ⃗ , at generation G, and it can be expressed as: can be expressed as: (18) (10) In addition to ( ), Zheng proposed another metric, which is the percentage of the best solution to the population in each generation [12]. It can be mathematically expressed (20) (11) where ⃗ * is the best solution in the decision space identified by the EA at generation G. If ���⃗ , = ⃗ * , = 1;otherwise, = 0. This metric is used to measure the selection pressure during the algorithm's search process, and the selection pressure can directly affect the quality of the best solution and the time required for the search. When the value of % is slowly increasing or fluctuating slightly, the algorithm selection pressure is small, and the search space can be widely searched, this meaning that better solutions can be identified with a high likelihood. In contrast, if the value of % increases with a larger rate as the number of iterations increases, the selection pressure of the algorithm is large and it is easy to fall into a local optimal solution. Figures 2-4 show the results of search quality metrics when the GA, DE and ACO are applied to five case studies. It is noted that for the relatively small case studies (HP and EHP), 50 runs are performed with different random number seeds, but 10 runs are performed for the ZJ, BN and RN case studies. The averaged performances across different runs as well as the best run are presented to enable the analysis. From the change curve of the metrics with the increase of the number of iterations (G), a pattern applicable to all the three EAs can be summarized: the solution is rapidly improved during the early searching stage, followed by gradual enhancements with a slow rate. Interestingly, during the early stage in the search, compared to the other two algorithms, GA can find better solutions for the five case studies as shown in Figures 2 and 3. However, as the number of iterations increases (during the middle-later searching stages), the solutions found by DE gradually outperform GA and ACO. Generally, the ACO performs the worst for all the case studies.

Search Quality Results
For 164 , 454 , 476 case studies, the performance differences of the three EAs in identifying optimal solutions are obvious. For 476 and 454 case studies, the solutions found by the DE algorithm are significantly better than the other two EAs during the later searching stages, which is especially the case for the 476 case study. More specifically, the solution quality of the GA and ACO was rapidly improved before 2000 generations, followed by a very slow further improvement later on. However, the DE algorithm can consistently improve its searching quality throughout the searching process. This implies that the DE is more robust in exploring the searching space relative to the GA and ACO.  The ability of the three EAs in identifying feasible solutions is outlined in Figure 4. It is seen from this figure, the DE overall possesses the best ability in locating the feasible solutions. This is supported by the observations that the DE can drive its population towards the feasible regions of the searching space with a fast rate for the four case studies, but not for the EHP problem. This demonstrates that the searching behaviours of the EAs can be also affected by the fitness landscape properties of the case studies. For the GAs, it is found that they can identify the feasible regions quickly, but they always maintain a small proportion of the searching efforts in the infeasible regions. This can be related to the mutation strategy used by the GA. The ACO identify feasible solutions with a slower rate compared with the DE, but it mainly focuses on the feasible region during the moderate-later searching stages.  6 show the results of convergence metrics when GA, DE and ACO were applied to the five case studies. As shown in these figures, at the initial stage of the search, the ( ) values of the three EAs decreased at a faster rate, corresponding to the rapid improvement of the quality of the solution in Figure 2. This shows that the search is effective during the fast convergence phase of the algorithm. As shown in Figure 5, the convergence rate of the three EAs is slowing down as the scale and complexity of the case study increase. In terms of convergence speed, GA consistently has the fastest convergence speed at the initial searching stage for all case studies, but ACO converges quickly during the middle-later stages. DE exhibits the mild convergence rate in the decision space as shown in Figure 5. In addition, it can be observed that as the scale and complexity of the case study increase, the differences between the convergence speeds of the three algorithms becomes more apparent. It is worth noting that in the late search period, the three EAs exhibit different search behaviours. In the later stage of the search, the ( ) of the GA keeps a high value and fluctuates slightly. This means that the GA's searching still maintains a high population diversity in the later stage. Combining with results in Figure 4, it can be deduced that GA still maintains a certain amount of search in the infeasible solution regions in the later stage of the search. As shown in Figure 5, during the later stage of the search, the DE focuses on the searching in the surrounding regions of the identified best solution. This is followed by that the ( ) converges to 0 as well as that the %( ) = 100 in Figure 6, implying that all individuals in the population locate the identical best solution. However, this is not the case for the RN case study due to its greatly increased complexity and, hence, the DE is unable to entirely converge for the given computational budgets.

Figures 5 and
The convergence speed of ACO is least affected by the increasing complexity of case studies. In the later stage of the search, the ( ) value of ACO stabilized at a small value, and the population maintained a certain diversity. It is worth mentioning that in the earliest stages of search, ACO maintains a large ( ), which means that during this period, ACO conducts broad search on the search space. Combining the higher %( ) values in Figure 4 and the fluctuating ( ) values in Figure 2, it can be deduced that at this stage, the search of ACO is concentrated at the boundary between the feasible and infeasible solution regions [29].
As shown in Figure 6, the GA's %( ) value remains at a very small value and fluctuates in the entire search process until the maximum number of iterations. After a sufficient search, DE forces the entire population to find the best solution. This means that in the subsequent search phase, DE is under the greatest selection pressure and GA is the smallest. This is related to their selection strategy and mutation strategy. It is noted that the ACO consistently shows the worst performance in terms of the averaged metric values across multiple runs. This indicates that the ACO's poor performance is more likely to be caused by its underlying searching mechanisms, instead of the randomness of the initial solutions.  Table 3 outlines the solution statistics of the three algorithms, where the current best known solution for each case study is also presented. It is seen from this table that the DE is able to find the current best known solutions for the HP and EHP, and even better solutions for the RN case study. For the ZJ and BN case studies, the solutions found by the DE are very close to their current best known optimal solutions. The ACO algorithm performs the worst in terms of the end-of-run solutions, and the ACO shows slightly worse performance than the DE algorithm. Table 3. Solution statistics of the three algorithms, where the percentage indicates the difference between the best solutions found by the three algorithms in this study and the current best known solutions in literature.

Conclusions
This paper makes the first attempt to compare the real-time searching behaviour of three widely used EAs within the WDS design domain with the aid of five metrics regarding searching quality and convergence properties. These are GAs, DE and ACO, and they are applied to five WDS design problems with increasing complexity. The results of the metrics have verified the applicability of real-time search behaviour metrics to different types of EAs. In addition, this study has proved that compared to traditional methods, run-time search behaviour metrics can deeply reveal the algorithm's searching characteristics. The main findings and the implications based on this study are given below. 1. The three EAs have exhibited significantly different searching behaviours in both the solution space and decision space. Such differences can be related to their underlying model structures and the searching mechanisms. This demonstrates that the proposed run-time metrics are effective in revealing the searching properties of the EAs. 2. From the obtained real-time metric results, it can be concluded that the DE has an overall best ability to locate feasible solutions as well as high quality solutions for the WDS design problems. The ACO performed overall the worst in identifying the optimal solutions as observed in this study. This provides a guideline for the selection of the algorithms for WDS design optimization. 3. If the computational budget is rather limited, the GA can identify better solutions than the DE as shown in this study. This suggests that the GA is promising for the case when the optimization solution needs to be identified in a short time such as real-time WDS operation and management.