Development of a Binary Model for Evaluating Water Distribution Systems by a Pressure Driven Analysis (PDA) Approach

Investigation of Water Distribution Networks (WDNs) is considered a challenging task due to the unpredicted and uncertain conditions in water engineering. When in a WDN, a pipe failure occurs, and shut-off valves to isolate the broken pipe to allow repairing works are activated. In these new conditions, the hydraulic parameters in the network are modified because the topology of the entire system changes. If the head becomes inadequate, the Pressure Driven Analysis (PDA) is the correct approach to evaluate the performance of water networks. Hence, in the present study, the water distribution system was evaluated in pressure-driven conditions for 100 different scenarios and then using a type of neural network called Group Method of Data Handling (GMDH) as a stochastic technique. For this purpose, several most notable parameters including the base demand, pressure, and alpha (the percentage of effective supplied flow) were calculated using simulations based on a PDA approach and applied to the water distribution network of Praia a Mare in Southern Italy. In the second stage, the output parameters were used in a developed binary classification model. Finally, the obtained results showed that the GMDH algorithm can be applied as a powerful tool for modeling water distribution networks.


Introduction
Water Distribution Networks (WDNs) are considered as one of the most significant urban infrastructures. Their proper performance leads to an increase in the level of water supply services and also customers' satisfaction. Hence, the correct management of WDNs is necessary to have a reliable water network system with high performance. When in a network, due to a failure, the pressure in some nodes is not adequate, and it is not possible to guarantee the service to the users [1].
Different scenarios can be modeled by using the Epanet tool and more descriptions of this ability can be found in some previous studies (e.g., [2][3][4]). The Epanet simulates hydraulic networks both in a steady-state and in extended period analysis and validates system behavior by assuming different input parameters. The Epanet Matlab toolkit allows for interfacing the Epanet with Matlab (R2018, MathWorks) to evaluate different hydraulic scenarios also with pressure driven analysis (PDA) based models [5]. Zheng et al. carried out an investigation for the least cost design of looped Water Distribution Systems (WDSs) by proposing a novel optimization approach. Their approach was a combination of a differential evolution algorithm with the traditional deterministic optimization technique which was tested on four looped WDSs. Finally, they proposed an approach with high

Methodology
Recently, artificial intelligence models have been successfully used for modeling the water network systems [19,20]. Hence, using EPANET software, group method of data handling type of neural network algorithm was used in this work. Figure 1 shows the flowchart of steps of conducting the research. According to Figure 1, the water distribution system of the adopted case study was evaluated in the first step and then 100 different scenarios were simulated by EPANET software under two conditions, namely steady-state and extended period simulation. Finally, the obtained results were modeled using the group method of data handling (GMDH) type of neural network algorithm as a stochastic technique.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 3 of 15 the research. According to Figure 1, the water distribution system of the adopted case study was evaluated in the first step and then 100 different scenarios were simulated by EPANET software under two conditions, namely steady-state and extended period simulation. Finally, the obtained results were modeled using the group method of data handling (GMDH) type of neural network algorithm as a stochastic technique.

Pressure Driven Analysis (PDA)
When in a WDS a pipe failure occurs, shut-off valves are used to isolate broken pipes. The new topology modifies the circulating flow and the pressure values in each node change from those operating in normal conditions. The analysis of the new system is necessary and, if the head at each node becomes inadequate, there is an inefficiency of the whole system and the impossibility to deliver the requested nodal demand to all users.
The effective demand (Base Demand, QBD) from the nodes depends on the real head value that is the sum of elevation (z) and piezometric height (p/γ), i.e., the ratio between pressure (p) and specific weight (γ). A node gives all the requested demand if the head is higher than Hs, which is the service head. It is the service head and depends on the ground level and (Hb) based on Equation (1): where: • z: elevation of ground level • Hb: the height of each supplied building • p/γ min is the minimum piezometric head necessary to serve the users; it is related to the

Pressure Driven Analysis (PDA)
When in a WDS a pipe failure occurs, shut-off valves are used to isolate broken pipes. The new topology modifies the circulating flow and the pressure values in each node change from those operating in normal conditions. The analysis of the new system is necessary and, if the head at each node becomes inadequate, there is an inefficiency of the whole system and the impossibility to deliver the requested nodal demand to all users.
The effective demand (Base Demand, Q BD ) from the nodes depends on the real head value that is the sum of elevation (z) and piezometric height (p/γ), i.e., the ratio between pressure (p) and specific weight (γ). A node gives all the requested demand if the head is higher than H s , which is the service head. It is the service head and depends on the ground level and (H b ) based on Equation (1): where: • z: elevation of ground level • Hb: the height of each supplied building • p/γ min is the minimum piezometric head necessary to serve the users; it is related to the height of building H b ; • P ms is the minimum pressure necessary in each point of the building, usually 5 m; • P p are the head losses along the riser column; • P D are the head losses starting from the network node and ending at the base of each building.
When the calculated head H is lower than H s , the system works in reduced mode and the effective delivered demand Q real is lower than assumed Base Demand Q BD necessary for customers' satisfaction. The value of Q real = 0 if the head is below H min with: H min = z + P ms (2) where H min is the head necessary to serve users at ground level. The value of Q real can be calculated as: The value of α is the ratio between Q real and Q BD , and it represents the percentage of supplied flow. It can be calculated using the relations indicated below [21] and according to Figure 2: where β can be calculated using a calibration model, and it is related to head loss along pipes. This value varies between 1.5 and 2 and generally we assume 2.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 4 of 15 • Pp are the head losses along the riser column; • PD are the head losses starting from the network node and ending at the base of each building.
When the calculated head H is lower than Hs, the system works in reduced mode and the effective delivered demand Qreal is lower than assumed Base Demand QBD necessary for customers' satisfaction. The value of Qreal = 0 if the head is below Hmin with: where Hmin is the head necessary to serve users at ground level.
The value of Qreal can be calculated as: The value of α is the ratio between Qreal and QBD, and it represents the percentage of supplied flow. It can be calculated using the relations indicated below [21] and according to Figure 2: where β can be calculated using a calibration model, and it is related to head loss along pipes. This value varies between 1.5 and 2 and generally we assume 2.

Group Method of Data Handling (GMDH)
Recently, artificial intelligence has been considered as an alternative tool in solving complex problems if compared to traditional methods. The use of artificial intelligence in a variety of engineering problems is increasing. As an example, the fuzzy set theory and its application [22,23], meta-heuristic algorithms [24][25][26][27][28][29][30][31], hybrid algorithms [32][33][34][35], pattern recognition, and machine learning [36][37][38][39][40] have been adopted. The GMDH type of neural network as a subset of artificial intelligence was proposed by Ivakhnenko [41], and it has a suitable capability for the training of network statistics. This method includes a set of neurons obtained from the connection of different pairs according to a quadratic polynomial that is also called Ivakhnenko polynomial equation [42-

Group Method of Data Handling (GMDH)
Recently, artificial intelligence has been considered as an alternative tool in solving complex problems if compared to traditional methods. The use of artificial intelligence in a variety of engineering problems is increasing. As an example, the fuzzy set theory and its application [22,23], meta-heuristic algorithms [24][25][26][27][28][29][30][31], hybrid algorithms [32][33][34][35], pattern recognition, and machine learning [36][37][38][39][40] have Appl. Sci. 2020, 10, 3029 5 of 15 been adopted. The GMDH type of neural network as a subset of artificial intelligence was proposed by Ivakhnenko [41], and it has a suitable capability for the training of network statistics. This method includes a set of neurons obtained from the connection of different pairs according to a quadratic polynomial that is also called Ivakhnenko polynomial equation [42][43][44]. The models are based upon the dataset and they are self-regulating models. Equation (7) shows the general form of the GMDH basic neural network map according to input and output data [45]: where y is output and m is the number of data for values of x 1 , x 2 , x 3 , . . . , x m . By combining the quadratic polynomials of all the neurons for a set of inputs such as X = (x i1 , x i2 , x i3 , . . . . x im ), the approximate function off with outputŷ with the least possible error compared to output y was reached in the following form: The GMDH has different layers. In the process of GMDH, input data are given to the first layer and, after process and combination of input data, these data are considered as new input for the next layer. This procedure continues until the results of the layer (n+1) are more appropriate if compared to the layer (n) with less error. Therefore, the calculations are stopped. Figure 3 shows the basic structure of the GMDH algorithm. Input data sets are randomly divided into training and testing data sets and also m and n are a number of variables and observations, respectively. It is worth mentioning that, in spite of the fact that there are several classification methods, the GMDH algorithm was selected as an optimization algorithm based on two main reasons in this work. First, it was not used in previous studies. Second, the pattern identification and binary classification are some of the most notable capabilities of GMDH algorithms in different disciplines [46].
Appl. Sci. 2020, 10, x FOR PEER REVIEW 5 of 15 44]. The models are based upon the dataset and they are self-regulating models. Equation (7) shows the general form of the GMDH basic neural network map according to input and output data [45]: where y is output and m is the number of data for values of x1, x2, x3, …, xm. By combining the quadratic polynomials of all the neurons for a set of inputs such as X = (xi1, xi2, xi3, …. xim), the approximate function off with outputŷ with the least possible error compared to output y was reached in the following form: The GMDH has different layers. In the process of GMDH, input data are given to the first layer and, after process and combination of input data, these data are considered as new input for the next layer. This procedure continues until the results of the layer (n+1) are more appropriate if compared to the layer (n) with less error. Therefore, the calculations are stopped. Figure 3 shows the basic structure of the GMDH algorithm. Input data sets are randomly divided into training and testing data sets and also m and n are a number of variables and observations, respectively. It is worth mentioning that, in spite of the fact that there are several classification methods, the GMDH algorithm was selected as an optimization algorithm based on two main reasons in this work. First, it was not used in previous studies. Second, the pattern identification and binary classification are some of the most notable capabilities of GMDH algorithms in different disciplines [46].

Case Study
The methodology was applied to a real case related to the water network of Praia a Mare (CS, Italy), a coastal city in the Northern zone of Calabria. The network, consisting of 73 pipes, 53 nodes, and 2 tanks, is shown in Figure 4. The total base demand circulating in the network, in summer conditions, is 47.25 l/s for about 14,000 users. The minimum head Hmin varies from 29 to 68 m and Hs varies from 64 to 98 m assuming for each node p/γmin = 35 m. The water distribution system has a total length of 18,500 m and the pipe diameters vary in the range 125-250 mm. The volume tank is about 600 m 3 for tank 54 and 1500 m 3 for tank 55.

Case Study
The methodology was applied to a real case related to the water network of Praia a Mare (CS, Italy), a coastal city in the Northern zone of Calabria. The network, consisting of 73 pipes, 53 nodes, and 2 tanks, is shown in  Four scenarios depending on the possible pipe failure in the network for pipes 1,2, and 73 outgoing, from the tanks 54 and 55 have been evaluated. The first one assumes that the three pipes are all functioning (scenario OOO). The other scenarios assume the failure of pipe 2 (COO), pipe 1 (OCO), and pipe 73 (OOC).
For each scenario, a steady-state analysis and an extended period simulation have been performed obtaining 25 data sets, one for steady-state simulation and 24 assuming a one hour time step in the extended period simulation.
The PDA analysis furnishes the values of real demand at each node when the calculated head H is lower than Hs and α < 1. For each node, the analysis shows if it works in reduced mode or not, and this condition is assumed as a parameter in this analysis. Table 1 shows, as an example, the results for steady-state analysis for each scenario.

GMDH Modeling and Discussion
For modeling in this research, three effective parameters including base demand, pressure, and alpha were calculated by simulating 100 different data sets of the water distribution network of Praia a Mare with 53 nodes. The nodes were used as input dataset, and the performance distribution network by PDA analysis was used as output data sets.
Before modeling, there are two steps that play a key role in the existence of an efficient model. At the first step, according to the type of model, a suitable performance index should be selected to evaluate the capacity performance of an artificial intelligence model [47]. As previously mentioned, in order to assess water distribution systems under insufficient pressure conditions in this study, binary classification is used by the GMDH algorithm. Hence, a confusion matrix is considered for evaluating models that is a suitable tool for measuring accuracy. The general form of a confusion matrix for evaluating a binary classification model through two classes is showed in Figure 5. Then, according to the results of the confusion matrix, accuracy and error are measured on the basis of the following Equations (9) and (10): Four scenarios depending on the possible pipe failure in the network for pipes 1,2, and 73 outgoing, from the tanks 54 and 55 have been evaluated. The first one assumes that the three pipes are all functioning (scenario OOO). The other scenarios assume the failure of pipe 2 (COO), pipe 1 (OCO), and pipe 73 (OOC).
For each scenario, a steady-state analysis and an extended period simulation have been performed obtaining 25 data sets, one for steady-state simulation and 24 assuming a one hour time step in the extended period simulation.
The PDA analysis furnishes the values of real demand at each node when the calculated head H is lower than H s and α < 1. For each node, the analysis shows if it works in reduced mode or not, and this condition is assumed as a parameter in this analysis. Table 1 shows, as an example, the results for steady-state analysis for each scenario.

GMDH Modeling and Discussion
For modeling in this research, three effective parameters including base demand, pressure, and alpha were calculated by simulating 100 different data sets of the water distribution network of Praia a Mare with 53 nodes. The nodes were used as input dataset, and the performance distribution network by PDA analysis was used as output data sets.
Before modeling, there are two steps that play a key role in the existence of an efficient model. At the first step, according to the type of model, a suitable performance index should be selected to evaluate the capacity performance of an artificial intelligence model [47]. As previously mentioned, in order to assess water distribution systems under insufficient pressure conditions in this study, binary classification is used by the GMDH algorithm. Hence, a confusion matrix is considered for evaluating models that is a suitable tool for measuring accuracy. The general form of a confusion matrix for evaluating a binary classification model through two classes is showed in Figure 5. Then, according to the results of the confusion matrix, accuracy and error are measured on the basis of the following Equations (9) and (10):

FP FN Error
Acc TP FP TN FN Finally, it should be noted that the determination of control parameters is an important phase before modeling because they greatly contribute to improve convergence and speed of the algorithm. There are often no specific relationships for these parameters and they are considered based on recent studies, expert opinions, and trial and error [48][49][50][51]. In this work, some of these control parameters were selected on the basis of previous studies and they were checked on the basis of trial and error including the Selection Pressure (SP), Maximum Number of Layers (MNL), and Maximum Number of Neurons in a Layer (MNNL). The SP is a dimensionless parameter, and it plays a key role in the sensitivity of modeling error. The previous studies suggested that the value of this parameter equal to 0.6. According to trial and error, the numbers of MNL and MNNL are determined. Hence, a set of values including 5, 10, and 15 for MNL, and a set of values including 5, 10, 15, and 20 for MNNL were considered, and 12 models were constructed and evaluated. It worth mentioning that the initial model was constructed for the first scenario (OOO) in steady-state conditions and 75% of data sets were also considered as training data and the rest was a testing dataset based on Looney's research [52]. The results of the binary classification of the first scenario (OOO) are shown in Table 2.  Finally, it should be noted that the determination of control parameters is an important phase before modeling because they greatly contribute to improve convergence and speed of the algorithm. There are often no specific relationships for these parameters and they are considered based on recent studies, expert opinions, and trial and error [48][49][50][51]. In this work, some of these control parameters were selected on the basis of previous studies and they were checked on the basis of trial and error including the Selection Pressure (SP), Maximum Number of Layers (MNL), and Maximum Number of Neurons in a Layer (MNNL). The SP is a dimensionless parameter, and it plays a key role in the sensitivity of modeling error. The previous studies suggested that the value of this parameter equal to 0.6. According to trial and error, the numbers of MNL and MNNL are determined. Hence, a set of values including 5, 10, and 15 for MNL, and a set of values including 5, 10, 15, and 20 for MNNL were considered, and 12 models were constructed and evaluated. It worth mentioning that the initial model was constructed for the first scenario (OOO) in steady-state conditions and 75% of data sets were also considered as training data and the rest was a testing dataset based on Looney's research [52]. The results of the binary classification of the first scenario (OOO) are shown in Table 2. For determining the best structure of the developed binary classification model, a simple ranking method was used [53]. For this purpose, for ranking of the accuracy of training and testing data set, the models were ranked from 12 to 1 from the highest accuracy to lowest accuracy, respectively. For instance, the second and 8th models had the highest accuracy in the training dataset and also both of them were equal. As a result, they were assigned a rank equal to 12. Moreover, the 8th model showed a suitable performance to classify the testing dataset with the highest accuracy among other models with 100%, which achieves a rank equal to 12. Finally, in the last column of Table 3, the values of the ranking of training and testing for each model were added. According to the results of Table 3, the 8th developed model has the highest rank if compared to other models with a structure including MNL and MNNL equal to 15 and 10, respectively, and it can be concluded that it has the best performance to recognize and predict the system. The confusion matrices of the 8th developed model for training, testing, and total data are shown in Figure 6 For determining the best structure of the developed binary classification model, a simple ranking method was used [53]. For this purpose, for ranking of the accuracy of training and testing data set, the models were ranked from 12 to 1 from the highest accuracy to lowest accuracy, respectively. For instance, the second and 8 th models had the highest accuracy in the training dataset and also both of them were equal. As a result, they were assigned a rank equal to 12. Moreover, the 8 th model showed a suitable performance to classify the testing dataset with the highest accuracy among other models with 100%, which achieves a rank equal to 12. Finally, in the last column of Table 3, the values of the ranking of training and testing for each model were added. According to the results of Table 3, the 8 th developed model has the highest rank if compared to other models with a structure including MNL and MNNL equal to 15 and 10, respectively, and it can be concluded that it has the best performance to recognize and predict the system. The confusion matrices of the 8 th developed model for training, testing, and total data are shown in Figures 6, 7, and 8, respectively.    According to Figure 6, the accuracy of binary classification for training data was 100%. In this case, for 21 nodes, H<Hs with labels "0" in which 21 nodes were recognized and classified correctly. Also, the proposed model could identify and classify all 19 nodes which did not work in reduced mode. It is worth noting that all testing data sets were classified and recognized with 92.3% accuracy by the proposed model based on Figure 7. Consequently, it can be concluded that the developed model could classify all 53 nodes with 98.1% accuracy, describing its high capability in the identification and binary classification of performance distribution network.
For a better understanding of the 8 th developed model's accuracy, the results of the confusion matrix for training, testing, and all data were evaluated with three other performance indicators, namely the precision, recall, and F1 score, which are known as complementary performance indicators for evaluating the confusion matrix. The precision is used to investigate a particular class. Indeed, it is the ratio between the value of "TP" and the sum of "TP" and "FP". Moreover, the recall is called the true negative rate, which is the ratio of the value between "TP" and the sum of "TP" and  According to Figure 6, the accuracy of binary classification for training data was 100%. In this case, for 21 nodes, H<Hs with labels "0" in which 21 nodes were recognized and classified correctly. Also, the proposed model could identify and classify all 19 nodes which did not work in reduced mode. It is worth noting that all testing data sets were classified and recognized with 92.3% accuracy by the proposed model based on Figure 7. Consequently, it can be concluded that the developed model could classify all 53 nodes with 98.1% accuracy, describing its high capability in the identification and binary classification of performance distribution network.
For a better understanding of the 8 th developed model's accuracy, the results of the confusion matrix for training, testing, and all data were evaluated with three other performance indicators, namely the precision, recall, and F1 score, which are known as complementary performance indicators for evaluating the confusion matrix. The precision is used to investigate a particular class. Indeed, it is the ratio between the value of "TP" and the sum of "TP" and "FP". Moreover, the recall is called the true negative rate, which is the ratio of the value between "TP" and the sum of "TP" and According to Figure 6, the accuracy of binary classification for training data was 100%. In this case, for 21 nodes, H<H s with labels "0" in which 21 nodes were recognized and classified correctly. Also, the proposed model could identify and classify all 19 nodes which did not work in reduced mode. It is worth noting that all testing data sets were classified and recognized with 92.3% accuracy by the proposed model based on Figure 7. Consequently, it can be concluded that the developed model could classify all 53 nodes with 98.1% accuracy, describing its high capability in the identification and binary classification of performance distribution network.
For a better understanding of the 8th developed model's accuracy, the results of the confusion matrix for training, testing, and all data were evaluated with three other performance indicators, namely the precision, recall, and F1 score, which are known as complementary performance indicators for evaluating the confusion matrix. The precision is used to investigate a particular class. Indeed, it is the ratio between the value of "TP" and the sum of "TP" and "FP". Moreover, the recall is called the true negative rate, which is the ratio of the value between "TP" and the sum of "TP" and "FN". In addition, the F1 score considers the effect of precision and recalls together. For more information regarding performance indicators of algorithm, please refer to Faradonbeh et al [54]. The results of this analysis are shown in Figure 9.
"FN". In addition, the F1 score considers the effect of precision and recalls together. For more information regarding performance indicators of algorithm, please refer to Faradonbeh et al [54]. The results of this analysis are shown in Figure 9. According to Figure 9, all performance metrics have values over 80%. Therefore, the 8 th developed model has high performance related to classifying data sets. Regarding the developed binary classification model, all calculated data of performance distribution network using a PDA approach for 100 different data sets were modeled, and total accuracy for each data set was calculated. The accuracies of four scenarios under steady-state conditions are shown in Figure 10 and also Figures 11 to 14 showing the total accuracies for 96 different data sets for extended period analysis.  According to Figure 9, all performance metrics have values over 80%. Therefore, the 8th developed model has high performance related to classifying data sets. Regarding the developed binary classification model, all calculated data of performance distribution network using a PDA approach for 100 different data sets were modeled, and total accuracy for each data set was calculated. The accuracies of four scenarios under steady-state conditions are shown in Figure 10 and also Figures 11-14 showing the total accuracies for 96 different data sets for extended period analysis.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 10 of 15 "FN". In addition, the F1 score considers the effect of precision and recalls together. For more information regarding performance indicators of algorithm, please refer to Faradonbeh et al [54]. The results of this analysis are shown in Figure 9. According to Figure 9, all performance metrics have values over 80%. Therefore, the 8 th developed model has high performance related to classifying data sets. Regarding the developed binary classification model, all calculated data of performance distribution network using a PDA approach for 100 different data sets were modeled, and total accuracy for each data set was calculated. The accuracies of four scenarios under steady-state conditions are shown in Figure 10 and also Figures 11 to 14 showing the total accuracies for 96 different data sets for extended period analysis.           Figures 10 to 14 demonstrate the capability of the developed model in simulating and evaluating water distribution systems by the PDA approach. Based on these figures, the accuracy of developed model was over 90% for four different scenarios based on the steady-state conditions and all time periods based on extended period simulation including 96 different scenarios. Consequently, the results show that the GMDH algorithm can provide highly acceptable degrees of accuracy to find and recognize a meaningful relationship between the base demand, pressure, and α as input and the results obtained by PDA analysis as output. It should be noted that the obtained results of the developed model are unique for only the water distribution network of Praia a Mare and are not directly used for other water distribution networks. In addition, although the GMDH algorithm is highly efficient in solving complex problems, incomplete data in some optimization problems of the water distribution system could occur. As a consequence, it is suggested that other methods of classification such as naive Bayes classifier as a supervised classification method can be used for these types of problems.

Conclusions
The ensuring of the performance of the urban water networks plays a key role in urban water management, which will lead to improving service and citizen satisfaction. Hence, in this work, the water distribution network of Praia a Mare in Southern Italy was selected as a case study and the water distribution system of this city was investigated using a pressure-driven analysis approach for 100 different data sets. PDA is a reliable method to evaluate the real performance of water networks. After simulating the water distribution system, three notable parameters including the base demand, pressure, and alpha were calculated. They were used as input data for a developed binary classification model. The GMDH algorithm was used and the developed binary classification model was constructed and then, for each different scenario, it was used. The result shows that all models had an accuracy of over 90%. As a result, it can be concluded that the developed binary classification model has a high performance capacity in classifying and evaluating the water distribution system of the involved urban water network, hence it can be applied as a powerful tool to recognize the performance of the water network. According to the importance of the present topic, other artificial intelligence techniques can be used in future research.   show that the GMDH algorithm can provide highly acceptable degrees of accuracy to find and recognize a meaningful relationship between the base demand, pressure, and α as input and the results obtained by PDA analysis as output. It should be noted that the obtained results of the developed model are unique for only the water distribution network of Praia a Mare and are not directly used for other water distribution networks. In addition, although the GMDH algorithm is highly efficient in solving complex problems, incomplete data in some optimization problems of the water distribution system could occur. As a consequence, it is suggested that other methods of classification such as naive Bayes classifier as a supervised classification method can be used for these types of problems.

Conclusions
The ensuring of the performance of the urban water networks plays a key role in urban water management, which will lead to improving service and citizen satisfaction. Hence, in this work, the water distribution network of Praia a Mare in Southern Italy was selected as a case study and the water distribution system of this city was investigated using a pressure-driven analysis approach for 100 different data sets. PDA is a reliable method to evaluate the real performance of water networks. After simulating the water distribution system, three notable parameters including the base demand, pressure, and alpha were calculated. They were used as input data for a developed binary classification model. The GMDH algorithm was used and the developed binary classification model was constructed and then, for each different scenario, it was used. The result shows that all models had an accuracy of over 90%. As a result, it can be concluded that the developed binary classification model has a high performance capacity in classifying and evaluating the water distribution system of the involved urban water network, hence it can be applied as a powerful tool to recognize the performance of the water network. According to the importance of the present topic, other artificial intelligence techniques can be used in future research.

Conflicts of Interest:
The authors declare no conflict of interest.