1. Introduction
The application of AI and ChatGPT in the water sector is gaining traction, offering promising solutions to longstanding issues in water distribution and management. Several studies have highlighted the potential benefits of integrating AI tools with human expertise to address complex challenges in urban planning and water resource management [
1,
2]. ChatGPT, in particular, holds potential for advancing knowledge democratization by simplifying complex information, enhancing training on sustainable water practices, and making critical data more accessible to a wider audience.
A notable example of ChatGPT’s use in the water industry is Klir, a platform that integrates the AI tool to enhance its water management capabilities. Klir leverages ChatGPT to monitor compliance by analyzing data against regulatory standards, streamline field data entry, simplify report generation, and assist with real-time data interpretation. This integration improves overall water management processes, making them more efficient and accurate. In Halifax, Klir has been implemented in wastewater management, where ChatGPT helps consolidate compliance and sampling data, offering utilities without expert knowledge a means of handling complex tasks [
3].
Furthermore, N. Rane [
4] identified additional applications of ChatGPT in water resource management, such as providing intelligent decision support for utilities, analyzing historical data to project future trends, and leveraging data from sensors and satellite imagery for real-time insights. ChatGPT can also aid in educating users about water conservation practices and can optimize water infrastructure operations by generating recommendations for efficient resource allocation. These applications contribute to the sustainability and resilience of water supply systems. The last version of ChatGPT has recently been applied in the context of water demand series forecasts with a one-week lead time in ten real-world districts for the Battle of Water-Demand Forecasting (BWDF) [
5,
6].
With reference to WDN, the concept of “divide and conquer” is widely employed through the implementation of District Metered Areas (DMAs), where water networks are divided into smaller areas. This allows utilities to monitor and manage the system more effectively by localizing and repairing pipe breaks and reducing water losses [
7]. Considering the dual-use paradigm, DMAs also enhance security by isolating potential contaminants within a single district [
8,
9]. However, defining DMAs has traditionally been an empirical process, often involving trial and error, which can risk compromising system performance due to reduced network redundancy [
10].
The partitioning of WDNs into DMAs offers multiple advantages, including but not limited to (1) enhanced leakage management, as leaks and losses can be detected and isolated more efficiently; (2) scalability, where smaller units allow for the optimization of each DMA individually without disrupting the entire network; (3) system simplification, as managing smaller clusters is easier than overseeing the entire distribution system; and (4) improved water quality control [
11].
From a conceptual point of view, the design of DMAs consists of a network partitioning, which can be effectively divided into two phases: clustering and dividing. In the clustering phase the WDN elements are divided into groups, the members of which share a characteristic, which can be either an operational feature or physical proximity. The algorithms used for clustering can be based on partition (e.g., k-means), hierarchy, density (e.g., DBSCAN), or others [
12]. With the clustering phase, the network elements will be grouped into several clusters, each representing a coherent subpart of the entire system that can be analyzed and managed independently. Once the clustering has been completed, the network is divided into sections, which would ultimately constitute the DMAs, through the dividing. In this phase, logical or physical boundaries among the clusters are created to form independently managed zones by inserting flow meters or gate valves, respectively.
The clustering phase plays a major role in the context of partitioning, as it is crucial to obtain optimal results in the second phase of dividing. The proposal of a suboptimal clustering unavoidably results in a poor dividing and in a poor final configuration of DMAs. Indeed, the minimization of the number of boundaries among clusters, which is one of the objectives of clustering, allows reducing the solution space for the optimization of the dividing phase, the purchase and maintenance cost of devices, and the alteration of hydraulic performance [
13,
14]. In the dividing phase, water quality, pressure or level constraints, head loss, etc., have to be checked with specific energy and hydraulic performance indices to preserve the minimum level of pressure for users. Furthermore, a balanced clustering in terms of number of nodes or water demands, or lengths of pipes, etc., is essential to have a good design of DMAs to compare water budget and water delivery.
Recently, software named SWANP (Smart Water Network Partitioning and Protection) was proposed on the market to automate the process of optimal design of DMAs. Developed in 2013, SWANP enables network partitioning based on advanced clustering and optimization algorithms using also complex network theory [
15].
As the AI and ChatGPT have been poorly tested so far in WDN applications, this paper presents their application in the context of network partitioning, with special focus on the clustering phase. Specifically, ChatGPT is tasked with clustering real WDNs using different algorithms. The performance of these algorithms will then be assessed using established performance indices. These indices provide intuitive and straightforward means of interpreting the obtained results. While the clustering approaches implemented here focus solely on the network’s topology, analyzing these indices offers valuable insights into how balanced the generated solutions are.
This research aims to explore whether ChatGPT can fulfill this task in a manner comparable to specialized hydraulic software, but with the added benefit of being accessible to non-experts. Furthermore, this paper investigates the ability of ChatGPT to analyze the results obtained, write a report, illustrate results with tables and graphs, and support researchers in writing a scientific paper.
The final objective is to assess the ability of ChatGPT to provide water utilities with an intuitive tool for network management, democratizing access to advanced water management techniques.
The remainder of the paper is organized as follows: Following a section presenting the general concept of clustering and its relationship with ChatGPT, the methodology and the results of this work are described. Then, a section presents the methodology validation, followed by the conclusions.
Clustering and ChatGPT
Clustering is a fundamental task in data analysis, used across diverse fields such as machine learning, bioinformatics, image processing, and market segmentation. While clustering algorithms have become widely used, the underlying complexity of finding optimal clusters often remains underestimated. At the core of this challenge is the fact that many clustering problems are classified as NP-hard. In computational theory, an NP-hard problem is one for which no known algorithm can solve all instances efficiently (in polynomial time). This difficulty arises from the vast number of possible ways to group objects into clusters, making it computationally unfeasible to explore every potential partition to find the optimal solution, especially for wide datasets. For instance, the popular k-means clustering problem, where the goal is to partition data into k clusters by minimizing the sum of squared distances between points and their respective cluster centroids, has been proven to be NP-hard, even for only two clusters [
16]. This means that, in practice, exact solutions are often computationally unattainable, particularly for large datasets. As a result, heuristic and approximated algorithms are typically employed to find near-optimal solutions within a reasonable timeframe.
Furthermore, in WDN problems, the complexity further increases. WDNs are typically represented as complex graphs with multiple constraints, such as pressure limits, flow capacity, and hydraulic connectivity, all of which must be respected. Clustering WDNs often requires multi-objective optimization, such as minimizing water losses, improving energy efficiency, or balancing pressure loads. Solving these objectives simultaneously adds significant complexity, making them intractable within polynomial time. In large WDNs, the high number of vertices (junctions) and edges (pipes) makes searching for an optimal partition computationally expensive, particularly when trying to maximize intra-cluster cohesion (e.g., similar demand flows or pressures) and minimize inter-cluster dissimilarity.
In the realm of cluster analysis, ChatGPT has recently emerged as a valuable tool for data analysts, offering interpretative assistance and insights across diverse fields. Indeed, ChatGPT offers promising benefits for automating cluster analysis, particularly in fields that require adaptive insights from large datasets. By utilizing natural language processing and machine learning capabilities, ChatGPT can assist in identifying underlying patterns and organizing data into clusters without manual intervention. This allows users to obtain insights on customer segmentation, trend analysis, or behavioral grouping efficiently. Additionally, ChatGPT can automate the interpretation of cluster results, translating complex data patterns into readable insights for business or research applications. Such capabilities enhance data-driven decision-making by making cluster analysis accessible to non-specialists and accelerating workflows, which reduces time and resources typically required for data exploration and interpretation. For example, A. Panjkota, A. Krajnović, and J. Perkov [
17] explore ChatGPT’s application in customer segmentation via k-means clustering, showing its capacity to aid marketing professionals in interpreting customer clusters effectively. This approach allows marketers to leverage clustered data to optimize targeting strategies. Y-Z. Cheng, T-H. Lai, T-W. Chien, and W. Chou [
18] compare ChatGPT with R for cluster analysis, demonstrating its efficiency in visualizing complex relationships, such as author collaborations and keyword co-occurrences in academic studies. Similarly, Z. Shapcott [
19] explores ChatGPT’s ability in guiding researchers through clustering applications and distance measures, enhancing usability for those integrating clustering algorithms in their workflows. These studies emphasize ChatGPT’s adaptability and potential to simplify cluster analysis, making it a versatile tool for data-driven insights. Finally, the methodology is not a substitute for hydraulic validation but a supportive automation tool for the clustering phase of water network partitioning.
2. Methodology
To address the study objectives, the methodology was organized in two parts: (1) assessing ChatGPT’s clustering performance and (2) evaluating its ability to analyze and report results in a way that supports potential applications for utilities.
2.1. Assessing ChatGPT’s Clustering Performance
For this objective, ChatGPT (version 4.0) was tasked with reading the input file (.inp file of EPANET), extracting the relevant sections, and applying clustering algorithms to the network. The focus was on testing whether ChatGPT could replicate a conventional clustering analysis based on topology and spatial attributes.
Data Preparation. The analysis began with the provided EPANET .inp file, which contained essential information about the water distribution network, including data on junctions, pipes, reservoirs, and coordinates. The key sections extracted from the .inp file were as follows:
Junctions: Contained node IDs, elevations, base demands, and patterns.
Pipes: Defined the connections between nodes, with details such as pipe lengths and diameters.
Reservoirs: Included the IDs and elevations of reservoirs serving as sources for the network.
Coordinates: Provided the geographical locations (x, y) of the junctions and reservoirs.
The network was then represented as a graph where nodes corresponded to junctions and reservoirs, and edges represented the pipes connecting these nodes. This graph model allowed for the application of clustering algorithms.
Clustering Algorithms. Three clustering algorithms were applied to partition the network into four DMAs: k-means clustering, spectral clustering, and hierarchical clustering. Each algorithm uses a different approach to divide the network and produces different results in terms of node distribution and pipe structure. K-means clustering is a distance-based algorithm that subdivides the nodes into k clusters by minimizing the variance within each cluster. In this analysis, the algorithm was applied using the geographic coordinates of the nodes to group them into four clusters based on proximity. This approach provided an initial partition of the network, where each cluster contained geographically close nodes, and the distances between nodes in each cluster were minimized. Spectral clustering utilizes the eigenvalues of the network’s adjacency matrix to partition nodes into clusters. This method is sensitive to the connectivity of the network, rather than relying solely on physical distances between nodes. For the Parete water network, spectral clustering grouped nodes that were closely connected in terms of the pipe network, producing clusters that reflected the structural connectivity of the system. Hierarchical clustering builds a tree-like structure by iteratively merging or splitting clusters.
Metrics. For each clustering approach, the following key metrics were calculated:
Number of boundary pipes: A boundary pipe is defined as any pipe connecting two nodes assigned to different clusters. Parallel pipes are counted individually, since each represents a distinct potential separation between DMAs. Valves are not explicitly counted, but each boundary pipe can be interpreted as implying the installation of a valve or flow meter in practice, in the next dividing phase, as reported in the literature. Shared nodes at cluster borders are considered through their incident pipes: if one of their pipes connects to a node in a different cluster, that pipe is classified as a boundary pipe.
Number of nodes per cluster: The total number of nodes assigned to each cluster.
Total pipe length per cluster: The cumulative length of pipes within each cluster.
Total demand per cluster: The total water demand based on the base demand of nodes within each cluster.
Additionally, three performance indices were computed to evaluate the balance of clustering (the formula to calculate the indices was provided by the user) [
20]:
where
is the number of clusters,
is the size of the largest cluster, and
is the total number of nodes in the network.
where
is the number of clusters,
is the size of the largest cluster, and
is the total demand supplied by the network.
where
is the number of clusters,
is the size of the largest cluster, and
is the total length of pipes in the network.
For clustering, the raw spatial coordinates (x, y) of the nodes extracted from the EPANET file were used without additional scaling or weighting. Performance indices (nodes, demand, pipe length) were computed at the cluster level, independently of the number of sources. The presence of multiple sources in WDNs was not explicitly represented in these indices, which were used here as topological proxies for cluster balance rather than hydraulic performance measures (computed in the next dividing phase, as suggested in the literature, and not considered in this paper that is focused on the clustering phase of WDN).
2.2. Evaluating ChatGPT’s Analysis, Reporting, and Applicability
For the second objective, ChatGPT was asked not only to apply clustering algorithms but also to generate an analysis report, produce tables and figures, and design a flowchart to ensure reproducibility (revised by the authors). These outputs were further considered in terms of ChatGPT their potential applicability for utilities, since they can be exported and used as a basis for DMA design and hydraulic validation.
2.3. Flowchart to Perform Clustering of a WDN Generated by ChatGPT
As part of the methodological framework, ChatGPT was asked to generate a reproducible workflow (in the form of a flowchart) for clustering water distribution networks. The resulting procedure, described in
Section 2.3.1,
Section 2.3.2,
Section 2.3.3,
Section 2.3.4,
Section 2.3.5,
Section 2.3.6,
Section 2.3.7, was subsequently adopted, verified, and used as a guiding protocol for the analysis.
2.3.1. Step 1: Provide the EPANET .inp File to ChatGPT
The first step is to upload or provide access to the EPANET .inp file. This file contains essential information about the network, including:
Junctions: The demand nodes, which contain information about water demand and elevation.
Pipes: The connections between the nodes, including their lengths and diameters.
Reservoirs: The water sources in the network, including their elevations and flow.
Coordinates: Geographical coordinates for the junctions and reservoirs, allowing for a spatial representation of the network.
The user may need to give ChatGPT specific instructions on how to interpret the sections of the .inp file if it struggles to understand them correctly. For example, explain that the COORDINATES section gives the positions of the nodes and that the JUNCTIONS section contains base demand data.
Example prompt: “Here is the EPANET .inp file for the water network. Can you extract the junctions, pipes, reservoirs, and coordinates? Use this data to represent the network as a graph, where nodes are junctions or reservoirs, and edges represent pipes connecting them.”
2.3.2. Step 2: Request Application of Clustering Algorithms
Once the data has been extracted, ask ChatGPT to apply clustering algorithms to divide the network into clusters. Specify how many clusters you want and mention the types of clustering algorithms you would like to use, such as the following:
K-means Clustering: Groups nodes based on geographic proximity.
Spectral Clustering: Groups nodes based on the connectivity between them.
Hierarchical Clustering: Creates a tree-like cluster structure.
Example prompt: “Now that you have the network data, please divide the network into four clusters. Use K-means clustering, Spectral clustering, and Hierarchical clustering. For each clustering algorithm, generate a visual representation where nodes are colored according to their assigned cluster.”
2.3.3. Step 3: Provide Notes on the Data and Metrics
If ChatGPT has trouble understanding the .inp file or calculating metrics, provide additional notes to clarify. For example, the user might need to explain the meaning of junction IDs, pipe lengths, and how demand is distributed across the network. Similarly, the user can provide instructions on how to calculate specific indices.
Example prompt: “In the .inp file, the COORDINATES section defines the x and y positions of the nodes. The JUNCTIONS section includes the base demand for each node. Use these to calculate the total demand in each cluster.”
2.3.4. Step 4: Ask ChatGPT to Calculate Performance Indices
Once the clustering is complete, ask ChatGPT to calculate the following performance indices:
Balance Index (Ib): To measure the balance in the number of nodes between clusters.
Demand Balance Index (Id): To measure how well water demand is balanced across clusters.
Length Balance Index (Il): To assess the balance in pipe length between clusters.
Provide the formulas to ChatGPT if necessary.
Example prompt: “Please calculate the Balance Index (Ib), Demand Balance Index (Id), and Length Balance Index (Il) for each clustering approach.”
2.3.5. Step 5: Review the Results and Identify Discrepancies
After ChatGPT provides the clustering results and calculates the indices, review the results carefully. Ensure that the total number of nodes, pipe lengths, and demand values are consistent. If there are any discrepancies (e.g., nodes that have not been assigned to any clusters), ask ChatGPT to revisit the clustering or correct the mistakes.
Example prompt: “It seems that the total number of nodes in the DMAs does not match the total number of nodes in the network. Can you check if any nodes were left unassigned and reassign them to the nearest cluster?”
2.3.6. Step 6: Correct and Recalculate Metrics if Necessary
If the user identifies errors in the clustering or calculations (as happened with hierarchical clustering in the original analysis), instruct ChatGPT to correct them. For example, if some nodes were left unassigned, ask it to reassign them to the nearest cluster and recalculate the performance indices.
Example prompt: “Some nodes were not assigned during the Hierarchical clustering process. Please reassign the un-clustered nodes based on their geographic proximity to the nearest clustered node and recalculate the indices.”
2.3.7. Step 7: Finalize the Results and Use for Network Management
Once the clustering and metrics have been corrected, the user can finalize the results. Use the clusters produced by ChatGPT for network management, such as setting up flow meters, detecting leaks, or balancing pressure. The performance indices will give insight into the balance of the clusters in terms of node distribution, demand, and pipe length, allowing the user to make informed decisions about adjustments to the network.
3. Results
The WDN used for the proof of concept is representative of the city of Parete, a city in the south of Italy, characterized by a population of about 12,000 people. The network consists of 182 demand nodes, 2 entry points and 282 pipes. Clustering solutions for Parete WDN were generated within seconds, and the results obtained with the different algorithms are reported in
Table 1 and represented in
Figure 1.
Spectral clustering resulted in a moderately balanced partition of the network, with indices reflecting a reasonable compromise between node count (Ib =1.05), water demand (Id = 1.58), and pipe length (Il = 1.07). This suggests that the method effectively prioritizes connectivity between nodes while maintaining a balance in geographic proximity and operational factors. K-means clustering exhibited higher variability in pipe length (Il = 1.53) and water demand distribution (Id = 1.51), while achieving a balanced node distribution (Ib = 1.17). The results indicate that K-means clustering focuses more on optimizing node count, leading to variations in operational metrics across clusters. Hierarchical clustering produced significant imbalances, particularly in node count (Ib = 1.52), while maintaining a relatively balanced pipe length distribution (Il = 1.42). This reflects its emphasis on geographic proximity over operational parameters, resulting in clusters that are spatially coherent but less uniform in terms of demand and node count.
Discussion of Errors and Corrections. During the initial application of hierarchical clustering, an issue was encountered where several nodes were left unassigned. This error was identified as the total number of nodes in the clusters did not match the expected number. This issue occurred because hierarchical clustering often handles “outliers” (nodes that do not fit well into any cluster), and in some cases, it may exclude nodes that do not meet the proximity or similarity thresholds. These outliers were not automatically flagged by the algorithm, and since the error only involved a subset of nodes, the issue was corrected when reviewing the initial results. After recognizing the discrepancy, the unassigned nodes were reassigned to the nearest clusters, leading to a corrected clustering outcome. Additionally, when calculating the demand balance index for hierarchical clustering, a mistake was made in reporting a value of 22, which was later corrected after recalculating the indices.
Preliminary conclusion. The clustering analysis demonstrated that different approaches produce varying levels of balance across the analyzed Parete network. Spectral clustering provided the most balanced pipe length and node but the highest imbalance for node demand. K-means clustering exhibited higher variability in pipe length and water demand but achieved a relatively balanced node distribution. Hierarchical clustering resulted in the most significant imbalances in node count, despite achieving the lowest Id among the different solutions. These results highlight the trade-offs between different clustering approaches, with spectral clustering favoring operational balance, k-means focusing on node distribution, and hierarchical clustering prioritizing geographic proximity at the expense of operational consistency. However, the results obtained may vary for a different WDN with different topology.
3.1. Validation
To further validate the process described in the flowchart and verify that ChatGPT has learned from previous errors and that it may produce a solution that respects the contiguity, a validation test was conducted on a larger network. This network represents the WDN of Giugliano in Campania, Italy, characterized by 994 demand nodes and 5 intake points.
ChatGPT generated the proposed solutions for the WDN of Giugliano in a few seconds, and the results obtained are represented in
Figure 2.
However, for the solutions proposed with hierarchical clustering and K-means, as shown in
Figure 2, contiguity is not fully respected. In some cases, nodes are disconnected from their assigned cluster, and these instances are indicated with red arrows in the figure. For spectral clustering, it is particularly challenging to determine from the visualization whether contiguity is respected, especially given the complexity of the Giugliano network. The network’s intricate topology includes multiple intersecting pipes and levels in the soil, which complicates cluster interpretation.
To address this issue, ChatGPT was specifically requested to generate a solution that ensures contiguity and to provide the code used to validate that contiguity is maintained. The response from ChatGPT is detailed below:
3.2. Response from ChatGPT
The solution provided by ChatGPT involved applying the SpectralClustering class from the sklearn.cluster module to partition the network into four clusters, ensuring contiguity. The specific code used is as follows:
from sklearn.cluster import SpectralClustering import networkx as nx import matplotlib.pyplot as plt import numpy as np # Assuming ‘G’ is your NetworkX graph and ‘pos’ contains node positions # Create adjacency matrix for the graph adj_matrix = nx.to_numpy_array(G) # Number of clusters n_dma = 4 # Apply Spectral Clustering clustering = SpectralClustering(n_clusters=n_dma, affinity=‘precomputed’, random_state=42) labels = clustering.fit_predict(adj_matrix) # Assign cluster labels to nodes for i, node in enumerate(G.nodes()): G.nodes[node][‘cluster’] = labels[i] # Plot the clustered graph without node labels plt.figure(figsize=(12, 8)) colors = [plt.cm.tab10(label) for label in labels] nx.draw(G, pos, node_color=colors, with_labels=False, node_size=50) plt.title(‘Network Clusters (4 Clusters with Contiguity, No Node IDs)’) plt.show() |
This code constructs the adjacency matrix for the graph, applies spectral clustering to divide the network into four clusters, assigns the resulting cluster labels to each node, and visualizes the clustered network. Since this approach is based on the adjacency matrix, contiguity is inherently ensured.
To further analyze the results and verify the solution’s consistency, ChatGPT was asked to regenerate the clusters using spectral clustering and to represent subgraphs for each cluster. These subgraphs provide a clearer visualization of the clusters and ensure that contiguity is maintained. The updated visualizations (
Figure 3) and code outputs confirm that the solution respects contiguity and aligns with the network’s topological structure.
As shown in
Figure 3, the solution obtained using spectral clustering ensures contiguity, addressing a critical limitation observed in other clustering approaches, which failed to maintain connectivity within clusters. This improvement is particularly important for practical applications in WDN management. The performance of spectral clustering for the Giugliano network is detailed in
Table 2.
It is important to note that the validation presented in this work is intentionally limited to two real Italian networks. The choice of Parete (182 demand nodes) and Giugliano in Campania (994 demand nodes) was made to represent both a smaller, more manageable system and a larger, more complex topology. This combination provides a first proof-of-concept of ChatGPT’s adaptability across networks of different scales and complexity. While the geographic scope remains limited, the methodological framework developed here is general and can be readily applied to other water distribution systems. Future research will aim to extend this validation to a wider range of networks, including those from different contexts and operational conditions.
4. Conclusions
Clustering WDNs is a critical and intricate task, especially when dealing with large-scale systems characterized by numerous nodes and the need for multiple clusters. In this study, the ability of ChatGPT to facilitate this process was evaluated as a proof-of-concept.
In line with the first objective, the results showed that ChatGPT can execute the clustering phase of WDNs within seconds, producing solutions comparable to conventional algorithms proposed in the literature, although user corrections were sometimes required. In line with the second objective, ChatGPT was also able to analyze the results, generate reports, illustrate outcomes with tables and graphs, and design a reproducible flowchart. These outputs can also be directly applied by practitioners, showing ChatGPT’s potential to support utilities and democratize access to advanced methodologies.
However, it is important to emphasize that user guidance played a crucial role in achieving the final outcomes. Throughout the process, the authors needed to intervene in cases where certain nodes were left unassigned or when errors in index calculations were detected. This suggests that, though being a useful tool for assisting experts, particularly in handling repetitive or time-consuming tasks, ChatGPT currently lacks the reliability and precision to address unsupervised water management applications. Thus, ChatGPT’s current role in water distribution management should be seen as complementary rather than substitutive to specialized software.
In conclusion, ChatGPT provides a valuable, accessible tool for water utilities, particularly in democratizing advanced techniques for users with limited technical backgrounds. Future work should extend validation to a wider range of networks, integrate hydraulic performance measures, and explore ways to reduce user intervention. By integrating ChatGPT with more robust validation mechanisms and conventional hydraulic solvers, its utility in water distribution and management could be enhanced, helping bridge the gap between AI-driven insights and real-world application.
Finally, this proof-of-concept opens the way for future integration of LLM-based workflows with conventional hydraulic solvers, potentially offering utilities semi-automated decision-support tools. Although broader issues of LLMs, such as reproducibility and bias, are well recognized, they are beyond the scope of this study. Here we specifically assess ChatGPT’s performance in WDN clustering as a domain-focused proof-of-concept.