Graph Convolutional Networks: Application to Database Completion of Wastewater Networks

: Wastewater networks are mandatory for urbanisation. Their management, including the prediction and planning of repairs and expansion operations, requires precise information on their underground components (manhole covers, equipment, nodes, and pipes). However, due to their years of service and to the increasing number of maintenance operations they may have undergone over time, the attributes and characteristics associated with the various objects constituting a network are not all available at a given time. This is partly because (i) the multiple actors that carry out repairs and extensions are not necessarily the operators who ensure the continuous functioning of the network, and (ii) the undertaken changes are not properly tracked and reported. Therefore, databases related to wastewater networks may suffer from missing data. To overcome this problem, we aim to exploit the structure of wastewater networks in the learning process of machine learning approaches, using topology and the relationship between components, to complete the missing values of pipes. Our results show that Graph Convolutional Network (GCN) models yield better results than classical methods and represent a useful tool for missing data completion.


Introduction
Urbanisation has been an increasing trend over the past century [1]. OCDE [2] predicted that over 2012-2050, the global water demand will increase by 55%. Given the predicted growth in population and water demand, Instrumentation, Control, and Automation (ICA) will become even more important and the need for system-wide ICAs more urgent [3]. The development of smart cities [4] has encouraged the use of innovative solutions like big data and Internet of Things (IoT) sensors and applications. One of the sectors that takes advantage of these cutting-edge technologies is that of water and wastewater [5][6][7]. Services are being developed for the real-time management of these systems [8] relying on a purely technical layer (sensors, actuators, etc.) and a software layer making use of data-mining techniques to infer the needed information and knowledge [9].
A problem often encountered when managing environmental systems, such as underground databases, is missing data [7,[10][11][12]. In wastewater network databases, missing data may directly impact their management at both decision-making and business/scientific domain-related levels. Planning is an important task for decision making. It helps develop a vision of needs in space and time so as to quantify and prioritise them to direct funding towards the most necessary investments and at a reasonable cost since urgent and unexpected operation costs are far higher than anticipated ones [13]. Decision makers use the available databases, which generally suffer from incompleteness, thus, often leading processing [33], image classification [34], and speech recognition [35]. However, the models behind this achievement like Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs) are only adapted to Euclidean data and cannot be applied directly to graphs, as their structures may vary extremely from one graph to another. For example, CNNs widely used for image applications, exploit the fixed structure of the pixel's neighbourhood to define convolution filters with shared weights and pooling operators [36]. This process cannot be directly generalised to graph structures since the number of neighbourhoods for each node might be different.
Considerable efforts have been deployed to make graphs benefit from the advancement of machine learning techniques. The main goal is to exploit the structure of graphs in the learning process, taking into consideration the topology and the relationships between their components (nodes and edges). Historically, machine learning models relied on handcrafted features, using approaches such as statistics to encode graph structures [37,38]. For example, in the case of graphs used to model viewers' relationships, when the edges between nodes represent a common watched film, one may use the number of shared edges between two users to suggest new ones. However, these approaches are time-consuming and inefficient since they depend strongly on the type of application and the specific use cases. To surpass these challenges, various automatic methods have been studied. Graph Embedding and Graph Neural Networks are the most common ones.

Graph Embedding
The goal of Graph Embedding is to use low-dimensional continuous vector representations for graph-structured data, instead of the whole graph, as input to the machine learning algorithms. Graph Embedding is the overlap of two problems, graph analysis, which aims to extract useful information from graph data, and representation learning, whose goal is to obtain a representation facilitating the extraction of useful information that is not necessarily low dimensional [39]. Embedding techniques depend on the type of graphs used as input (such as homogeneous/heterogeneous, directed/undirected, etc.) and the type of desired output (nodes' embedding, edges' embedding, graph embedding). In [39], a clear taxonomy of the different techniques and applications of graph embedding is presented. Although graph embedding techniques have been successfully used in many applications such as node classification using the Node2Vec algorithm [40], they nevertheless present several drawbacks. Indeed, Refs. [38,39] identified two severe ones: Computation inefficiency and the inability to generalise their application since they cannot deal with dynamic graphs. In addition, the authors of [41] indicate that mapping a graph structure into a simple representation may cause information loss. For example, in the case of node embedding, edges are considered as additional node features, although these links generally encode relationships between concepts or objects.

Graph Neural Networks
To operate directly on graphs, [41] proposed the first Graph Neural Network model. Described as the extension of existing neural network methods in the graph domain, this model considers nodes as concepts or objects and edges as relationships between them. To accomplish supervised learning, the GNN model associates each node to a state containing information about the node itself and its neighbourhood. Using a feedforward network, a shared transition function is defined to update all the states iteratively until a fixed point. The states are updated based on the current states of the nodes and the ones of their neighbours. Then, using a feedforward network, an output function is applied to the states to compute the outputs of each node, or a unique output for the whole graph, depending on the application. These steps are repeated following the descentgradient algorithm until the desired criterion is reached. This GNN model has proven to be efficient in some application domains, such as chemistry. However, it is not suitable for a variety of graph problems such as knowledge graphs and semi-supervised applications, where the goal is to predict missing data based on the graph structure. However, this model suffers essentially from the expensive cost of the computations while trying to reach fixed points. To address these problems, several variants of GNN models and new approaches have been proposed [42][43][44]. The most widely used is the Graph Convolutional Network (GCN), which aims at generalising CNNs to graphs. In the next paragraph, we present graph convolutional network models for semi-supervised learning which might be used to complete missing data.

GCN for Semi-Supervised Learning
Graph Convolutional Network (GCN) models have achieved state of the art in many applications. In semi-supervised learning for node applications, the objective is to use labelled nodes to learn representations or embedding of both labelled and unlabelled nodes and therefore use the resulting representations to predict missing labels. GCNs are classified into two categories: Spectral approaches and spatial approaches. Spectral approaches were first introduced in [45]. Since convolution filters, defined in the Euclidean space and used in CNNs, cannot be applied directly on graphs, [45] have shown that they can be defined in the Fourier domain for non-Euclidean data. This operation is defined in [38,43] as the multiplication of a signal x ∈ R N (one scalar for each node) with a filter g θ = diag(θ) parametrised by θ ∈ R N : where U is the matrix of eigenvectors of the normalised graph Laplacian L = I N − D −1/2 AD −1/2 = UΛU T , with a diagonal matrix of its eigenvalues Λ. D, A, and U T are respectively the degree matrix, the adjacency matrix of the graph, and the graph Fourier transform of x. However, this proposition suffers from two major drawbacks. First, calculating the eigenvectors and eigendecomposition is computationally expensive, especially for large graphs. Second, the defined filters in the spectral domain are nonspatially localised, contrary to those in CNNs, i.e., filters are not necessarily applied to spatially close nodes. To surpass these challenges, improvements have been published, which generally consist in proposing new filters [43,46]. ChebNet [46] is the most popular one, and uses polynomial parametrisation to compute K localised filters: where the parameter θ ∈ R K is a vector of Chebyshev coefficients. To address the computation issue, ChebNet uses Chebyshev expansion [47] of order K − 1 and g θ (Λ) becomes: where T k Λ ∈ R n×n is the Chebyshev polynomial of order k evaluated atΛ = 2Λ/λ max − I n , the rescaled eigenvalues in [−1, 1] with λ max the maximal eigenvalue. To alleviate the problem of overfitting on local neighbourhood structures on graphs, [43] limit and simplify the filtering to only the first-order neighbours with K = 1. Since they depend on the eigenbasis of the graph, spectral approaches cannot be used with graphs that have different structures. However, they are suitable for semi-supervised learning, which involves the prediction of features of the same graph used for the learning procedure. Thus, they are suitable for our goal, which involves the prediction of incomplete data related to wastewater networks.
Contrary to spectral approaches, spatial ones define convolution directly on graphs. Various propositions have been published. The authors of [48] proposed a spatial convolution network that operates directly on graphs for molecular applications. GraphSAGE [49], one of the most popular frameworks in this category, defined as an inductive framework. Unlike transductive approaches that generate embedding for a specific seen fixed graph in their process, inductive ones generate low dimensional representation for unseen compo-nents of graphs. GraphSAGE is based on the aggregation of fixed-size node neighbourhood features: where h k denotes a node's representation at step k, N (v) is the immediate neighbourhood of v, AGGREGATE is the aggregation function, and σ is a nonlinear activation function. Authors in [49] defined three aggregation functions: Mean, LSTM, and pooling. To avoid computing the spectrum of the graph Laplacian as in [45,46] and to apply CNNs on graphs, [50] proposed TAGCN, a method based on a fixed-size K-localised filters adaptive to the topology of graphs to replace the fixed square filters in traditional CNNs.

Materials and Methods
In this work, we seek to complete missing attribute values based on the structure of wastewater networks and the database records related to them.

Models and Test Configurations
To highlight the added value of GCNs in this prediction task, we also apply algorithms that do not take into account topology. The GCNs' results will thus be benchmarked against these non-topological algorithms: Support Vector Machine [51], Decision Trees [52], feedforward Artificial Neural Networks (ANN), precisely a MultiLayer Perceptron (MLP) [53], and four GCN models that have proven to be efficient in many applications. The GCN models consist of two spectral models: GCN [43] and ChebNet [46] as well as two spatial models: GraphSAGE [49] and TAGCN [50].
Given that pipe diameters and materials directly impact hydraulic modelling results, which is the aim of our work, we chose to automatically predict the missing values for each one of these two attributes. Nevertheless, other attributes could be targeted the same way.
The available attributes and their missing values are not necessarily similar and vary between providers. Hence, to investigate whether GCNs are useful in real cases, we defined two configurations based on the available data: When no attributes are available, domain knowledge can be used to create and add new attributes to the structure to improve the learning process. In wastewater networks, pipe diameters increase when moving from the upstream wastewater catchments to the vicinity of the treatment plant. This domain knowledge can be accounted for using Strahler's number, a measure of the network's branching complexity [54]. This attribute is easily computed for each pipe since the position of treatment plants is usually known. Thus, the first configuration is conducted using the network graph and Strahler's number as a domain knowledge attribute.
In the second configuration, managers possess more information about the networks, and relevant additional fields of the attribute table are used to infer relationships. Thus, this configuration is the richest in terms of learning material as it uses the network structure, domain knowledge, and additional characteristics to impute missing values. In this situation, the managers seek precise information about a specific attribute for various purposes, such as the diameter values for a hydraulic modelling simulation.
For each of the two configurations, the datasets were split into two subsets: Training and test. The training subset includes the available attributes of the pipes and their associated labels to be learned. However, contrary to non-topological models, in order to operate, GCN models require the structure of the graphs. Therefore, the entire structure of the graph modelled by the adjacency matrix of the wastewater network pipes was provided to this graph-based model. A total 10% of the training subset is used as a validation subset to tune the models' parameters, that is the number of convolution layers, the number of epochs, etc.
For the MLP, we set the number of hidden layers to 3 with respectively 100, 50, and 25 units for the first, second, and third hidden layers. The number of outputs is defined by the number of classes depending on each attribute. The Rectified Linear Unit (ReLu) is used as an activation function between the layers. All layers are formed by the linear layers of PyTorch [55] and the output is computed using the Log Softmax function. For GCN models (Figure 1), we set the number of convolution layers to 2, the number of hidden units was set to 20 for the first layer, and to the number of desired classes to predict for the second layer. We used the Rectified Linear Unit (ReLu) as an activation function between the two convolutional layers, and the LogSoftmax as the activation function to output the labels. For the ChebNet layers, the filter size K was varied from 10 to 40 depending on the configuration and the size of the training subset. For the SVM model, the regularisation parameter C is set to 1 and the Radial Basis Function (RBF) is a degree 3 polynomial kernel function. For the DT models, the Splitter is set to "best", the quality of the split is evaluated by the "Gini" criterion without any max depth constraint.
We implemented the GCN models and the MLP using PyTorch [55], where the name of the models GCN, ChebNet, GraphSAGE, and TAGCN are respectively GCNConv, ChebConv, SAGEConv, and TAGConv. The non-topological models, SVM and DT, were implemented using Scikit-learn [56].

Datasets
In this study, we used two real wastewater network databases. The first one is that of Angers Metropolis and is available through the French Government's open access portal (https://www.data.gouv.fr/ (accessed on 1 August 2020)). The second source is the database of Montpellier Méditerranée Métropole (3M) (https://data.montpellier3m.fr/ (accessed on 1 August 2020)). These databases were chosen because they have two specific fields for the pipe diameter and material (see Figure 2 for an example of attribute tables). However, the attribute values are not all indicated and 5.9% of the total pipes of Angers and 28.63% of those of the Montpellier datasets have a missing diameter or material values. At the scale of a metropolis, wastewater networks are usually formed of several subnetworks of cities and villages, either managed separately or linked to the main treatment plant by a unique pipe. Thus, the acquired databases are composed of several sub-graphs that represent independent wastewater networks and Strahler's orders may be computed separately for each sub-graph. However, due to data imperfections, these disconnections may also be the result of missing spatial information such as missing pipes. Hence, to validate our results, this study was carried out on the sub-networks having the least missing attribute values. Taking into consideration possible spatial imperfections, we carefully extracted one sub-graph from each dataset ( The different materials encountered in Angers metropolis are Polyvinyl Chloride (PVC), Asbestos-Cement (AC), Cast Iron, and Metal. In Montpelier metropolis we found, PVC, AC, Cast Iron, Concrete, Glass Reinforced Plastic (GRP), and Polypropylene. Ten classes of possible diameters are present in Angers's subgraph and Montpellier's subgraph, ranging from 80 to 500. However, for materials or diameters, several classes have less than 10 elements and will not be considered in the following. Figure 4 shows the distribution of material and diameter attributes for the considered classes, for the two data sets.

Testing Procedure
After tuning operations, the models are trained on 90% of the data and the remaining 10% are predicted. This is the first test. To put forward the models' ability to distinguish between classes and assess their effectiveness regarding minority classes, we evaluate the results of the predictions by computing the Recall, Precision, and F1-score metrics for each class of attributes as follows: This prediction operation is repeated 10 times with randomly selected datasets to estimate the models' performance more accurately. The average of these predictions is examined. To evaluate the performance of the models over each attribute, we compute the Macro-Recall, the Macro-Precision, and the Macro-F1-score as follows, where N is the number of classes of an attribute: The training set is then sequentially reduced to increase the size of the test set, i.e., 80% for training and 20% for testing and so forth. As shown in Figure 4, attribute values are unbalanced, and the portion of the selected test subset may include only the dominant classes. Therefore, the test subset is extracted as a portion of the number of occurrences in each class. Consequently, only classes with more than 10 occurrences are considered as test subsets. For example, the diameter class of value φ(200) having 568 occurrences in the sub-graph of the Angers metropolis, the number of selected pipes for a 10% testing subset (when the task is to predict pipe diameter values) will be 56.

Experimental Results
In this section, we show the results of atrributes' prediction for "Diameter" and "Material" for the two configurations described in Section 3.1. We compare the results of several experiments using the different machine learning techniques presented in the previous section. The purpose of comparing GCNs-based algorithms with different techniques of machine learning, which do not use the graph's structure to predict missing data, is to investigate whether the network graph can facilitate missing data completion in the context of a machine learning approach. It is important to note that thanks to its structure, a GCN can predict classes without being given any attributes as input. This is clearly not possible for non-topological models. Thus, before conducting the experiments on the two defined configurations, and in order to see the behaviour of a GCN in terms of the quality of its results using only the structure of the graphs, we tested this possibility. The results show that GCN models GCNConv, SAGEConv, and TAGConv predict only the dominant classes, but the ChebConv model can identify other non-dominant classes albeit with very low recall scores such as 10% for the diameter class φ(150) on limited randomly selected test datasets. The prediction of minority classes with ChebConv, even with low scores, shows that using the structure of wastewater networks is promising.

Configuration 1
In addition to the portion of the available values and the structure of the network, in this configuration, we added Strahler's order as an attribute to help the models distinguish between the classes. Figures 5 and 6 show the results for the Angers and Montpellier datasets, respectively. Despite having difficulties with classes with small occurrences, Strahler's order helps the models identify more classes than the dominant ones. Non-topological models SVM, Decision Tree, and MLP are unable to distinguish minor classes for the Angers dataset. Nevertheless, they predict some minor classes such as the class φ(400) with a high recall score for the Montpellier dataset (Figure 6a,c), despite having only 37 occurrences for this class. Unlike non-topological models, GCN models, namely, ChebConv and TAGConv, predict more classes for both datasets. Thus, GCN models outperform non-topological ones in terms of the number of detected classes.  ChebConv outperforms all models for both diameter and material prediction having predicted 30% of missing diameter classes φ(150) and φ(250) for the Angers dataset respectively with a recall of 79% and 77% (Figure 5c) despite having only 123 and 12 occurrences for these classes. In the case of the Montpellier dataset, ChebConv, while using only 30% of the available data, completes missing φ(150) and φ(300) diameter classes with respectively 63% and 58% recall (Figure 6a). The metric is improved when the training set is increased to 70%, thus reaching 77% and 85% respectively for these classes (Figure 6c). In comparison, the other models fail to detect these two classes for both datasets, except for TAGConv which has a very low score for the class φ(150) (Figures 5a,c and 6a,c).
Similar results are obtained for material prediction. Indeed, besides having higher scores for both datasets, only GCN models predicted the AC class for Angers (Figure 5b,d). This shows that the structure of the graph and the choice of the GCN model have a great impact on the learning process.

Configuration 2
In addition to the information used in the previous configuration, the attribute "material type" is added to help predict the attribute "diameter" and vice versa. The correlation between these attributes is 0.74 for the subgraph of Angers and 0.43 for the subgraph of Montpellier. Adding this information to the models substantially increases their performance regarding the number of detected classes and the recall scores. First, except for ChebConv as it already identified all the classes in the previous configuration, the number of predicted classes increases for all models. For instance, the non-topological models predict the AC class for the Angers dataset (Figure 7b). Second, Figures 7 and 8, show that recall scores have increased for the majority of the classes using the various models. Still, ChebConv outperforms all models by predicting missing values with high scores for almost all classes including the minor ones, using only 30% of the available data it achieved 80% for the class φ(300), having 34 occurrences (Figure 8a) and 70% for the class φ(250), having only 12 occurrences (Figure 7a).    Tables 1 and 2 display the scores, Macro-Recall (MR), Macro-Precision (MP), and Macro-F1 Score (MF1) for each attribute of the two datasets of Angers and Montpellier for configurations 1 and 2 respectively, and the nine different percentages of the dataset used for training. First, for configuration 1, for both cities, Table 1a,b show, as indicated before, a poor performance of the non-topological models. This has been expected since they use only Strahler's order to distinguish the different classes, while graph models use the adjacency matrix. As for configuration 2, the scores increase for all models. Thus, the performance of non-topological models relies only on the correlations (Table 3) between Strahler's order and the targeted attributes. Second, except for ChebConv, whose performance increases when the portion of missing values decreases, all the models' performances are generally constant in configuration 1 for the Angers dataset (Table 1a) since they predict only the dominant classes. This is also to be expected for non-topological models, since there is no correlation between Strahler's order and both attributes, diameter, and material, for this dataset. However, for the Montpellier dataset, (Table 1b) where the correlation between material and Strahler is 0.08 and between the diameter and Strahler the correlation is 0.31, the non-topological models' performances increase when the percentage of missing data decreases for the attribute diameter. In addition, in configuration 2, the models' performances evolve differently for the two datasets. For Angers, all models are nearly constant, although a small increase can be noted in ChebConv's performance while the missing data decreases. These scores (Table 2a) can be explained by the high correlation of the attributes material and diameter (0.74). For the Montpellier dataset, where the correlation is lower compared to the Angers dataset, almost all the models' performances increase. Figure 9 illustrates this evolution using the Macro-F1Score metric. The differences in performance related to the GCN models are detailed in the next paragraph.  Our experiments show that for real-world configurations, ChebConv yields the best results for both datasets and both predicted attributes. Spatial approaches fail to distinguish minority classes compared to the spectral approaches (i.e., ChebConv) and slightly outperform non-topological approaches. The fact that SAGEConv, which is a spatial approach, has a nearly similar evolution performance as non-topological models, and is outperformed by ChebConv, may be explained by the fixed-size set of the neighbourhood, where not all the neighbourhoods are explored. Furthermore, for the spectral approaches, ChebConv surpassing GCNConv may be explained by the differences in the number of K-localised filters since GCNConv uses only K = 1 to avoid overfitting. To confirm this assumption we varied the values of parameter K to 1, 10, 15, and 20 for the ChebConv model and compared the new experiments to the GCNConv. Figure 10 shows that GCNConv and ChebConv with K = 1 have similar performances regarding the number of predicted classes and the recall scores, when predicting the diameter values for the Montpellier dataset. Moreover, comparing the performance of ChebConv with different K values shows that increasing the number of neighbour nodes used in the learning process improves the prediction results. This was also noted for TAGConv.

Discussion and Conclusions
This study was conducted to investigate whether machine learning algorithms can be used for Missing Value Imputation on wastewater networks. We carried out tests using seven different models; four Graph Convolutional Network models: GCN, ChebNet, TAGCN, and GraphSAGE, and three popular non-topological models: SVM, Decision Trees, and a MultiLayer Perceptron. The results show that machine learning models are an efficient tool for completing missing attributes for wastewater networks when various types of information about a network are available. This is highlighted in the second test configuration we explored. Moreover, for extreme situations, when only the network layout and partial attribute information are available (i.e., the first test configuration), the Cheb-Conv spectral GCN approach, which is based on the approximation of the spectrum of the graph Laplacian, yields the best results for the completion of attribute values in general, and minority classes in particular. ChebConv also yields acceptable results when a small percentage of the available data is used for training. This was demonstrated in several studies using GCN-based models. The work of [32] demonstrated that, in comparison with other approaches such as KNN, the performance of their GCN-based model increases substantially when the percentage of the missing data increases. In a different application, similar conclusions were reached by [57] when inferring users' geo-localisation in social media. The authors used a semi-supervised configuration combining graph structure and text and showed that a GCN-based model performs well in scenarios with minimal supervision by effectively using unlabelled data.
The machine learning models that we used in this application require specific conditions. First, the classes to be learnt must be part of the training dataset. We complied with this request by ignoring classes with less than 10 occurrences. However, this led to fewer minority classes in the test subset and therefore impacted the prediction results substantially. Second, machine learning models are known to require important data quantity to achieve satisfying results. Having achieved these scores while using such restricted datasets shows that this approach can be even more promising with larger datasets. We would like also to emphasise that our objective was not to determine the best GCN architecture for wastewater network data completion, but rather to investigate the impact of the structure of the graph as a learning factor on the prediction results. In this study, we used the default implementation of the GCN models as described in the original papers. Although these models showed excellent performance in various domains such as information science, bibliometrics, water distribution systems, or biology [43,49,50,58], they can be further adapted to the specific context of each domain to produce better results. For instance, in [59], a novel type of GCN for road networks called Relational Fusion Network (RFN) is put forward for driving speed estimation and speed limit classification. The results indicate that RFN outperforms state-of-the-art GCN algorithms such as GraphSAGE in this application.
To assess whether the structure of the graph, modelled in our case by the adjacency matrix, has an impact on the learning process, non-topological models were trained using only the available attributes. That is Strahler's order for the first configuration and Strahler's order, diameter, and material for the second configuration. Strahler's order is used as a proxy for network topology in these models. For the GCN models, in addition to these attributes, the adjacency matrix is required and is also provided. The matrix is not used for the non-topological models because they are not built to deal with graph structures and require a pre-processing step to operate. This consists in representing or encoding the graph in a suitable form for the targeted model. As stated in Section 2, this operation is complex and does not guarantee the full use of the graph structure, while GCN models can easily handle information such as adjacency or angle between pipes to perform MVI operations. Therefore, no pre-processing was carried out in this work.
The attributes diameter, material, and Strahler's order were used only as illustration examples in this study. We aim to show that machine learning models can be an efficient method to help all entities facing the problem of missing wastewater network data, to overcome this challenge. The use of both numerical (diameter) and categorical (material) attributes shows that this approach overcomes the limits of the statistical methods used in [18]. In some instances, Strahler's order, which is dependent on the dataset, may not be the best descriptor. For instance, since the Angers dataset is very small, the pipe diameters do not increase when moving from the upstream wastewater catchments to the vicinity of the treatment plant. This leads to a lack of correlation between Strahler's order and diameter (Table 3). Thus, Strahler's order does not affect the diameter predictions for the Angers dataset, contrary to the Montpellier network. One may also use the type of buildings near the pipes as an attribute to predict their diameter. The main idea is that, since network construction rules vary from one country to another, and between regions of the same country, machine learning models can easily integrate new information to make predictions and improve them. It all depends on the available data and knowledge about the targeted network.
Urban managers and environmental monitoring services are often faced with incomplete data sets and have to resort to Missing Value Imputation (MVI) or Missing Data Imputation (MDI) algorithms. GCN models would provide managers with an additional accessible resource to overcome data imperfection challenges and support decision makers, be it to conduct repairs, predict future damages such as in [60], or run a hydraulic simulation model. Indeed, several urban utility networks such as gas, water, and electrical supplies are structured as graphs with nodes and edges. Our proposition would help asset management tasks by providing a better estimate for given characteristics of the undocumented portions of the network. Another important feature of Smart City management plans is air and water pollution monitoring. Given the spatial and temporal variability of environmental indicators, these monitoring plans rely on a network of sensors, spread out over large geographical areas. As with any piece of equipment, these devices are prone to failure and damage, resulting in missing data. By resorting to GNNs, managers would be able to extract the most of their network's structure and gain more accurate estimations of the missing data. They would thus be able to better inform citizens and improve their quality of life.

Data Availability Statement:
The data that support the findings of this study were derived from the following resources available in the public domain: https://www.data.gouv.fr/fr/; https://www. data.montpellier3m.fr/ (accessed on 20 May 2021).