Estimating Spatio-Temporal Building Power Consumption Based on Graph Convolution Network Method

: Buildings are responsible for around 30% and 42% of the consumed energy at the global and European levels, respectively. Accurate building power consumption estimation is crucial for resource saving. This research investigates the combination of graph convolutional networks (GCNs) and long short-term memory networks (LSTMs) to analyze power building consumption, thereby focusing on predictive modeling. Specifically, by structuring graphs based on Pearson’s correlation and Euclidean distance methods, GCNs are employed to discern intricate spatial dependencies, and LSTM is used for temporal dependencies. The proposed models are applied to data from a multistory, multizone educational building, and they are then compared with baseline machine learning, deep learning, and statistical models. The performance of all models is evaluated using metrics such as the mean absolute error (MAE), mean squared error (MSE), R-squared ( R 2 ), and the coefficient of variation of the root mean squared error (CV(RMSE)). Among the proposed computation models, one of the Euclidean-based models consistently achieved the lowest MAE and MSE values, thus indicating superior prediction accuracy. The suggested methods seem promising and highlight the effectiveness of GCNs in improving accuracy and reliability in predicting power consumption. The results could be useful in the planning of building energy policies by engineers, as well as in the evaluation of the energy management of structures.


Introduction
At the global level, buildings consume around 30% of produced energy, and they are also responsible for 26% of energy-related emissions [1].Furthermore, in the European Union, the energy utilized in buildings represents 42% of energy use and more than 30% of greenhouse gas emissions [2].Consequently, buildings in general could be considered as one of the largest energy consumers.Therefore, accurate building energy prediction is vital for engineers to design policies that lead to high levels of energy efficiency, for stakeholders to make investment decisions, and for consumers and businesses to save energy and money.
Building energy prediction is based on three main methodologies: physical, hybrid, and data-driven models.Physical or "white box" models dynamically describe the thermal behavior of a building using heat and mass transfer equations.EnergyPlus, TRNSYS, and DOE-2 are some of the available software packages for building energy modeling [3].Hybrid or "gray box" methods incorporate physical and data-driven approaches to predict building energy consumption, such as the proposal by Dong et al. [4], which combines building geometry (physical) and historical power consumption data to predict air conditioning and total power consumption for a group of residences.Furthermore, data-driven or "black box" models learn from the historical data of the target values, as well as environmental and exogenous factors, to forecast power utilization.Data-driven algorithms are distinguished into statistical and machine learning (ML) models.The former include methods like linear regression (LR), autoregressive moving average (ARMA), and autoregressive integrated moving average (ARIMA), while the latter comprise methods such as regression tree, support vector regression (SVR), artificial neural networks (ANNs), deep learning (DL), and ensemble [5,6].
In the last decade, machine learning methods have gained the interest of researchers and have been extensively used in building power prediction.For instance, ML and DL algorithms were used for forecasting electrical demand in airports [7], offices [8][9][10], educational institutions [11][12][13], and residential [14,15] buildings.In more detail, the recurrent neural network (RNN) algorithms, namely long short-term memory (LSTM) [16], convolutional neural network (CNN), hybrid CNN-LSTM [17], and gated recurrent units (GRU) [18], perform well in power prediction due to their ability to extract the temporal dependencies between energy usage and other features such as occupancy, lighting conditions, and equipment usage.A brief summary of building energy prediction applications utilizing LSTM models is presented in [19].
Graph neural networks (GNNs) are deep learning algorithms for analyzing and modeling complex relationships, and they have been successfully applied to different scientific fields including social networks, citation networks, molecular structures, physics, chemistry, road networks, finance, etc. [20,21].The fundamental idea behind GNNs is to extract the spatial correlation of the nodes based on their topological connections.GNNs demonstrate versatility and effectiveness in real-world challenges, and they have shown notable success in various applications, including chemical reaction prediction, question answering, image classification, disease classification, and time-series prediction [22].
Building-related prediction applications, such as energy usage, indoor environmental conditions, occupancy, etc., have known problems such as time series prediction; however, significant progress has been made with the incorporation of graph neural networks (GNNs), which represent a novel approach to modeling complex relationships within building structures.Inspired by the success of GNNs in capturing dependencies within graph-structured data, Hu et al. [23] introduced a novel graph-based hybrid model (a spatio-temporal graph convolutional network (ST-GCN)) to embed solar-based building interdependencies in urban building energy modeling in order to predict the energy usage of several buildings in a university campus.They performed a solar analysis to construct a directed-weighted graph of the project, where nodes represent buildings and edges represent the solar impacts.Guo et al. [24] combined GCN and GRU to predict the future energy consumption of 140 locations in a large-scale aluminum profile plant located in Guangdong, China.Lu et al. [25] proposed a GCN-based model for the estimation of the design loads of complex-shaped buildings.
Furthermore, Jia et al. [26] proposed a graph-based model to predict the thermal load of a building that incorporates graph attention neural networks (GATs) and gated recurrent units (GRUs) to extract spatial and temporal dependencies, respectively.This model was applied to a dataset from a simulated, single-story, and four-zone building with a general prediction accuracy (MSE) of nearly 0.01.The researchers depicted the thermal zones of the building as an unweighted and undirected graph to define the feature matrix.Moreover, Zhang et al. [27] presented a spatio-temporal, graph-based data-driven model (GNN-RNN), which showed enhanced accuracy compared to conventional deep learning models, for the indoor environment prediction and optimal control of air conditioning systems.The authors used a graph-based data representation of the central air conditioning system of the building to integrate spatial and temporal data to the forecasting model.Regarding spatial data, consider the air handling units' and variable air volume boxes' physical location and the connections between them.In terms of occupancy prediction, Xie and Stravoravdis [28] suggested a hybrid GCN-LSTM model, which is implemented in a generated occupancy profile.The authors mapped each office layout as an undirected, unweighted graph, where the nodes indicate each room in the floor plan and the edges represent their connection.
In summarizing the recent literature on GNNs for building energy usage prediction, to the best of the authors' knowledge, not much emphasis has been given to the impact of graph construction methods on the field of building power prediction.In this study, several graph computation methodologies are examined and implemented over a GNN-RNN model to forecast the zone-level and overall energy usage of a real-world multizone, multistory building dataset.
The main contributions of this work are as follows: 1.
An adaptation of graph computation techniques, which have previously been utilized in other domains, for the purpose of building load prediction.

2.
An investigation, adaptation, and implementation of generic methods to extract the spatial information of building zones to build a graph that represents the relationship between zones in pairs for a prediction application.

3.
The generated graphs are implemented over a GNN-RNN forecasting model, and their results are compared with a variety of statistical, machine, and deep learning algorithms in terms of accuracy and the utilization of several error metrics.
This paper is divided into the following parts: Section 2 introduces and analyzes the necessary actions related to the graph computation techniques and presents the forecasting model.Dataset description, preprocessing, conducted experiments, and their results are presented in Section 3. In Section 4, the results and limitations are discussed while some directions for future work are suggested.This paper is then concluded in Section 5.

Materials and Methods
In this section, topics related to the prediction of building power consumption and several major concepts will be presented.The preliminary Section 2.1 serves as a foundational introduction in graph symbolization, prediction problem definition, and power problem forecasting formulation, thereby providing essential context for the subsequent analysis.Furthermore, in Section 2.2, the methodologies for adjacency matrix computation are presented, which involves the exploration of the various computational techniques employed to derive the crucial representation of graph structures that will be utilized in the forecasting model described in Section 2.3.Lastly, in Section 2.4, a comparison of the models that evaluate and contrast the efficacy of the proposed forecasting model is presented, thereby providing invaluable insight into the relative strengths and limitations of the proposed model.

Preliminaries
A multizone building power consumption prediction can be considered a spatiotemporal or, even better, a micro spatio-temporal prediction application.In order to describe the space layout or the quantitative relations between zones, a graph network approach is introduced.This graph is symbolized as G = (V, E), where V is a set of N nodes that denote each building's thermal zone, while E is an edges set that represents the association between a pair of nodes.A Boolean (binary) adjacency matrix A, whose elements are 0 or 1, is expressed as A∈R N×N , and it describes the links for each couple of nodes.The adjacency matrix element a ij is equal to 1 if nodes i and j are directly connected, otherwise a ij = 0.In this study, several methods are used for the computation of the adjacency matrix, and they will be analyzed in Section 2.2.
The estimation problem is defined as a given sequence of power consumption values in the time steps t + 1, t + 2, . . ., t + T, and the prediction of future power consumption values is represented in the time steps t + T + 1, t + T + 2, . . ., t + T + h, where T is the past historical input data and h the length of the prediction horizon.
The present and historical values of the power consumption of all zones are defined as X = (X 1 , X 2 , . . ., X T ), while the values x i t for every zone i at every sampling time t are summarized in a feature matrix X t = (x 1 t , x 2 t , . . ., x N t )∈R N×D , where D is the number of input features.Furthermore, the sequence of the building power consumption estimation for a prediction time t is defined as Xt = ( xt 1 , xt 2 , . . ., xt N ) ∈ R N×h .Consequently, in incorporating the abovementioned definitions, the power prediction is formulated as Xt+T+h = F(X, G), where F is the GNN-RNN model.

Adjacency Matrix Computation
Constructing a graph from the geographical location data of elements in a twodimensional space is an unambiguous task.For instance, in traffic speed prediction research, the values a ij of the adjacency matrix, which constructs the graph, could be the inverse proportion of the geographical distance of points i and j.In contrast, in a multistory building energy prediction study, the thermal zones are in a three-dimensional space, in which case, the impression to a graph is complicated.In this section, the methods that are used for adjacency matrix computation are presented and analyzed.
The methods that are used in this study are divided into two categories: correlationbased and distance-based.

Correlation-Based Methods Pearson Correlation Coefficient (PCC)
As stated by Li et al. [29] and utilized by Zhang et al. [27], the similarities among node vectors (also known as node attributes) can serve as a quantitative measure of the correlation between nodes.Therefore, in this first method, the elements a ij of the adjacency matrix A are calculated using Pearson's correlation coefficient [30] (Equation ( 1)) between the two zones v i and v j and the power consumption, which are described in Equation ( 2): where Pv i and Pv j are the power consumption of zones v i and v j , Pv ik and Pv jk are the individual power consumption data points, Pv i , Pv j are the mean values, and n is the number of data points.
where PCC is the Pearson correlation coefficient and σ∈[−1, 1] is a threshold to control the distribution and sparsity of the matrix A. Only the nodes that have a correlation value greater than the threshold are connected.

Absolute Pearson Correlation Coefficient (PCCA)
This method, as presented in Equation ( 3), is nearly similar to the previous one (PCC), with the only difference being that the absolute value of PCC is chosen, which will produce a graph different from the previous method in the case where the correlation matrix has a sufficient number of negatively correlated values.
where X min and X max are the minimum and maximum values of the input data X, and min and max are the desired range to transform the input data.

Distance-Based Methods
Inspired from other works on traffic prediction and molecular science, we represent the Euclidean data in a non-Euclidean space of a graph using the following method.The floors of the building are placed in a three-dimensional Cartesian coordinate system, where the lower left corner of the ground floor of the building is considered as the point zero of the three axes while the coordinates of each zone's center is assumed.

Euclidean Distance Scaled (EDS)
A matrix with dimensions N×N was created, where each element is the spatial distance between zones i and j, which is calculated by Equation ( 6) and deals with the Euclidean distance in three-dimensional space, as is depicted in Figure 1.The matrix is then scaled (Equation ( 4)) in space [0,1].A filter, i.e., the threshold value σ, is applied to the matrix elements, where only the values lower than the threshold are kept.The adjacency matrix is computed according to Equation (7).
where x i , y i , z i and x j , y j , z j are coordinates in the three-dimensional space of nodes i and j, respectively.
where ED scaled is the Euclidean distance between a pair of thermal zones and σ is a threshold value.Euclidean Distance with Threshold (EDT) In this method, the matrix elements a ij are equal to 1 when the Euclidean distance is smaller than the threshold value, as illustrated in Equation (8).In other words, only the nodes whose distance is less than the value of the threshold are connected.
Euclidean Distance with Gaussian Kernel (EDGK) In this last distance-based method, the elements a ij of the adjacency matrix are determined by the Gaussian kernel weighting function [31], which has been adapted in this study and is presented in Equation (9).
where ED is the Euclidean distance, and σ and θ are the distance and distribution thresholds among the zones, respectively.

Forecasting Model
In the field of predicting building energy consumption, combining graph convolutional networks (GCNs) with long short-term memory (LSTM) models provides a powerful approach for capturing the complex spatial and temporal relationships found in building energy systems.GCNs are highly effective in analyzing complex relationships within graph-structured data, such as building networks, by aggregating the information from neighboring nodes.This capability is especially valuable for modeling the spatial relationships among various components of a building, such as rooms, floors, and zones.On the other hand, LSTM models stand out by capturing temporal patterns and dependencies over time, which are crucial for forecasting energy consumption dynamics.Thus, the utilization of a GCN-LSTM model in building energy prediction tasks extracts spatial and temporal variations, thereby representing an improved predictive performance and a deeper understanding of energy consumption dynamics in complex building environments.

Graph Convolutional Networks
Graph convolutional networks (GCN) are one of the most straightforward and extensively utilized variants of graph neural networks (GNNs).Their operation is based on the aggregation of the attributes of neighboring nodes via a weighted average, where the weights are determined by the edge connections [32].
A simplified version of a GCN [33] takes a graph as the input with a set of node features X, thereby utilizing this information to produce node embeddings that use the graph convolution operation, which is expressed as a nonlinear function, as is illustrated in the following layer-wise propagation formula: where A is the adjacency matrix of graph G, H l ∈R N×C and H 0 = X∈R N×D represent the output and input vectors of lth GCN layer, σ(•) is denoted the activation function, and W (l) is a layer-specific trainable weight matrix.
In this study, in order to extract spatial information, a single-layer GCN was used in the prediction model, as is shown in Equation (11).
where ReLU(•) is applied as an activation function and W (0) is randomly initialized using a Glorot initializer [34].

Long Short-Term Memory
Long short-term memory (LSTM) [35] is a variation of recurrent neural networks.It presents tremendous prediction effectiveness in numerous tasks, such as time series prediction, speech recognition, and natural language processing, due to its ability to capture temporal dependencies in sequential data.
LSTM bears a similarity to the standard RNN formation, but it uses a specialized memory unit capable of retaining or discarding information over sequential data.The memory unit or cell has four layers that interact in a special way.The computation method is explained in Equation ( 12) [36], and the architecture is depicted in Figure 2.
More specifically, t is the time step and x t is the input to the current time step.W, b, σ, and tanh are the weight matrices, bias vectors, sigmoid activation function, and hyperbolic tangent activation function, respectively.Forget Gate f t decides what information of the previous unit state to forget, while Input Gate i t controls the new information to store in the memory unit.Furthermore, c t is the memory unit that stores the long-term information from previous time steps and combines the information from the forget and input gates to update the memory unit.Finally, Output Gate o t determines the output of the LSTM memory unit.

The Proposed Model: GCN-LSTM
The structure of the GCN-LSTM model utilized in this study, as shown in Figure 3, is a subset of the generic GNN-RNN prediction models.It consists of an input layer, a GCN layer, a LSTM layer, and a fully connected layer as the output.Each layer is extensively described in the following list.
Input layer: A graph G with N nodes.These represent the number of building zones that are generated using the similarities in the power consumption or distance information between the zones.The input of the GCN-LSTM model is the historical values of the N zones at T time points before the prediction window T + h.
GCN layer: A GCN layer with a ReLU activation function is used to extract the spatial correlations between the neighbor nodes.The GCN layer uses weights to aggregate the information from the neighbor nodes based on the acquired correlation between zones.
LSTM layer: This layer is adopted as a temporal feature extraction module to capture the long-term sequential dependencies of the power consumption between zones.
Output layer: A fully connected (Dense) layer returns the power consumption prediction sequence.

Comparison Models
The presented adjacency matrix computation methods in combination with the GCN-LSTM model are compared with some statistical, machine, and deep learning models for multi-output multistep building power prediction.These baseline models are introduced below: Historical average (HA): This is a statistical model that does not utilize online data for making predictions.The strategy underlying its prediction is as follows: For each prediction value, the average of historical values is used.Therefore, considering the mean value, the prediction horizon can be determined from one step up to multiple steps ahead.This model is quite simple but lacks the ability to capture abrupt value changes.
Multilayer perceptron (MLP) [37]: This neural network model works by taking the historical data as the input and using multiple layers of interconnected neurons to learn patterns and relationships within the data.These patterns help the model make predictions about future values.During the training process, the model adjusts its internal parameters through backpropagation, and it compares its predictions to the actual values and updates its synaptic weights to minimize the forecast errors.
Convolutional neural network (CNN) [38]: This is a kind of deep learning model that uses sequential data as the input and convolutional layers to extract the features and patterns from the data over time.These features capture local dependencies within the time series, thereby aiding in learning the relevant information for forecasting.The main architecture of this model consists of convolutional layers; pooling layers, which the model applies to reduce dimensionality and further extract essential features; and one or more fully connected layers, which are used to make the final predictions.
Long short-term memory (LSTM): LSTM is a variant of RNN, as described in Section 2.3.2.

Dataset Description
In order to verify the effectiveness of our methods, several experiments were conducted on the following real-world dataset.
The CU-BEMS [39] dataset was collected from a seven-story building with an overall area of 11,700 m 2 located in Bangkok, Thailand.The building is divided into 33 thermal zones.The plan of each floor is shown in Figure 4.The dataset consists of measurements of the electricity consumption, in kW, of the individual air conditioning units, lighting, and plug loads of each zone.Furthermore, the indoor environmental conditions that were recorded include temperature (°C), relative humidity (%), and ambient light (lux) values for each zone.The data were gathered from 1 July 2018 to 31 December 2019, with a resolution of one minute.This dataset has been used by other scholars to predict indoor temperature [40], real-time thermal comfort [41], and energy use [42].

Data Preprocessing
For this study, the load measurements of the same zone were aggregated to one value per time step, and the environmental measurements were discarded.The dataset does not present missing values.The null-empty values, due to null power consumption, were converted to zeros.The power consumption of Floor 1, Thermal Zone 3 appeared with a peak value (i.e., it is 160 times larger than the previous and next value) for a certain time step; thus, it was replaced with a linear interpolated value.The final dataset consists of 33 columns that represent the overall power consumption of each zone.Furthermore, the variables of the dataset were standardized using the Standard Scaler function from the Scikit Learn Python package before being used by the prediction algorithms.This function rescales the distribution of values such that the mean of the values is 0 and the standard deviation is 1.

Dataset Analysis
In this section, a brief analysis of the dataset is conducted.The dataset consists of 33 time series, with 790.560 observations, which correspond to the power consumption of the relevant zones.Therefore, this dataset is described as "large data", and a letter value plot [43] is used to investigate the distribution and variability of the data.The letter-value plot is an extension of the box plots, which shows only actual values, and it labels fewer observations as outliers than the box plots.
Figure 5a presents a letter-value plot for each thermal zone, and Figure 5b shows the horizontal bar plot of the median and mean values for each zone of the building.From these two plots, it is noticeable that most of the zones presented a mean power consumption that was lower than 10 kW and a median that was nearly 2 kW.Only a couple of the zones presented mean and median values greater than 20 kW.
Furthermore, Figure 6 shows the letter-value plot for the total power consumption of the building.It was discerned that the median was around 300 kW and that half of the total observations were distributed at 300 kW and below; meanwhile, the remainder of the total observed values were allocated between 300 kW and around 1750 kW.
Additionally, Figure 7 depicts a correlation plot that visualizes the relationships between the power consumption of the zones.It seemed that Zone 1 to Zone 4, which are located on the first floor, had a low correlation compared to the other zones.

Predictions Evaluation
The performance of each model was evaluated by the following metrics: mean absolute error in kW (MAE), mean squared error in kW (MSE), R-squared in percentage (R 2 ), and the coefficient of variation of the root mean squared error in percentage (CV(RMSE)).These are also the most commonly used metrics in building power prediction tasks.
where y i denotes the actual values, ȳ is the mean of the y i values, ŷi represents the predicted values, and n presents the observed samples.The RMSE is defined as follows: The lower values of MAE and MSE [44] indicate a better prediction performance.The R 2 metric, when it is closer to 100%, represents a better model fitting.Moreover, as stated by Chicco et al. [45], R-squared is more informative in regression analysis evaluation than other metrics.Furthermore, the CV(RMSE) metric, when it shows values lower than 25%, represents a model with satisfactory prediction according to ASHRAE Guideline 14 [46].

Software Environment and Experimental Setup
The research experiments were conducted in a Google Colab Platform with a Python 3 Google Compute Engine backend, and it utilized a GPU NVIDIA T4 with RAM 15.0 GB system with RAM 12.7 GB and an available disk space of 78.2 GB.The 3.10 Python programming language was used for code development, with the incorporation of the open-source software library Tensorflow 2.15.0 and the Keras 3.0.2high-level API for training and testing the algorithms.Furthermore, the following Python libraries were used: Pandas 2.1.0and Numpy 1.26.0 for data analysis, as well as Seaborn and Matplotlib for visualizing the exploratory analysis and predicted results, respectively.Additionally, the NetworkX Python library was used for graph analysis and visualization.
The architecture of our forecasting model consists of one graph convolutional layer, an activation function ReLU, one LSTM layer with dropout, and one dense layer as the output.The datasets were divided into 70%, 10%, and 20% for training, validation, and testing, respectively.The root mean squared propagation (RMSprop) optimizer was utilized to update the prediction model parameters with a learning rate of 0.001, while mean absolute error was chosen as a loss function.The batch size was set to 256, and the historical data were set to 10. Finally, an early-stopping regularization technique with a patience parameter was used during the training process to prevent model overfitting.The patience parameter value signifies the number of iterations at which no further enhancement in the prediction performance is observed, thus leading to the termination of the training process.Multiple trial intervals were experimented with, ranging from 5 to 10, in order to determine the best fit for the patience parameter, which was determined to be 8.

Experimental Results
In this section, the different adjacency matrices and the results of the multistep prediction experiments are presented.For the correlation-based adjacency matrix computation (AMC) methods, the value of the threshold σ was selected as 0.7 so as to compare the three methods on the same basis.Figure 8 illustrates the adjacency matrices.As is presented in Figure 8a,b, the PCC and PCCA methods produced the same graph with 220 edges.This was due to the fact that the negative values of the correlations in the examined dataset when they become positive did not affect the number of values that were above the threshold value, as illustrated in the distribution plots of Figure 9. Additionally, the PCCS method produced a graph of 289 edges, as is depicted in Figure 8c.On the other hand, for the distance-based AMC methods, the value of threshold σ was chosen as the mean value of the Euclidean distances between the zones.Additionally, the θ threshold value was selected as 10.Hence, the aim was to connect only the nodes where the distances of the corresponding zones were smaller than the average distances.The adjacency matrices that were made by the EDS, EDT, and EDGK methods are shown in Figures 10a, 10b, and 10c, respectively.The above methods were evaluated at the zone and building levels for a prediction horizon of 5, 10, and 15 timesteps ahead.Table 1 presents the mean power prediction errors in the zones for the three prediction horizons across all of the forecasting methods.The metrics include the mean absolute error (MAE) and the mean squared error (MSE), which are measured in kilowatts (kW), as well as the coefficient of determination (R 2 ), which is expressed as a percentage.Furthermore, Figure 11 shows the alternation of the metrics across the prediction horizons on the three randomly selected zones, and Figure 12 presents a plot of a random zone for the predicted and the actual values.In contrast, Table 2 presents the building-level power consumption forecasted performance metrics for the forecasting models across different prediction horizons (5, 10, and 15), and the metrics are plotted in Figure 13.The metrics include the mean absolute error (MAE), the mean squared error (MSE), and the coefficient of variation of root mean squared error (CV(RMSE)).Additionally, Figure 14 depicts the predicted and actual values of the building's total consumed power.

Discussion
In this study, the impact of several graph computation techniques that were applied over the GCN-LSTM model was investigated to predict the energy usage of an educational building.Among the computation models utilized, at the zone and building level, EDT consistently achieved the lowest MAE and MSE values on all horizons, thus indicating a superior prediction accuracy.PCCA and EDS also demonstrated competitive performance, with relatively low MAE and MSE values.In contrast, the LSTM, CNN, and MLP models exhibited higher MAE and MSE values, thereby suggesting less accurate predictions compared to the other models.For LSTM and CNN in particular, this prediction behavior can be explained due to the lack of exogenous features for the training of the models.
In terms of the building-level prediction metric CV(RMSE), all methods, except for HA, produced prediction values below 25%, which shows a good model fit with a more than satisfactory total power forecasting according to ASHRAE Guideline 14.For the mean zone-level estimation metric R 2 , the EDT method presented the confounding model fitting effect for the three prediction horizons in comparison with the other computation methods.The HA model performed the worst across all metrics and prediction horizons, with significantly higher MAE and MSE values, thus indicating poor predictive performance.
Additionally, as shown in Figures 12 and 14, the predictive accuracy of the various models and time horizons for power forecasting at both the zone and building levels was evident.It was apparent that the proposed models exhibited a high level of consistency with the actual curve when compared.
However, several limitations warrant consideration.First, the present study examined the computation of binary adjacency matrices over a simplified GCN-LSTM model for spatio-temporal building power prediction.This simplified model does not take into account the weights of the edges.Second, exogenous parameters, such as weather conditions and building usage in terms of occupancy, were not taken into consideration during model training.Third, the finer sampling resolution (one minute) of the utilized dataset and the educational use of the building, which presents a repetitive load shape, might lead to better prediction results.
These findings suggest certain recommendations for future research, such as the application of the proposed methodologies over a variation of the GNN forecasting models proposed by other academics for time series estimation, like graph attention neural networks, GraphSAGE, etc.It may be of scholarly interest to explore the application of these methods on datasets characterized by varying levels of granularity.Moreover, future studies should explore the analysis of how the choice of threshold value affects the prediction outcomes.Additionally, the proposed methodologies can be implemented on a dataset representing a composite building type, such as residential apartments, or on a group of buildings forming a microgrid.
In summary, this study contributes valuable insights into the efficacy of graph neural networks and the several graph representation methodologies that were used for the building energy prediction.

Conclusions
In this paper, a comprehensive study was carried out to predict the energy consumption of a multistory, multizone building for multiple prediction horizons.For this purpose, several adjacency matrix computation methods were proposed over a GNN-RNN deep learning model, and they were compared with the baseline machine and deep learning models such as LSTM, MLP, CNN, and a statistical model.The proposed adjacency matrix computation methods were divided into correlation-based and distance-based methods.More specifically, the methods that are computed with the Pearson correlation matrix as a basis are the Pearson Correlation Coefficient (PCC), the Absolute Pearson Correlation Coefficient (PCCA), and the Pearson Correlation Coefficient Scaled (PCCS).Furthermore, the methods that have Euclidean distance among zones as their basis are the Euclidean

Figure 1 .
Figure 1.Euclidean distance computation in a Cartesian space between zones.This figure represents two floor plans, A and B, with the relevant zones of A and B.

Figure 5 .
Figure 5. (a) Letter-value plot of the thermal zones.(b) Plot of the mean and median values of each zone.

Figure 6 .Figure 7 .
Figure 6.Letter-value plot of the building's total power.

Figure 8 .Figure 9 .
Figure 8. (a-c):The generated adjacency matrix for each computation method.The edges between the nodes and zones are presented with a black color, e.g., for the PCC method, the nodes with values lower than the threshold were not interconnected with the other nodes due to low or negative correlations.

Figure 10 .
Figure 10.The generated adjacency matrices for the distance-based computation methods (a-c).

Figure 12 .
Figure 12.Plot of Zone 3's actual and predicted power values grouped per prediction horizon.

Table 1 .
The zone-level mean prediction performance of the different models.

Table 2 .
Building-level prediction performance of the different models.
Figure 14.Plot of the building's total actual and predicted power values grouped per time step horizon.