Symmetry-Aware Graph Neural Approaches for Data-Efficient Return Prediction in International Financial Market Indices

Lee, Tae Kyoung; Choi, Insu; Kim, Woo Chang

doi:10.3390/sym17091372

Open AccessArticle

Symmetry-Aware Graph Neural Approaches for Data-Efficient Return Prediction in International Financial Market Indices

by

Tae Kyoung Lee

^1,†,

Insu Choi

^2,*,†

and

Woo Chang Kim

^2,*

¹

Department of Industrial and Management Systems Engineering, Kyung Hee University, Yongin 17104, Republic of Korea

²

Department of Industrial and Systems Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Symmetry 2025, 17(9), 1372; https://doi.org/10.3390/sym17091372

Submission received: 21 June 2025 / Revised: 21 July 2025 / Accepted: 5 August 2025 / Published: 22 August 2025

(This article belongs to the Special Issue Symmetry and Asymmetry in Machine Learning and Data Science)

Download

Browse Figures

Versions Notes

Abstract

This research evaluates the suitability of Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT) for improving financial return predictions across 15 major worldwide stock indices. The proposed method uses graph modeling to represent financial index relationships which enables the detection of symmetric market dependencies including mutual spillover effects and bidirectional influence patterns. The symmetric network structures become most important during financial instability because market interdependencies strengthen at such times. The evaluation process compares these models against XGBoost and Multi-Layer Perceptron (MLP) and Support Vector Machine (SVM) traditional forecasting approaches. The results of 30 time-series cross-validation experiments show that GNN models produce lower RMSE and MAE values, especially during financial crises and recovery phases and volatile market periods. The models show reduced advantages when markets remain stable. The research demonstrates that graph-based forecasting models which incorporate symmetry effectively detect complex financial relationships which leads to important implications for investment strategies and financial risk management and global economic forecasting.

Keywords:

data-efficient forecasting; graph neural networks; bidirectional spillover effects; international financial indices; predictive modeling; systemic risk

1. Introduction

The increasing global financial market connections make it difficult for economists to monitor economic disturbances that propagate across worldwide index systems. ML and DL techniques within standard financial forecasting systems rely on static feature representations that fail to demonstrate market relationships. Financial indices operate as part of a network system which enables market spillovers to affect each other, so models must account for network dynamics instead of independent index analysis.

The inter-market relationships commonly present symmetric patterns because markets influence each other through reciprocal responses and bidirectional spillover effects. The correct modeling of global financial networks requires capturing these symmetrical dependencies because they define the actual nature of the system.

Financial forecasting benefits from GNNs as an innovative solution because they process data relational structures. The ability of GNNs to learn endogenous features from connectivity patterns makes them suitable for complex nonlinear financial index modeling. GNNs have been applied to stock predictions and cryptocurrency markets and single-market indices but their use for modeling global financial spillovers remains an underdeveloped field. GNNs use their ability to gather information from connected nodes to naturally encode symmetric relationships in graph structures where edge connections indicate market behavior that reciprocates. The models show potential to improve prediction results by identifying concealed market connections.

The financial forecasting potential of GNNs faces two fundamental barriers that prevent their adoption. Current GNN finance research primarily studies localized financial data without demonstrating effective use in international investment markets. Financial markets operate independently yet researchers rarely study how GNN-based models perform when processing multiple worldwide indices with varying economic conditions and different risk profiles and regulatory systems. The research shows that GNN forecasting methods need thorough validation because multiple studies present unproven predictive accuracy improvements through insufficient cross-validation methods and ambiguous statistical significance results.

This study aims to address these gaps by answering the following core research questions: (1) Can GNN-based models such as GCN and GAT predict global financial index returns more effectively than traditional ML methods? (2) Under what market conditions do graph-based models exhibit performance advantages? (3) Can a correlation-thresholded graph structure constructed from return data capture global spillover dynamics without relying on external variables?

To this end, the objectives of this paper are threefold: (i) to design and validate a GNN-based framework for forecasting global financial indices using only return-based data; (ii) to benchmark GCN and GAT models against standard ML baselines such as RF, XGBoost, SVM, MLP, and KNN through repeated time-series cross-validation; and (iii) to identify market regimes (e.g., volatility, crisis recovery) where GNNs offer statistically significant advantages.

This study creates new possibilities for GNN-based forecasting by studying global market applications while building a complete validation framework. The research evaluates the performance of Graph Convolutional Networks (GCNs) and Graph Attention Networks (GATs) when predicting daily market returns from the fifteen most capitalization-heavy global financial indices. The chosen indices represent diverse economic conditions along with market interaction effects which generate results that transcend individual economies. The method operates solely on intrinsic market data which enhances data efficiency and reduces the requirement for external indicators that could produce delays and bias.

The research establishes its findings through 30 time-series cross-validation experiments which provide a robust methodological framework for statistical assessment. The evaluation of models relies on RMSE and MAE measurements which are assessed at three different significance levels of 10%, 5% and 1%. The research explores specific market environments that show better performance by GNNs than conventional models while distinguishing between periods of high market volatility and post-crisis recovery phases and times of economic stability. The thorough analysis enables us to determine when GNNs excel and which specific conditions optimize their predictive benefits.

The research establishes three crucial contributions which benefit financial econometrics together with quantitative finance. The research demonstrates how GNNs outperform traditional models by detecting intricate nonlinear relationships that exist between worldwide indices. Graph-based models excel at capturing the symmetric financial interdependencies which produces stronger and more interpretable forecasting results. The research demonstrates graph-based methods’ durability through multiple testing sessions that prove their effectiveness in various market environments. The application of GNNs in global financial forecasting enables them to function as efficient tools for investment strategy development and portfolio risk management and macroeconomic prediction.

This work makes three notable contributions to the field of financial forecasting. First, it proposes a graph construction methodology based on thresholded correlation networks, specifically tailored to capturing dynamic interdependencies among global financial indices. Second, it conducts rigorous evaluation using statistical testing at multiple significance levels, ensuring reliable comparisons across models. Third, the results demonstrate that GCN and GAT models significantly outperform traditional ML methods under specific market regimes, highlighting their practical value in macroeconomic forecasting and risk-aware investment strategy.

The paper has a defined sequence which begins with Section 2 presenting GNN applications in financial forecasting alongside their existing methodological limitations. The research methodology section explains the steps involved in data acquisition and graph structure creation and model implementation. The experimental findings from this research study GNNs against traditional machine learning models in Section 4. The evaluation examines when GNN-based forecasting provides better results based on market circumstances. Future research directions are presented in the final section of the paper.

2. Literature Review

The identification of financial patterns stands as a fundamental challenge which research in econometrics machine learning and deep learning continues to pursue. The time-series forecasting models of traditional methods consist of autoregressive integrated moving average (ARIMA) models (Box et al., 2015) [1] together with vector autoregression (VAR) models (Sims, 1980) [2] and GARCH models (Bollerslev, 1986) [3]. The previous forecasting methods have proven useful for modeling historical financial patterns yet they do not excel in recognizing intricate connections between worldwide financial indices throughout times of economic turbulence. ML and DL research introduced alternative methods to traditional approaches through support vector regression (Drucker et al., 1997) [4], random forests (Breiman, 2001) [5], and long short-term memory (LSTM) networks (Hochreiter & Schmidhuber, 1997) [6]. Building on this trend, Choi and Kim (2025) [7] extend the scope of analysis beyond predictive accuracy, showing that diverse graph-based network analysis methods can reveal distinct perspectives and patterns in sector-based financial instruments’ price discrepancies, thereby underscoring the value of method-specific insights. Although such predictive models show enhanced accuracy, they often fail to capture the underlying financial market relationships, as noted by Ozbayoglu et al. (2019) [8]. Foroutan and Lahmiri (2024) [9] found that the Temporal Convolutional Network (TCN) proved most effective in forecasting WTI, Brent, and silver prices, while the BiGRU model excelled at gold price prediction, providing essential information for investors, policymakers, and other market stakeholders. Zhang et al. (2023) [10] explained that their EEMD-PSO-LSSVM-ICSS-GARCH hybrid model achieved superior prediction accuracy for the NASDAQ CTA Artificial Intelligence and Robotics (AIRO) Index returns because of its ability to handle complex structural characteristics.

Graph-based learning represents an innovative solution to financial forecasting problems because it provides an organized method for analyzing market dependency structures and spillover patterns. GNNs have been widely applied to structured data problems across various domains, including physics (Battaglia et al., 2018) [11], biology (Gilmer et al., 2017) [12], and social networks (Hamilton et al., 2017) [13]. Their adoption for financial market analysis began recently but shows fast-growing momentum. Xiang et al. (2023) [14] proved GNNs can identify temporal relationships in stock market predictions better than traditional machine learning and deep learning methods. Choi and Kim (2023) [15] applied a graph-based approach to forecast downside risks in global financial markets by constructing inflation rate-adjusted dependence networks among 21 major indices. Chen et al. (2023) [16] further developed this concept by uniting natural language processing (NLP) with GNNs to add sentiment analysis capabilities to financial prediction models. Zhou et al. (2025) [17] demonstrated how their evolving multiscale graph neural network (EMGNN) framework leads to superior cryptocurrency volatility prediction by modeling cryptocurrency and conventional financial market interactions thus helping risk management and policy development. Similarly, Yin et al. (2024) [18] proposed a GNN-based strategy that combines the financial stress index with cryptocurrency forecasting, confirming the model’s ability to capture macro-financial stress propagation mechanisms that affect digital asset pricing.

Recent innovations in graph structure learning further enhance graph and GNN capabilities. Fan et al. (2025) [19] proposed the CCGIB framework to balance shared and channel-specific representations, enabling richer modeling of multiview financial structures. Choi et al. (2024) [20] also contributed by introducing an augmented representation framework that encodes temporal statistical-space priors into graph models, showing improved accuracy in handling volatile time series under complex dependencies. Meanwhile, Fan et al. (2024) [21] introduced Neural Gaussian Similarity Modeling to enable differentiable and scalable graph construction, which is well-suited for financial data with high node similarity.

Zhang et al. (2024) [22] explored the application of graph neural networks to power grid operational risk assessment under dynamic topological conditions, demonstrating that GNNs can reliably predict system-wide and localized risk indicators despite uncertainty in future grid configurations. Dong et al. (2024) [23] developed a dynamic fraud detection framework by integrating reinforcement learning into graph neural networks, addressing key challenges such as label imbalance, feature distortion from highly connected nodes, and evolving fraud patterns.

The foundational Graph Convolutional Networks (GCNs) (Kipf & Welling, 2017) [24] and Graph Attention Networks (GATs) (Veličković et al., 2018) [25] extract useful node-level features from network connectivity thus making them suitable for financial forecasting applications. Researchers have developed various graph-based learning models such as GraphSAGE (Hamilton et al., 2017) [13] together with temporal graph networks (Rossi et al., 2020) [26] which demonstrate their potential to model financial relationships as they change over time. Stock-level predictions represent the primary focus of current applications but macroeconomic forecasting remains an underdeveloped area.

The use of GNNs for global financial forecasting needs further exploration despite recent progress in the field. The existing research mainly focuses on individual stocks and specific sector indices and single-market datasets which hinders the generalization of findings to interconnected financial markets across multiple markets. The majority of current GNN-based research fails to study how economic disturbances transmit between markets which represents a crucial element for systemic risk assessment and spread modeling. The application of graph-based models to financial risk prediction by Cheng et al. (2022) [27], Choi and Kim (2024) [28], and Das et al. (2024) [29] failed to perform rigorous tests against traditional econometric models while ignoring how GNNs behave under varying economic scenarios. This highlights the need for a comprehensive and robust framework that evaluates GNN performance across diverse regimes, international indices, and volatility conditions.

The research addresses these knowledge gaps through international financial index GNN implementation along with an advanced validation framework which surpasses previous studies. The research employs extensive hyperparameter tuning combined with 30 repeated time-series cross-validation experiments instead of traditional single-period backtesting and static train-test splits. The evaluation method delivers both robust and statistically sound performance assessments that work effectively across multiple market conditions throughout various time periods. The comprehensive validation process tackles model overfitting together with temporal data leakage problems to establish reliable performance comparisons between different prediction models.

Research makes an original contribution to financial network structure research by using Graph Neural Networks (GNNs) to show how these networks successfully identify spillover effects and systemic risk transmission and structural dependencies between international markets. The current time-series methods for financial indicators work independently from each other because they disregard the fundamental connections between these indicators. The studied approach uses graph models to reveal hidden network relationships thus delivering an integrated predictive system for complex market connections.

Through correlation-based network formation techniques the research develops financial graphs that lead to an efficient forecasting system which works without requiring macroeconomic data indicators. The proposed approach benefits emerging markets by working without requiring abundant economic data that is difficult to obtain. The research sets itself apart from previous studies because it systematically examines the conditions that optimize GNN performance by evaluating their predictive outcomes across different financial environments such as high-volatility crises and post-crisis recoveries and stable economic conditions.

The research contributes to multiple fields through its work in financial econometrics and quantitative finance and systemic risk modeling. The research connects financial network representation to predictive analytics through its application of graph-based forecasting methods to international market indices. The research demonstrates that financial network forecasting accuracy depends heavily on structural properties which include symmetry and interconnectedness. The empirical results demonstrate that GNN-based methods deliver both robustness and data efficiency which provides new understanding about how network structures affect macroeconomic predictions. The research establishes a solid base for creating hybrid modeling systems which combine graph neural networks with conventional econometric models to improve forecasting capabilities in the modern interconnected global financial system.

3. Methodology and Data

3.1. Methodology

This study aims to predict financial returns by measuring the spillover effects among global indices. Specifically, it examines whether utilizing graph-based embeddings improves predictive performance compared to benchmark models that rely solely on raw data. The benchmark models include Random Forest, XGBoost, Multi-Layer Perceptron (MLP), K-Nearest Neighbors (KNN), and Support Vector Machine (SVM), which were used for baseline predictions. The results of these models were then compared with predictions made using embeddings generated through Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT). Our research flow is presented in Figure 1.

3.1.1. Benchmark Regression Models

Random Forest Regressor

Random Forest (Breiman, 2001) [5] is an ensemble learning method that builds multiple decision trees and aggregates their outputs to enhance predictive performance. In regression tasks, it reduces variance by averaging the predictions of individual trees

y^{'} = f (x) = \frac{1}{N} \sum_{i = 1}^{N} T_{i} (x)

where

T_{i} (x)

is the prediction of the i-th tree.

XGBoost Regressor

XGBoost (Chen & Guestrin, 2016) [30] is a gradient boosting algorithm that optimizes decision trees iteratively to minimize regression loss by reducing bias and variance. It uses a differentiable loss function, such as mean squared error, to improve predictive accuracy.

F_{t} (x) = F_{t - 1} (x) + η \cdot h_{t} (x)

where

F_{t} (x)

is the updated model,

η

is the learning rate, and

h_{t} (x)

is the residual of a regression.

Multi-Layer Perceptron Regressor

Multi-Layer Perceptron (Rosenblatt, 1958) [31] is a neural network model consisting of multiple layers of neurons that learn nonlinear mappings through backpropagation.

y = σ (W^{(L)} σ (W^{(L - 1)} \dots σ (W^{(1)} x + b^{(1)}) + b^{(L - 1)}) + b^{(L)})

where

σ

is the activation function,

W^{(l)}

and

b^{(l)}

is the weights and biases for the

l^{t h}

layer, where

l \in {1,2, \dots ., L}

.

K-Nearest Neighbors Regressor

K-Nearest Neighbors (Cover & Hart, 1967) [32] is a non-parametric algorithm that predicts a sample’s value by averaging the target values of its k-nearest neighbors in feature space.

\hat{y} = \frac{1}{k} \sum_{i = 1}^{k} {y_{i}}

Support Vector Regressor

Support Vector Machine (Cortes & Vapnik, 1995) [33] is a supervised learning model that finds the optimal hyperplane to predict continuous values by minimizing error within a specified margin (ε-insensitive zone).

\min_{w, b} \frac{1}{2} {|w|}^{2} + C \sum_{i = 1}^{n} (ξ_{i} + ξ_{i}^{*}) s . t . y_{i} - (w \cdot x_{i} + b) \leq ϵ + ξ_{i}, \forall i (w \cdot x_{i} + b) - y_{i} \leq ϵ + ξ_{i}^{*}, \forall i ξ_{i}, ξ_{i}^{*} \geq 0, \forall i

where

ξ_{i}, ξ_{i}^{*}

is the slack variable,

w

is the weight vector,

b

is the bias, and

y_{i}

is the label of

x_{i}

. The

ϵ

is the insensitive-loss meaning the margin of error and

C \sum_{i = 1}^{n} (ξ_{i} + ξ_{i}^{*})

is the penalty, when the sample is outside the margin of error.

3.1.2. Graph Models

A graph is a data structure composed of a set of nodes (also referred to as vertices) and edges that connect these nodes. It is generally defined as

G = (V, E)

, where V represents the set of nodes and E represents the set of edges connecting pairs of nodes. In this study, the nodes represent individual stock indices, while the edges are weighted by the correlation between two indices. GNNs generally utilize a message-passing mechanism to calculate the embeddings of each node. This involves updating a node’s embedding by leveraging its relationships with neighboring nodes. GNNs employ an iterative process where the feature information of neighboring nodes is aggregated and integrated into the representation of the central node. This process includes aggregation and update operations performed through multiple stacked layers. As a result, GNNs enable more accurate predictions based on the provided graph data. In this study, to variation models of GNNs were used for prediction.

Graph Convolutional Network (GCN)

A Graph Convolutional Network (GCN) is a model that utilizes both the characteristics of nodes and the structure of the graph to learn from graph data. A node’s embeddings are updated by combining its own features with those of its neighbors. The embedding update of a node v in one layer is represented by the equation below.

H^{(l + 1)} = σ (\hat{A} H^{(l)} W^{(l)}) = σ (D^{- \frac{1}{2}} \tilde{A} D^{- \frac{1}{2}} H^{(l)} W^{(l)})

The input is the initial node feature

H^{(0)} \in R^{N \times F}

, which is the initial input values.

\tilde{A} = A + I

, is the adjacency matrix of the graph, with self-loops added. The element

A_{i j} = \{0,1\}

in the matrix indicates whether there is an edge between node i and node j. The

{\hat{A} = D}^{- \frac{1}{2}} \tilde{A} D^{- \frac{1}{2}}

part is to normalize the data to make it easier to learn from, where D is a diagonal matrix (

D_{i i} = \sum_{j} A_{i j}

).

H^{(l)}, W^{(l)}

each represents the node embedding matrix and the learnable weight matrix at the l-th layer, and

σ

is the activation function. The model aggregates information from neighboring nodes at each layer and updates the current node’s representation. The output values are obtained from the final layer

H^{(L)}

. This step is called the forward pass.

The backpropagation step is the process of updating the weight matrix

W^{(l)}

for each layer, which minimizes the loss function below.

L = \frac{1}{N} \sum_{i = 1}^{N} Loss (\hat{y_{i}}, y_{i})

It calculates the gradient of

L

and updates the weight

W^{(l)}

.

Graph Attention Network (GAT)

The Graph Attention Network (GAT) introduces an attention mechanism to graph neural networks. Instead of treating all neighbors equally (as in GCN), GAT assigns different attention weights to neighboring nodes, allowing the model to focus on the most relevant neighbors during feature aggregation. The main purpose of the GAT model is to generalize a new set of node features as an output

h^{'} = \{h_{1}^{'}, h_{2}^{'}, \dots, h_{n}^{'}\}

from its previous input

h = \{h_{1}, h_{2}, \dots, h_{n}\}

. To proceed this step, the model first calculates and the attention coefficient.

e_{i j} = LeakyReLU (a^{T} (W h_{i} | | W h_{j})),

where

a^{T}

is the attention vector (learnable parameter) and W is the weight matrix which can be trained by the model.

Next, the attention coefficient is normalized by using the softmax function.

α_{i j} = softmax (e_{i j})

Lastly, the features of neighbor nodes are aggregated using the normalized attention coefficient.

h_{i}^{'} = σ (\sum_{j \in N (i)} α_{i j} W h_{j})

The GAT model utilizes a multi-head attention mechanism to improve model stability and expressiveness. In the intermediate layers, the attention heads are concatenated, whereas in the final layer, they are averaged to reduce dimensionality and enhance stability.

h_{i}^{'} = {| |}_{k = 1}^{K} σ (\sum_{j \in N (i)} α_{i j}^{(k)} W^{(k)} h_{j})

3.1.3. Optimization and Computational Complexity

The model learns through the process of minimizing the Mean Squared Error (MSE) loss between predicted and actual returns. The MSE loss function remains convex for output predictions but the overall optimization landscape becomes non-convex because of the nonlinear activation functions (e.g., ReLU, ELU) and stacked neural layers in both GCN and GAT architectures. Deep learning models typically exhibit non-convexity which stochastic gradient-based optimization methods (e.g., Adam) solve by finding generalizable local minima in practice.

The proposed models—Graph Convolutional Networks (GCNs) and Graph Attention Networks (GATs)—exhibit polynomial computational complexity, which supports their applicability to large-scale financial networks. For GCNs, the dominant computational cost arises from matrix multiplications involving node features and weight parameters, yielding a complexity of

O (N \cdot F \cdot F^{'})

, where

N

is the number of nodes,

F

is the input feature dimension, and

F'

is the output feature dimension. In practice, this cost is significantly reduced by leveraging sparse adjacency matrices, which is appropriate for financial graphs that typically exhibit sparse connectivity.

GATs have a complexity of

O (E \cdot F)

, where

E

denotes the number of edges. This is because attention coefficients are computed only for connected node pairs. Although the self-attention mechanism could lead to a worst-case complexity of

O (N^{2})

in dense graphs, this is mitigated in real-world applications by limiting attention computations to each node’s local neighbors.

Therefore, under the common assumption of sparsity in financial networks, both GCNs and GATs offer computationally feasible and scalable solutions for modeling complex inter-market relationships.

3.2. Data

To analyze global financial spillover effects, we constructed a dataset of major stock indices spanning from 1 January 2004, to 31 December 2024. The initial list of indices was obtained by crawling the main stock index listings available on Investing.com, which provides a comprehensive and up-to-date registry of representative indices from major global economies. For each index, daily closing prices and trading volume data were collected through automated crawling of Yahoo Finance, which offers a reliable source of historical financial market data.

The initial collection yielded 41 global stock indices, covering a diverse range of regions including North America, Europe, and Asia. During the data preprocessing phase, we evaluated the completeness of each index’s time series. Indices with excessive missing values or incomplete coverage over the 21-year sample period were excluded to maintain consistency in longitudinal analysis. This filtering process resulted in 25 indices with sufficient data quality.

Among these, we further narrowed the sample to 15 representative indices, selected based on relative market importance. Specifically, we computed a proxy for market capitalization by taking the cumulative sum of the product of daily closing price and trading volume over the full period. The top 15 indices with the highest cumulative value were retained, under the rationale that these markets play a more influential role in transmitting financial shocks across borders and thus offer richer insights into spillover dynamics.

To prepare the data for modeling and analysis, the closing prices were transformed into daily log return series, which standardizes the data and enhances stationarity—a critical requirement for many econometric and machine learning forecasting models.

3.3. Graph Data Generation

The original tabular data underwent a transformation to create a graph-structured dataset which enabled graph-based learning. The graph model uses stock indices as nodes to represent individual elements while edges represent historical correlation relationships between them. The financial network construction process retained only those edges which demonstrated absolute correlation coefficients exceeding 0.3.

The chosen threshold followed both statistical conventions and financial network analysis principles. The absolute correlation coefficient of 0.3 or higher in correlation analysis indicates moderate to strong relationships because positive values (≥0.3) show meaningful positive associations and negative values (≤−0.3) indicate significant inverse relationships. The graph structure becomes less reliable when correlations fall below this threshold because they introduce excessive noise.

Financial network modeling research shows that low correlation values tend to change frequently which makes them unreliable for long-term market analysis. The 0.3 threshold selection helps the model detect important financial connections while avoiding temporary market fluctuations and random statistical noise. The model achieves better robustness through this balance because it selects meaningful financial dependencies which represent actual market structures. The graph structures are plotted in Appendix C, Figure A1a–oo.

The graph structure remains static throughout each 6-month evaluation period but changes dynamically between time segments. The graphs derive from thresholded correlation values (|ρ| ≥ 0.3) which produce sparse undirected weighted adjacency matrices. The models do not require uniform connectivity because they use natural variations in edge density and degree distribution which emerge from actual market data. The models maintain flexibility to adjust their financial interdependency strength during different periods. The graph structure gets reconstructed independently for each rolling window to handle time variations.

3.4. Experimental Design

The baseline experiment used 15 indices in tabular form with five models from Section 3.1.1 for prediction. A total of 41 segments were tested, each comprising a 6-month training set and a 6-month test set, as detailed in Appendix A, Table A1.

The prediction purpose was to predict the rate of return of the day after (1 day prediction). For the feature variables, data from time steps 1 to N-1 (where N is the total number of rows for each test period) across all index columns were used. This process was repeated by shifting the target index

i

for each experiment, and 30 iterations were performed for each target to ensure robust results. Mean RMSE and MAE were calculated for every target index to evaluate model performance comprehensively.

The graph-based experiments were conducted using the GCN and GAT models. Embeddings were generated for each node and edge based on the previously constructed graph dataset. For each combination, experiments were performed for all target indices with 30 loops per target, and each loop consisted of 100 epochs to ensure the robustness of the results. Default settings for the ML models, as outlined in Table A7 of Appendix B, were based on the configurations specified in the respective papers or implementation packages.

4. Results

The results of the experimental data can be found in Table A1, Table A2, Table A3, Table A4 and Table A5 of the Appendix A. GAT and GCN demonstrate their ability to track nonlinear financial relationships and dynamic dependencies according to the experimental results which are presented in Figure A2, Figure A3, Figure A4, Figure A5, Figure A6 and Figure A7 of Appendix D.

(1): Pre-Crisis Period (2004–2007): Early Signal Detection

During the relatively stable pre-crisis years, both models showed moderate yet statistically significant improvements over traditional approaches. In particular, Test 6, conducted in the first half of 2007 amid geopolitical tension, revealed that GAT outperformed MLP (t = 3.8764, p = 0.0003), while GCN results were comparable (t = 3.8670, p = 0.0003).

These findings suggest that even under modest volatility, graph-based models began to detect early structural changes in global financial linkages—although their relative advantage over baselines was not yet dramatic.

(2): Crisis Periods: 2008 Global Financial Crisis & European Debt Crisis

The models achieved their strongest performance during the 2008 Global Financial Crisis, a time when market interdependencies intensified dramatically.

Test 10, conducted after the collapse of Lehman Brothers, showed that GAT significantly outperformed XGBoost (t = 3.4109, p = 0.0011) and MLP (t = 2.2339, p = 0.0169), while GCN produced slightly lower but still competitive results (XGBoost: t = 3.3862, p = 0.0012; MLP: t = 2.2054, p = 0.0180). These results confirm the superiority of GNNs in capturing systemic market shocks and nonlinear contagion paths.

Similarly, during the European Debt Crisis (2010–2012), the models continued to perform strongly. In Test 14 (early 2011), GAT significantly outperformed MLP (t = 3.6446, p = 0.0006). Under heightened uncertainty due to Greece’s bailout negotiations (Test 16), GAT again surpassed XGBoost (t = 1.9666), MLP (t = 2.0200), and KNN (t = 2.2302). GCN also showed competitive performance (e.g., KNN: t = 2.2227, p = 0.0175).

These results emphasize the models’ ability to monitor sovereign risk events and shifts in market sentiment.

(3): Recovery Phases: Post-GFC and Post-COVID-19

As markets entered recovery, GAT and GCN maintained strong predictive capabilities by adapting to evolving inter-market structures.

Test 11 (late 2009) indicated that GAT significantly outperformed XGBoost (t = 2.5188), MLP (t = 2.8342), and KNN (t = 3.5566), with similar results for GCN (e.g., KNN: t = 3.5029, p = 0.0008). During the COVID-19 crisis, both models again delivered outstanding results.

In Test 31 (early 2020), GAT strongly outperformed MLP in RMSE (t = 4.6089, p < 0.0001). In Test 33, GAT showed clear advantages over MLP (p = 0.0029) and SVM (p = 0.0034) in MAE, with GCN producing comparable outcomes.

Test 34 further showed that GAT remained superior even in the post-COVID adjustment phase, outperforming MLP (t = 2.9313) and SVM (t = 2.2648). These results demonstrate that GNNs are highly effective during periods of rapid regime shifts and volatility spikes.

(4): Stable and Mixed Regimes: 2013–2019 and Recent Inflationary Periods

The performance gap between GNNs and baseline models narrowed during relatively calm periods. From 2013 to 2019, when markets were supported by quantitative easing, GAT still managed to outperform traditional models in some cases such as Test 20 (2014) where it beat MLP (t = 2.8370, p = 0.0043).

During Test 25 (2016), amid the Brexit referendum, GAT demonstrated robustness again, outperforming MLP (t = 1.8035, p = 0.0410). In more recent years marked by inflation and monetary tightening, both GAT and GCN remained competitive.

In Test 37 (early 2022), GAT outperformed MLP (t = 1.7436, p = 0.0462), and in Test 39 (2023), GAT and GCN both significantly surpassed MLP (t ≈ 2.37, p ≈ 0.0130), validating their responsiveness to macroeconomic structural changes.

The models demonstrated excellence in recent times marked by inflation together with monetary tightening. GAT and GCN displayed effective adaptation to macroeconomic changes in Test 37 during the first half of 2022 by achieving better results than MLP (t = 1.7436, p = 0.0462) and producing results comparable to GCN (t = 1.7395, p = 0.0465).

The analysis in Test 39 (2023) indicated that GAT achieved better significance than MLP (t = 2.3703, p = 0.0129) compared to GCN (t = 2.3651, p = 0.0130) which validated its ability to detect inflation-driven market adjustments.

Taken together, the experimental results demonstrate that graph-based models provide superior forecasting power compared to traditional methods, especially under periods of crisis and structural transition. GAT and GCN performed best during events such as the 2008 Global Financial Crisis, European Debt Crisis, Brexit referendum, COVID-19 pandemic, and recent inflationary shocks, all of which were marked by intense shifts in market connectivity. In contrast, during stable periods such as Tests 8 (2008) and 24 (2016), the performance gap narrowed, showing that GNNs’ effectiveness depends heavily on market conditions. The results confirm that GNNs are particularly effective in modeling dynamic, symmetric, and nonlinear spillover effects, which traditional models often fail to capture. These findings suggest that GAT and GCN are especially valuable for systemic risk analysis, adaptive forecasting, and financial decision support in volatile and interconnected market environments.

5. Discussion

This study presents robust empirical evidence indicating that Graph Convolutional Networks (GCNs) and Graph Attention Networks (GATs) outperform conventional machine learning and deep learning models in forecasting financial returns across international markets. The advantage of GNN-based models becomes particularly pronounced during periods of heightened market volatility, systemic crises, and subsequent recovery phases—periods characterized by strengthened interdependencies and structural changes within financial networks, which GNNs are uniquely equipped to capture.

Both GCNs and GATs learn representations directly from dynamically evolving network structures. Their architectural design allows for the modeling of symmetric financial relationships, including mutual influences and bidirectional spillovers. GATs, in particular, utilize attention mechanisms that assign learnable weights to neighbor nodes, enabling the model to emphasize the most salient inter-market connections. This adaptive capacity enhances the model’s responsiveness to regime shifts and sudden economic disturbances. Conversely, GCNs are more effective in relatively stable markets, where neighborhood aggregation suffices to capture persistent dependency structures.

These architectural differences explain why GATs consistently outperformed other models during crises such as the 2008 Global Financial Crisis and the COVID-19 shock, while GCNs maintained more stable performance during less volatile periods. These findings suggest that the choice of GNN architecture should be guided by the structural characteristics of the market network and prevailing economic conditions.

The current model operates under a centralized training framework, where financial index data is aggregated and processed in a single computational environment. This design avoids the communication overhead and compression bottlenecks that often occur in distributed systems. However, we acknowledge that centralized architectures may face scalability limitations, especially in real-time, multi-agent financial environments.

Recent advancements such as the work by Doostmohammadian et al. (2025) [34] in IEEE Transactions on Automation Science and Engineering have demonstrated that log-scale quantization in distributed first-order optimization algorithms can substantially reduce communication costs while preserving convergence performance. Their method enables learning over networks of geographically distributed agents by exchanging quantized gradient information, offering practical benefits in bandwidth-constrained settings.

While our current implementation does not incorporate distributed or quantized optimization techniques, we recognize the relevance of such approaches for future extensions of this work. In particular, integrating log-quantized GNN training within decentralized financial forecasting systems could enhance scalability, reduce latency, and support edge-based deployment.

The performance of GNN models is highly sensitive to the structure of the underlying financial network. In this study, graphs were constructed based on pairwise correlations with a threshold of |ρ| ≥ 0.3, producing sparse, undirected networks that evolve over time. The density and connectivity of these graphs vary across market regimes, directly influencing the flow of information during training. During crises, increased graph connectivity creates dense inter-market feedback loops, enhancing the predictive power of GNNs. In contrast, sparse or disconnected graphs—more common in stable periods—limit relational learning and reduce the model’s relative advantage. Thus, model effectiveness is closely tied to graph sparsity, degree distribution, and the temporal stability of edge weights.

While correlation-based graph construction offers computational efficiency and interpretability, it captures only linear and symmetric relationships. This approach cannot account for more complex, nonlinear, or causal dependencies often present in financial systems. Future research should explore advanced graph construction techniques based on mutual information, Granger causality, or transfer entropy to more accurately represent the dynamics of financial markets. Incorporating causal inference and temporal structure would improve both the robustness and interpretability of GNN-based models.

Another key limitation pertains to model interpretability. Although GATs provide some transparency via attention weights, deep graph models generally lack intuitive explanations for their predictions. For practical adoption in financial decision-making, enhanced explainability is essential. The development of visualization tools and post hoc interpretation techniques is therefore critical for promoting trust and accountability in graph-based forecasting systems. Also, integrating Explainable AI (XAI) techniques—such as feature attribution, counterfactual analysis, or graph-specific saliency mapping—into GNN-based forecasting frameworks could further enhance transparency and stakeholder trust. Such integration would allow market analysts to not only observe prediction outputs but also understand the underlying drivers of inter-market dependencies and systemic risk signals. Given that XAI approaches have already been successfully applied in diverse domains such as marketing, healthcare, and policy analytics, their adoption in financial forecasting could similarly improve interpretability, user confidence, and practical decision-making capabilities [35,36,37].

In summary, GNNs offer a flexible and powerful framework for capturing the dynamic, symmetric relationships that emerge during structural changes in global financial systems. Their effectiveness is most evident when traditional models fail to adapt to nonlinear regime shifts. However, their deployment requires careful consideration of graph structure, computational demands, and interpretability. Future progress in distributed graph learning, explainable GNNs, and causal graph construction will be key to realizing their full potential in real-time financial applications across interconnected markets.

6. Conclusions

This study demonstrates that Graph Convolutional Networks (GCNs) and Graph Attention Networks (GATs) are effective tools for forecasting financial returns, particularly in environments characterized by structural volatility and systemic risk. By leveraging the underlying financial network structures, these models capture complex, symmetric relationships and nonlinear interdependencies that traditional machine learning and deep learning approaches often fail to model adequately.

Through a comprehensive benchmarking framework, we evaluate GCNs and GATs against standard machine learning models across various market regimes. The results consistently show that graph-based models achieve superior predictive performance during periods of market disruption, such as the 2008 Global Financial Crisis and the COVID-19 pandemic. These performance gains stem from the models’ capacity to learn dynamic patterns and directional spillovers embedded within evolving financial networks—an ability that proves critical in capturing contagion effects and regime transitions.

Nevertheless, our findings also highlight several limitations. The computational complexity of GNNs increases with network size, and their relative advantage diminishes in stable market conditions where inter-index relationships are more static and linear. In such contexts, simpler models like XGBoost or MLP often perform comparably at significantly lower computational cost. These trade-offs suggest that GNNs should be applied selectively—ideally during periods of heightened interconnectivity or structural change, where their relational modeling capabilities provide meaningful benefits.

To enable broader adoption in real-world financial systems, further developments are required in three areas: (1) scalable and adaptive graph construction methods that reflect temporal changes in market topology, (2) computationally efficient training and inference procedures suited for high-frequency environments, and (3) enhanced model interpretability through explainable AI techniques.

Ultimately, this research contributes an integrated framework that connects financial network modeling with predictive modeling techniques, offering a graph-based approach to capturing structural evolution in global markets. The flexibility of GNNs in adapting to nonlinear dynamics and uncovering latent inter-market structures positions them as valuable tools for macroeconomic forecasting, portfolio allocation, and systemic risk monitoring. Their demonstrated ability to identify structural breaks and directional dependencies offers significant potential for informing institutional decision-making and developing next-generation decision-support systems in finance.

Author Contributions

Conceptualization, I.C.; methodology, I.C.; software, T.K.L. and I.C.; validation, I.C. and T.K.L.; formal analysis, T.K.L. and I.C.; investigation, I.C. and T.K.L.; resources, T.K.L. and W.C.K.; data curation, T.K.L. and I.C.; writing—original draft preparation, T.K.L. and I.C.; writing—review and editing, I.C. and T.K.L.; visualization, I.C.; supervision, W.C.K.; project administration, I.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

All data supporting the results of this study are publicly available and properly cited within the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

Abbreviation	Full Form
GNN	Graph Neural Network
GCN	Graph Convolutional Network
GAT	Graph Attention Network
ML	Machine Learning
DL	Deep Learning
ARIMA	Autoregressive Integrated Moving Average
VAR	Vector Autoregression
GARCH	Generalized Autoregressive Conditional Heteroskedasticity
MLP	Multi-Layer Perceptron
SVM	Support Vector Machine
KNN	K-Nearest Neighbors
RMSE	Root Mean Squared Error
MAE	Mean Absolute Error
NLP	Natural Language Processing

Appendix A

Table A1. Forecasting Results.

Test ID	Model	Mean RMSE	RMSE Summary	Mean MAE	MAE Summary
1	Random Forest	0.0089	0.0089 ± 0.0032	0.0071	0.0071 ± 0.0025
	XGBoost	0.0098	0.0098 ± 0.0035	0.0077	0.0077 ± 0.0027
	MLP	0.0101	0.0101 ± 0.0028	0.0081	0.0081 ± 0.0022
	KNN	0.0098	0.0098 ± 0.0033	0.0077	0.0077 ± 0.0027
	SVM	0.0100	0.0100 ± 0.0028	0.0080	0.0080 ± 0.0023
	GCN	0.0086	0.0086 ± 0.0027	0.0067	0.0067 ± 0.0022
	GAT	0.0086	0.0086 ± 0.0027	0.0067	0.0067 ± 0.0022
2	Random Forest	0.0080	0.0080 ± 0.0033	0.0063	0.0063 ± 0.0026
	XGBoost	0.0085	0.0085 ± 0.0033	0.0068	0.0068 ± 0.0027
	MLP	0.0100	0.0100 ± 0.0029	0.0079	0.0079 ± 0.0024
	KNN	0.0085	0.0085 ± 0.0034	0.0067	0.0067 ± 0.0027
	SVM	0.0081	0.0081 ± 0.0031	0.0064	0.0064 ± 0.0025
	GCN	0.0080	0.0080 ± 0.0031	0.0062	0.0062 ± 0.0025
	GAT	0.0080	0.0080 ± 0.0031	0.0062	0.0062 ± 0.0025
3	Random Forest	0.0086	0.0086 ± 0.0031	0.0067	0.0067 ± 0.0024
	XGBoost	0.0092	0.0092 ± 0.0033	0.0072	0.0072 ± 0.0025
	MLP	0.0102	0.0102 ± 0.0026	0.0080	0.0080 ± 0.0021
	KNN	0.0089	0.0089 ± 0.0031	0.0070	0.0070 ± 0.0026
	SVM	0.0094	0.0094 ± 0.0031	0.0074	0.0074 ± 0.0026
	GCN	0.0086	0.0086 ± 0.0028	0.0066	0.0066 ± 0.0023
	GAT	0.0086	0.0086 ± 0.0028	0.0066	0.0066 ± 0.0023
4	Random Forest	0.0112	0.0112 ± 0.0040	0.0085	0.0085 ± 0.0029
	XGBoost	0.0118	0.0118 ± 0.0043	0.0090	0.0090 ± 0.0032
	MLP	0.0153	0.0153 ± 0.0033	0.0117	0.0117 ± 0.0025
	KNN	0.0115	0.0115 ± 0.0041	0.0088	0.0088 ± 0.0031
	SVM	0.0116	0.0116 ± 0.0040	0.0088	0.0088 ± 0.0031
	GCN	0.0115	0.0115 ± 0.0041	0.0086	0.0086 ± 0.0031
	GAT	0.0115	0.0115 ± 0.0041	0.0086	0.0086 ± 0.0031
5	Random Forest	0.0091	0.0091 ± 0.0028	0.0070	0.0070 ± 0.0022
	XGBoost	0.0099	0.0099 ± 0.0032	0.0076	0.0076 ± 0.0025
	MLP	0.0106	0.0106 ± 0.0022	0.0083	0.0083 ± 0.0017
	KNN	0.0094	0.0094 ± 0.0030	0.0074	0.0074 ± 0.0024
	SVM	0.0103	0.0103 ± 0.0041	0.0082	0.0082 ± 0.0035
	GCN	0.0090	0.0090 ± 0.0026	0.0068	0.0068 ± 0.0019
	GAT	0.0090	0.0090 ± 0.0026	0.0068	0.0068 ± 0.0019
6	Random Forest	0.0097	0.0097 ± 0.0026	0.0073	0.0073 ± 0.0019
	XGBoost	0.0104	0.0104 ± 0.0028	0.0078	0.0078 ± 0.0021
	MLP	0.0131	0.0131 ± 0.0021	0.0100	0.0100 ± 0.0016
	KNN	0.0102	0.0102 ± 0.0028	0.0077	0.0077 ± 0.0021
	SVM	0.0103	0.0103 ± 0.0029	0.0078	0.0078 ± 0.0023
	GCN	0.0098	0.0098 ± 0.0026	0.0073	0.0073 ± 0.0020
	GAT	0.0098	0.0098 ± 0.0026	0.0073	0.0073 ± 0.0020
7	Random Forest	0.0142	0.0142 ± 0.0031	0.0110	0.0110 ± 0.0024
	XGBoost	0.0151	0.0151 ± 0.0032	0.0117	0.0117 ± 0.0025
	MLP	0.0177	0.0177 ± 0.0026	0.0140	0.0140 ± 0.0020
	KNN	0.0143	0.0143 ± 0.0033	0.0110	0.0110 ± 0.0025
	SVM	0.0161	0.0161 ± 0.0039	0.0131	0.0131 ± 0.0031
	GCN	0.0143	0.0143 ± 0.0033	0.0109	0.0109 ± 0.0025
	GAT	0.0143	0.0143 ± 0.0033	0.0109	0.0109 ± 0.0025
8	Random Forest	0.0171	0.0171 ± 0.0035	0.0128	0.0128 ± 0.0024
	XGBoost	0.0183	0.0183 ± 0.0039	0.0138	0.0138 ± 0.0028
	MLP	0.0180	0.0180 ± 0.0033	0.0135	0.0135 ± 0.0021
	KNN	0.0176	0.0176 ± 0.0035	0.0134	0.0134 ± 0.0025
	SVM	0.0172	0.0172 ± 0.0041	0.0129	0.0129 ± 0.0029
	GCN	0.0167	0.0167 ± 0.0039	0.0123	0.0123 ± 0.0026
	GAT	0.0167	0.0167 ± 0.0039	0.0123	0.0123 ± 0.0026
9	Random Forest	0.0320	0.0320 ± 0.0044	0.0232	0.0232 ± 0.0036
	XGBoost	0.0329	0.0329 ± 0.0045	0.0240	0.0240 ± 0.0036
	MLP	0.0354	0.0354 ± 0.0041	0.0259	0.0259 ± 0.0031
	KNN	0.0327	0.0327 ± 0.0046	0.0236	0.0236 ± 0.0037
	SVM	0.0344	0.0344 ± 0.0048	0.0254	0.0254 ± 0.0042
	GCN	0.0330	0.0330 ± 0.0041	0.0237	0.0237 ± 0.0032
	GAT	0.0330	0.0330 ± 0.0041	0.0237	0.0237 ± 0.0032
10	Random Forest	0.0230	0.0230 ± 0.0032	0.0182	0.0182 ± 0.0026
	XGBoost	0.0258	0.0258 ± 0.0040	0.0201	0.0201 ± 0.0033
	MLP	0.0239	0.0239 ± 0.0031	0.0189	0.0189 ± 0.0025
	KNN	0.0224	0.0224 ± 0.0036	0.0177	0.0177 ± 0.0030
	SVM	0.0244	0.0244 ± 0.0042	0.0195	0.0195 ± 0.0039
	GCN	0.0216	0.0216 ± 0.0026	0.0166	0.0166 ± 0.0023
	GAT	0.0216	0.0216 ± 0.0026	0.0165	0.0165 ± 0.0023
11	Random Forest	0.0128	0.0128 ± 0.0021	0.0101	0.0101 ± 0.0015
	XGBoost	0.0147	0.0147 ± 0.0024	0.0117	0.0117 ± 0.0019
	MLP	0.0147	0.0147 ± 0.0019	0.0116	0.0116 ± 0.0015
	KNN	0.0153	0.0153 ± 0.0021	0.0123	0.0123 ± 0.0017
	SVM	0.0138	0.0138 ± 0.0022	0.0108	0.0108 ± 0.0017
	GCN	0.0127	0.0127 ± 0.0020	0.0099	0.0099 ± 0.0016
	GAT	0.0127	0.0127 ± 0.0020	0.0099	0.0099 ± 0.0016
12	Random Forest	0.0143	0.0143 ± 0.0038	0.0104	0.0104 ± 0.0024
	XGBoost	0.0154	0.0154 ± 0.0039	0.0114	0.0114 ± 0.0026
	MLP	0.0184	0.0184 ± 0.0033	0.0137	0.0137 ± 0.0022
	KNN	0.0148	0.0148 ± 0.0037	0.0109	0.0109 ± 0.0023
	SVM	0.0146	0.0146 ± 0.0038	0.0108	0.0108 ± 0.0028
	GCN	0.0141	0.0141 ± 0.0037	0.0102	0.0102 ± 0.0024
	GAT	0.0141	0.0141 ± 0.0037	0.0102	0.0102 ± 0.0024
13	Random Forest	0.0106	0.0106 ± 0.0024	0.0081	0.0081 ± 0.0019
	XGBoost	0.0115	0.0115 ± 0.0026	0.0090	0.0090 ± 0.0021
	MLP	0.0117	0.0117 ± 0.0024	0.0092	0.0092 ± 0.0019
	KNN	0.0112	0.0112 ± 0.0024	0.0086	0.0086 ± 0.0019
	SVM	0.0138	0.0138 ± 0.0083	0.0115	0.0115 ± 0.0082
	GCN	0.0104	0.0104 ± 0.0024	0.0077	0.0077 ± 0.0018
	GAT	0.0103	0.0103 ± 0.0024	0.0077	0.0077 ± 0.0018
14	Random Forest	0.0105	0.0105 ± 0.0024	0.0080	0.0080 ± 0.0016
	XGBoost	0.0115	0.0115 ± 0.0026	0.0087	0.0087 ± 0.0018
	MLP	0.0133	0.0133 ± 0.0021	0.0104	0.0104 ± 0.0014
	KNN	0.0110	0.0110 ± 0.0026	0.0084	0.0084 ± 0.0018
	SVM	0.0104	0.0104 ± 0.0026	0.0080	0.0080 ± 0.0018
	GCN	0.0103	0.0103 ± 0.0025	0.0078	0.0078 ± 0.0015
	GAT	0.0103	0.0103 ± 0.0025	0.0078	0.0078 ± 0.0015
15	Random Forest	0.0186	0.0186 ± 0.0043	0.0143	0.0143 ± 0.0033
	XGBoost	0.0195	0.0195 ± 0.0041	0.0150	0.0150 ± 0.0032
	MLP	0.0200	0.0200 ± 0.0042	0.0155	0.0155 ± 0.0033
	KNN	0.0186	0.0186 ± 0.0044	0.0143	0.0143 ± 0.0034
	SVM	0.0201	0.0201 ± 0.0043	0.0157	0.0157 ± 0.0038
	GCN	0.0188	0.0188 ± 0.0040	0.0144	0.0144 ± 0.0031
	GAT	0.0188	0.0188 ± 0.0040	0.0144	0.0144 ± 0.0031
16	Random Forest	0.0118	0.0118 ± 0.0032	0.0092	0.0092 ± 0.0025
	XGBoost	0.0132	0.0132 ± 0.0034	0.0103	0.0103 ± 0.0026
	MLP	0.0131	0.0131 ± 0.0029	0.0104	0.0104 ± 0.0023
	KNN	0.0136	0.0136 ± 0.0036	0.0106	0.0106 ± 0.0028
	SVM	0.0130	0.0130 ± 0.0034	0.0105	0.0105 ± 0.0028
	GCN	0.0110	0.0110 ± 0.0028	0.0084	0.0084 ± 0.0021
	GAT	0.0110	0.0110 ± 0.0028	0.0084	0.0084 ± 0.0021
17	Random Forest	0.0097	0.0097 ± 0.0034	0.0074	0.0074 ± 0.0025
	XGBoost	0.0104	0.0104 ± 0.0035	0.0080	0.0080 ± 0.0027
	MLP	0.0109	0.0109 ± 0.0032	0.0084	0.0084 ± 0.0022
	KNN	0.0104	0.0104 ± 0.0036	0.0078	0.0078 ± 0.0026
	SVM	0.0101	0.0101 ± 0.0034	0.0077	0.0077 ± 0.0025
	GCN	0.0095	0.0095 ± 0.0034	0.0071	0.0071 ± 0.0023
	GAT	0.0095	0.0095 ± 0.0034	0.0071	0.0071 ± 0.0024
18	Random Forest	0.0108	0.0108 ± 0.0040	0.0080	0.0080 ± 0.0027
	XGBoost	0.0115	0.0115 ± 0.0042	0.0086	0.0086 ± 0.0029
	MLP	0.0133	0.0133 ± 0.0034	0.0101	0.0101 ± 0.0024
	KNN	0.0113	0.0113 ± 0.0040	0.0085	0.0085 ± 0.0028
	SVM	0.0112	0.0112 ± 0.0045	0.0085	0.0085 ± 0.0032
	GCN	0.0106	0.0106 ± 0.0040	0.0079	0.0079 ± 0.0028
	GAT	0.0106	0.0106 ± 0.0040	0.0079	0.0079 ± 0.0028
19	Random Forest	0.0094	0.0094 ± 0.0038	0.0072	0.0072 ± 0.0028
	XGBoost	0.0104	0.0104 ± 0.0042	0.0081	0.0081 ± 0.0031
	MLP	0.0108	0.0108 ± 0.0034	0.0085	0.0085 ± 0.0025
	KNN	0.0099	0.0099 ± 0.0037	0.0076	0.0076 ± 0.0027
	SVM	0.0120	0.0120 ± 0.0066	0.0098	0.0098 ± 0.0059
	GCN	0.0091	0.0091 ± 0.0037	0.0069	0.0069 ± 0.0026
	GAT	0.0091	0.0091 ± 0.0037	0.0069	0.0069 ± 0.0026
20	Random Forest	0.0089	0.0089 ± 0.0028	0.0068	0.0068 ± 0.0021
	XGBoost	0.0097	0.0097 ± 0.0032	0.0074	0.0074 ± 0.0025
	MLP	0.0115	0.0115 ± 0.0023	0.0089	0.0089 ± 0.0018
	KNN	0.0094	0.0094 ± 0.0031	0.0072	0.0072 ± 0.0024
	SVM	0.0092	0.0092 ± 0.0028	0.0071	0.0071 ± 0.0021
	GCN	0.0088	0.0088 ± 0.0028	0.0066	0.0066 ± 0.0022
	GAT	0.0088	0.0088 ± 0.0028	0.0066	0.0066 ± 0.0022
21	Random Forest	0.0103	0.0103 ± 0.0033	0.0079	0.0079 ± 0.0028
	XGBoost	0.0111	0.0111 ± 0.0035	0.0084	0.0084 ± 0.0029
	MLP	0.0119	0.0119 ± 0.0029	0.0092	0.0092 ± 0.0024
	KNN	0.0107	0.0107 ± 0.0035	0.0083	0.0083 ± 0.0028
	SVM	0.0108	0.0108 ± 0.0032	0.0083	0.0083 ± 0.0026
	GCN	0.0101	0.0101 ± 0.0033	0.0076	0.0076 ± 0.0027
	GAT	0.0101	0.0101 ± 0.0033	0.0076	0.0076 ± 0.0027
22	Random Forest	0.0099	0.0099 ± 0.0026	0.0076	0.0076 ± 0.0021
	XGBoost	0.0106	0.0106 ± 0.0031	0.0083	0.0083 ± 0.0026
	MLP	0.0110	0.0110 ± 0.0023	0.0086	0.0086 ± 0.0018
	KNN	0.0103	0.0103 ± 0.0030	0.0081	0.0081 ± 0.0023
	SVM	0.0100	0.0100 ± 0.0024	0.0078	0.0078 ± 0.0019
	GCN	0.0096	0.0096 ± 0.0024	0.0073	0.0073 ± 0.0019
	GAT	0.0096	0.0096 ± 0.0024	0.0073	0.0073 ± 0.0019
23	Random Forest	0.0133	0.0133 ± 0.0029	0.0100	0.0100 ± 0.0022
	XGBoost	0.0139	0.0139 ± 0.0032	0.0107	0.0107 ± 0.0026
	MLP	0.0147	0.0147 ± 0.0024	0.0112	0.0112 ± 0.0018
	KNN	0.0133	0.0133 ± 0.0027	0.0100	0.0100 ± 0.0020
	SVM	0.0136	0.0136 ± 0.0028	0.0102	0.0102 ± 0.0022
	GCN	0.0132	0.0132 ± 0.0026	0.0099	0.0099 ± 0.0019
	GAT	0.0132	0.0132 ± 0.0026	0.0099	0.0099 ± 0.0019
24	Random Forest	0.0139	0.0139 ± 0.0039	0.0104	0.0104 ± 0.0027
	XGBoost	0.0152	0.0152 ± 0.0040	0.0114	0.0114 ± 0.0027
	MLP	0.0157	0.0157 ± 0.0034	0.0120	0.0120 ± 0.0024
	KNN	0.0144	0.0144 ± 0.0039	0.0107	0.0107 ± 0.0028
	SVM	0.0149	0.0149 ± 0.0047	0.0116	0.0116 ± 0.0038
	GCN	0.0138	0.0138 ± 0.0039	0.0103	0.0103 ± 0.0027
	GAT	0.0138	0.0138 ± 0.0039	0.0103	0.0103 ± 0.0027
25	Random Forest	0.0092	0.0092 ± 0.0028	0.0068	0.0068 ± 0.0020
	XGBoost	0.0102	0.0102 ± 0.0029	0.0077	0.0077 ± 0.0022
	MLP	0.0106	0.0106 ± 0.0027	0.0082	0.0082 ± 0.0020
	KNN	0.0106	0.0106 ± 0.0034	0.0080	0.0080 ± 0.0025
	SVM	0.0139	0.0139 ± 0.0102	0.0116	0.0116 ± 0.0104
	GCN	0.0089	0.0089 ± 0.0027	0.0065	0.0065 ± 0.0019
	GAT	0.0089	0.0089 ± 0.0027	0.0065	0.0065 ± 0.0019
26	Random Forest	0.0072	0.0072 ± 0.0023	0.0053	0.0053 ± 0.0017
	XGBoost	0.0080	0.0080 ± 0.0026	0.0060	0.0060 ± 0.0020
	MLP	0.0093	0.0093 ± 0.0020	0.0071	0.0071 ± 0.0014
	KNN	0.0078	0.0078 ± 0.0027	0.0058	0.0058 ± 0.0019
	SVM	0.0089	0.0089 ± 0.0045	0.0070	0.0070 ± 0.0043
	GCN	0.0070	0.0070 ± 0.0023	0.0052	0.0052 ± 0.0016
	GAT	0.0070	0.0070 ± 0.0023	0.0052	0.0052 ± 0.0016
27	Random Forest	0.0064	0.0064 ± 0.0018	0.0049	0.0049 ± 0.0014
	XGBoost	0.0072	0.0072 ± 0.0020	0.0055	0.0055 ± 0.0016
	MLP	0.0079	0.0079 ± 0.0016	0.0061	0.0061 ± 0.0012
	KNN	0.0068	0.0068 ± 0.0019	0.0052	0.0052 ± 0.0015
	SVM	0.0096	0.0096 ± 0.0078	0.0082	0.0082 ± 0.0077
	GCN	0.0061	0.0061 ± 0.0017	0.0046	0.0046 ± 0.0013
	GAT	0.0061	0.0061 ± 0.0017	0.0046	0.0046 ± 0.0013
28	Random Forest	0.0094	0.0094 ± 0.0019	0.0070	0.0070 ± 0.0014
	XGBoost	0.0099	0.0099 ± 0.0019	0.0074	0.0074 ± 0.0014
	MLP	0.0117	0.0117 ± 0.0017	0.0088	0.0088 ± 0.0013
	KNN	0.0096	0.0096 ± 0.0020	0.0073	0.0073 ± 0.0015
	SVM	0.0098	0.0098 ± 0.0019	0.0074	0.0074 ± 0.0014
	GCN	0.0096	0.0096 ± 0.0020	0.0071	0.0071 ± 0.0014
	GAT	0.0095	0.0095 ± 0.0020	0.0071	0.0071 ± 0.0014
29	Random Forest	0.0114	0.0114 ± 0.0028	0.0084	0.0084 ± 0.0020
	XGBoost	0.0125	0.0125 ± 0.0033	0.0093	0.0093 ± 0.0024
	MLP	0.0125	0.0125 ± 0.0024	0.0095	0.0095 ± 0.0017
	KNN	0.0115	0.0115 ± 0.0028	0.0086	0.0086 ± 0.0021
	SVM	0.0123	0.0123 ± 0.0034	0.0095	0.0095 ± 0.0030
	GCN	0.0112	0.0112 ± 0.0025	0.0081	0.0081 ± 0.0018
	GAT	0.0112	0.0112 ± 0.0025	0.0081	0.0081 ± 0.0018
30	Random Forest	0.0092	0.0092 ± 0.0024	0.0070	0.0070 ± 0.0018
	XGBoost	0.0102	0.0102 ± 0.0026	0.0077	0.0077 ± 0.0020
	MLP	0.0113	0.0113 ± 0.0021	0.0088	0.0088 ± 0.0016
	KNN	0.0095	0.0095 ± 0.0024	0.0071	0.0071 ± 0.0019
	SVM	0.0118	0.0118 ± 0.0036	0.0096	0.0096 ± 0.0034
	GCN	0.0089	0.0089 ± 0.0023	0.0067	0.0067 ± 0.0016
	GAT	0.0089	0.0089 ± 0.0023	0.0067	0.0067 ± 0.0016
31	Random Forest	0.0086	0.0086 ± 0.0021	0.0066	0.0066 ± 0.0016
	XGBoost	0.0094	0.0094 ± 0.0024	0.0073	0.0073 ± 0.0019
	MLP	0.0113	0.0113 ± 0.0016	0.0086	0.0086 ± 0.0012
	KNN	0.0093	0.0093 ± 0.0022	0.0071	0.0071 ± 0.0017
	SVM	0.0093	0.0093 ± 0.0021	0.0071	0.0071 ± 0.0018
	GCN	0.0085	0.0085 ± 0.0018	0.0063	0.0063 ± 0.0013
	GAT	0.0085	0.0085 ± 0.0018	0.0063	0.0063 ± 0.0013
32	Random Forest	0.0266	0.0266 ± 0.0061	0.0178	0.0178 ± 0.0037
	XGBoost	0.0269	0.0269 ± 0.0059	0.0182	0.0182 ± 0.0036
	MLP	0.0313	0.0313 ± 0.0052	0.0209	0.0209 ± 0.0030
	KNN	0.0262	0.0262 ± 0.0060	0.0177	0.0177 ± 0.0037
	SVM	0.0267	0.0267 ± 0.0059	0.0185	0.0185 ± 0.0039
	GCN	0.0264	0.0264 ± 0.0059	0.0176	0.0176 ± 0.0037
	GAT	0.0264	0.0264 ± 0.0059	0.0176	0.0176 ± 0.0037
33	Random Forest	0.0178	0.0178 ± 0.0198	0.0104	0.0104 ± 0.0029
	XGBoost	0.0194	0.0194 ± 0.0193	0.0117	0.0117 ± 0.0032
	MLP	0.0228	0.0228 ± 0.0187	0.0127	0.0127 ± 0.0028
	KNN	0.0187	0.0187 ± 0.0195	0.0113	0.0113 ± 0.0029
	SVM	0.0226	0.0226 ± 0.0197	0.0156	0.0156 ± 0.0071
	GCN	0.0171	0.0171 ± 0.0200	0.0096	0.0096 ± 0.0029
	GAT	0.0171	0.0171 ± 0.0200	0.0096	0.0096 ± 0.0029
34	Random Forest	0.0101	0.0101 ± 0.0026	0.0077	0.0077 ± 0.0020
	XGBoost	0.0113	0.0113 ± 0.0027	0.0086	0.0086 ± 0.0020
	MLP	0.0126	0.0126 ± 0.0023	0.0097	0.0097 ± 0.0018
	KNN	0.0113	0.0113 ± 0.0025	0.0088	0.0088 ± 0.0020
	SVM	0.0134	0.0134 ± 0.0051	0.0110	0.0110 ± 0.0052
	GCN	0.0099	0.0099 ± 0.0026	0.0074	0.0074 ± 0.0021
	GAT	0.0099	0.0099 ± 0.0026	0.0074	0.0074 ± 0.0021
35	Random Forest	0.0105	0.0105 ± 0.0031	0.0080	0.0080 ± 0.0022
	XGBoost	0.0113	0.0113 ± 0.0035	0.0086	0.0086 ± 0.0025
	MLP	0.0131	0.0131 ± 0.0027	0.0101	0.0101 ± 0.0020
	KNN	0.0109	0.0109 ± 0.0032	0.0082	0.0082 ± 0.0022
	SVM	0.0120	0.0120 ± 0.0072	0.0097	0.0097 ± 0.0071
	GCN	0.0102	0.0102 ± 0.0029	0.0076	0.0076 ± 0.0021
	GAT	0.0102	0.0102 ± 0.0029	0.0076	0.0076 ± 0.0021
36	Random Forest	0.0154	0.0154 ± 0.0043	0.0120	0.0120 ± 0.0035
	XGBoost	0.0163	0.0163 ± 0.0043	0.0129	0.0129 ± 0.0036
	MLP	0.0166	0.0166 ± 0.0037	0.0131	0.0131 ± 0.0030
	KNN	0.0158	0.0158 ± 0.0043	0.0123	0.0123 ± 0.0035
	SVM	0.0152	0.0152 ± 0.0041	0.0120	0.0120 ± 0.0032
	GCN	0.0147	0.0147 ± 0.0041	0.0113	0.0113 ± 0.0034
	GAT	0.0147	0.0147 ± 0.0041	0.0113	0.0113 ± 0.0034
37	Random Forest	0.0138	0.0138 ± 0.0042	0.0104	0.0104 ± 0.0031
	XGBoost	0.0146	0.0146 ± 0.0044	0.0112	0.0112 ± 0.0032
	MLP	0.0157	0.0157 ± 0.0038	0.0122	0.0122 ± 0.0028
	KNN	0.0140	0.0140 ± 0.0046	0.0107	0.0107 ± 0.0034
	SVM	0.0157	0.0157 ± 0.0059	0.0127	0.0127 ± 0.0053
	GCN	0.0132	0.0132 ± 0.0042	0.0100	0.0100 ± 0.0029
	GAT	0.0132	0.0132 ± 0.0042	0.0100	0.0100 ± 0.0030
38	Random Forest	0.0113	0.0113 ± 0.0055	0.0088	0.0088 ± 0.0041
	XGBoost	0.0129	0.0129 ± 0.0057	0.0101	0.0101 ± 0.0044
	MLP	0.0129	0.0129 ± 0.0051	0.0101	0.0101 ± 0.0038
	KNN	0.0121	0.0121 ± 0.0056	0.0095	0.0095 ± 0.0043
	SVM	0.0115	0.0115 ± 0.0054	0.0089	0.0089 ± 0.0040
	GCN	0.0108	0.0108 ± 0.0054	0.0083	0.0083 ± 0.0041
	GAT	0.0108	0.0108 ± 0.0054	0.0083	0.0083 ± 0.0040
39	Random Forest	0.0090	0.0090 ± 0.0021	0.0071	0.0071 ± 0.0016
	XGBoost	0.0098	0.0098 ± 0.0023	0.0078	0.0078 ± 0.0018
	MLP	0.0107	0.0107 ± 0.0018	0.0085	0.0085 ± 0.0015
	KNN	0.0095	0.0095 ± 0.0023	0.0074	0.0074 ± 0.0018
	SVM	0.0104	0.0104 ± 0.0027	0.0084	0.0084 ± 0.0024
	GCN	0.0089	0.0089 ± 0.0022	0.0069	0.0069 ± 0.0018
	GAT	0.0089	0.0089 ± 0.0022	0.0069	0.0069 ± 0.0018
40	Random Forest	0.0086	0.0086 ± 0.0025	0.0066	0.0066 ± 0.0019
	XGBoost	0.0093	0.0093 ± 0.0025	0.0073	0.0073 ± 0.0020
	MLP	0.0100	0.0100 ± 0.0023	0.0078	0.0078 ± 0.0017
	KNN	0.0089	0.0089 ± 0.0026	0.0070	0.0070 ± 0.0020
	SVM	0.0094	0.0094 ± 0.0038	0.0074	0.0074 ± 0.0033
	GCN	0.0083	0.0083 ± 0.0025	0.0063	0.0063 ± 0.0019
	GAT	0.0083	0.0083 ± 0.0025	0.0063	0.0063 ± 0.0019
41	Random Forest	0.0104	0.0104 ± 0.0036	0.0078	0.0078 ± 0.0025
	XGBoost	0.0113	0.0113 ± 0.0038	0.0085	0.0085 ± 0.0027
	MLP	0.0122	0.0122 ± 0.0031	0.0094	0.0094 ± 0.0022
	KNN	0.0108	0.0108 ± 0.0040	0.0083	0.0083 ± 0.0029
	SVM	0.0113	0.0113 ± 0.0045	0.0087	0.0087 ± 0.0036
	GCN	0.0100	0.0100 ± 0.0035	0.0075	0.0075 ± 0.0024
	GAT	0.0100	0.0100 ± 0.0035	0.0075	0.0075 ± 0.0024

Table A2. Independent T-Test Results (GAT Model, RMSE).

Test ID	Comparison Model	T-Statistic	Significance (10%)	Significance (5%)	Significance (1%)	p-Value
1	RF	0.2786	FALSE	FALSE	FALSE	0.3913
1	XGB	1.0160	FALSE	FALSE	FALSE	0.1595
1	MLP	1.5337	TRUE	FALSE	FALSE	0.0682
1	KNN	1.0278	FALSE	FALSE	FALSE	0.1566
1	SVM	1.3767	TRUE	FALSE	FALSE	0.0898
2	RF	0.0255	FALSE	FALSE	FALSE	0.4899
2	XGB	0.4112	FALSE	FALSE	FALSE	0.3422
2	MLP	1.7753	TRUE	TRUE	FALSE	0.0438
2	KNN	0.4253	FALSE	FALSE	FALSE	0.3371
2	SVM	0.1208	FALSE	FALSE	FALSE	0.4524
3	RF	−0.0083	FALSE	FALSE	FALSE	0.5033
3	XGB	0.5314	FALSE	FALSE	FALSE	0.2997
3	MLP	1.5581	TRUE	FALSE	FALSE	0.0653
3	KNN	0.2953	FALSE	FALSE	FALSE	0.3850
3	SVM	0.6972	FALSE	FALSE	FALSE	0.2457
4	RF	−0.2098	FALSE	FALSE	FALSE	0.5823
4	XGB	0.1885	FALSE	FALSE	FALSE	0.4259
4	MLP	2.8205	TRUE	TRUE	TRUE	0.0045
4	KNN	0.0278	FALSE	FALSE	FALSE	0.4890
4	SVM	0.1196	FALSE	FALSE	FALSE	0.4528
5	RF	0.1492	FALSE	FALSE	FALSE	0.4412
5	XGB	0.8320	FALSE	FALSE	FALSE	0.2064
5	MLP	1.8747	TRUE	TRUE	FALSE	0.0357
5	KNN	0.4484	FALSE	FALSE	FALSE	0.3287
5	SVM	1.0450	FALSE	FALSE	FALSE	0.1534
6	RF	−0.0579	FALSE	FALSE	FALSE	0.5229
6	XGB	0.6543	FALSE	FALSE	FALSE	0.2592
6	MLP	3.8764	TRUE	TRUE	TRUE	0.0003
6	KNN	0.4172	FALSE	FALSE	FALSE	0.3399
6	SVM	0.4945	FALSE	FALSE	FALSE	0.3125
7	RF	−0.1284	FALSE	FALSE	FALSE	0.5506
7	XGB	0.6534	FALSE	FALSE	FALSE	0.2594
7	MLP	3.1230	TRUE	TRUE	TRUE	0.0021
7	KNN	−0.0184	FALSE	FALSE	FALSE	0.5073
7	SVM	1.3837	TRUE	FALSE	FALSE	0.0888
8	RF	0.3481	FALSE	FALSE	FALSE	0.3652
8	XGB	1.1803	FALSE	FALSE	FALSE	0.1239
8	MLP	1.0402	FALSE	FALSE	FALSE	0.1537
8	KNN	0.6995	FALSE	FALSE	FALSE	0.2450
8	SVM	0.3595	FALSE	FALSE	FALSE	0.3610
9	RF	−0.6304	FALSE	FALSE	FALSE	0.7332
9	XGB	−0.0477	FALSE	FALSE	FALSE	0.5189
9	MLP	1.5992	TRUE	FALSE	FALSE	0.0605
9	KNN	−0.1780	FALSE	FALSE	FALSE	0.5700
9	SVM	0.8699	FALSE	FALSE	FALSE	0.1960
10	RF	1.3654	TRUE	FALSE	FALSE	0.0917
10	XGB	3.4109	TRUE	TRUE	TRUE	0.0011
10	MLP	2.2339	TRUE	TRUE	FALSE	0.0169
10	KNN	0.7610	FALSE	FALSE	FALSE	0.2268
10	SVM	2.2005	TRUE	TRUE	FALSE	0.0189
11	RF	0.1596	FALSE	FALSE	FALSE	0.4372
11	XGB	2.5188	TRUE	TRUE	TRUE	0.0090
11	MLP	2.8342	TRUE	TRUE	TRUE	0.0042
11	KNN	3.5566	TRUE	TRUE	TRUE	0.0007
11	SVM	1.4906	TRUE	FALSE	FALSE	0.0737
12	RF	0.1362	FALSE	FALSE	FALSE	0.4463
12	XGB	0.9320	FALSE	FALSE	FALSE	0.1797
12	MLP	3.4280	TRUE	TRUE	TRUE	0.0010
12	KNN	0.5392	FALSE	FALSE	FALSE	0.2970
12	SVM	0.3637	FALSE	FALSE	FALSE	0.3594
13	RF	0.2561	FALSE	FALSE	FALSE	0.3999
13	XGB	1.3099	FALSE	FALSE	FALSE	0.1005
13	MLP	1.5817	TRUE	FALSE	FALSE	0.0625
13	KNN	0.9919	FALSE	FALSE	FALSE	0.1649
13	SVM	1.5509	TRUE	FALSE	FALSE	0.0701
14	RF	0.2880	FALSE	FALSE	FALSE	0.3877
14	XGB	1.3051	FALSE	FALSE	FALSE	0.1013
14	MLP	3.6446	TRUE	TRUE	TRUE	0.0006
14	KNN	0.7966	FALSE	FALSE	FALSE	0.2162
14	SVM	0.1508	FALSE	FALSE	FALSE	0.4406
15	RF	−0.1812	FALSE	FALSE	FALSE	0.5712
15	XGB	0.4278	FALSE	FALSE	FALSE	0.3360
15	MLP	0.7853	FALSE	FALSE	FALSE	0.2194
15	KNN	−0.1309	FALSE	FALSE	FALSE	0.5516
15	SVM	0.8135	FALSE	FALSE	FALSE	0.2114
16	RF	0.7268	FALSE	FALSE	FALSE	0.2367
16	XGB	1.9666	TRUE	TRUE	FALSE	0.0298
16	MLP	2.0200	TRUE	TRUE	FALSE	0.0265
16	KNN	2.2302	TRUE	TRUE	FALSE	0.0172
16	SVM	1.8162	TRUE	TRUE	FALSE	0.0402
17	RF	0.1494	FALSE	FALSE	FALSE	0.4411
17	XGB	0.7153	FALSE	FALSE	FALSE	0.2402
17	MLP	1.1627	FALSE	FALSE	FALSE	0.1274
17	KNN	0.6538	FALSE	FALSE	FALSE	0.2593
17	SVM	0.4718	FALSE	FALSE	FALSE	0.3204
18	RF	0.1003	FALSE	FALSE	FALSE	0.4604
18	XGB	0.5849	FALSE	FALSE	FALSE	0.2817
18	MLP	1.9292	TRUE	TRUE	FALSE	0.0321
18	KNN	0.4369	FALSE	FALSE	FALSE	0.3328
18	SVM	0.3952	FALSE	FALSE	FALSE	0.3479
19	RF	0.2337	FALSE	FALSE	FALSE	0.4085
19	XGB	0.9203	FALSE	FALSE	FALSE	0.1827
19	MLP	1.3292	TRUE	FALSE	FALSE	0.0973
19	KNN	0.5832	FALSE	FALSE	FALSE	0.2822
19	SVM	1.4794	TRUE	FALSE	FALSE	0.0766
20	RF	0.1616	FALSE	FALSE	FALSE	0.4364
20	XGB	0.8601	FALSE	FALSE	FALSE	0.1986
20	MLP	2.8370	TRUE	TRUE	TRUE	0.0043
20	KNN	0.5262	FALSE	FALSE	FALSE	0.3015
20	SVM	0.4475	FALSE	FALSE	FALSE	0.3290
21	RF	0.1547	FALSE	FALSE	FALSE	0.4391
21	XGB	0.7662	FALSE	FALSE	FALSE	0.2250
21	MLP	1.5533	TRUE	FALSE	FALSE	0.0659
21	KNN	0.4532	FALSE	FALSE	FALSE	0.3269
21	SVM	0.5382	FALSE	FALSE	FALSE	0.2973
22	RF	0.3155	FALSE	FALSE	FALSE	0.3775
22	XGB	0.9859	FALSE	FALSE	FALSE	0.1669
22	MLP	1.5690	TRUE	FALSE	FALSE	0.0644
22	KNN	0.7473	FALSE	FALSE	FALSE	0.2309
22	SVM	0.4387	FALSE	FALSE	FALSE	0.3323
23	RF	0.0327	FALSE	FALSE	FALSE	0.4871
23	XGB	0.6647	FALSE	FALSE	FALSE	0.2560
23	MLP	1.5740	TRUE	FALSE	FALSE	0.0634
23	KNN	0.0809	FALSE	FALSE	FALSE	0.4681
23	SVM	0.3394	FALSE	FALSE	FALSE	0.3684
24	RF	0.0541	FALSE	FALSE	FALSE	0.4786
24	XGB	0.9568	FALSE	FALSE	FALSE	0.1734
24	MLP	1.3802	TRUE	FALSE	FALSE	0.0893
24	KNN	0.4114	FALSE	FALSE	FALSE	0.3419
24	SVM	0.7054	FALSE	FALSE	FALSE	0.2433
25	RF	0.3391	FALSE	FALSE	FALSE	0.3685
25	XGB	1.3699	TRUE	FALSE	FALSE	0.0908
25	MLP	1.8035	TRUE	TRUE	FALSE	0.0410
25	KNN	1.5381	TRUE	FALSE	FALSE	0.0679
25	SVM	1.8387	TRUE	TRUE	FALSE	0.0423
26	RF	0.1738	FALSE	FALSE	FALSE	0.4316
26	XGB	1.0438	FALSE	FALSE	FALSE	0.1528
26	MLP	2.8730	TRUE	TRUE	TRUE	0.0039
26	KNN	0.8468	FALSE	FALSE	FALSE	0.2022
26	SVM	1.4557	TRUE	FALSE	FALSE	0.0802
27	RF	0.4610	FALSE	FALSE	FALSE	0.3243
27	XGB	1.4787	TRUE	FALSE	FALSE	0.0758
27	MLP	2.7531	TRUE	TRUE	TRUE	0.0053
27	KNN	0.9242	FALSE	FALSE	FALSE	0.1820
27	SVM	1.6165	TRUE	FALSE	FALSE	0.0639
28	RF	−0.1662	FALSE	FALSE	FALSE	0.5653
28	XGB	0.4307	FALSE	FALSE	FALSE	0.3353
28	MLP	2.9072	TRUE	TRUE	TRUE	0.0039
28	KNN	0.0855	FALSE	FALSE	FALSE	0.4663
28	SVM	0.2942	FALSE	FALSE	FALSE	0.3856
29	RF	0.2476	FALSE	FALSE	FALSE	0.4032
29	XGB	1.1865	FALSE	FALSE	FALSE	0.1234
29	MLP	1.4032	TRUE	FALSE	FALSE	0.0862
29	KNN	0.3224	FALSE	FALSE	FALSE	0.3749
29	SVM	0.9701	FALSE	FALSE	FALSE	0.1708
30	RF	0.3686	FALSE	FALSE	FALSE	0.3576
30	XGB	1.4167	TRUE	FALSE	FALSE	0.0839
30	MLP	2.9361	TRUE	TRUE	TRUE	0.0033
30	KNN	0.6214	FALSE	FALSE	FALSE	0.2697
30	SVM	2.6059	TRUE	TRUE	TRUE	0.0078
31	RF	0.1883	FALSE	FALSE	FALSE	0.4260
31	XGB	1.1666	FALSE	FALSE	FALSE	0.1270
31	MLP	4.6089	TRUE	TRUE	TRUE	0.0000
31	KNN	1.0459	FALSE	FALSE	FALSE	0.1525
31	SVM	1.1044	FALSE	FALSE	FALSE	0.1395
32	RF	0.0959	FALSE	FALSE	FALSE	0.4622
32	XGB	0.2404	FALSE	FALSE	FALSE	0.4060
32	MLP	2.3175	TRUE	TRUE	FALSE	0.0144
32	KNN	−0.0702	FALSE	FALSE	FALSE	0.5277
32	SVM	0.1552	FALSE	FALSE	FALSE	0.4389
33	RF	0.1020	FALSE	FALSE	FALSE	0.4597
33	XGB	0.3234	FALSE	FALSE	FALSE	0.3744
33	MLP	0.8075	FALSE	FALSE	FALSE	0.2131
33	KNN	0.2212	FALSE	FALSE	FALSE	0.4133
33	SVM	0.7586	FALSE	FALSE	FALSE	0.2272
34	RF	0.2522	FALSE	FALSE	FALSE	0.4014
34	XGB	1.4415	TRUE	FALSE	FALSE	0.0807
34	MLP	2.9313	TRUE	TRUE	TRUE	0.0035
34	KNN	1.5001	TRUE	FALSE	FALSE	0.0728
34	SVM	2.2648	TRUE	TRUE	FALSE	0.0176
35	RF	0.2551	FALSE	FALSE	FALSE	0.4002
35	XGB	0.9461	FALSE	FALSE	FALSE	0.1762
35	MLP	2.7989	TRUE	TRUE	TRUE	0.0046
35	KNN	0.6289	FALSE	FALSE	FALSE	0.2673
35	SVM	0.8968	FALSE	FALSE	FALSE	0.1907
36	RF	0.4837	FALSE	FALSE	FALSE	0.3163
36	XGB	1.0497	FALSE	FALSE	FALSE	0.1518
36	MLP	1.3258	TRUE	FALSE	FALSE	0.0983
36	KNN	0.7227	FALSE	FALSE	FALSE	0.2382
36	SVM	0.3339	FALSE	FALSE	FALSE	0.3706
37	RF	0.3585	FALSE	FALSE	FALSE	0.3613
37	XGB	0.8954	FALSE	FALSE	FALSE	0.1891
37	MLP	1.7436	TRUE	TRUE	FALSE	0.0462
37	KNN	0.4769	FALSE	FALSE	FALSE	0.3186
37	SVM	1.3380	TRUE	FALSE	FALSE	0.0964
38	RF	0.2394	FALSE	FALSE	FALSE	0.4063
38	XGB	1.0362	FALSE	FALSE	FALSE	0.1545
38	MLP	1.1023	FALSE	FALSE	FALSE	0.1399
38	KNN	0.6259	FALSE	FALSE	FALSE	0.2682
38	SVM	0.3489	FALSE	FALSE	FALSE	0.3649
39	RF	0.1350	FALSE	FALSE	FALSE	0.4468
39	XGB	1.0794	FALSE	FALSE	FALSE	0.1452
39	MLP	2.3703	TRUE	TRUE	FALSE	0.0129
39	KNN	0.7016	FALSE	FALSE	FALSE	0.2446
39	SVM	1.5979	TRUE	FALSE	FALSE	0.0613
40	RF	0.3065	FALSE	FALSE	FALSE	0.3808
40	XGB	1.1106	FALSE	FALSE	FALSE	0.1385
40	MLP	1.8459	TRUE	TRUE	FALSE	0.0382
40	KNN	0.6396	FALSE	FALSE	FALSE	0.2640
40	SVM	0.8960	FALSE	FALSE	FALSE	0.1899
41	RF	0.2582	FALSE	FALSE	FALSE	0.3992
41	XGB	0.8995	FALSE	FALSE	FALSE	0.1884
41	MLP	1.7367	TRUE	TRUE	FALSE	0.0472
41	KNN	0.5695	FALSE	FALSE	FALSE	0.2870
41	SVM	0.8451	FALSE	FALSE	FALSE	0.2031

Table A3. Independent T-Test Results (GCN Model, RMSE).

Test ID	Comparison Model	T-Statistic	Significance (10%)	Significance (5%)	Significance (1%)	p-Value
1	RF	0.2671	FALSE	FALSE	FALSE	0.3957
1	XGB	1.0057	FALSE	FALSE	FALSE	0.1619
1	MLP	1.5225	TRUE	FALSE	FALSE	0.0695
1	KNN	1.0172	FALSE	FALSE	FALSE	0.1590
1	SVM	1.3654	TRUE	FALSE	FALSE	0.0915
2	RF	0.0215	FALSE	FALSE	FALSE	0.4915
2	XGB	0.4072	FALSE	FALSE	FALSE	0.3436
2	MLP	1.7710	TRUE	TRUE	FALSE	0.0442
2	KNN	0.4213	FALSE	FALSE	FALSE	0.3385
2	SVM	0.1166	FALSE	FALSE	FALSE	0.4540
3	RF	−0.0125	FALSE	FALSE	FALSE	0.5049
3	XGB	0.5274	FALSE	FALSE	FALSE	0.3011
3	MLP	1.5540	TRUE	FALSE	FALSE	0.0658
3	KNN	0.2912	FALSE	FALSE	FALSE	0.3865
3	SVM	0.6932	FALSE	FALSE	FALSE	0.2470
4	RF	−0.2128	FALSE	FALSE	FALSE	0.5835
4	XGB	0.1858	FALSE	FALSE	FALSE	0.4270
4	MLP	2.8192	TRUE	TRUE	TRUE	0.0045
4	KNN	0.0250	FALSE	FALSE	FALSE	0.4901
4	SVM	0.1168	FALSE	FALSE	FALSE	0.4539
5	RF	0.1407	FALSE	FALSE	FALSE	0.4446
5	XGB	0.8241	FALSE	FALSE	FALSE	0.2086
5	MLP	1.8649	TRUE	TRUE	FALSE	0.0364
5	KNN	0.4402	FALSE	FALSE	FALSE	0.3316
5	SVM	1.0382	FALSE	FALSE	FALSE	0.1549
6	RF	−0.0654	FALSE	FALSE	FALSE	0.5258
6	XGB	0.6471	FALSE	FALSE	FALSE	0.2615
6	MLP	3.8670	TRUE	TRUE	TRUE	0.0003
6	KNN	0.4100	FALSE	FALSE	FALSE	0.3425
6	SVM	0.4874	FALSE	FALSE	FALSE	0.3149
7	RF	−0.1301	FALSE	FALSE	FALSE	0.5513
7	XGB	0.6519	FALSE	FALSE	FALSE	0.2599
7	MLP	3.1222	TRUE	TRUE	TRUE	0.0022
7	KNN	−0.0201	FALSE	FALSE	FALSE	0.5079
7	SVM	1.3825	TRUE	FALSE	FALSE	0.0890
8	RF	0.3441	FALSE	FALSE	FALSE	0.3667
8	XGB	1.1759	FALSE	FALSE	FALSE	0.1248
8	MLP	1.0354	FALSE	FALSE	FALSE	0.1548
8	KNN	0.6952	FALSE	FALSE	FALSE	0.2464
8	SVM	0.3558	FALSE	FALSE	FALSE	0.3623
9	RF	−0.6348	FALSE	FALSE	FALSE	0.7346
9	XGB	−0.0520	FALSE	FALSE	FALSE	0.5205
9	MLP	1.5950	TRUE	FALSE	FALSE	0.0610
9	KNN	−0.1822	FALSE	FALSE	FALSE	0.5716
9	SVM	0.8660	FALSE	FALSE	FALSE	0.1970
10	RF	1.3383	TRUE	FALSE	FALSE	0.0960
10	XGB	3.3862	TRUE	TRUE	TRUE	0.0012
10	MLP	2.2054	TRUE	TRUE	FALSE	0.0180
10	KNN	0.7362	FALSE	FALSE	FALSE	0.2341
10	SVM	2.1774	TRUE	TRUE	FALSE	0.0198
11	RF	0.1116	FALSE	FALSE	FALSE	0.4560
11	XGB	2.4705	TRUE	TRUE	FALSE	0.0100
11	MLP	2.7788	TRUE	TRUE	TRUE	0.0048
11	KNN	3.5029	TRUE	TRUE	TRUE	0.0008
11	SVM	1.4420	TRUE	FALSE	FALSE	0.0803
12	RF	0.1327	FALSE	FALSE	FALSE	0.4477
12	XGB	0.9288	FALSE	FALSE	FALSE	0.1805
12	MLP	3.4254	TRUE	TRUE	TRUE	0.0010
12	KNN	0.5357	FALSE	FALSE	FALSE	0.2982
12	SVM	0.3603	FALSE	FALSE	FALSE	0.3607
13	RF	0.2391	FALSE	FALSE	FALSE	0.4064
13	XGB	1.2945	FALSE	FALSE	FALSE	0.1031
13	MLP	1.5660	TRUE	FALSE	FALSE	0.0643
13	KNN	0.9756	FALSE	FALSE	FALSE	0.1688
13	SVM	1.5444	TRUE	FALSE	FALSE	0.0709
14	RF	0.2791	FALSE	FALSE	FALSE	0.3911
14	XGB	1.2970	FALSE	FALSE	FALSE	0.1026
14	MLP	3.6374	TRUE	TRUE	TRUE	0.0006
14	KNN	0.7883	FALSE	FALSE	FALSE	0.2186
14	SVM	0.1422	FALSE	FALSE	FALSE	0.4440
15	RF	−0.1837	FALSE	FALSE	FALSE	0.5722
15	XGB	0.4254	FALSE	FALSE	FALSE	0.3369
15	MLP	0.7830	FALSE	FALSE	FALSE	0.2201
15	KNN	−0.1334	FALSE	FALSE	FALSE	0.5526
15	SVM	0.8112	FALSE	FALSE	FALSE	0.2121
16	RF	0.7185	FALSE	FALSE	FALSE	0.2392
16	XGB	1.9587	TRUE	TRUE	FALSE	0.0303
16	MLP	2.0114	TRUE	TRUE	FALSE	0.0270
16	KNN	2.2227	TRUE	TRUE	FALSE	0.0175
16	SVM	1.8084	TRUE	TRUE	FALSE	0.0409
17	RF	0.1428	FALSE	FALSE	FALSE	0.4437
17	XGB	0.7092	FALSE	FALSE	FALSE	0.2420
17	MLP	1.1566	FALSE	FALSE	FALSE	0.1286
17	KNN	0.6477	FALSE	FALSE	FALSE	0.2612
17	SVM	0.4654	FALSE	FALSE	FALSE	0.3226
18	RF	0.0976	FALSE	FALSE	FALSE	0.4615
18	XGB	0.5824	FALSE	FALSE	FALSE	0.2825
18	MLP	1.9270	TRUE	TRUE	FALSE	0.0322
18	KNN	0.4343	FALSE	FALSE	FALSE	0.3337
18	SVM	0.3928	FALSE	FALSE	FALSE	0.3487
19	RF	0.2273	FALSE	FALSE	FALSE	0.4109
19	XGB	0.9143	FALSE	FALSE	FALSE	0.1842
19	MLP	1.3227	TRUE	FALSE	FALSE	0.0983
19	KNN	0.5768	FALSE	FALSE	FALSE	0.2843
19	SVM	1.4751	TRUE	FALSE	FALSE	0.0772
20	RF	0.1580	FALSE	FALSE	FALSE	0.4378
20	XGB	0.8571	FALSE	FALSE	FALSE	0.1994
20	MLP	2.8345	TRUE	TRUE	TRUE	0.0043
20	KNN	0.5231	FALSE	FALSE	FALSE	0.3025
20	SVM	0.4442	FALSE	FALSE	FALSE	0.3302
21	RF	0.1502	FALSE	FALSE	FALSE	0.4408
21	XGB	0.7618	FALSE	FALSE	FALSE	0.2263
21	MLP	1.5486	TRUE	FALSE	FALSE	0.0664
21	KNN	0.4489	FALSE	FALSE	FALSE	0.3285
21	SVM	0.5337	FALSE	FALSE	FALSE	0.2989
22	RF	0.3117	FALSE	FALSE	FALSE	0.3789
22	XGB	0.9826	FALSE	FALSE	FALSE	0.1677
22	MLP	1.5653	TRUE	FALSE	FALSE	0.0648
22	KNN	0.7438	FALSE	FALSE	FALSE	0.2320
22	SVM	0.4347	FALSE	FALSE	FALSE	0.3337
23	RF	0.0272	FALSE	FALSE	FALSE	0.4892
23	XGB	0.6593	FALSE	FALSE	FALSE	0.2576
23	MLP	1.5674	TRUE	FALSE	FALSE	0.0642
23	KNN	0.0752	FALSE	FALSE	FALSE	0.4703
23	SVM	0.3338	FALSE	FALSE	FALSE	0.3705
24	RF	0.0462	FALSE	FALSE	FALSE	0.4818
24	XGB	0.9490	FALSE	FALSE	FALSE	0.1754
24	MLP	1.3718	TRUE	FALSE	FALSE	0.0906
24	KNN	0.4036	FALSE	FALSE	FALSE	0.3448
24	SVM	0.6983	FALSE	FALSE	FALSE	0.2455
25	RF	0.3229	FALSE	FALSE	FALSE	0.3746
25	XGB	1.3548	TRUE	FALSE	FALSE	0.0932
25	MLP	1.7883	TRUE	TRUE	FALSE	0.0423
25	KNN	1.5243	TRUE	FALSE	FALSE	0.0696
25	SVM	1.8329	TRUE	TRUE	FALSE	0.0428
26	RF	0.1683	FALSE	FALSE	FALSE	0.4338
26	XGB	1.0386	FALSE	FALSE	FALSE	0.1540
26	MLP	2.8667	TRUE	TRUE	TRUE	0.0039
26	KNN	0.8416	FALSE	FALSE	FALSE	0.2036
26	SVM	1.4520	TRUE	FALSE	FALSE	0.0806
27	RF	0.4497	FALSE	FALSE	FALSE	0.3283
27	XGB	1.4685	TRUE	FALSE	FALSE	0.0771
27	MLP	2.7426	TRUE	TRUE	TRUE	0.0055
27	KNN	0.9132	FALSE	FALSE	FALSE	0.1848
27	SVM	1.6130	TRUE	FALSE	FALSE	0.0643
28	RF	−0.1731	FALSE	FALSE	FALSE	0.5680
28	XGB	0.4243	FALSE	FALSE	FALSE	0.3376
28	MLP	2.9030	TRUE	TRUE	TRUE	0.0039
28	KNN	0.0789	FALSE	FALSE	FALSE	0.4689
28	SVM	0.2876	FALSE	FALSE	FALSE	0.3880
29	RF	0.2439	FALSE	FALSE	FALSE	0.4046
29	XGB	1.1839	FALSE	FALSE	FALSE	0.1239
29	MLP	1.4007	TRUE	FALSE	FALSE	0.0866
29	KNN	0.3188	FALSE	FALSE	FALSE	0.3762
29	SVM	0.9674	FALSE	FALSE	FALSE	0.1715
30	RF	0.3680	FALSE	FALSE	FALSE	0.3578
30	XGB	1.4168	TRUE	FALSE	FALSE	0.0839
30	MLP	2.9374	TRUE	TRUE	TRUE	0.0033
30	KNN	0.6210	FALSE	FALSE	FALSE	0.2698
30	SVM	2.6063	TRUE	TRUE	TRUE	0.0078
31	RF	0.1803	FALSE	FALSE	FALSE	0.4291
31	XGB	1.1598	FALSE	FALSE	FALSE	0.1284
31	MLP	4.6031	TRUE	TRUE	TRUE	0.0000
31	KNN	1.0386	FALSE	FALSE	FALSE	0.1541
31	SVM	1.0970	FALSE	FALSE	FALSE	0.1411
32	RF	0.0943	FALSE	FALSE	FALSE	0.4628
32	XGB	0.2389	FALSE	FALSE	FALSE	0.4065
32	MLP	2.3162	TRUE	TRUE	FALSE	0.0144
32	KNN	−0.0718	FALSE	FALSE	FALSE	0.5283
32	SVM	0.1537	FALSE	FALSE	FALSE	0.4395
33	RF	0.1012	FALSE	FALSE	FALSE	0.4601
33	XGB	0.3226	FALSE	FALSE	FALSE	0.3747
33	MLP	0.8068	FALSE	FALSE	FALSE	0.2133
33	KNN	0.2204	FALSE	FALSE	FALSE	0.4136
33	SVM	0.7578	FALSE	FALSE	FALSE	0.2274
34	RF	0.2447	FALSE	FALSE	FALSE	0.4043
34	XGB	1.4338	TRUE	FALSE	FALSE	0.0818
34	MLP	2.9226	TRUE	TRUE	TRUE	0.0036
34	KNN	1.4921	TRUE	FALSE	FALSE	0.0739
34	SVM	2.2598	TRUE	TRUE	FALSE	0.0178
35	RF	0.2514	FALSE	FALSE	FALSE	0.4017
35	XGB	0.9425	FALSE	FALSE	FALSE	0.1771
35	MLP	2.7941	TRUE	TRUE	TRUE	0.0047
35	KNN	0.6251	FALSE	FALSE	FALSE	0.2685
35	SVM	0.8947	FALSE	FALSE	FALSE	0.1912
36	RF	0.4832	FALSE	FALSE	FALSE	0.3165
36	XGB	1.0491	FALSE	FALSE	FALSE	0.1519
36	MLP	1.3251	TRUE	FALSE	FALSE	0.0984
36	KNN	0.7221	FALSE	FALSE	FALSE	0.2383
36	SVM	0.3334	FALSE	FALSE	FALSE	0.3708
37	RF	0.3542	FALSE	FALSE	FALSE	0.3629
37	XGB	0.8913	FALSE	FALSE	FALSE	0.1902
37	MLP	1.7395	TRUE	TRUE	FALSE	0.0465
37	KNN	0.4728	FALSE	FALSE	FALSE	0.3200
37	SVM	1.3346	TRUE	FALSE	FALSE	0.0970
38	RF	0.2368	FALSE	FALSE	FALSE	0.4073
38	XGB	1.0334	FALSE	FALSE	FALSE	0.1551
38	MLP	1.0993	FALSE	FALSE	FALSE	0.1405
38	KNN	0.6232	FALSE	FALSE	FALSE	0.2691
38	SVM	0.3462	FALSE	FALSE	FALSE	0.3659
39	RF	0.1285	FALSE	FALSE	FALSE	0.4494
39	XGB	1.0738	FALSE	FALSE	FALSE	0.1464
39	MLP	2.3651	TRUE	TRUE	FALSE	0.0130
39	KNN	0.6958	FALSE	FALSE	FALSE	0.2464
39	SVM	1.5930	TRUE	FALSE	FALSE	0.0618
40	RF	0.2987	FALSE	FALSE	FALSE	0.3838
40	XGB	1.1030	FALSE	FALSE	FALSE	0.1401
40	MLP	1.8383	TRUE	TRUE	FALSE	0.0388
40	KNN	0.6319	FALSE	FALSE	FALSE	0.2665
40	SVM	0.8900	FALSE	FALSE	FALSE	0.1914
41	RF	0.2557	FALSE	FALSE	FALSE	0.4001
41	XGB	0.8975	FALSE	FALSE	FALSE	0.1889
41	MLP	1.7353	TRUE	TRUE	FALSE	0.0473
41	KNN	0.5674	FALSE	FALSE	FALSE	0.2877
41	SVM	0.8432	FALSE	FALSE	FALSE	0.2036

Table A4. Independent T-Test Results (GAT Model, MAE).

Test ID	Comparison Model	T-Statistic	Significance (10%)	Significance (5%)	Significance (1%)	p-Value
1	RF	0.4108	FALSE	FALSE	FALSE	0.3422
1	XGB	1.1188	FALSE	FALSE	FALSE	0.1366
1	MLP	1.7498	TRUE	TRUE	FALSE	0.0456
1	KNN	1.0827	FALSE	FALSE	FALSE	0.1443
1	SVM	1.6132	TRUE	FALSE	FALSE	0.0590
2	RF	0.1116	FALSE	FALSE	FALSE	0.4560
2	XGB	0.5359	FALSE	FALSE	FALSE	0.2983
2	MLP	1.8668	TRUE	TRUE	FALSE	0.0366
2	KNN	0.4860	FALSE	FALSE	FALSE	0.3155
2	SVM	0.1347	FALSE	FALSE	FALSE	0.4470
3	RF	0.0340	FALSE	FALSE	FALSE	0.4865
3	XGB	0.6169	FALSE	FALSE	FALSE	0.2712
3	MLP	1.7234	TRUE	TRUE	FALSE	0.0480
3	KNN	0.4417	FALSE	FALSE	FALSE	0.3311
3	SVM	0.8769	FALSE	FALSE	FALSE	0.1941
4	RF	−0.1220	FALSE	FALSE	FALSE	0.5481
4	XGB	0.3646	FALSE	FALSE	FALSE	0.3591
4	MLP	2.9805	TRUE	TRUE	TRUE	0.0030
4	KNN	0.1782	FALSE	FALSE	FALSE	0.4299
4	SVM	0.2099	FALSE	FALSE	FALSE	0.4177
5	RF	0.3619	FALSE	FALSE	FALSE	0.3601
5	XGB	1.0454	FALSE	FALSE	FALSE	0.1527
5	MLP	2.4066	TRUE	TRUE	FALSE	0.0115
5	KNN	0.7966	FALSE	FALSE	FALSE	0.2164
5	SVM	1.3614	TRUE	FALSE	FALSE	0.0938
6	RF	−0.0334	FALSE	FALSE	FALSE	0.5132
6	XGB	0.6721	FALSE	FALSE	FALSE	0.2535
6	MLP	3.9890	TRUE	TRUE	TRUE	0.0002
6	KNN	0.4716	FALSE	FALSE	FALSE	0.3204
6	SVM	0.5947	FALSE	FALSE	FALSE	0.2785
7	RF	0.0271	FALSE	FALSE	FALSE	0.4893
7	XGB	0.8407	FALSE	FALSE	FALSE	0.2038
7	MLP	3.7928	TRUE	TRUE	TRUE	0.0004
7	KNN	0.1349	FALSE	FALSE	FALSE	0.4468
7	SVM	2.1117	TRUE	TRUE	FALSE	0.0221
8	RF	0.5553	FALSE	FALSE	FALSE	0.2915
8	XGB	1.5265	TRUE	FALSE	FALSE	0.0691
8	MLP	1.3918	TRUE	FALSE	FALSE	0.0877
8	KNN	1.1138	FALSE	FALSE	FALSE	0.1374
8	SVM	0.5533	FALSE	FALSE	FALSE	0.2923
9	RF	−0.4032	FALSE	FALSE	FALSE	0.6551
9	XGB	0.2449	FALSE	FALSE	FALSE	0.4042
9	MLP	1.9274	TRUE	TRUE	FALSE	0.0321
9	KNN	−0.0438	FALSE	FALSE	FALSE	0.5173
9	SVM	1.2285	FALSE	FALSE	FALSE	0.1151
10	RF	1.8607	TRUE	TRUE	FALSE	0.0368
10	XGB	3.5261	TRUE	TRUE	TRUE	0.0008
10	MLP	2.7020	TRUE	TRUE	TRUE	0.0058
10	KNN	1.1975	FALSE	FALSE	FALSE	0.1209
10	SVM	2.5575	TRUE	TRUE	TRUE	0.0089
11	RF	0.4627	FALSE	FALSE	FALSE	0.3236
11	XGB	2.9518	TRUE	TRUE	TRUE	0.0032
11	MLP	3.1681	TRUE	TRUE	TRUE	0.0018
11	KNN	4.0375	TRUE	TRUE	TRUE	0.0002
11	SVM	1.6521	TRUE	FALSE	FALSE	0.0549
12	RF	0.2493	FALSE	FALSE	FALSE	0.4025
12	XGB	1.2890	FALSE	FALSE	FALSE	0.1040
12	MLP	4.1612	TRUE	TRUE	TRUE	0.0001
12	KNN	0.8258	FALSE	FALSE	FALSE	0.2080
12	SVM	0.6097	FALSE	FALSE	FALSE	0.2735
13	RF	0.4969	FALSE	FALSE	FALSE	0.3116
13	XGB	1.7445	TRUE	TRUE	FALSE	0.0461
13	MLP	2.0865	TRUE	TRUE	FALSE	0.0231
13	KNN	1.2108	FALSE	FALSE	FALSE	0.1181
13	SVM	1.7190	TRUE	FALSE	FALSE	0.0528
14	RF	0.3651	FALSE	FALSE	FALSE	0.3589
14	XGB	1.6022	TRUE	FALSE	FALSE	0.0603
14	MLP	5.0067	TRUE	TRUE	TRUE	0.0000
14	KNN	1.0967	FALSE	FALSE	FALSE	0.1412
14	SVM	0.3996	FALSE	FALSE	FALSE	0.3463
15	RF	−0.1434	FALSE	FALSE	FALSE	0.5565
15	XGB	0.4731	FALSE	FALSE	FALSE	0.3199
15	MLP	0.8719	FALSE	FALSE	FALSE	0.1953
15	KNN	−0.0877	FALSE	FALSE	FALSE	0.5346
15	SVM	0.9952	FALSE	FALSE	FALSE	0.1642
16	RF	0.9820	FALSE	FALSE	FALSE	0.1674
16	XGB	2.1903	TRUE	TRUE	FALSE	0.0186
16	MLP	2.4577	TRUE	TRUE	FALSE	0.0102
16	KNN	2.4406	TRUE	TRUE	FALSE	0.0109
16	SVM	2.2612	TRUE	TRUE	FALSE	0.0162
17	RF	0.3525	FALSE	FALSE	FALSE	0.3636
17	XGB	1.0313	FALSE	FALSE	FALSE	0.1557
17	MLP	1.6123	TRUE	FALSE	FALSE	0.0591
17	KNN	0.8227	FALSE	FALSE	FALSE	0.2088
17	SVM	0.7681	FALSE	FALSE	FALSE	0.2244
18	RF	0.1177	FALSE	FALSE	FALSE	0.4536
18	XGB	0.6886	FALSE	FALSE	FALSE	0.2484
18	MLP	2.3394	TRUE	TRUE	FALSE	0.0134
18	KNN	0.6174	FALSE	FALSE	FALSE	0.2710
18	SVM	0.5719	FALSE	FALSE	FALSE	0.2860
19	RF	0.3510	FALSE	FALSE	FALSE	0.3641
19	XGB	1.1733	FALSE	FALSE	FALSE	0.1254
19	MLP	1.6938	TRUE	FALSE	FALSE	0.0507
19	KNN	0.7818	FALSE	FALSE	FALSE	0.2205
19	SVM	1.7681	TRUE	TRUE	FALSE	0.0464
20	RF	0.2411	FALSE	FALSE	FALSE	0.4056
20	XGB	0.9607	FALSE	FALSE	FALSE	0.1725
20	MLP	3.1355	TRUE	TRUE	TRUE	0.0020
20	KNN	0.7222	FALSE	FALSE	FALSE	0.2381
20	SVM	0.5827	FALSE	FALSE	FALSE	0.2824
21	RF	0.2993	FALSE	FALSE	FALSE	0.3835
21	XGB	0.8820	FALSE	FALSE	FALSE	0.1927
21	MLP	1.7734	TRUE	TRUE	FALSE	0.0436
21	KNN	0.7003	FALSE	FALSE	FALSE	0.2448
21	SVM	0.7997	FALSE	FALSE	FALSE	0.2153
22	RF	0.4384	FALSE	FALSE	FALSE	0.3324
22	XGB	1.1200	FALSE	FALSE	FALSE	0.1369
22	MLP	1.8023	TRUE	TRUE	FALSE	0.0416
22	KNN	0.9221	FALSE	FALSE	FALSE	0.1827
22	SVM	0.6067	FALSE	FALSE	FALSE	0.2746
23	RF	0.1662	FALSE	FALSE	FALSE	0.4346
23	XGB	0.9731	FALSE	FALSE	FALSE	0.1698
23	MLP	1.9253	TRUE	TRUE	FALSE	0.0322
23	KNN	0.2463	FALSE	FALSE	FALSE	0.4036
23	SVM	0.4778	FALSE	FALSE	FALSE	0.3183
24	RF	0.0706	FALSE	FALSE	FALSE	0.4721
24	XGB	1.0778	FALSE	FALSE	FALSE	0.1452
24	MLP	1.7822	TRUE	TRUE	FALSE	0.0429
24	KNN	0.4216	FALSE	FALSE	FALSE	0.3383
24	SVM	1.0187	FALSE	FALSE	FALSE	0.1590
25	RF	0.5037	FALSE	FALSE	FALSE	0.3092
25	XGB	1.6200	TRUE	FALSE	FALSE	0.0584
25	MLP	2.3904	TRUE	TRUE	FALSE	0.0119
25	KNN	1.8088	TRUE	TRUE	FALSE	0.0411
25	SVM	1.8900	TRUE	TRUE	FALSE	0.0392
26	RF	0.2712	FALSE	FALSE	FALSE	0.3941
26	XGB	1.2876	FALSE	FALSE	FALSE	0.1045
26	MLP	3.6410	TRUE	TRUE	TRUE	0.0006
26	KNN	1.0303	FALSE	FALSE	FALSE	0.1560
26	SVM	1.5893	TRUE	FALSE	FALSE	0.0648
27	RF	0.6036	FALSE	FALSE	FALSE	0.2757
27	XGB	1.7036	TRUE	FALSE	FALSE	0.0504
27	MLP	3.2180	TRUE	TRUE	TRUE	0.0017
27	KNN	1.0815	FALSE	FALSE	FALSE	0.1448
27	SVM	1.7218	TRUE	FALSE	FALSE	0.0538
28	RF	−0.0829	FALSE	FALSE	FALSE	0.5327
28	XGB	0.6236	FALSE	FALSE	FALSE	0.2694
28	MLP	3.3402	TRUE	TRUE	TRUE	0.0014
28	KNN	0.3431	FALSE	FALSE	FALSE	0.3672
28	SVM	0.5088	FALSE	FALSE	FALSE	0.3078
29	RF	0.4124	FALSE	FALSE	FALSE	0.3417
29	XGB	1.5142	TRUE	FALSE	FALSE	0.0715
29	MLP	2.0031	TRUE	TRUE	FALSE	0.0279
29	KNN	0.6579	FALSE	FALSE	FALSE	0.2583
29	SVM	1.5083	TRUE	FALSE	FALSE	0.0731
30	RF	0.4572	FALSE	FALSE	FALSE	0.3256
30	XGB	1.5552	TRUE	FALSE	FALSE	0.0658
30	MLP	3.5428	TRUE	TRUE	TRUE	0.0007
30	KNN	0.7514	FALSE	FALSE	FALSE	0.2294
30	SVM	3.0135	TRUE	TRUE	TRUE	0.0034
31	RF	0.4838	FALSE	FALSE	FALSE	0.3162
31	XGB	1.5898	TRUE	FALSE	FALSE	0.0622
31	MLP	4.8798	TRUE	TRUE	TRUE	0.0000
31	KNN	1.3608	TRUE	FALSE	FALSE	0.0925
31	SVM	1.3149	FALSE	FALSE	FALSE	0.1000
32	RF	0.1974	FALSE	FALSE	FALSE	0.4225
32	XGB	0.4386	FALSE	FALSE	FALSE	0.3323
32	MLP	2.6620	TRUE	TRUE	TRUE	0.0067
32	KNN	0.1054	FALSE	FALSE	FALSE	0.4584
32	SVM	0.6385	FALSE	FALSE	FALSE	0.2644
33	RF	0.7412	FALSE	FALSE	FALSE	0.2324
33	XGB	1.8787	TRUE	TRUE	FALSE	0.0354
33	MLP	2.9828	TRUE	TRUE	TRUE	0.0029
33	KNN	1.5789	TRUE	FALSE	FALSE	0.0628
33	SVM	3.0404	TRUE	TRUE	TRUE	0.0034
34	RF	0.3723	FALSE	FALSE	FALSE	0.3564
34	XGB	1.5592	TRUE	FALSE	FALSE	0.0655
34	MLP	3.1324	TRUE	TRUE	TRUE	0.0022
34	KNN	1.7891	TRUE	TRUE	FALSE	0.0426
34	SVM	2.4275	TRUE	TRUE	FALSE	0.0133
35	RF	0.4152	FALSE	FALSE	FALSE	0.3406
35	XGB	1.2023	FALSE	FALSE	FALSE	0.1198
35	MLP	3.2584	TRUE	TRUE	TRUE	0.0015
35	KNN	0.7474	FALSE	FALSE	FALSE	0.2306
35	SVM	1.0773	FALSE	FALSE	FALSE	0.1485
36	RF	0.5028	FALSE	FALSE	FALSE	0.3097
36	XGB	1.1551	FALSE	FALSE	FALSE	0.1293
36	MLP	1.4326	TRUE	FALSE	FALSE	0.0820
36	KNN	0.7309	FALSE	FALSE	FALSE	0.2357
36	SVM	0.4845	FALSE	FALSE	FALSE	0.3160
37	RF	0.4071	FALSE	FALSE	FALSE	0.3435
37	XGB	1.0908	FALSE	FALSE	FALSE	0.1424
37	MLP	2.1546	TRUE	TRUE	FALSE	0.0200
37	KNN	0.6142	FALSE	FALSE	FALSE	0.2721
37	SVM	1.7004	TRUE	FALSE	FALSE	0.0516
38	RF	0.3273	FALSE	FALSE	FALSE	0.3729
38	XGB	1.1536	FALSE	FALSE	FALSE	0.1292
38	MLP	1.2453	FALSE	FALSE	FALSE	0.1117
38	KNN	0.7595	FALSE	FALSE	FALSE	0.2270
38	SVM	0.3832	FALSE	FALSE	FALSE	0.3522
39	RF	0.3119	FALSE	FALSE	FALSE	0.3788
39	XGB	1.3974	TRUE	FALSE	FALSE	0.0871
39	MLP	2.6063	TRUE	TRUE	TRUE	0.0076
39	KNN	0.7980	FALSE	FALSE	FALSE	0.2160
39	SVM	1.8952	TRUE	TRUE	FALSE	0.0351
40	RF	0.4552	FALSE	FALSE	FALSE	0.3264
40	XGB	1.3542	TRUE	FALSE	FALSE	0.0937
40	MLP	2.2107	TRUE	TRUE	FALSE	0.0181
40	KNN	0.9156	FALSE	FALSE	FALSE	0.1842
40	SVM	1.0985	FALSE	FALSE	FALSE	0.1422
41	RF	0.2933	FALSE	FALSE	FALSE	0.3858
41	XGB	1.0116	FALSE	FALSE	FALSE	0.1606
41	MLP	2.0640	TRUE	TRUE	FALSE	0.0246
41	KNN	0.7109	FALSE	FALSE	FALSE	0.2418
41	SVM	1.0217	FALSE	FALSE	FALSE	0.1588

Table A5. Independent T-Test Results (GCN Model, MAE).

Test ID	Comparison Model	T-Statistic	Significance (10%)	Significance (5%)	Significance (1%)	p-Value
1	RF	0.3980	FALSE	FALSE	FALSE	0.3469
1	XGB	1.1072	FALSE	FALSE	FALSE	0.1390
1	MLP	1.7377	TRUE	TRUE	FALSE	0.0466
1	KNN	1.0710	FALSE	FALSE	FALSE	0.1468
1	SVM	1.6013	TRUE	FALSE	FALSE	0.0603
2	RF	0.1094	FALSE	FALSE	FALSE	0.4568
2	XGB	0.5343	FALSE	FALSE	FALSE	0.2989
2	MLP	1.8668	TRUE	TRUE	FALSE	0.0366
2	KNN	0.4843	FALSE	FALSE	FALSE	0.3161
2	SVM	0.1324	FALSE	FALSE	FALSE	0.4478
3	RF	0.0280	FALSE	FALSE	FALSE	0.4889
3	XGB	0.6112	FALSE	FALSE	FALSE	0.2730
3	MLP	1.7175	TRUE	TRUE	FALSE	0.0485
3	KNN	0.4360	FALSE	FALSE	FALSE	0.3331
3	SVM	0.8713	FALSE	FALSE	FALSE	0.1955
4	RF	−0.1259	FALSE	FALSE	FALSE	0.5497
4	XGB	0.3612	FALSE	FALSE	FALSE	0.3603
4	MLP	2.9798	TRUE	TRUE	TRUE	0.0030
4	KNN	0.1746	FALSE	FALSE	FALSE	0.4313
4	SVM	0.2063	FALSE	FALSE	FALSE	0.4190
5	RF	0.3545	FALSE	FALSE	FALSE	0.3628
5	XGB	1.0379	FALSE	FALSE	FALSE	0.1544
5	MLP	2.3949	TRUE	TRUE	FALSE	0.0118
5	KNN	0.7891	FALSE	FALSE	FALSE	0.2185
5	SVM	1.3555	TRUE	FALSE	FALSE	0.0947
6	RF	−0.0450	FALSE	FALSE	FALSE	0.5178
6	XGB	0.6609	FALSE	FALSE	FALSE	0.2570
6	MLP	3.9765	TRUE	TRUE	TRUE	0.0002
6	KNN	0.4605	FALSE	FALSE	FALSE	0.3243
6	SVM	0.5841	FALSE	FALSE	FALSE	0.2820
7	RF	0.0234	FALSE	FALSE	FALSE	0.4907
7	XGB	0.8368	FALSE	FALSE	FALSE	0.2049
7	MLP	3.7870	TRUE	TRUE	TRUE	0.0004
7	KNN	0.1312	FALSE	FALSE	FALSE	0.4483
7	SVM	2.1078	TRUE	TRUE	FALSE	0.0223
8	RF	0.5492	FALSE	FALSE	FALSE	0.2936
8	XGB	1.5199	TRUE	FALSE	FALSE	0.0699
8	MLP	1.3841	TRUE	FALSE	FALSE	0.0888
8	KNN	1.1070	FALSE	FALSE	FALSE	0.1389
8	SVM	0.5477	FALSE	FALSE	FALSE	0.2942
9	RF	−0.4093	FALSE	FALSE	FALSE	0.6573
9	XGB	0.2387	FALSE	FALSE	FALSE	0.4065
9	MLP	1.9205	TRUE	TRUE	FALSE	0.0325
9	KNN	−0.0498	FALSE	FALSE	FALSE	0.5197
9	SVM	1.2228	FALSE	FALSE	FALSE	0.1161
10	RF	1.8368	TRUE	TRUE	FALSE	0.0386
10	XGB	3.5045	TRUE	TRUE	TRUE	0.0009
10	MLP	2.6770	TRUE	TRUE	TRUE	0.0062
10	KNN	1.1758	FALSE	FALSE	FALSE	0.1252
10	SVM	2.5389	TRUE	TRUE	TRUE	0.0092
11	RF	0.4217	FALSE	FALSE	FALSE	0.3382
11	XGB	2.9138	TRUE	TRUE	TRUE	0.0035
11	MLP	3.1249	TRUE	TRUE	TRUE	0.0021
11	KNN	3.9967	TRUE	TRUE	TRUE	0.0002
11	SVM	1.6132	TRUE	FALSE	FALSE	0.0590
12	RF	0.2420	FALSE	FALSE	FALSE	0.4053
12	XGB	1.2820	FALSE	FALSE	FALSE	0.1052
12	MLP	4.1540	TRUE	TRUE	TRUE	0.0001
12	KNN	0.8184	FALSE	FALSE	FALSE	0.2100
12	SVM	0.6029	FALSE	FALSE	FALSE	0.2757
13	RF	0.4835	FALSE	FALSE	FALSE	0.3162
13	XGB	1.7333	TRUE	TRUE	FALSE	0.0471
13	MLP	2.0753	TRUE	TRUE	FALSE	0.0236
13	KNN	1.1984	FALSE	FALSE	FALSE	0.1204
13	RF	0.4835	FALSE	FALSE	FALSE	0.3162
14	RF	0.3516	FALSE	FALSE	FALSE	0.3639
14	XGB	1.5903	TRUE	FALSE	FALSE	0.0616
14	MLP	4.9959	TRUE	TRUE	TRUE	0.0000
14	KNN	1.0846	FALSE	FALSE	FALSE	0.1438
14	SVM	0.3872	FALSE	FALSE	FALSE	0.3508
15	RF	−0.1481	FALSE	FALSE	FALSE	0.5583
15	XGB	0.4686	FALSE	FALSE	FALSE	0.3215
15	MLP	0.8676	FALSE	FALSE	FALSE	0.1965
15	KNN	−0.0923	FALSE	FALSE	FALSE	0.5365
15	SVM	0.9913	FALSE	FALSE	FALSE	0.1652
16	RF	0.9721	FALSE	FALSE	FALSE	0.1698
16	XGB	2.1808	TRUE	TRUE	FALSE	0.0190
16	MLP	2.4477	TRUE	TRUE	FALSE	0.0105
16	KNN	2.4316	TRUE	TRUE	FALSE	0.0111
16	SVM	2.2523	TRUE	TRUE	FALSE	0.0165
17	RF	0.3439	FALSE	FALSE	FALSE	0.3667
17	XGB	1.0235	FALSE	FALSE	FALSE	0.1575
17	MLP	1.6042	TRUE	FALSE	FALSE	0.0600
17	KNN	0.8147	FALSE	FALSE	FALSE	0.2111
17	SVM	0.7599	FALSE	FALSE	FALSE	0.2269
18	RF	0.1118	FALSE	FALSE	FALSE	0.4559
18	XGB	0.6832	FALSE	FALSE	FALSE	0.2501
18	MLP	2.3348	TRUE	TRUE	FALSE	0.0136
18	KNN	0.6118	FALSE	FALSE	FALSE	0.2728
18	SVM	0.5666	FALSE	FALSE	FALSE	0.2878
19	RF	0.3423	FALSE	FALSE	FALSE	0.3673
19	XGB	1.1651	FALSE	FALSE	FALSE	0.1270
19	MLP	1.6846	TRUE	FALSE	FALSE	0.0516
19	KNN	0.7729	FALSE	FALSE	FALSE	0.2230
19	SVM	1.7629	TRUE	TRUE	FALSE	0.0468
20	RF	0.2389	FALSE	FALSE	FALSE	0.4065
20	XGB	0.9590	FALSE	FALSE	FALSE	0.1730
20	MLP	3.1346	TRUE	TRUE	TRUE	0.0020
20	KNN	0.7204	FALSE	FALSE	FALSE	0.2387
20	SVM	0.5807	FALSE	FALSE	FALSE	0.2831
21	RF	0.2943	FALSE	FALSE	FALSE	0.3853
21	XGB	0.8770	FALSE	FALSE	FALSE	0.1940
21	MLP	1.7677	TRUE	TRUE	FALSE	0.0441
21	KNN	0.6954	FALSE	FALSE	FALSE	0.2463
21	SVM	0.7945	FALSE	FALSE	FALSE	0.2168
22	RF	0.4350	FALSE	FALSE	FALSE	0.3336
22	XGB	1.1172	FALSE	FALSE	FALSE	0.1375
22	MLP	1.7992	TRUE	TRUE	FALSE	0.0418
22	KNN	0.9190	FALSE	FALSE	FALSE	0.1835
22	SVM	0.6032	FALSE	FALSE	FALSE	0.2758
23	RF	0.1617	FALSE	FALSE	FALSE	0.4363
23	XGB	0.9690	FALSE	FALSE	FALSE	0.1708
23	MLP	1.9202	TRUE	TRUE	FALSE	0.0326
23	KNN	0.2416	FALSE	FALSE	FALSE	0.4054
23	SVM	0.4732	FALSE	FALSE	FALSE	0.3199
24	RF	0.0623	FALSE	FALSE	FALSE	0.4754
24	XGB	1.0695	FALSE	FALSE	FALSE	0.1470
24	MLP	1.7734	TRUE	TRUE	FALSE	0.0436
24	KNN	0.4135	FALSE	FALSE	FALSE	0.3412
24	SVM	1.0119	FALSE	FALSE	FALSE	0.1606
25	RF	0.4827	FALSE	FALSE	FALSE	0.3165
25	XGB	1.6017	TRUE	FALSE	FALSE	0.0604
25	MLP	2.3722	TRUE	TRUE	FALSE	0.0124
25	KNN	1.7920	TRUE	TRUE	FALSE	0.0424
25	SVM	1.8846	TRUE	TRUE	FALSE	0.0396
26	RF	0.2646	FALSE	FALSE	FALSE	0.3966
26	XGB	1.2812	FALSE	FALSE	FALSE	0.1056
26	MLP	3.6320	TRUE	TRUE	TRUE	0.0006
26	KNN	1.0238	FALSE	FALSE	FALSE	0.1575
26	SVM	1.5858	TRUE	FALSE	FALSE	0.0652
27	RF	0.5854	FALSE	FALSE	FALSE	0.2817
27	XGB	1.6873	TRUE	FALSE	FALSE	0.0520
27	MLP	3.2009	TRUE	TRUE	TRUE	0.0018
27	KNN	1.0642	FALSE	FALSE	FALSE	0.1486
27	SVM	1.7173	TRUE	FALSE	FALSE	0.0542
28	RF	−0.0933	FALSE	FALSE	FALSE	0.5368
28	XGB	0.6140	FALSE	FALSE	FALSE	0.2725
28	MLP	3.3345	TRUE	TRUE	TRUE	0.0014
28	KNN	0.3334	FALSE	FALSE	FALSE	0.3709
28	SVM	0.4991	FALSE	FALSE	FALSE	0.3111
29	RF	0.4068	FALSE	FALSE	FALSE	0.3438
29	XGB	1.5099	TRUE	FALSE	FALSE	0.0720
29	MLP	1.9986	TRUE	TRUE	FALSE	0.0281
29	KNN	0.6527	FALSE	FALSE	FALSE	0.2599
29	SVM	1.5045	TRUE	FALSE	FALSE	0.0736
30	RF	0.4580	FALSE	FALSE	FALSE	0.3253
30	XGB	1.5563	TRUE	FALSE	FALSE	0.0657
30	MLP	3.5448	TRUE	TRUE	TRUE	0.0007
30	KNN	0.7523	FALSE	FALSE	FALSE	0.2292
30	SVM	3.0143	TRUE	TRUE	TRUE	0.0034
31	RF	0.4702	FALSE	FALSE	FALSE	0.3210
31	XGB	1.5783	TRUE	FALSE	FALSE	0.0635
31	MLP	4.8684	TRUE	TRUE	TRUE	0.0000
31	KNN	1.3482	TRUE	FALSE	FALSE	0.0945
31	SVM	1.3028	FALSE	FALSE	FALSE	0.1021
32	RF	0.1929	FALSE	FALSE	FALSE	0.4243
32	XGB	0.4339	FALSE	FALSE	FALSE	0.3340
32	MLP	2.6571	TRUE	TRUE	TRUE	0.0068
32	KNN	0.1008	FALSE	FALSE	FALSE	0.4602
32	SVM	0.6340	FALSE	FALSE	FALSE	0.2658
33	RF	0.7284	FALSE	FALSE	FALSE	0.2362
33	XGB	1.8650	TRUE	TRUE	FALSE	0.0364
33	MLP	2.9664	TRUE	TRUE	TRUE	0.0031
33	KNN	1.5649	TRUE	FALSE	FALSE	0.0644
33	SVM	3.0328	TRUE	TRUE	TRUE	0.0035
34	RF	0.3639	FALSE	FALSE	FALSE	0.3594
34	XGB	1.5508	TRUE	FALSE	FALSE	0.0665
34	MLP	3.1234	TRUE	TRUE	TRUE	0.0022
34	KNN	1.7805	TRUE	TRUE	FALSE	0.0434
34	SVM	2.4231	TRUE	TRUE	FALSE	0.0134
35	RF	0.4084	FALSE	FALSE	FALSE	0.3430
35	XGB	1.1954	FALSE	FALSE	FALSE	0.1211
35	MLP	3.2488	TRUE	TRUE	TRUE	0.0015
35	KNN	0.7404	FALSE	FALSE	FALSE	0.2326
35	SVM	1.0745	FALSE	FALSE	FALSE	0.1491
36	RF	0.5011	FALSE	FALSE	FALSE	0.3103
36	XGB	1.1533	FALSE	FALSE	FALSE	0.1297
36	MLP	1.4305	TRUE	FALSE	FALSE	0.0823
36	KNN	0.7291	FALSE	FALSE	FALSE	0.2362
36	SVM	0.4827	FALSE	FALSE	FALSE	0.3167
37	RF	0.3988	FALSE	FALSE	FALSE	0.3465
37	XGB	1.0831	FALSE	FALSE	FALSE	0.1440
37	MLP	2.1470	TRUE	TRUE	FALSE	0.0203
37	KNN	0.6064	FALSE	FALSE	FALSE	0.2746
37	SVM	1.6950	TRUE	FALSE	FALSE	0.0522
38	RF	0.3220	FALSE	FALSE	FALSE	0.3749
38	XGB	1.1478	FALSE	FALSE	FALSE	0.1304
38	MLP	1.2389	FALSE	FALSE	FALSE	0.1129
38	KNN	0.7539	FALSE	FALSE	FALSE	0.2286
38	SVM	0.3777	FALSE	FALSE	FALSE	0.3543
39	RF	0.3052	FALSE	FALSE	FALSE	0.3813
39	XGB	1.3919	TRUE	FALSE	FALSE	0.0879
39	MLP	2.6016	TRUE	TRUE	TRUE	0.0077
39	KNN	0.7921	FALSE	FALSE	FALSE	0.2177
39	SVM	1.8907	TRUE	TRUE	FALSE	0.0354
40	RF	0.4448	FALSE	FALSE	FALSE	0.3301
40	XGB	1.3439	TRUE	FALSE	FALSE	0.0953
40	MLP	2.1999	TRUE	TRUE	FALSE	0.0185
40	KNN	0.9055	FALSE	FALSE	FALSE	0.1868
40	SVM	1.0910	FALSE	FALSE	FALSE	0.1438
41	RF	0.2920	FALSE	FALSE	FALSE	0.3863
41	XGB	1.0110	FALSE	FALSE	FALSE	0.1607
41	MLP	2.0646	TRUE	TRUE	FALSE	0.0246
41	KNN	0.7100	FALSE	FALSE	FALSE	0.2421
41	SVM	1.0211	FALSE	FALSE	FALSE	0.1590

Appendix B

Table A6. Train Periods and Test Periods.

Test ID	Train Set Periods	Test Set Periods
1	2004.01.–2004.06.	2004.07.–2004.12
2	2004.07.–2004.12.	2005.01.–2005.06.
3	2005.01.–2005.06.	2005.07.–2005.12.
4	2005.07.–2005.12.	2006.01.–2006.06.
5	2006.01.–2006.06.	2006.07.–2006.12.
6	2006.07.–2006.12.	2007.01.–2007.06.
7	2007.01.–2007.06.	2007.07.–2007.12.
8	2007.07.–2007.12.	2008.01.–2008.06.
9	2008.01.–2008.06.	2008.07.–2008.12.
10	2008.07.–2008.12.	2009.01.–2009.06.
11	2009.01.–2009.06.	2009.07.–2009.12.
12	2009.07.–2009.12.	2010.01.–2010.06.
13	2010.01.–2010.06.	2010.07.–2010.12.
14	2010.07.–2010.12.	2011.01.–2011.06.
15	2011.01.–2011.06.	2011.07.–2011.12.
16	2011.07.–2011.12.	2012.01.–2012.06
17	2012.01.–2012.06	2012.07.–2012.12
18	2012.07.–2012.12	2013.01.–2013.06.
19	2013.01.–2013.06.	2013.07.–2013.12.
20	2013.07.–2013.12.	2014.01.–2014.06.
21	2014.01.–2014.06.	2014.07.–2014.12.
22	2014.07.–2014.12.	2015.01.–2015.06.
23	2015.01.–2015.06.	2015.07.–2015.12.
24	2015.07.–2015.12.	2016.01.–2016.06.
25	2016.01.–2016.06.	2016.07.–2016.12.
26	2016.07.–2016.12.	2017.01.–2017.06.
27	2017.01.–2017.06.	2017.07.–2017.12.
28	2017.07.–2017.12.	2018.01.–2018.06.
29	2018.01.–2018.06.	2018.07.–2018.12.
30	2018.07.–2018.12.	2019.01.–2019.06.
31	2019.01.–2019.06.	2019.07.–2019.12.
32	2019.07.–2019.12.	2020.01.–2020.06.
33	2020.01.–2020.06.	2020.07.–2020.12.
34	2020.07.–2020.12.	2021.01.–2021.06.
35	2021.01.–2021.06.	2021.07.–2021.12.
36	2021.07.–2021.12.	2022.01.–2022.06.
37	2022.01.–2022.06.	2022.07.–2022.12.
38	2022.07.–2022.12.	2023.01.–2023.06.
39	2023.01.–2023.06.	2023.07.–2023.12.
40	2023.07.–2023.12.	2024.01.–2024.06.
41	2024.01.–2024.06.	2024.07.–2024.12.

Table A7. Hyperparameter Settings.

Model	Parameter	Values
RF	Number of Estimators	100
	Maximum Depth	None
	Minimum Samples Split	5
	Minimum Samples Leaf	3
	Random Seed	Random integer (0 to 10,000)
XGB	Number of Estimators	100
	Maximum Depth	2000
	Minimum Samples Split	ReLU
	Minimum Samples Leaf	0.001
	Random Seed	TRUE
MLP	Hidden Layer Size	Random integer (3 to 8)
	Maximum Iterations	30
	Activation Function	None
	Learning Rate Initialization	Radial Basis Function (RBF)
	Early Stopping	Random value (0.1 to 10)
	Random Seed	0.1
KNN	Number of Neighbors	Integer
	Leaf Size	Integer
	Number of Jobs	TRUE
SVM	Kernel	TRUE
	Regularization Parameter (C)	64
	Epsilon	64
GCN	Input Channels	32
	Output Channels	0.01
	Normalization	4
	Bias	100
	Hidden Dimension 1	None
	Hidden Dimension 2	5
	Embedding Dimension	3
	Learning Rate	Random integer (0 to 10,000)
GAT	Input Channels	100
	Output Channels	2000
	Normalization	ReLU
	Bias	0.001
	Number of Heads	TRUE
	Hidden Dimension 1	Random integer (3 to 8)
	Hidden Dimension 2	30
	Embedding Dimension	None
	Learning Rate	Radial Basis Function (RBF)

Appendix C

Figure A1. Training Graph Structures Based on Pearson Correlation Coefficients. (a). Graph Plot for Test ID 1. (b). Graph Plot for Test ID 2. (c). Graph Plot for Test ID 3. (d). Graph Plot for Test ID 4. (e). Graph Plot for Test ID 5. (f). Graph Plot for Test ID 6. (g). Graph Plot for Test ID 7. (h). Graph Plot for Test ID 8. (i). Graph Plot for Test ID 9. (j). Graph Plot for Test ID 10. (k). Graph Plot for Test ID 11. (l). Graph Plot for Test ID 12. (m). Graph Plot for Test ID 13. (n). Graph Plot for Test ID 14. (o). Graph Plot for Test ID 15. (p). Graph Plot for Test ID 16. (q). Graph Plot for Test ID 17. (r). Graph Plot for Test ID 18. (s). Graph Plot for Test ID 19. (t). Graph Plot for Test ID 20. (u). Graph Plot for Test ID 21. (v). Graph Plot for Test ID 22. (w). Graph Plot for Test ID 23. (x). Graph Plot for Test ID 24. (y). Graph Plot for Test ID 25. (z). Graph Plot for Test ID 26. (aa). Graph Plot for Test ID 27. (bb). Graph Plot for Test ID 28. (cc). Graph Plot for Test ID 29. (dd). Graph Plot for Test ID 30. (ee). Graph Plot for Test ID 31. (ff). Graph Plot for Test ID 32. (gg). Graph Plot for Test ID 33. (hh). Graph Plot for Test ID 34. (ii). Graph Plot for Test ID 35. (jj). Graph Plot for Test ID 36. (kk). Graph Plot for Test ID 37. (ll). Graph Plot for Test ID 38. (mm). Graph Plot for Test ID 39. (nn). Graph Plot for Test ID 40. (oo). Graph Plot for Test ID 41.

Appendix D

Figure A2. RMSE Bar Chart with Error Bars.

Figure A3. MAE Bar Chart with Error Bars. Note: The models related to the graphs are represented in red hues, while the benchmarks are represented in blue hues. The error bars indicate a length of 1 sigma, and the dot in the middle of the error bar represents the mean. A smaller mean (further to the left) indicates better performance, while smaller error bars signify greater robustness.

Figure A4. p-Value Bar Chart for RMSE Using GAT. Note: p-value bars are used, and smaller p-values (from an independent t-test) indicate more significant differences between the corresponding model and the benchmark. The color scheme is the same as in Figure A2 and Figure A3. Vertical lines at p-values of 0.1, 0.05, and 0.01, representing significance thresholds, are marked in green, yellow, and red, respectively.

Figure A5. p-Value Bar Chart for MAE Using GAT.

Figure A6. p-Value Bar Chart for RMSE Using GCN.

Figure A7. p-Value Bar Chart for MAE Using GCN.

References

Box, G.E.P.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time Series Analysis: Forecasting and Control; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
Sims, C.A. Macroeconomics and reality. Econometrica 1980, 1, 1–48. [Google Scholar] [CrossRef]
Bollerslev, T. Generalized autoregressive conditional heteroskedasticity. J. Econ. 1986, 31, 307–327. [Google Scholar] [CrossRef]
Drucker, H.; Burges, C.J.; Kaufman, L.; Smola, A.; Vapnik, V. Support vector regression machines. Adv. Neural Inf. Process. Syst. 1996, 28, 779–784. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Choi, I.; Kim, W.C. A multifaceted graph-wise network analysis of sector-based financial instruments’ price-based discrepancies with diverse statistical interdependencies. N. Am. J. Econ. Financ. 2025, 75, 102316. [Google Scholar] [CrossRef]
Ozbayoglu, A.M.; Gudelek, M.U.; Sezer, O.B. Deep learning for financial applications: A survey. Appl. Soft Comput. 2020, 93, 106384. [Google Scholar] [CrossRef]
Foroutan, P.; Lahmiri, S. Deep learning systems for forecasting the prices of crude oil and precious metals. Financ. Innov. 2024, 10, 111. [Google Scholar] [CrossRef]
Zhang, Y.-J.; Zhang, H.; Gupta, R. A new hybrid method with data-characteristic-driven analysis for artificial intelligence and robotics index return forecasting. Financ. Innov. 2023, 9, 75. [Google Scholar] [CrossRef]
Battaglia, P.W.; Hamrick, J.B.; Bapst, V.; Sanchez-Gonzalez, A.; Zambaldi, V.; Malinowski, M.; Tacchetti, A.; Raposo, D.; Santoro, A.; Faulkner, R.; et al. Relational inductive biases, deep learning, and graph networks. arXiv 2018, arXiv:1806.01261. [Google Scholar] [CrossRef]
Gilmer, J.; Schoenholz, S.S.; Riley, P.F.; Vinyals, O.; Dahl, G.E. Neural message passing for quantum chemistry. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 1263–1272. [Google Scholar]
Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, 4–9 December 2017; pp. 1025–1035. [Google Scholar]
Xiang, S.; Cheng, D.; Shang, C.; Zhang, Y.; Liang, Y. Temporal and Heterogeneous Graph Neural Network for Financial Time Series Prediction. arXiv 2023, arXiv:2305.08740. [Google Scholar]
Choi, I.; Kim, W.C. Estimating historical downside risks of global financial market indices via inflation rate-adjusted dependence graphs. Res. Int. Bus. Financ. 2023, 66, 102077. [Google Scholar] [CrossRef]
Chen, Z.; Zheng, L.N.; Lu, C.; Yuan, J.; Zhu, D. ChatGPT Informed Graph Neural Network for Stock Movement Prediction. arXiv 2023, arXiv:2306.03763. [Google Scholar] [CrossRef]
Zhou, Y.; Xie, C.; Wang, G.-J.; Gong, J.; Zhu, Y. Forecasting cryptocurrency volatility: A novel framework based on the evolving multiscale graph neural network. Financ. Innov. 2025, 11, 87. [Google Scholar] [CrossRef]
Yin, W.; Chen, Z.; Luo, X.; Kirkulak-Uludag, B. Forecasting cryptocurrencies’ price with the financial stress index: A graph neural network prediction strategy. Appl. Econ. Lett. 2024, 31, 630–639. [Google Scholar] [CrossRef]
Fan, X.; Gong, M.; Wu, Y.; Tang, Z.; Liu, J. CCGIB: A Cross-Channel Graph Information Bottleneck Principle. IEEE Trans. Neural Netw. Learn. Syst. 2025, 36, 9488–9499. [Google Scholar] [CrossRef]
Choi, I.; Koh, W.; Kang, G.; Jang, Y.; Kim, W.C. Encoding Temporal Statistical-Space Priors via Augmented Representation. arXiv 2024, arXiv:2401.16808. [Google Scholar]
Fan, X.; Gong, M.; Wu, Y.; Tang, Z.; Liu, J. Neural Gaussian Similarity Modeling for Differential Graph Structure Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; Volume 38. [Google Scholar]
Zhang, Y.; Karve, P.M.; Mahadevan, S. Graph neural networks for power grid operational risk assessment under evolving grid topology. arXiv 2024, arXiv:2405.07343. [Google Scholar]
Dong, Y.; Yao, J.; Wang, J.; Liang, Y.; Liao, S.; Xiao, M. Dynamic fraud detection: Integrating reinforcement learning into graph neural networks. In Proceedings of the 2024 6th International Conference on Data-driven Optimization of Complex Systems, Hangzhou, China, 16–18 August 2024; pp. 818–823. [Google Scholar]
Kipf, T.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Liò, P.; Bengio, Y. Graph Attention Networks. In Proceedings of the International Conference on Learning Representations (ICLR), Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
Rossi, E.; Chamberlain, B.; Frasca, F.; Eynard, D.; Monti, F.; Bronstein, M. Temporal graph networks for deep learning on dynamic graphs. arXiv 2020, arXiv:2006.10637. [Google Scholar] [CrossRef]
Cheng, D.; Yang, F.; Xiang, S.; Liu, J. Financial time series forecasting with multi-modality graph neural network. Pattern Recognit. 2022, 121, 108218. [Google Scholar] [CrossRef]
Choi, I.; Kim, W.C. Practical forecasting of risk boundaries for industrial metals and critical minerals via statistical machine learning techniques. Int. Rev. Financ. Anal. 2024, 94, 103252. [Google Scholar] [CrossRef]
Das, N.; Sadhukhan, B.; Chatterjee, R.; Chakrabarti, S. Integrating sentiment analysis with graph neural networks for enhanced stock prediction: A comprehensive survey. Decis. Anal. J. 2024, 10, 100417. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Rosenblatt, F. The perceptron: A probabilistic model for information storage and organization in the brain. Psychol. Rev. 1958, 65, 386–408. [Google Scholar] [CrossRef]
Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Doostmohammadian, M.; Qureshi, M.I.; Khalesi, M.H.; Rabiee, H.R.; Khan, U.A. Log-scale quantization in distributed first-order methods: Gradient-based learning from distributed data. IEEE Trans. Autom. Sci. Eng. 2025, 22, 10948–10959. [Google Scholar] [CrossRef]
Lee, M.; Choi, I.; Kim, W.C. Predicting mobile payment behavior through explainable machine learning and application usage analysis. J. Theor. Appl. Electron. Commer. Res. 2025, 20, 117. [Google Scholar] [CrossRef]
Černevičienė, J.; Kabašinskas, A. Explainable artificial intelligence (XAI) in finance: A systematic literature review. Artif. Intell. Rev. 2024, 57, 216. [Google Scholar] [CrossRef]
Choi, I.; Kim, W.C. A transparent single financial asset trading framework via reinforcement learning. In Proceedings of the 10th International Conference on E-Business and Applications, Singapore, 24–26 February 2024; Springer Nature: Singapore; pp. 72–79. [Google Scholar]

Figure 1. Research Flow.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lee, T.K.; Choi, I.; Kim, W.C. Symmetry-Aware Graph Neural Approaches for Data-Efficient Return Prediction in International Financial Market Indices. Symmetry 2025, 17, 1372. https://doi.org/10.3390/sym17091372

AMA Style

Lee TK, Choi I, Kim WC. Symmetry-Aware Graph Neural Approaches for Data-Efficient Return Prediction in International Financial Market Indices. Symmetry. 2025; 17(9):1372. https://doi.org/10.3390/sym17091372

Chicago/Turabian Style

Lee, Tae Kyoung, Insu Choi, and Woo Chang Kim. 2025. "Symmetry-Aware Graph Neural Approaches for Data-Efficient Return Prediction in International Financial Market Indices" Symmetry 17, no. 9: 1372. https://doi.org/10.3390/sym17091372

APA Style

Lee, T. K., Choi, I., & Kim, W. C. (2025). Symmetry-Aware Graph Neural Approaches for Data-Efficient Return Prediction in International Financial Market Indices. Symmetry, 17(9), 1372. https://doi.org/10.3390/sym17091372

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Symmetry-Aware Graph Neural Approaches for Data-Efficient Return Prediction in International Financial Market Indices

Abstract

1. Introduction

2. Literature Review

3. Methodology and Data

3.1. Methodology

3.1.1. Benchmark Regression Models

Random Forest Regressor

XGBoost Regressor

Multi-Layer Perceptron Regressor

K-Nearest Neighbors Regressor

Support Vector Regressor

3.1.2. Graph Models

Graph Convolutional Network (GCN)

Graph Attention Network (GAT)

3.1.3. Optimization and Computational Complexity

3.2. Data

3.3. Graph Data Generation

3.4. Experimental Design

4. Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

Appendix B

Appendix C

Appendix D

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI