1. Introduction
The increasing global financial market connections make it difficult for economists to monitor economic disturbances that propagate across worldwide index systems. ML and DL techniques within standard financial forecasting systems rely on static feature representations that fail to demonstrate market relationships. Financial indices operate as part of a network system which enables market spillovers to affect each other, so models must account for network dynamics instead of independent index analysis.
The inter-market relationships commonly present symmetric patterns because markets influence each other through reciprocal responses and bidirectional spillover effects. The correct modeling of global financial networks requires capturing these symmetrical dependencies because they define the actual nature of the system.
Financial forecasting benefits from GNNs as an innovative solution because they process data relational structures. The ability of GNNs to learn endogenous features from connectivity patterns makes them suitable for complex nonlinear financial index modeling. GNNs have been applied to stock predictions and cryptocurrency markets and single-market indices but their use for modeling global financial spillovers remains an underdeveloped field. GNNs use their ability to gather information from connected nodes to naturally encode symmetric relationships in graph structures where edge connections indicate market behavior that reciprocates. The models show potential to improve prediction results by identifying concealed market connections.
The financial forecasting potential of GNNs faces two fundamental barriers that prevent their adoption. Current GNN finance research primarily studies localized financial data without demonstrating effective use in international investment markets. Financial markets operate independently yet researchers rarely study how GNN-based models perform when processing multiple worldwide indices with varying economic conditions and different risk profiles and regulatory systems. The research shows that GNN forecasting methods need thorough validation because multiple studies present unproven predictive accuracy improvements through insufficient cross-validation methods and ambiguous statistical significance results.
This study aims to address these gaps by answering the following core research questions: (1) Can GNN-based models such as GCN and GAT predict global financial index returns more effectively than traditional ML methods? (2) Under what market conditions do graph-based models exhibit performance advantages? (3) Can a correlation-thresholded graph structure constructed from return data capture global spillover dynamics without relying on external variables?
To this end, the objectives of this paper are threefold: (i) to design and validate a GNN-based framework for forecasting global financial indices using only return-based data; (ii) to benchmark GCN and GAT models against standard ML baselines such as RF, XGBoost, SVM, MLP, and KNN through repeated time-series cross-validation; and (iii) to identify market regimes (e.g., volatility, crisis recovery) where GNNs offer statistically significant advantages.
This study creates new possibilities for GNN-based forecasting by studying global market applications while building a complete validation framework. The research evaluates the performance of Graph Convolutional Networks (GCNs) and Graph Attention Networks (GATs) when predicting daily market returns from the fifteen most capitalization-heavy global financial indices. The chosen indices represent diverse economic conditions along with market interaction effects which generate results that transcend individual economies. The method operates solely on intrinsic market data which enhances data efficiency and reduces the requirement for external indicators that could produce delays and bias.
The research establishes its findings through 30 time-series cross-validation experiments which provide a robust methodological framework for statistical assessment. The evaluation of models relies on RMSE and MAE measurements which are assessed at three different significance levels of 10%, 5% and 1%. The research explores specific market environments that show better performance by GNNs than conventional models while distinguishing between periods of high market volatility and post-crisis recovery phases and times of economic stability. The thorough analysis enables us to determine when GNNs excel and which specific conditions optimize their predictive benefits.
The research establishes three crucial contributions which benefit financial econometrics together with quantitative finance. The research demonstrates how GNNs outperform traditional models by detecting intricate nonlinear relationships that exist between worldwide indices. Graph-based models excel at capturing the symmetric financial interdependencies which produces stronger and more interpretable forecasting results. The research demonstrates graph-based methods’ durability through multiple testing sessions that prove their effectiveness in various market environments. The application of GNNs in global financial forecasting enables them to function as efficient tools for investment strategy development and portfolio risk management and macroeconomic prediction.
This work makes three notable contributions to the field of financial forecasting. First, it proposes a graph construction methodology based on thresholded correlation networks, specifically tailored to capturing dynamic interdependencies among global financial indices. Second, it conducts rigorous evaluation using statistical testing at multiple significance levels, ensuring reliable comparisons across models. Third, the results demonstrate that GCN and GAT models significantly outperform traditional ML methods under specific market regimes, highlighting their practical value in macroeconomic forecasting and risk-aware investment strategy.
The paper has a defined sequence which begins with
Section 2 presenting GNN applications in financial forecasting alongside their existing methodological limitations. The research methodology section explains the steps involved in data acquisition and graph structure creation and model implementation. The experimental findings from this research study GNNs against traditional machine learning models in
Section 4. The evaluation examines when GNN-based forecasting provides better results based on market circumstances. Future research directions are presented in the final section of the paper.
2. Literature Review
The identification of financial patterns stands as a fundamental challenge which research in econometrics machine learning and deep learning continues to pursue. The time-series forecasting models of traditional methods consist of autoregressive integrated moving average (ARIMA) models (Box et al., 2015) [
1] together with vector autoregression (VAR) models (Sims, 1980) [
2] and GARCH models (Bollerslev, 1986) [
3]. The previous forecasting methods have proven useful for modeling historical financial patterns yet they do not excel in recognizing intricate connections between worldwide financial indices throughout times of economic turbulence. ML and DL research introduced alternative methods to traditional approaches through support vector regression (Drucker et al., 1997) [
4], random forests (Breiman, 2001) [
5], and long short-term memory (LSTM) networks (Hochreiter & Schmidhuber, 1997) [
6]. Building on this trend, Choi and Kim (2025) [
7] extend the scope of analysis beyond predictive accuracy, showing that diverse graph-based network analysis methods can reveal distinct perspectives and patterns in sector-based financial instruments’ price discrepancies, thereby underscoring the value of method-specific insights. Although such predictive models show enhanced accuracy, they often fail to capture the underlying financial market relationships, as noted by Ozbayoglu et al. (2019) [
8]. Foroutan and Lahmiri (2024) [
9] found that the Temporal Convolutional Network (TCN) proved most effective in forecasting WTI, Brent, and silver prices, while the BiGRU model excelled at gold price prediction, providing essential information for investors, policymakers, and other market stakeholders. Zhang et al. (2023) [
10] explained that their EEMD-PSO-LSSVM-ICSS-GARCH hybrid model achieved superior prediction accuracy for the NASDAQ CTA Artificial Intelligence and Robotics (AIRO) Index returns because of its ability to handle complex structural characteristics.
Graph-based learning represents an innovative solution to financial forecasting problems because it provides an organized method for analyzing market dependency structures and spillover patterns. GNNs have been widely applied to structured data problems across various domains, including physics (Battaglia et al., 2018) [
11], biology (Gilmer et al., 2017) [
12], and social networks (Hamilton et al., 2017) [
13]. Their adoption for financial market analysis began recently but shows fast-growing momentum. Xiang et al. (2023) [
14] proved GNNs can identify temporal relationships in stock market predictions better than traditional machine learning and deep learning methods. Choi and Kim (2023) [
15] applied a graph-based approach to forecast downside risks in global financial markets by constructing inflation rate-adjusted dependence networks among 21 major indices. Chen et al. (2023) [
16] further developed this concept by uniting natural language processing (NLP) with GNNs to add sentiment analysis capabilities to financial prediction models. Zhou et al. (2025) [
17] demonstrated how their evolving multiscale graph neural network (EMGNN) framework leads to superior cryptocurrency volatility prediction by modeling cryptocurrency and conventional financial market interactions thus helping risk management and policy development. Similarly, Yin et al. (2024) [
18] proposed a GNN-based strategy that combines the financial stress index with cryptocurrency forecasting, confirming the model’s ability to capture macro-financial stress propagation mechanisms that affect digital asset pricing.
Recent innovations in graph structure learning further enhance graph and GNN capabilities. Fan et al. (2025) [
19] proposed the CCGIB framework to balance shared and channel-specific representations, enabling richer modeling of multiview financial structures. Choi et al. (2024) [
20] also contributed by introducing an augmented representation framework that encodes temporal statistical-space priors into graph models, showing improved accuracy in handling volatile time series under complex dependencies. Meanwhile, Fan et al. (2024) [
21] introduced Neural Gaussian Similarity Modeling to enable differentiable and scalable graph construction, which is well-suited for financial data with high node similarity.
Zhang et al. (2024) [
22] explored the application of graph neural networks to power grid operational risk assessment under dynamic topological conditions, demonstrating that GNNs can reliably predict system-wide and localized risk indicators despite uncertainty in future grid configurations. Dong et al. (2024) [
23] developed a dynamic fraud detection framework by integrating reinforcement learning into graph neural networks, addressing key challenges such as label imbalance, feature distortion from highly connected nodes, and evolving fraud patterns.
The foundational Graph Convolutional Networks (GCNs) (Kipf & Welling, 2017) [
24] and Graph Attention Networks (GATs) (Veličković et al., 2018) [
25] extract useful node-level features from network connectivity thus making them suitable for financial forecasting applications. Researchers have developed various graph-based learning models such as GraphSAGE (Hamilton et al., 2017) [
13] together with temporal graph networks (Rossi et al., 2020) [
26] which demonstrate their potential to model financial relationships as they change over time. Stock-level predictions represent the primary focus of current applications but macroeconomic forecasting remains an underdeveloped area.
The use of GNNs for global financial forecasting needs further exploration despite recent progress in the field. The existing research mainly focuses on individual stocks and specific sector indices and single-market datasets which hinders the generalization of findings to interconnected financial markets across multiple markets. The majority of current GNN-based research fails to study how economic disturbances transmit between markets which represents a crucial element for systemic risk assessment and spread modeling. The application of graph-based models to financial risk prediction by Cheng et al. (2022) [
27], Choi and Kim (2024) [
28], and Das et al. (2024) [
29] failed to perform rigorous tests against traditional econometric models while ignoring how GNNs behave under varying economic scenarios. This highlights the need for a comprehensive and robust framework that evaluates GNN performance across diverse regimes, international indices, and volatility conditions.
The research addresses these knowledge gaps through international financial index GNN implementation along with an advanced validation framework which surpasses previous studies. The research employs extensive hyperparameter tuning combined with 30 repeated time-series cross-validation experiments instead of traditional single-period backtesting and static train-test splits. The evaluation method delivers both robust and statistically sound performance assessments that work effectively across multiple market conditions throughout various time periods. The comprehensive validation process tackles model overfitting together with temporal data leakage problems to establish reliable performance comparisons between different prediction models.
Research makes an original contribution to financial network structure research by using Graph Neural Networks (GNNs) to show how these networks successfully identify spillover effects and systemic risk transmission and structural dependencies between international markets. The current time-series methods for financial indicators work independently from each other because they disregard the fundamental connections between these indicators. The studied approach uses graph models to reveal hidden network relationships thus delivering an integrated predictive system for complex market connections.
Through correlation-based network formation techniques the research develops financial graphs that lead to an efficient forecasting system which works without requiring macroeconomic data indicators. The proposed approach benefits emerging markets by working without requiring abundant economic data that is difficult to obtain. The research sets itself apart from previous studies because it systematically examines the conditions that optimize GNN performance by evaluating their predictive outcomes across different financial environments such as high-volatility crises and post-crisis recoveries and stable economic conditions.
The research contributes to multiple fields through its work in financial econometrics and quantitative finance and systemic risk modeling. The research connects financial network representation to predictive analytics through its application of graph-based forecasting methods to international market indices. The research demonstrates that financial network forecasting accuracy depends heavily on structural properties which include symmetry and interconnectedness. The empirical results demonstrate that GNN-based methods deliver both robustness and data efficiency which provides new understanding about how network structures affect macroeconomic predictions. The research establishes a solid base for creating hybrid modeling systems which combine graph neural networks with conventional econometric models to improve forecasting capabilities in the modern interconnected global financial system.
3. Methodology and Data
3.1. Methodology
This study aims to predict financial returns by measuring the spillover effects among global indices. Specifically, it examines whether utilizing graph-based embeddings improves predictive performance compared to benchmark models that rely solely on raw data. The benchmark models include Random Forest, XGBoost, Multi-Layer Perceptron (MLP), K-Nearest Neighbors (KNN), and Support Vector Machine (SVM), which were used for baseline predictions. The results of these models were then compared with predictions made using embeddings generated through Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT). Our research flow is presented in
Figure 1.
3.1.1. Benchmark Regression Models
Random Forest Regressor
Random Forest (Breiman, 2001) [
5] is an ensemble learning method that builds multiple decision trees and aggregates their outputs to enhance predictive performance. In regression tasks, it reduces variance by averaging the predictions of individual trees
where
is the prediction of the i-th tree.
XGBoost Regressor
XGBoost (Chen & Guestrin, 2016) [
30] is a gradient boosting algorithm that optimizes decision trees iteratively to minimize regression loss by reducing bias and variance. It uses a differentiable loss function, such as mean squared error, to improve predictive accuracy.
where
is the updated model,
is the learning rate, and
is the residual of a regression.
Multi-Layer Perceptron Regressor
Multi-Layer Perceptron (Rosenblatt, 1958) [
31] is a neural network model consisting of multiple layers of neurons that learn nonlinear mappings through backpropagation.
where
is the activation function,
and
is the weights and biases for the
layer, where
.
K-Nearest Neighbors Regressor
K-Nearest Neighbors (Cover & Hart, 1967) [
32] is a non-parametric algorithm that predicts a sample’s value by averaging the target values of its k-nearest neighbors in feature space.
Support Vector Regressor
Support Vector Machine (Cortes & Vapnik, 1995) [
33] is a supervised learning model that finds the optimal hyperplane to predict continuous values by minimizing error within a specified margin (ε-insensitive zone).
where
is the slack variable,
is the weight vector,
is the bias, and
is the label of
. The
is the insensitive-loss meaning the margin of error and
is the penalty, when the sample is outside the margin of error.
3.1.2. Graph Models
A graph is a data structure composed of a set of nodes (also referred to as vertices) and edges that connect these nodes. It is generally defined as , where V represents the set of nodes and E represents the set of edges connecting pairs of nodes. In this study, the nodes represent individual stock indices, while the edges are weighted by the correlation between two indices. GNNs generally utilize a message-passing mechanism to calculate the embeddings of each node. This involves updating a node’s embedding by leveraging its relationships with neighboring nodes. GNNs employ an iterative process where the feature information of neighboring nodes is aggregated and integrated into the representation of the central node. This process includes aggregation and update operations performed through multiple stacked layers. As a result, GNNs enable more accurate predictions based on the provided graph data. In this study, to variation models of GNNs were used for prediction.
Graph Convolutional Network (GCN)
A Graph Convolutional Network (GCN) is a model that utilizes both the characteristics of nodes and the structure of the graph to learn from graph data. A node’s embeddings are updated by combining its own features with those of its neighbors. The embedding update of a node
v in one layer is represented by the equation below.
The input is the initial node feature , which is the initial input values. , is the adjacency matrix of the graph, with self-loops added. The element in the matrix indicates whether there is an edge between node i and node j. The part is to normalize the data to make it easier to learn from, where D is a diagonal matrix (). each represents the node embedding matrix and the learnable weight matrix at the l-th layer, and is the activation function. The model aggregates information from neighboring nodes at each layer and updates the current node’s representation. The output values are obtained from the final layer . This step is called the forward pass.
The backpropagation step is the process of updating the weight matrix
for each layer, which minimizes the loss function below.
It calculates the gradient of and updates the weight .
Graph Attention Network (GAT)
The Graph Attention Network (GAT) introduces an attention mechanism to graph neural networks. Instead of treating all neighbors equally (as in GCN), GAT assigns different attention weights to neighboring nodes, allowing the model to focus on the most relevant neighbors during feature aggregation. The main purpose of the GAT model is to generalize a new set of node features as an output
from its previous input
. To proceed this step, the model first calculates and the attention coefficient.
where
is the attention vector (learnable parameter) and W is the weight matrix which can be trained by the model.
Next, the attention coefficient is normalized by using the softmax function.
Lastly, the features of neighbor nodes are aggregated using the normalized attention coefficient.
The GAT model utilizes a multi-head attention mechanism to improve model stability and expressiveness. In the intermediate layers, the attention heads are concatenated, whereas in the final layer, they are averaged to reduce dimensionality and enhance stability.
3.1.3. Optimization and Computational Complexity
The model learns through the process of minimizing the Mean Squared Error (MSE) loss between predicted and actual returns. The MSE loss function remains convex for output predictions but the overall optimization landscape becomes non-convex because of the nonlinear activation functions (e.g., ReLU, ELU) and stacked neural layers in both GCN and GAT architectures. Deep learning models typically exhibit non-convexity which stochastic gradient-based optimization methods (e.g., Adam) solve by finding generalizable local minima in practice.
The proposed models—Graph Convolutional Networks (GCNs) and Graph Attention Networks (GATs)—exhibit polynomial computational complexity, which supports their applicability to large-scale financial networks. For GCNs, the dominant computational cost arises from matrix multiplications involving node features and weight parameters, yielding a complexity of , where is the number of nodes, is the input feature dimension, and is the output feature dimension. In practice, this cost is significantly reduced by leveraging sparse adjacency matrices, which is appropriate for financial graphs that typically exhibit sparse connectivity.
GATs have a complexity of , where denotes the number of edges. This is because attention coefficients are computed only for connected node pairs. Although the self-attention mechanism could lead to a worst-case complexity of in dense graphs, this is mitigated in real-world applications by limiting attention computations to each node’s local neighbors.
Therefore, under the common assumption of sparsity in financial networks, both GCNs and GATs offer computationally feasible and scalable solutions for modeling complex inter-market relationships.
3.2. Data
To analyze global financial spillover effects, we constructed a dataset of major stock indices spanning from 1 January 2004, to 31 December 2024. The initial list of indices was obtained by crawling the main stock index listings available on Investing.com, which provides a comprehensive and up-to-date registry of representative indices from major global economies. For each index, daily closing prices and trading volume data were collected through automated crawling of Yahoo Finance, which offers a reliable source of historical financial market data.
The initial collection yielded 41 global stock indices, covering a diverse range of regions including North America, Europe, and Asia. During the data preprocessing phase, we evaluated the completeness of each index’s time series. Indices with excessive missing values or incomplete coverage over the 21-year sample period were excluded to maintain consistency in longitudinal analysis. This filtering process resulted in 25 indices with sufficient data quality.
Among these, we further narrowed the sample to 15 representative indices, selected based on relative market importance. Specifically, we computed a proxy for market capitalization by taking the cumulative sum of the product of daily closing price and trading volume over the full period. The top 15 indices with the highest cumulative value were retained, under the rationale that these markets play a more influential role in transmitting financial shocks across borders and thus offer richer insights into spillover dynamics.
To prepare the data for modeling and analysis, the closing prices were transformed into daily log return series, which standardizes the data and enhances stationarity—a critical requirement for many econometric and machine learning forecasting models.
3.3. Graph Data Generation
The original tabular data underwent a transformation to create a graph-structured dataset which enabled graph-based learning. The graph model uses stock indices as nodes to represent individual elements while edges represent historical correlation relationships between them. The financial network construction process retained only those edges which demonstrated absolute correlation coefficients exceeding 0.3.
The chosen threshold followed both statistical conventions and financial network analysis principles. The absolute correlation coefficient of 0.3 or higher in correlation analysis indicates moderate to strong relationships because positive values (≥0.3) show meaningful positive associations and negative values (≤−0.3) indicate significant inverse relationships. The graph structure becomes less reliable when correlations fall below this threshold because they introduce excessive noise.
Financial network modeling research shows that low correlation values tend to change frequently which makes them unreliable for long-term market analysis. The 0.3 threshold selection helps the model detect important financial connections while avoiding temporary market fluctuations and random statistical noise. The model achieves better robustness through this balance because it selects meaningful financial dependencies which represent actual market structures. The graph structures are plotted in
Appendix C,
Figure A1a–oo.
The graph structure remains static throughout each 6-month evaluation period but changes dynamically between time segments. The graphs derive from thresholded correlation values (|ρ| ≥ 0.3) which produce sparse undirected weighted adjacency matrices. The models do not require uniform connectivity because they use natural variations in edge density and degree distribution which emerge from actual market data. The models maintain flexibility to adjust their financial interdependency strength during different periods. The graph structure gets reconstructed independently for each rolling window to handle time variations.
3.4. Experimental Design
The baseline experiment used 15 indices in tabular form with five models from
Section 3.1.1 for prediction. A total of 41 segments were tested, each comprising a 6-month training set and a 6-month test set, as detailed in
Appendix A,
Table A1.
The prediction purpose was to predict the rate of return of the day after (1 day prediction). For the feature variables, data from time steps 1 to N-1 (where N is the total number of rows for each test period) across all index columns were used. This process was repeated by shifting the target index for each experiment, and 30 iterations were performed for each target to ensure robust results. Mean RMSE and MAE were calculated for every target index to evaluate model performance comprehensively.
The graph-based experiments were conducted using the GCN and GAT models. Embeddings were generated for each node and edge based on the previously constructed graph dataset. For each combination, experiments were performed for all target indices with 30 loops per target, and each loop consisted of 100 epochs to ensure the robustness of the results. Default settings for the ML models, as outlined in
Table A7 of
Appendix B, were based on the configurations specified in the respective papers or implementation packages.
4. Results
- (1)
Pre-Crisis Period (2004–2007): Early Signal Detection
During the relatively stable pre-crisis years, both models showed moderate yet statistically significant improvements over traditional approaches. In particular, Test 6, conducted in the first half of 2007 amid geopolitical tension, revealed that GAT outperformed MLP (t = 3.8764, p = 0.0003), while GCN results were comparable (t = 3.8670, p = 0.0003).
These findings suggest that even under modest volatility, graph-based models began to detect early structural changes in global financial linkages—although their relative advantage over baselines was not yet dramatic.
- (2)
Crisis Periods: 2008 Global Financial Crisis & European Debt Crisis
The models achieved their strongest performance during the 2008 Global Financial Crisis, a time when market interdependencies intensified dramatically.
Test 10, conducted after the collapse of Lehman Brothers, showed that GAT significantly outperformed XGBoost (t = 3.4109, p = 0.0011) and MLP (t = 2.2339, p = 0.0169), while GCN produced slightly lower but still competitive results (XGBoost: t = 3.3862, p = 0.0012; MLP: t = 2.2054, p = 0.0180). These results confirm the superiority of GNNs in capturing systemic market shocks and nonlinear contagion paths.
Similarly, during the European Debt Crisis (2010–2012), the models continued to perform strongly. In Test 14 (early 2011), GAT significantly outperformed MLP (t = 3.6446, p = 0.0006). Under heightened uncertainty due to Greece’s bailout negotiations (Test 16), GAT again surpassed XGBoost (t = 1.9666), MLP (t = 2.0200), and KNN (t = 2.2302). GCN also showed competitive performance (e.g., KNN: t = 2.2227, p = 0.0175).
These results emphasize the models’ ability to monitor sovereign risk events and shifts in market sentiment.
- (3)
Recovery Phases: Post-GFC and Post-COVID-19
As markets entered recovery, GAT and GCN maintained strong predictive capabilities by adapting to evolving inter-market structures.
Test 11 (late 2009) indicated that GAT significantly outperformed XGBoost (t = 2.5188), MLP (t = 2.8342), and KNN (t = 3.5566), with similar results for GCN (e.g., KNN: t = 3.5029, p = 0.0008). During the COVID-19 crisis, both models again delivered outstanding results.
In Test 31 (early 2020), GAT strongly outperformed MLP in RMSE (t = 4.6089, p < 0.0001). In Test 33, GAT showed clear advantages over MLP (p = 0.0029) and SVM (p = 0.0034) in MAE, with GCN producing comparable outcomes.
Test 34 further showed that GAT remained superior even in the post-COVID adjustment phase, outperforming MLP (t = 2.9313) and SVM (t = 2.2648). These results demonstrate that GNNs are highly effective during periods of rapid regime shifts and volatility spikes.
- (4)
Stable and Mixed Regimes: 2013–2019 and Recent Inflationary Periods
The performance gap between GNNs and baseline models narrowed during relatively calm periods. From 2013 to 2019, when markets were supported by quantitative easing, GAT still managed to outperform traditional models in some cases such as Test 20 (2014) where it beat MLP (t = 2.8370, p = 0.0043).
During Test 25 (2016), amid the Brexit referendum, GAT demonstrated robustness again, outperforming MLP (t = 1.8035, p = 0.0410). In more recent years marked by inflation and monetary tightening, both GAT and GCN remained competitive.
In Test 37 (early 2022), GAT outperformed MLP (t = 1.7436, p = 0.0462), and in Test 39 (2023), GAT and GCN both significantly surpassed MLP (t ≈ 2.37, p ≈ 0.0130), validating their responsiveness to macroeconomic structural changes.
The models demonstrated excellence in recent times marked by inflation together with monetary tightening. GAT and GCN displayed effective adaptation to macroeconomic changes in Test 37 during the first half of 2022 by achieving better results than MLP (t = 1.7436, p = 0.0462) and producing results comparable to GCN (t = 1.7395, p = 0.0465).
The analysis in Test 39 (2023) indicated that GAT achieved better significance than MLP (t = 2.3703, p = 0.0129) compared to GCN (t = 2.3651, p = 0.0130) which validated its ability to detect inflation-driven market adjustments.
Taken together, the experimental results demonstrate that graph-based models provide superior forecasting power compared to traditional methods, especially under periods of crisis and structural transition. GAT and GCN performed best during events such as the 2008 Global Financial Crisis, European Debt Crisis, Brexit referendum, COVID-19 pandemic, and recent inflationary shocks, all of which were marked by intense shifts in market connectivity. In contrast, during stable periods such as Tests 8 (2008) and 24 (2016), the performance gap narrowed, showing that GNNs’ effectiveness depends heavily on market conditions. The results confirm that GNNs are particularly effective in modeling dynamic, symmetric, and nonlinear spillover effects, which traditional models often fail to capture. These findings suggest that GAT and GCN are especially valuable for systemic risk analysis, adaptive forecasting, and financial decision support in volatile and interconnected market environments.
5. Discussion
This study presents robust empirical evidence indicating that Graph Convolutional Networks (GCNs) and Graph Attention Networks (GATs) outperform conventional machine learning and deep learning models in forecasting financial returns across international markets. The advantage of GNN-based models becomes particularly pronounced during periods of heightened market volatility, systemic crises, and subsequent recovery phases—periods characterized by strengthened interdependencies and structural changes within financial networks, which GNNs are uniquely equipped to capture.
Both GCNs and GATs learn representations directly from dynamically evolving network structures. Their architectural design allows for the modeling of symmetric financial relationships, including mutual influences and bidirectional spillovers. GATs, in particular, utilize attention mechanisms that assign learnable weights to neighbor nodes, enabling the model to emphasize the most salient inter-market connections. This adaptive capacity enhances the model’s responsiveness to regime shifts and sudden economic disturbances. Conversely, GCNs are more effective in relatively stable markets, where neighborhood aggregation suffices to capture persistent dependency structures.
These architectural differences explain why GATs consistently outperformed other models during crises such as the 2008 Global Financial Crisis and the COVID-19 shock, while GCNs maintained more stable performance during less volatile periods. These findings suggest that the choice of GNN architecture should be guided by the structural characteristics of the market network and prevailing economic conditions.
The current model operates under a centralized training framework, where financial index data is aggregated and processed in a single computational environment. This design avoids the communication overhead and compression bottlenecks that often occur in distributed systems. However, we acknowledge that centralized architectures may face scalability limitations, especially in real-time, multi-agent financial environments.
Recent advancements such as the work by Doostmohammadian et al. (2025) [
34] in IEEE Transactions on Automation Science and Engineering have demonstrated that log-scale quantization in distributed first-order optimization algorithms can substantially reduce communication costs while preserving convergence performance. Their method enables learning over networks of geographically distributed agents by exchanging quantized gradient information, offering practical benefits in bandwidth-constrained settings.
While our current implementation does not incorporate distributed or quantized optimization techniques, we recognize the relevance of such approaches for future extensions of this work. In particular, integrating log-quantized GNN training within decentralized financial forecasting systems could enhance scalability, reduce latency, and support edge-based deployment.
The performance of GNN models is highly sensitive to the structure of the underlying financial network. In this study, graphs were constructed based on pairwise correlations with a threshold of |ρ| ≥ 0.3, producing sparse, undirected networks that evolve over time. The density and connectivity of these graphs vary across market regimes, directly influencing the flow of information during training. During crises, increased graph connectivity creates dense inter-market feedback loops, enhancing the predictive power of GNNs. In contrast, sparse or disconnected graphs—more common in stable periods—limit relational learning and reduce the model’s relative advantage. Thus, model effectiveness is closely tied to graph sparsity, degree distribution, and the temporal stability of edge weights.
While correlation-based graph construction offers computational efficiency and interpretability, it captures only linear and symmetric relationships. This approach cannot account for more complex, nonlinear, or causal dependencies often present in financial systems. Future research should explore advanced graph construction techniques based on mutual information, Granger causality, or transfer entropy to more accurately represent the dynamics of financial markets. Incorporating causal inference and temporal structure would improve both the robustness and interpretability of GNN-based models.
Another key limitation pertains to model interpretability. Although GATs provide some transparency via attention weights, deep graph models generally lack intuitive explanations for their predictions. For practical adoption in financial decision-making, enhanced explainability is essential. The development of visualization tools and post hoc interpretation techniques is therefore critical for promoting trust and accountability in graph-based forecasting systems. Also, integrating Explainable AI (XAI) techniques—such as feature attribution, counterfactual analysis, or graph-specific saliency mapping—into GNN-based forecasting frameworks could further enhance transparency and stakeholder trust. Such integration would allow market analysts to not only observe prediction outputs but also understand the underlying drivers of inter-market dependencies and systemic risk signals. Given that XAI approaches have already been successfully applied in diverse domains such as marketing, healthcare, and policy analytics, their adoption in financial forecasting could similarly improve interpretability, user confidence, and practical decision-making capabilities [
35,
36,
37].
In summary, GNNs offer a flexible and powerful framework for capturing the dynamic, symmetric relationships that emerge during structural changes in global financial systems. Their effectiveness is most evident when traditional models fail to adapt to nonlinear regime shifts. However, their deployment requires careful consideration of graph structure, computational demands, and interpretability. Future progress in distributed graph learning, explainable GNNs, and causal graph construction will be key to realizing their full potential in real-time financial applications across interconnected markets.
6. Conclusions
This study demonstrates that Graph Convolutional Networks (GCNs) and Graph Attention Networks (GATs) are effective tools for forecasting financial returns, particularly in environments characterized by structural volatility and systemic risk. By leveraging the underlying financial network structures, these models capture complex, symmetric relationships and nonlinear interdependencies that traditional machine learning and deep learning approaches often fail to model adequately.
Through a comprehensive benchmarking framework, we evaluate GCNs and GATs against standard machine learning models across various market regimes. The results consistently show that graph-based models achieve superior predictive performance during periods of market disruption, such as the 2008 Global Financial Crisis and the COVID-19 pandemic. These performance gains stem from the models’ capacity to learn dynamic patterns and directional spillovers embedded within evolving financial networks—an ability that proves critical in capturing contagion effects and regime transitions.
Nevertheless, our findings also highlight several limitations. The computational complexity of GNNs increases with network size, and their relative advantage diminishes in stable market conditions where inter-index relationships are more static and linear. In such contexts, simpler models like XGBoost or MLP often perform comparably at significantly lower computational cost. These trade-offs suggest that GNNs should be applied selectively—ideally during periods of heightened interconnectivity or structural change, where their relational modeling capabilities provide meaningful benefits.
To enable broader adoption in real-world financial systems, further developments are required in three areas: (1) scalable and adaptive graph construction methods that reflect temporal changes in market topology, (2) computationally efficient training and inference procedures suited for high-frequency environments, and (3) enhanced model interpretability through explainable AI techniques.
Ultimately, this research contributes an integrated framework that connects financial network modeling with predictive modeling techniques, offering a graph-based approach to capturing structural evolution in global markets. The flexibility of GNNs in adapting to nonlinear dynamics and uncovering latent inter-market structures positions them as valuable tools for macroeconomic forecasting, portfolio allocation, and systemic risk monitoring. Their demonstrated ability to identify structural breaks and directional dependencies offers significant potential for informing institutional decision-making and developing next-generation decision-support systems in finance.