A Novel Hybrid Temporal Fusion Transformer Graph Neural Network Model for Stock Market Prediction
Abstract
1. Introduction
2. Methodology
2.1. Overview
2.2. Data and Preprocessing
2.3. Problem Framing and Experimental Setup
- Training & validation 2012–2017, test 2018.
- Training & validation 2015–2020, test 2021.
- Training & validation 2018–2023, test 2024.
2.4. Models
2.4.1. Statistical Models
- , , .
- , , .
2.4.2. Deep Learning Models
- A simplified directional GNN signal; a binary up/down indicator distilled from a GAT classifier rather than high-dimensional embeddings;
- A lightweight integration strategy that injects relational signals as time-varying features, preserving the TFT’s variable-selection and attention-based interpretability.
2.5. Evaluation Metrics
2.6. Implementation
3. Results
3.1. Overview
3.2. Statistical Models
3.2.1. SARIMA
3.2.2. ES Model in the ETS Framework
3.3. Deep Learning Models
3.3.1. TFT
3.3.2. Hybrid TFT-GNN
3.4. Comparative Summary
4. Discussion
4.1. Statistical Models
4.2. TFT
4.3. TFT-GNN Hybrid
4.4. Limitations
4.5. Comparison to Other Studies
4.6. Summary
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AAPL | Apple Inc. |
| CNN | Convolutional Neural Network |
| ES | Exponential Smoothing |
| ETF | Exchange Traded Fund |
| ETS | Error, Trend, Seasonal |
| GAT | Graph Attention Network |
| GNN | Graph Neural Network |
| GRU | Gated Recurrent Unit |
| JPM | JPMorgan Chase & Co. |
| LSTM | Long Short-Term Memory |
| MACD | Moving Average Convergence Divergence |
| MAE | Mean Absolute Error |
| MAPE | Mean Absolute Percentage Error |
| NVDA | NVIDIA |
| OHLCV | Open, High, Low, Close, Volume |
| Coefficient of Determination | |
| RMSE | Root Mean Squared Error |
| RSI | Relative Strength Index |
| SARIMA | Seasonal Autoregressive Integrated Moving Average |
| SPY | SPDR S&P 500 ETF Trust |
| TFT | Temporal Fusion Transformer |
| TFT-GNN | Temporal Fusion Transformer with Graph Neural Network integration |
Appendix A. Hyperparameter Tuning
Appendix A.1. TFT Model
- Encoder and Decoder Layers: Number of stacked LSTM layers in the encoder and decoder modules. Deeper layers increase model capacity but risk overfitting.
- Hidden Layer Size: Dimensionality of the model’s internal layers, influencing its representational capacity.
- Attention Heads: Number of parallel attention mechanisms in the multi-head attention block; additional heads capture diverse patterns at higher computational cost.
- Static Embedding Size: Dimensionality of learned embeddings for static covariates.
- Time-Varying Embedding Size: Dimensionality of learned embeddings for time-varying features such as lagged prices or technical indicators.
- Variable Selection: Boolean flag indicating whether to use input variable selection networks for dynamic feature relevance estimation.
- Attention Window: Number of past time steps accessible to the temporal attention mechanism.
Appendix A.2. TFT-GNN Model
- Apple Supply Chain and Related: AAPL, AMD, TSM, AVGO, ASML, QCOM, TXN, MU, NXPI, KLAC, LRCX, ADI, AMAT, MCHP
- Big Tech Peers: GOOGL, MSFT, AMZN, META, NFLX, ORCL, SONY, CRM, ADBE
- Financials: JPM, MS, BAC, BLK, GS, WFC, SCHW, BK, AXP, COF, MET
- Semiconductors: NVDA, AMD, TSM, ASML, QCOM, MU, TXN, NXPI, KLAC, LRCX
- Healthcare: UNH, JNJ, PFE, MRK, LLY, TMO, BMY, NVO
- Automotive: TSLA, F, GM, HMC, TM
- Consumer: WMT, HD, COST, PG, KO, MCD, TGT, PEP
- Exchange-Traded Funds (ETFs): SPY, QQQ, DIA, IWM, XLK, XLF, XLE, XLI, XLV, XLY, XLP, VNQ, IYR, VGT, VTI, VUG, VTV, IWF, IWD, ITOT
- Node Features: Input features for each stock (e.g., OHLCV, RSI, MACD).
- Hidden Channels: Dimensionality of intermediate node embeddings.
- Number of GAT Layers: Controls network depth; deeper models capture higher-order relations but risk over-smoothing.
- Attention Heads: Number of attention mechanisms applied per layer.
- Dropout Rate, Learning Rate, and Optimizer: Standard training hyperparameters.
Appendix B. Code Repository
Appendix C. Table of Results

References
- Fama, E.F. The behavior of stock-market prices. J. Bus. 1965, 38, 34–105. [Google Scholar] [CrossRef]
- Malkiel, B.G. The efficient market hypothesis and its critics. J. Econ. Perspect. 2003, 17, 59–82. [Google Scholar] [CrossRef]
- Saberironaghi, M.; Ren, J.; Saberironaghi, A. Stock market prediction using machine learning and deep learning techniques: A review. AppliedMath 2025, 5, 5030076. [Google Scholar] [CrossRef]
- Lara-Benítez, P.; Carranza-García, M.; Riquelme, J.C. An experimental review on deep learning architectures for time series forecasting. Int. J. Neural Syst. 2021, 31, 2130001. [Google Scholar] [CrossRef]
- Lynch, S. Python for Scientific Computing and Artificial Intelligence; Chapman and Hall/CRC: Boca Raton, FL, USA, 2023. [Google Scholar]
- Lynch, S. Dynamical Systems with Applications using MATLAB, 3rd ed.; Springer Nature: Berlin/Heidelberg, Germany, 2025. [Google Scholar]
- Shi, Z.; Ibrahim, O.; Hashim, H.I.C. A novel hybrid HO-CAL framework for enhanced stock index prediction. Int. J. Adv. Comput. Sci. Appl. 2025, 16, 333–342. [Google Scholar] [CrossRef]
- Olorunnimbe, K.; Viktor, H. Ensemble of temporal Transformers for financial time series. J. Intell. Inf. Syst. 2024, 62, 1087–1111. [Google Scholar] [CrossRef]
- Kautkar, H.; Das, S.; Gupta, H.; Ghosh, S.; Kanjilal, K. Leveraging an integrated first and second moments modeling approach for optimal trading strategies: Evidence from the Indian pharma sector in the pre- and post-COVID-19 era. J. Forecast. 2025, 70046. [Google Scholar] [CrossRef]
- Huang, Y.; Pei, Z.; Yan, J.; Zhou, C.; Lu, X. A combined adaptive Gaussian short-term Fourier transform and Mamba framework for stock price prediction. Eng. Appl. Artif. Intellgence 2025, 162, 112588. [Google Scholar] [CrossRef]
- Jiang, W. Applications of deep learning in stock market prediction: Recent progress. Expert Syst. Appl. 2021, 184, 115537. [Google Scholar] [CrossRef]
- Ajiga, D.I.; Adeleye, R.A.; Tubokirifuruar, T.S.; Bello, B.G.; Ndubuisi, N.L.; Asuzu, O.F.; Owolabi, O.R. Machine learning for stock market forecasting: A review of models and accuracy. Financ. Account. Res. J. 2024, 6, 112–124. [Google Scholar]
- Kehinde, T.; Chan, F.T.; Chung, S.H. Scientometric review and analysis of recent approaches to stock market forecasting: Two decades survey. Expert Syst. Appl. 2023, 213, 119299. [Google Scholar] [CrossRef]
- NASDAQ. Nasdaqtraded.txt—NASDAQ Symbol Directory. Available online: https://www.nasdaqtrader.com/dynamic/SymDir/nasdaqtraded.txt (accessed on 23 April 2025).
- Box, G.E.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time Series Analysis: Forecasting and Control; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
- Brown, R.G. Exponential Smoothing for Predicting Demand; Arthur D. Little: Cambridge, MA, USA, 1956. [Google Scholar]
- Brown, R. Statistical Forecasting for Inventory Control; McGraw-Hill: Columbus, OH, USA, 1959. [Google Scholar]
- Gardner Jr, E.S. Exponential smoothing: The state of the art. J. Forecast. 1985, 4, 1–28. [Google Scholar] [CrossRef]
- Hyndman, R.J.; Koehler, A.B.; Snyder, R.D.; Grose, S. A state space framework for automatic forecasting using exponential smoothing methods. Int. J. Forecast. 2002, 18, 439–454. [Google Scholar] [CrossRef]
- Lim, B.; Arık, S.Ö.; Loeff, N.; Pfister, T. Temporal fusion transformers for interpretable multi-horizon time series forecasting. Int. J. Forecast. 2021, 37, 1748–1764. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 6000–6010. [Google Scholar]
- Scarselli, F.; Gori, M.; Tsoi, A.C.; Hagenbuchner, M.; Monfardini, G. The graph neural network model. IEEE Trans. Neural Netw. 2008, 20, 61–80. [Google Scholar] [CrossRef]
- Patel, M.; Jariwala, K.; Chattopadhyay, C. A Systematic Review on Graph Neural Network-based Methods for Stock Market Forecasting. ACM Comput. Surv. 2024, 57, 1–38. [Google Scholar] [CrossRef]
- Velickovic, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. Stat 2017, 1050, 10–48550. [Google Scholar]
- Feng, F.; He, X.; Wang, X.; Luo, C.; Liu, Y.; Chua, T.S. Temporal relational ranking for stock prediction. ACM Trans. Inf. Syst. (TOIS) 2019, 37, 1–30. [Google Scholar] [CrossRef]
- Hu, X. Stock price prediction based on temporal fusion transformer. In Proceedings of the 2021 3rd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), Taiyuan, China, 3–5 December 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 60–66. [Google Scholar]
- Wang, J.; Zhang, S.; Xiao, Y.; Song, R. A Review on Graph Neural Network Methods in Financial Applications. J. Data Sci. 2022, 20, 111–134. [Google Scholar] [CrossRef]
- Sun, Z. Comparison of trend forecast using ARIMA and ETS Models for S&P500 close price. In Proceedings of the 2020 4th International Conference on E-Business and Internet, Singapore, 9–11 October 2020; pp. 57–60. [Google Scholar]
- Schmidt, F. Generalization in generation: A closer look at exposure bias. arXiv 2019, arXiv:1910.00292. [Google Scholar] [CrossRef]
- Saiyyad, A.; Wankhade, S.; Sakhare, A.; Kale, P.; Yenchilwar, G.; Sharma, P. Stock Price Prediction for Stock Market Forecasting using Machine Learning. In Proceedings of the 2025 4th OPJU International Technology Conference (OTCON) on Smart Computing for Innovation and Advancement in Industry 5.0, Raigarh, India, 9–11 April 2025; IEEE: Piscataway, NJ, USA, 2025; pp. 1–5. [Google Scholar]
- Uzzal, M.H.; Ślepaczuk, R. The Performance of Time Series Forecasting Based on Classical and Machine Learning Methods for S&P 500 Index; University of Warsaw, Faculty of Economic Sciences: Warsaw, Poland, 2023. [Google Scholar]
- Parker, M.; Ghahremani, M.; Shiaeles, S. Stock Price Prediction Using a Stacked Heterogeneous Ensemble. Int. J. Financ. Stud. 2025, 13, 201. [Google Scholar] [CrossRef]
- Gasparėnienė, L.; Remeikiene, R.; Sosidko, A.; Vėbraitė, V. A modelling of S&P 500 index price based on US economic indicators: Machine learning approach. Eng. Econ. 2021, 32, 362–375. [Google Scholar]
- Sarıkoç, M.; Celik, M. PCA-ICA-LSTM: A hybrid deep learning model based on dimension reduction methods to predict S&P 500 index price. Comput. Econ. 2025, 65, 2249–2315. [Google Scholar]
- Chahuán-Jiménez, K. Neural network-based predictive models for stock market index forecasting. J. Risk Financ. Manag. 2024, 17, 242. [Google Scholar] [CrossRef]
- Simeunović, J.; Schubnel, B.; Alet, P.J.; Carrillo, R.E. Spatio-temporal graph neural networks for multi-site PV power forecasting. IEEE Trans. Sustain. Energy 2021, 13, 1210–1220. [Google Scholar] [CrossRef]
- Yu, B.; Yin, H.; Zhu, Z. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. arXiv 2017, arXiv:1709.04875. [Google Scholar]
- Liu, T.; Liang, L.; Che, C.; Liu, Y.; Jin, B. A transformer-based framework for temporal health event prediction with graph-enhanced representations. J. Biomed. Inform. 2025, 166, 104826. [Google Scholar] [CrossRef]









| Model | Strengths | Limitations | Scientific Gap Relevant to This Study |
|---|---|---|---|
| LSTM | Learns nonlinear temporal dependencies. | Weak long-range memory; training instability. | Cannot model cross-asset relationships; limited interpretability. |
| GRU | Computationally efficient recurrent model. | Reduced capacity for complex dynamics. | Cannot model cross-asset relationships; limited interpretability. |
| CNN | Captures local temporal patterns. | Poor long-sequence modelling. | Cannot model cross-asset relationships; limited interpretability. |
| TFT | Strong multivariate forecasting with interpretable attention. | Still treats assets independently; high computational cost. | Lacks mechanism for incorporating relational information. |
| GNN | Captures relational structure and cross-asset dependencies. | Not designed for sequence forecasting; limited temporal modelling. | Lacks integration with temporal architectures for joint relational–temporal prediction. |
| Hybrid DL Models | Combine complementary architectures. | Higher computational cost. | Typically do not integrate graph-based relational signals. |
| TFT–GNN (This Study) | Joint temporal–relational modelling; attention-based interpretability. | Higher computational cost. | Introduces relational information directly into a transformer forecasting pipeline. |
| Model | RMSE (↓) | (↑) | Horizon | Interpretability | Compute Time |
|---|---|---|---|---|---|
| SARIMA | Weekly | Low | Moderate | ||
| ETS | Daily | Moderate | Low | ||
| TFT | Daily | High | High | ||
| TFT-GNN | Daily | High | Very high |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lynch, S.T.; Derakhshan, P.; Lynch, S. A Novel Hybrid Temporal Fusion Transformer Graph Neural Network Model for Stock Market Prediction. AppliedMath 2025, 5, 176. https://doi.org/10.3390/appliedmath5040176
Lynch ST, Derakhshan P, Lynch S. A Novel Hybrid Temporal Fusion Transformer Graph Neural Network Model for Stock Market Prediction. AppliedMath. 2025; 5(4):176. https://doi.org/10.3390/appliedmath5040176
Chicago/Turabian StyleLynch, Sebastian Thomas, Parisa Derakhshan, and Stephen Lynch. 2025. "A Novel Hybrid Temporal Fusion Transformer Graph Neural Network Model for Stock Market Prediction" AppliedMath 5, no. 4: 176. https://doi.org/10.3390/appliedmath5040176
APA StyleLynch, S. T., Derakhshan, P., & Lynch, S. (2025). A Novel Hybrid Temporal Fusion Transformer Graph Neural Network Model for Stock Market Prediction. AppliedMath, 5(4), 176. https://doi.org/10.3390/appliedmath5040176

