Deep Learning-Based Hybrid Model with Multi-Head Attention for Multi-Horizon Stock Price Prediction
Abstract
1. Introduction
- (i)
- A hybrid feature selection approach that combines nonparametric correlation analysis with recursive feature elimination to identify informative and non-redundant features, enhancing model performance.
- (ii)
- A novel integration of TCNs with GRUs and MHA allows the model to identify long-range dependencies, sequential dynamics, and diverse feature representations.
- (iii)
- Demonstration of the superior predictive accuracy of the proposed model, TCN-GRU-MHA, compared with traditional short-term stock price forecasting methods across multiple horizons (1-day, 3-day, and 7-day).
- (iv)
- To ensure sectoral diversity and comprehensive evaluation, we analyzed three stocks from different sectors and two major indices for evaluation.
2. Related Work
2.1. Statistical Approaches
2.2. Machine Learning and Deep Learning Approaches
2.3. Hybrid and Attention-Based Approaches
3. Methodology
3.1. Sequence Modeling Techniques
3.1.1. Temporal Convolutional Networks
3.1.2. Gated Recurrent Unit
3.1.3. Multi-Head Attention Mechanism
3.2. Proposed Model
3.3. Data Description, Preprocessing, and Feature Selection
3.3.1. Data Description
3.3.2. Data Preprocessing and Feature Selection
3.4. Evaluation Metrics
4. Results and Discussion
4.1. Performance Evaluation on HDFC Bank Stock Data
4.2. Performance Evaluation on TCS Stock Data
4.3. Performance Evaluation on TSLA Stock Data
4.4. Performance Evaluation on Nifty 50 Index Dataset
4.5. Performance Evaluation on S&P 500 Index Dataset
4.6. Comparative Evaluation of Baseline and Proposed Models
4.7. Comparison of Predictive Performance with Other Approaches
4.8. Statistical Tests for Model Evaluation and Data Stationarity
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Atesongun, A., & Gulsen, M. (2024). A hybrid forecasting structure based on ARIMA and artificial neural network models. Applied Sciences, 14(16), 7122. [Google Scholar] [CrossRef]
- Caiado, J., & Lúcio, F. (2023). Stock market forecasting accuracy of asymmetric GARCH models during the COVID-19 pandemic. The North American Journal of Economics and Finance, 68, 101971. [Google Scholar] [CrossRef]
- Chen, C., Xue, L., & Xing, W. (2023). Research on improved GRU-based stock price prediction method. Applied Sciences, 13(15), 8813. [Google Scholar] [CrossRef]
- Chi, D.-J., & Chu, C.-C. (2021). Artificial intelligence in corporate sustainability: Using LSTM and GRU for going concern prediction. Sustainability, 13(21), 11631. [Google Scholar] [CrossRef]
- Chinta, S. (2021). Integrating machine learning algorithms in big data analytics: A framework for enhancing predictive insights. IJARESM, 9, 2145–2161. [Google Scholar] [CrossRef]
- Chopra, R., & Sharma, G. D. (2021). Application of artificial intelligence in stock market forecasting: A critique, review, and research agenda. Journal of Risk and Financial Management, 14(11), 526. [Google Scholar] [CrossRef]
- Fathali, Z., Kodia, Z., & Ben Said, L. (2022). Stock market prediction of Nifty 50 index applying machine learning techniques. Applied Artificial Intelligence, 36(1), 2111134. [Google Scholar] [CrossRef]
- Fozap, F. M. P. (2025). Hybrid machine learning models for long-term stock market forecasting: Integrating technical indicators. Journal of Risk and Financial Management, 18(4), 201. [Google Scholar] [CrossRef]
- Friday, I. K., Pati, S. P., Mishra, D., Mallick, P. K., & Kumar, S. (2024). CAGTRADE: Predicting stock market price movement with a CNN-Attention-GRU model. Asia-Pacific Financial Markets, 32, 583–608. [Google Scholar] [CrossRef]
- Gautam, B., Kandel, S., Shrestha, M., & Thakur, S. (2024). Comparative analysis of machine learning models for stock price prediction: Leveraging LSTM for real-time forecasting. Journal of Computer and Communications, 12(8), 52–80. [Google Scholar] [CrossRef]
- Guo, C., Kang, X., Xiong, J., & Wu, J. (2023). A new time series forecasting model based on complete ensemble empirical mode decomposition with adaptive noise and temporal convolutional network. Neural Processing Letters, 55(4), 4397–4417. [Google Scholar] [CrossRef]
- Hoseinzade, E., & Haratizadeh, S. (2019). CNNpred: CNN-based stock market prediction using a diverse set of variables. Expert Systems with Applications, 129, 273–285. [Google Scholar] [CrossRef]
- Jaiswal, R., & Singh, B. (2022, April 23–24). A hybrid convolutional recurrent (CNN-GRU) model for stock price prediction. 2022 IEEE 11th International Conference on Communication Systems and Network Technologies (CSNT) (pp. 299–304), Indore, India. [Google Scholar] [CrossRef]
- Kanwal, A., Lau, M. F., Ng, S. P., Sim, K. Y., & Chandrasekaran, S. (2022). BiCuDNNLSTM-1dCNN—A hybrid deep learning-based predictive model for stock price prediction. Expert Systems with Applications, 202, 117123. [Google Scholar] [CrossRef]
- Kervanci, I. S., Akay, M. F., & Özceylan, E. (2024). Bitcoin price prediction using LSTM, GRU and hybrid LSTM-GRU with bayesian optimization, random search, and grid search for the next days. Journal of Industrial and Management Optimization, 20(2), 570–588. [Google Scholar] [CrossRef]
- Khodaee, P., Esfahanipour, A., & Taheri, H. M. (2022). Forecasting turning points in stock price by applying a novel hybrid CNN-LSTM-ResNet model fed by 2D segmented images. Engineering Applications of Artificial Intelligence, 116, 105464. [Google Scholar] [CrossRef]
- Kurani, A., Doshi, P., Vakharia, A., & Shah, M. (2023). A comprehensive comparative study of artificial neural network (ANN) and support vector machines (SVM) on stock forecasting. Annals of Data Science, 10(1), 183–208. [Google Scholar] [CrossRef]
- Lei, K., Zhang, B., Li, Y., Yang, M., & Shen, Y. (2020). Time-driven feature-aware jointly deep reinforcement learning for financial signal representation and algorithmic trading. Expert Systems with Applications, 140, 112872. [Google Scholar] [CrossRef]
- Li, S., Huang, X., Cheng, Z., Zou, W., & Yi, Y. (2023). AE-ACG: A novel deep learning-based method for stock price movement prediction. Finance Research Letters, 58, 104304. [Google Scholar] [CrossRef]
- Li, S., Tang, G., Chen, X., & Lin, T. (2024). Stock index forecasting using a novel integrated model based on CEEMDAN and TCN-GRU-CBAM. IEEE Access, 12, 122524–122543. [Google Scholar] [CrossRef]
- Luo, A., Zhong, L., Wang, J., Wang, Y., Li, S., & Tai, W. (2024). Short-term stock correlation forecasting based on CNN-BiLSTM enhanced by attention mechanism. IEEE Access, 12, 29617–29632. [Google Scholar] [CrossRef]
- Mostafavi, S. M., & Hooman, A. R. (2025). Key technical indicators for stock market prediction. Machine Learning with Applications, 20, 100631. [Google Scholar] [CrossRef]
- Naeem, M., Jassim, H. S., & Korsah, D. (2024). The application of machine learning techniques to predict stock market crises in Africa. Journal of Risk and Financial Management, 17(12), 554. [Google Scholar] [CrossRef]
- Nourbakhsh, Z., & Habibi, N. (2023). Combining LSTM and CNN methods and fundamental analysis for stock price trend prediction. Multimedia Tools and Applications, 82(12), 17769–17799. [Google Scholar] [CrossRef]
- Parray, I. R., Khurana, S. S., Kumar, M., & Altalbe, A. A. (2020). Time series data analysis of stock price movement using machine learning techniques. Soft Computing-A Fusion of Foundations, Methodologies & Applications, 24(21), 16509–16517. [Google Scholar] [CrossRef]
- Priyatno, A. M., & Widiyaningtyas, T. (2024). A systematic literature review: Recursive feature elimination algorithms. JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer), 9(2), 196–207. [Google Scholar] [CrossRef]
- Salem, F. M. (2021). Gated RNN: The gated recurrent unit (GRU) RNN. In Recurrent neural networks: From simple to gated architectures (pp. 85–100). Springer. [Google Scholar] [CrossRef]
- Sarıkoç, M., & Celik, M. (2024). PCA-ICA-LSTM: A hybrid deep learning model based on dimension reduction methods to predict S&P 500 index price. Computational Economics, 65, 2249–2315. [Google Scholar] [CrossRef]
- Selvamuthu, D., Kumar, V., & Mishra, A. (2019). Indian stock market prediction using artificial neural networks on tick data. Financial Innovation, 5(1), 16. [Google Scholar] [CrossRef]
- Sirisha, U. M., Belavagi, M. C., & Attigeri, G. (2022). Profit prediction using ARIMA, SARIMA and LSTM models in time series forecasting: A comparison. IEEE Access, 10, 124715–124727. [Google Scholar] [CrossRef]
- Teixeira, D. M., & Barbosa, R. S. (2024). Stock price prediction in the financial market using machine learning models. Computation, 13(1), 3. [Google Scholar] [CrossRef]
- Wang, Z., & Peng, Z. (2024). Structural acceleration response reconstruction based on BiLSTM network and multi-head attention mechanism. Structures, 64, 106602. [Google Scholar] [CrossRef]
- Wen, X., Liao, J., Niu, Q., Shen, N., & Bao, Y. (2024). Deep learning-driven hybrid model for short-term load forecasting and smart grid information management. Scientific Reports, 14(1), 13720. [Google Scholar] [CrossRef] [PubMed]
- Xiaoyan, H., Bingjie, L., Jing, S., Hua, L., & Guojing, L. (2021, September 27–29). A novel forecasting method for short-term load based on TCN-GRU model. 2021 IEEE International Conference on Energy Internet (ICEI) (pp. 79–83), Southampton, UK. [Google Scholar] [CrossRef]
- Yang, S., Guo, H., & Li, J. (2022). CNN-GRUA-FC stock price forecast model based on multi-factor analysis. Journal of Advanced Computational Intelligence and Intelligent Informatics, 26(4), 600–608. [Google Scholar] [CrossRef]
- Zhou, S., Song, C., Wang, T., Pan, X., Chang, W., & Yang, L. (2022). A short-term hybrid TCN-GRU prediction model of bike-sharing demand based on travel characteristics mining. Entropy, 24(9), 1193. [Google Scholar] [CrossRef] [PubMed]
Reference | Method Used | Findings | Limitations |
---|---|---|---|
Sirisha et al. (2022) | ARIMA and SARIMA | Demonstrated effectiveness of ARIMA and SARIMA in modeling and forecasting financial profit time series. | Relies on linearity and stationarity assumptions; weak performance on nonlinear and volatile stock data. |
Caiado and Lúcio (2023) | Asymmetric GARCH and clustering | Proposed error-clustering framework to evaluate asymmetric GARCH models under pandemic-driven volatility. | Limited ability to capture nonlinear and long-term dependencies; not scalable to complex datasets. |
Kurani et al. (2023) | SVM | Comprehensive application of SVM in financial forecasting; effective in small and moderately nonlinear datasets. | Scalability issues with large datasets; weaker performance on highly volatile, nonlinear markets. |
Guo et al. (2023) | TCN | Showed TCN’s ability to effectively model long-range sequence with parallel computation advantages. | Standalone TCN lacks temporal gating; may miss fine-grained sequential patterns. |
Chen et al. (2023) | GRU | Demonstrated GRU’s efficiency in capturing temporal dynamics with fewer parameters than GRU. | Single-architecture focus; limited generalization across complex datasets. |
Friday et al. (2024) | CNN-Attention-GRU | Conducted multi-horizon evaluation; dynamic weighting via attention improved robustness across markets. | Emphasis on short-term horizons; limited exploration of feature engineering depth. |
Yang et al. (2022) | CNN–GRUA–FC and RF | Used RF for feature selection; CNN/GRU with attention improved sequential modeling. | High pipeline complexity; risk of overfitting on small datasets. |
Li et al. (2023) | AE–ACG, CNN–GRU and Attention | Introduced autoencoder with CNN–GRU–attention to enhance feature extraction and prediction accuracy. | Shallow hybrid architecture; limited scalability to broader markets. |
Luo et al. (2024) | CNN–BiGRU and Attention | Enhanced correlation forecasting by combining CNN, BiLSTM, and attention to reduce information loss. | Limited number of features; validation on a small set of stocks. |
No. | Stock/Index | Sector/Country | Data Length | Mean | Max | Min | Std |
---|---|---|---|---|---|---|---|
1 | HDFC Bank | Banking | 2499 | ||||
2 | TCS | IT | 2499 | ||||
3 | TSLA | Automotive | 2536 | 9.58 | |||
4 | Nifty 50 | India | 2500 | 13,590.09 | 26,216.05 | ||
5 | S&P 500 | USA | 2536 |
Rank | Feature | Description | Final Weighted Score |
---|---|---|---|
1 | HLC3 | Average of high, low, and close price | 1.000000 |
2 | Mean HL | Mean of High and Low | 0.333048 |
3 | Low | Lower price | 0.311132 |
4 | High | Higher price | 0.281828 |
5 | Open | Opening price | 0.247799 |
6 | Rolling Mean5 | 5-Period moving average | 0.244660 |
7 | Parabolic SAR | Parabolic stop and reverse | 0.237286 |
8 | EMA | Exponential moving average | 0.236381 |
9 | SMA | Simple moving average | 0.233920 |
10 | BB | Volatility bands around price average | 0.229445 |
11 | OBV | Cumulative volume flow indicator | 0.207496 |
12 | ATR | Volatility measurement over time | 0.134234 |
13 | Volume | Trading intensity over time | 0.114961 |
14 | Price Range | High–low-price difference | 0.109802 |
15 | Rolling Std5 | 5-Period rolling standard deviation | 0.097434 |
16 | Upper Shadow | Wick above candle body | 0.065497 |
17 | Lower Shadow | Wick below candle body | 0.057736 |
18 | MACD | Momentum and trend strength indicator | 0.027607 |
19 | Volatility | Measure of market fluctuation | 0.026295 |
20 | CMF | Money flow strength indicator | 0.013079 |
21 | Momentum | Speed of price movement | 0.011721 |
22 | TRIX | Filters short-term price noise | 0.006090 |
23 | Mass Index | Range expansion reversal indicator | 0.003344 |
24 | PROC | Measures price momentum shift | 0.003286 |
25 | Normalized Volume | Relative volume across time | 0.003129 |
26 | Candle Direction | Bullish or bearish candle signal | 0.002402 |
27 | Daily Return | Daily price return percentage | 0.002206 |
28 | ROC | Price momentum strength indicator | 0.002105 |
29 | Price Position Range | Price level in recent range | 0.001361 |
30 | CCI | Measures price deviation strength | 0.000752 |
31 | VI+ | Positive vortex indicator | 0.000300 |
32 | Williams %R | Price position in recent range | 0.000184 |
33 | Stochastic | Measures price relative to range | 0.000183 |
34 | TSI | Identifies trend strength and direction | 0.000163 |
35 | RSI | Measures overbought or oversold | 0.000001 |
Parameter | Value |
---|---|
Number of TCN Filters | 64 |
Kernel Size | 3 |
Dilation Rate | [1, 2, 4, 8] |
Padding | Causal |
GRUs | 128, 64 |
Number of Filters (MHA) | 4 |
Key Dimension | 16 |
Loss Function | MSE |
Activation Function | ReLU |
Optimizer | Adam |
Batch Size | 32 |
Dropout Rate | |
Epochs | 100 |
Learning Rate | |
Early Stopping | patience = 10 |
Time Frame | Model | RMSE | MAE | MAPE | R2 |
---|---|---|---|---|---|
1 Day | TCN | ||||
LSTM | |||||
GRU | |||||
BiGRU | |||||
TCN-GRU | |||||
Proposed Model | |||||
3 Days | TCN | ||||
LSTM | |||||
GRU | |||||
BiGRU | |||||
TCN-GRU | |||||
Proposed Model | |||||
7 Days | TCN | ||||
LSTM | |||||
GRU | |||||
BiGRU | |||||
TCN-GRU | |||||
Proposed Model |
Time Frame | Model | RMSE | MAE | MAPE | R2 |
---|---|---|---|---|---|
1 Day | TCN | ||||
LSTM | |||||
GRU | |||||
BiGRU | |||||
TCN-GRU | |||||
Proposed Model | |||||
3 Days | TCN | ||||
LSTM | |||||
GRU | |||||
BiGRU | |||||
TCN-GRU | |||||
Proposed Model | |||||
7 Days | TCN | ||||
LSTM | |||||
GRU | |||||
BiGRU | |||||
TCN-GRU | |||||
Proposed Model |
Time Frame | Model | RMSE | MAE | MAPE | R2 |
---|---|---|---|---|---|
1 Day | TCN | ||||
LSTM | |||||
GRU | |||||
BiGRU | |||||
TCN-GRU | |||||
Proposed Model | |||||
3 Days | TCN | ||||
LSTM | |||||
GRU | |||||
BiGRU | |||||
TCN-GRU | |||||
Proposed Model | |||||
7 Days | TCN | ||||
LSTM | |||||
GRU | |||||
BiGRU | |||||
TCN-GRU | |||||
Proposed Model |
Time Frame | Model | RMSE | MAE | MAPE | R2 |
---|---|---|---|---|---|
1 Day | TCN | ||||
LSTM | |||||
GRU | |||||
BiGRU | |||||
TCN-GRU | |||||
Proposed Model | |||||
3 Days | TCN | ||||
LSTM | |||||
GRU | |||||
BiGRU | |||||
TCN-GRU | |||||
Proposed Model | |||||
7 Days | TCN | ||||
LSTM | |||||
GRU | |||||
BiGRU | |||||
TCN-GRU | |||||
Proposed Model |
Time Frame | Model | RMSE | MAE | MAPE | R2 |
---|---|---|---|---|---|
1 Day | TCN | ||||
LSTM | |||||
GRU | |||||
BiGRU | |||||
TCN-GRU | |||||
Proposed Model | |||||
3 Days | TCN | ||||
LSTM | |||||
GRU | |||||
BiGRU | |||||
TCN-GRU | |||||
Proposed Model | |||||
7 Days | TCN | ||||
LSTM | |||||
GRU | |||||
BiGRU | |||||
TCN-GRU | |||||
Proposed Model |
Dataset | Model | RMSE | MAE | MAPE | |
---|---|---|---|---|---|
HDFC Bank | ARIMA | 0.245 | 0.196 | 11.28% | 0.310 |
Random Walk Model | 0.046 | 0.027 | 3.83% | 0.932 | |
Proposed Model | 0.014 | 0.008 | 1.21% | 0.981 | |
TCS | ARIMA | 0.411 | 0.321 | 8.64% | 0.356 |
Random Walk Model | 0.094 | 0.065 | 2.03% | 0.915 | |
Proposed Model | 0.072 | 0.050 | 1.44% | 0.987 | |
TSLA | ARIMA | 0.362 | 0.287 | 11.83% | 0.137 |
Random Walk Model | 0.085 | 0.063 | 4.24% | 0.890 | |
Proposed Model | 0.049 | 0.036 | 2.37% | 0.978 | |
Nifty 50 | ARIMA | 0.428 | 0.341 | 7.52% | 0.216 |
Random Walk Model | 0.172 | 0.142 | 2.39% | 0.920 | |
Proposed Model | 0.132 | 0.110 | 1.84% | 0.983 | |
S&P 500 | ARIMA | 0.335 | 0.246 | 10.24% | 0.292 |
Random Walk Model | 0.205 | 0.155 | 3.18% | 0.906 | |
Proposed Model | 0.155 | 0.096 | 1.98% | 0.979 |
Reference | Technique Used | Dataset | RMSE | |
---|---|---|---|---|
Fathali et al. (2022) | LSTM (H-L-O-C feature) | Nifty 50 | 0.082 | |
Fozap (2025) | LSTM-CNN | S&P 500 | 0.101 | |
Fozap (2025) | Random Forest | S&P 500 | 0.085 | |
Sarıkoç and Celik (2024) | PCA-ICA-LSTM | S&P 500 | – | |
Li et al. (2024) | TCN-GRU | S&P 500 | 68.896 | |
Li et al. (2024) | TCN-GRU-CBAM | S&P 500 | 52.187 | |
Proposed model | TCN-GRU-MHA | Nifth 50 | 0.132 | |
Proposed model | TCN-GRU-MHA | S&P 500 | 0.155 |
Stock/Index Name | Test Statistic | p-Value |
---|---|---|
HDFC Bank | −1.061 | 0.730 |
TCS | −0.472 | 0.897 |
HUL | −0.596 | 0.871 |
Nifty 50 | 0.366 | 0.980 |
S&P 500 | 0.612 | 0.987 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ghosh, R.K.; Gupta, B.K.; Nayak, A.K.; Ghosh, S.K. Deep Learning-Based Hybrid Model with Multi-Head Attention for Multi-Horizon Stock Price Prediction. J. Risk Financial Manag. 2025, 18, 551. https://doi.org/10.3390/jrfm18100551
Ghosh RK, Gupta BK, Nayak AK, Ghosh SK. Deep Learning-Based Hybrid Model with Multi-Head Attention for Multi-Horizon Stock Price Prediction. Journal of Risk and Financial Management. 2025; 18(10):551. https://doi.org/10.3390/jrfm18100551
Chicago/Turabian StyleGhosh, Rajesh Kumar, Bhupendra Kumar Gupta, Ajit Kumar Nayak, and Samit Kumar Ghosh. 2025. "Deep Learning-Based Hybrid Model with Multi-Head Attention for Multi-Horizon Stock Price Prediction" Journal of Risk and Financial Management 18, no. 10: 551. https://doi.org/10.3390/jrfm18100551
APA StyleGhosh, R. K., Gupta, B. K., Nayak, A. K., & Ghosh, S. K. (2025). Deep Learning-Based Hybrid Model with Multi-Head Attention for Multi-Horizon Stock Price Prediction. Journal of Risk and Financial Management, 18(10), 551. https://doi.org/10.3390/jrfm18100551