Forecasting Stock Market Indices Using Integration of Encoder, Decoder, and Attention Mechanism
Abstract
1. Introduction
- 1.
- Given their effectiveness in language modeling, such as predicting the next word in a sequence, can encoder–decoder architectures also excel in stock price prediction contexts?
- 2.
- Can encoder–decoder architectures, including those with and without attention mechanisms, outperform traditional recurrent neural networks in predicting stock price indices in the Vietnamese market?
- 3.
- What is the impact of attention mechanisms on the predictive performance of encoder–decoder architectures for stock price forecasting in the Vietnamese context?
- Application to the Vietnamese market: This study extends the application of advanced deep learning models, specifically encoder–decoder architectures with and without attention mechanisms, to the Vietnamese stock market, a market with unique characteristics and limited prior research on the application of these sophisticated models.
- Comparative analysis: We conduct a comprehensive comparative analysis of the performance of encoder–decoder models with and without attention mechanisms against traditional recurrent neural networks, providing valuable insights into the relative strengths and weaknesses of these different approaches.
- Rigorous methodology: We employ a rigorous experimental framework, including hyperparameter tuning using Bayesian optimization, to ensure optimal model performance and robust evaluation.
2. Related Work
2.1. Significant Early Contributions to Deep Learning Models for Stock Price Prediction
2.2. Historical Developments in Encoder–Decoder Architecture and Attention Mechanism
3. Recurrent Neural Networks
3.1. RNN
3.2. LSTM
3.3. Gated Recurrent Units
4. Encoder–Decoder Architecture
4.1. Encoder
4.2. Decoder
4.3. Attention
5. Experiments
5.1. Datasets
5.2. Data Preprocessing
5.3. Hyperparameters Setting
5.4. Model Performance Measures
6. Results
7. Conclusions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| RNN | Recurrent Neural Network | 
| LSTM | Long Short-Term Memory | 
| GRU | Gated Recurrent Unit | 
| MAE | Mean Absolute Error | 
| MSE | Mean Square Error | 
| RMSE | Root Mean Square Error | 
| MAPE | Mean Absolute Percentage Error | 
| VN-Index | Ho Chi Minh Stock Exchange Index | 
| HNX-Index | Hanoi Stock Exchange Index | 
References
- Singh, R.; Srivastava, S. Stock prediction using deep learning. Multimed. Tools Appl. 2017, 76, 18569–18584. [Google Scholar] [CrossRef]
- Nabipour, M.; Nayyeri, P.; Jabani, H.; Mosavi, A.; Salwana, E. Deep learning for stock market prediction. Entropy 2020, 22, 840. [Google Scholar] [CrossRef]
- Ecer, F.; Ardabili, S.; Band, S.S.; Mosavi, A. Training multilayer perceptron with genetic algorithms and particle swarm optimization for modeling stock price index prediction. Entropy 2020, 22, 1239. [Google Scholar] [CrossRef] [PubMed]
- Wu, D.; Wang, X.; Su, J.; Tang, B.; Wu, S. A labeling method for financial time series prediction based on trends. Entropy 2020, 22, 1162. [Google Scholar] [CrossRef] [PubMed]
- Lu, W.; Li, J.; Li, Y.; Sun, A.; Wang, J. A CNN-LSTM-based model to forecast stock prices. Complexity 2020, 2020, 6622927. [Google Scholar] [CrossRef]
- Lu, W.; Li, J.; Wang, J.; Qin, L. A CNN-BiLSTM-AM method for stock price prediction. Neural Comput. Appl. 2021, 33, 4741–4753. [Google Scholar] [CrossRef]
- Wang, C.; Chen, Y.; Zhang, S.; Zhang, Q. Stock market index prediction using deep Transformer model. Expert Syst. Appl. 2022, 208, 118128. [Google Scholar] [CrossRef]
- Kanwal, A.; Lau, M.; Ng, S.; Sim, K.; Chandrasekaran, S. BiCuDNNLSTM-1dCNN - A hybrid deep learning-based predictive model for stock price prediction. Expert Syst. Appl. 2022, 202, 117123. [Google Scholar] [CrossRef]
- Yang, S.; Ding, Y.; Xie, B.; Guo, Y.; Bai, X.; Qian, J.; Gao, Y.; Wang, W.; Ren, J. Advancing Financial Forecasts: A Deep Dive into Memory Attention and Long-Distance Loss in Stock Price Predictions. Appl. Sci. 2023, 13, 12160. [Google Scholar] [CrossRef]
- Wang, C.; Ren, J.; Liang, H.; Gong, J.; Wang, B. Conducting stock market index prediction via the localized spatial–temporal convolutional network. Comput. Electr. Eng. 2023, 108, 108687. [Google Scholar] [CrossRef]
- Zhu, W.; Dai, W.; Tang, C.; Zhou, G.; Liu, Z.; Zhao, Y. PMANet: A time series forecasting model for Chinese stock price prediction. Sci. Rep. 2024, 14, 18351. [Google Scholar] [CrossRef] [PubMed]
- Chen, X.; Yang, F.; Sun, Q.; Yi, W. Research on stock prediction based on CED-PSO-StockNet time series model. Sci. Rep. 2024, 14, 27462. [Google Scholar] [CrossRef]
- Xie, L.; Chen, Z.; Yu, S. Deep Convolutional Transformer Network for Stock Movement Prediction. Electronics 2024, 13, 4225. [Google Scholar] [CrossRef]
- Li, S.; Xu, S. Enhancing stock price prediction using GANs and transformer-based attention mechanisms. Empir. Econ. 2024, 1–31. [Google Scholar] [CrossRef]
- Soydaner, D. Attention mechanism in neural networks: Where it comes and where it goes. Neural Comput. Appl. 2022, 34, 13371–13385. [Google Scholar] [CrossRef]
- Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to Sequence Learning with Neural Networks. arXiv 2014, arXiv:1409.3215. [Google Scholar]
- Cho, K.; van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar]
- Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv 2014, arXiv:1409.0473. [Google Scholar]
- Cho, K.; Courville, A.; Bengio, Y. Describing multimedia content using attention-based encoder-decoder networks. IEEE Trans. Multimed. 2015, 17, 1875–1886. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.; Kaiser, L.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond efficient transformer for long sequence time-series forecasting. Proc. AAAI Conf. Artif. Intell. 2021, 35, 11106–11115. [Google Scholar] [CrossRef]
- Wu, H.; Xu, J.; Wang, J.; Long, M. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Adv. Neural Inf. Process. Syst. 2021, 34, 22419–22430. [Google Scholar]
- Zhou, T.; Ma, Z.; Wen, Q.; Wang, X.; Sun, L.; Jin, R. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In Proceedings of the 39th International Conference on Machine Learning, Baltimore, MD, USA, 17–23 July 2022; pp. 27268–27286. [Google Scholar]
- Du, S.; Li, T.; Yang, Y.; Horng, S.J. Multivariate time series forecasting via attention-based encoder–decoder framework. Neurocomputing 2020, 388, 269–279. [Google Scholar] [CrossRef]
- Jin, X.B.; Zheng, W.Z.; Kong, J.L.; Wang, X.Y.; Bai, Y.T.; Su, T.L.; Lin, S. Deep-learning forecasting method for electric power load via attention-based encoder-decoder with bayesian optimization. Energies 2021, 14, 1596. [Google Scholar] [CrossRef]
- Wu, L.; Zhang, Y. Attention-based encoder-decoder networks for state of charge estimation of lithium-ion battery. Energy 2023, 268, 126665. [Google Scholar] [CrossRef]
- Klaar, A.; Stefenon, S.; Seman, L.; Mariani, V.; Coelho, L. Optimized EWT-Seq2Seq-LSTM with attention mechanism to insulators fault prediction. Sensors 2023, 23, 3202. [Google Scholar] [CrossRef] [PubMed]
- Jayanth, T.; Manimaran, A. Developing a Novel Hybrid Model Double Exponential Smoothing and Dual Attention Encoder-Decoder Based Bi-Directional Gated Recurrent Unit Enhanced with Bayesian Optimization to Forecast Stock Price. IEEE Access 2024, 12, 114760–114785. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Cho, K.; van Merriënboer, B.; Bahdanau, D.; Bengio, Y. On the properties of neural machine translation: Encoder-decoder approaches. arXiv 2014, arXiv:1409.1259. [Google Scholar]
- Saud, A.S.; Shakya, S. Analysis of look back period for stock price prediction with RNN variants: A case study on banking sector of NEPSE. Procedia Comput. Sci. 2020, 167, 788–798. [Google Scholar] [CrossRef]




















| Component | Specification | 
|---|---|
| CPU | AMD Ryzen 7 7435HS (3.10 GHz up to 4.50 GHz, 8 cores) | 
| GPU | NVIDIA GeForce RTX 4060 (8 GB GDDR6) | 
| Miniconda Version | Conda 24.9.2 | 
| Python Version | Python 3.12.7 | 
| PyTorch Version | PyTorch 2.5.1 | 
| Operating System | Window 11 | 
| CUDA Version | CUDA 11.6 | 
| Other Libraries | NumPy 1.26.4, Pandas 2.2.2, scikit-learn 1.5.1, scikit-optimize 0.10.2 | 
| Index | Mean | Std | Min | Max | |||
|---|---|---|---|---|---|---|---|
| VN-Index | |||||||
| HNX-Index | 
| Model | Hyper-Parameter | VN-Index | HNX-Index | 
|---|---|---|---|
| RNN | Hidden layers | 1 | 1 | 
| Hidden size | 128 | 185 | |
| Batch size | 32 | 32 | |
| Loss function | MSE | MSE | |
| Optimizer | Adam | Adam | |
| Learning rate | |||
| Number of epochs | 1000 | 700 | |
| Period of learning rate decay | 200 | 113 | |
| Multiplicative factor of learning rate decay | |||
| LSTM | Hidden layers | 1 | 1 | 
| Hidden size | 128 | 48 | |
| Batch size | 32 | 32 | |
| Loss function | MSE | MSE | |
| Optimizer | Adam | Adam | |
| Learning rate | |||
| Number of epochs | 1500 | 2000 | |
| Period of learning rate decay | 200 | 253 | |
| Multiplicative factor of learning rate decay | |||
| GRU | Hidden layers | 1 | 1 | 
| Hidden size | 128 | 184 | |
| Batch size | 32 | 32 | |
| Loss function | MSE | MSE | |
| Optimizer | Adam | Adam | |
| Learning rate | |||
| Number of epochs | 1000 | 1600 | |
| Period of learning rate decay | 200 | 300 | |
| Multiplicative factor of learning rate decay | |||
| Encoder-Decoder | Hidden layers | 1 | 1 | 
| Hidden size | 128 | 191 | |
| Batch size | 32 | 32 | |
| Loss function | MSE | MSE | |
| Optimizer | Adam | Adam | |
| Learning rate | |||
| Number of epochs | 402 | 1000 | |
| Period of learning rate decay | 200 | 125 | |
| Multiplicative factor of learning rate decay | |||
| Encoder-Decoder-Attention | Hidden layers | 1 | 1 | 
| Hidden size | 128 | 128 | |
| Batch size | 32 | 32 | |
| Loss function | MSE | MSE | |
| Optimizer | Adam | Adam | |
| Learning rate | |||
| Number of epochs | 402 | 2400 | |
| Period of learning rate decay | 200 | 200 | |
| Multiplicative factor of learning rate decay | 
| Dataset | Model | MAE | RMSE | MAPE | 
|---|---|---|---|---|
| VN-Index | RNN | 11.1491 (0.0319) | 15.6967 (0.0209) | 0.0094 (0.0000) | 
| LSTM | 11.1006 (0.0268) | 15.7524 (0.0518) | 0.0094 (0.0000) | |
| GRU | 11.0749 (0.0129) | 15.6599 (0.0164) | 0.0093 (0.0000) | |
| Encoder-Decoder | 11.0512 (0.0091) | 15.6529 (0.0132) | 0.0093 (0.0000) | |
| Encoder-Decoder-Attention | 11.0602 (0.0083) | 15.6494 (0.0071) | 0.0093 (0.0000) | |
| HNX-Index | RNN | 3.8963 (0.0652) | 5.4988 (0.1112) | 0.0130 (0.0001) | 
| LSTM | 3.8539 (0.2091) | 5.5601 (0.2694) | 0.0133 (0.0005) | |
| GRU | 3.7678 (0.0148) | 5.3455 (0.0095) | 0.0132 (0.0001) | |
| Encoder-Decoder | 3.5612 (0.0478) | 5.3473 (0.0725) | 0.0122 (0.0001) | |
| Encoder-Decoder-Attention | 3.6273 (0.0395) | 5.3463 (0.1265) | 0.0127 (0.0001) | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Thach, T.T. Forecasting Stock Market Indices Using Integration of Encoder, Decoder, and Attention Mechanism. Entropy 2025, 27, 82. https://doi.org/10.3390/e27010082
Thach TT. Forecasting Stock Market Indices Using Integration of Encoder, Decoder, and Attention Mechanism. Entropy. 2025; 27(1):82. https://doi.org/10.3390/e27010082
Chicago/Turabian StyleThach, Tien Thanh. 2025. "Forecasting Stock Market Indices Using Integration of Encoder, Decoder, and Attention Mechanism" Entropy 27, no. 1: 82. https://doi.org/10.3390/e27010082
APA StyleThach, T. T. (2025). Forecasting Stock Market Indices Using Integration of Encoder, Decoder, and Attention Mechanism. Entropy, 27(1), 82. https://doi.org/10.3390/e27010082
 
        


 
       