MiT-WGAN: Financial Time-Series Generation GAN Based on Multi-Convolution Dynamic Fusion and iTransformer
Abstract
1. Introduction
- The present study proposes MiT-WGAN, a hybrid generative adversarial framework for financial time-series generation. In contrast to extant methodologies that depend exclusively on a solitary feature extraction mechanism (e.g., QuantGAN based on TCN, TimeGAN relying on supervised embedding, and diffusion models using stepwise denoising generation), this study integrates multi-convolutional dynamic fusion (MCDF) with an enhanced Transformer (iTransformer) to concurrently model local multi-scale patterns and long-term temporal dependencies.
- The dynamic gated fusion (DGF) mechanism is introduced to achieve an adaptive balance between local and global features. In contrast to conventional methods such as simple splicing or static weighting, DGF employs a learnable gating unit to adaptively adjust the contribution ratio of the MCDF and iTransformer branches based on the characteristics of the data. This enhances the model’s robustness and its capacity for generalization in nonstationary financial sequences.
- To mitigate instability, vanishing gradients, and mode collapse commonly encountered in GAN training on noisy financial time series, the adversarial framework adopts the Wasserstein GAN with gradient penalty (WGAN-GP), which stabilizes training and improves generator convergence.
- We evaluate MiT-WGAN on two stock datasets sampled from the S&P 500 and compare its performance against several state-of-the-art time-series generative models. Using multiple quantitative metrics, the experiments demonstrate that MiT-WGAN significantly outperforms baseline methods in modeling logarithmic returns of financial time series.
2. Related Work
2.1. Time-Series GAN
2.2. Financial Time-Series GAN
2.3. Methods Based on VAE and Diffusion Model
3. Methodology
3.1. GAN
3.2. MiT-WGAN
3.2.1. Multi-Convolution Dynamic Fusion
3.2.2. iTransformer
3.2.3. Dynamically Gated Feature Fusion
3.3. Loss Update
3.4. Theoretical Motivation
4. Experiment
4.1. Data Preprocessing
- Initially, for the AAPL and AMZN datasets, the actual closing prices are retrieved.
- For the real closing prices of the two datasets, the logarithmic return rate is calculated as follows:
- Subsequently, the calculated logarithmic return data is standardized and converted into a series with zero mean and unit variance.
- In order to resolve the contradiction between the model’s generative capabilities and the heavy-tailed nature of financial data, the standardized data is further processed using the Lambert W transformation [27].
- Subsequently, a secondary normalization is required to further scale the data to the interval [−1, 1].
- In order to facilitate the model’s capacity to discern the potential representation between data points, a sliding window approach is employed. The dimensions of the sliding window are set to 128, and the step size is designated as 1.
4.2. Experimental Results
- The Earth Mover’s Distance (EMD, also known as the Wasserstein-1 distance): This metric quantifies the discrepancy between the distributions of real and generated sequences. The Wasserstein-1 distance is formally defined as
- ACF-Score: A key property of financial log-return series is that their linear autocorrelations are typically close to zero, whereas the autocorrelations of their absolute and squared values are significant, reflecting volatility clustering. Accordingly, we compute the autocorrelation functions of the raw log-return , the absolute log-return , and the squared return and compare them between the real and generated series. Given a lag order k (set to 10 in this study), the autocorrelation is defined asThe discrepancy between the autocorrelation functions (ACFs) of the real log-return series and the generated log-return series is quantified using the mean squared error (MSE):
- Leverage Effect Score: The negative correlation between returns and future volatility is a well-documented nonlinear feature of financial time series, known as the leverage effect. It is computed as the correlation between the log-return and its future squared log-return, expressed asA smaller value of this metric indicates that the generated data more accurately reproduces the leverage effect observed in real markets.
- Maximum Mean Discrepancy (MMD): To further evaluate the distribution similarity between the real sequence and the generated sequence, the Maximum Mean Discrepancy was calculated. The two groups of samples are denoted by and , respectively, and the square form is defined as follows:
4.3. Tail Risk Evaluation
4.4. Ablation Study
- Removal of the MCDF module: To assess the contribution of the convolutional branch in capturing local dependencies and volatility clustering, the MCDF module was removed, leaving only the iTransformer branch to directly generate logarithmic return series. Under this configuration, the model relies primarily on the iTransformer’s ability to capture long-range dependencies, thereby highlighting the essential role of local convolutional features in modeling short-term volatility patterns.
- Removal of the iTransformer module: Conversely, to examine the importance of global time-series modeling, the iTransformer branch was removed, retaining only the multi-convolutional dynamic fusion (MCDF) module. In this setting, the generated data depends solely on the convolutional architecture for feature extraction, which has been shown to be effective in capturing short-term dependency patterns. However, this approach is insufficient for adequately modeling long-term dependencies and the overall temporal structure. This experiment thereby facilitates an evaluation of the iTransformer’s effectiveness in capturing long-term dependencies in financial time series.
- Removal of the DGF module: To evaluate the role of the dynamic gated fusion (DGF) module in information interaction and feature integration, a simple linear fusion strategy was adopted. In this setting, the outputs of the MCDF and iTransformer branches were combined through direct weighted summation or concatenation before mapping, without incorporating a gating mechanism. Comparing this simplified variant with the full model enables a clear assessment of the gating mechanism’s contribution to enhancing feature interaction and overall generation quality.
5. Discussion and Future Work
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Lim, B.; Zohren, S. Time-series forecasting with deep learning: A survey. Philos. Trans. R. Soc. A 2021, 379, 20200209. [Google Scholar] [CrossRef]
- Kumar, N.; Susan, S. COVID-19 pandemic prediction using time series forecasting models. In Proceedings of the 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kharagpur, India, 1–3 July 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–7. [Google Scholar]
- Sousa, M.R.; Gama, J.; Brandão, E. A new dynamic modeling framework for credit risk assessment. Expert Syst. Appl. 2016, 45, 341–351. [Google Scholar] [CrossRef]
- Fabozzi, F.A.; Simonian, J.; Fabozzi, F.J. Risk parity: The democratization of risk in asset allocation. J. Portf. Manag. 2021, 47, 41–50. [Google Scholar] [CrossRef]
- Acharya, V.V. A theory of systemic risk and design of prudential bank regulation. J. Financ. Stab. 2009, 5, 224–255. [Google Scholar] [CrossRef]
- Goodell, J.W.; Goutte, S. Co-movement of COVID-19 and Bitcoin: Evidence from wavelet coherence analysis. Financ. Res. Lett. 2021, 38, 101625. [Google Scholar] [CrossRef] [PubMed]
- McRandal, R.; Rozanov, A. A primer on tail risk hedging. J. Secur. Oper. Custody 2012, 5, 29–36. [Google Scholar] [CrossRef]
- Yoon, J.; Jarrett, D.; Van der Schaar, M. Time-series generative adversarial networks. Adv. Neural Inf. Process. Syst. 2019, 32, 5508–5518. [Google Scholar]
- Smith, K.E.; Smith, A.O. Conditional GAN for timeseries generation. arXiv 2020, arXiv:2006.16477. [Google Scholar] [CrossRef]
- Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 2014, 27, 2672–2680. [Google Scholar]
- Wang, M.; El-Gayar, O. Generative adversarial networks in fraud detection: A systematic literature review. In Proceedings of the Americas Conference on Information Systems (AMCIS), Salt Lake City, UT, USA, 15–17 August 2024; Available online: https://aisel.aisnet.org/amcis2024/security/security/35/ (accessed on 1 October 2025).
- Yang, X.; Li, C.; Han, Z.; Lu, Z. Distributed Generative Adversarial Networks for Fuzzy Portfolio Optimization. In Proceedings of the International Conference on Algorithms and Architectures for Parallel Processing, Tianjin, China, 20–22 October 2023; Springer Nature: Singapore, 2023; pp. 236–247. [Google Scholar]
- Zhou, X.; Pan, Z.; Hu, G.; Tang, S.; Zhao, C. Stock market prediction on high-frequency data using generative adversarial nets. Math. Probl. Eng. 2018, 2018, 4907423. [Google Scholar] [CrossRef]
- Esteban, C.; Hyland, S.L.; Rätsch, G. Real-valued (medical) time series generation with recurrent conditional GANs. arXiv 2017, arXiv:1706.02633. [Google Scholar] [CrossRef]
- Borovykh, A.; Bohte, S.; Oosterlee, C.W. Conditional time series forecasting with convolutional neural networks. arXiv 2017, arXiv:1703.04691. [Google Scholar]
- Bai, S.; Kolter, J.Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv 2018, arXiv:1803.01271. [Google Scholar] [CrossRef]
- Zerveas, G.; Jayaraman, S.; Patel, D.; Bhamidipaty, A.; Eickhoff, C. A transformer-based framework for multivariate time series representation learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Virtual Event, 14–18 August 2021; pp. 2114–2124. [Google Scholar]
- Wen, Q.; Zhou, T.; Zhang, C.; Chen, W.; Ma, Z.; Yan, J.; Sun, L. Transformers in time series: A survey. arXiv 2022, arXiv:2202.07125. [Google Scholar]
- Dai, Z.; Liu, H.; Le, Q.V.; Tan, M. CoatNet: Marrying convolution and attention for all data sizes. Adv. Neural Inf. Process. Syst. 2021, 34, 3965–3977. [Google Scholar]
- Liu, Y.; Hu, T.; Zhang, H.; Wu, H.; Wang, S.; Ma, L.; Long, M. iTransformer: Inverted transformers are effective for time series forecasting. arXiv 2023, arXiv:2310.06625. [Google Scholar]
- Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein generative adversarial networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 214–223. [Google Scholar]
- Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A.C. Improved training of Wasserstein GANs. Adv. Neural Inf. Process. Syst. 2017, 30, 5767–5777. [Google Scholar]
- Mogren, O. C-RNN-GAN: Continuous recurrent neural networks with adversarial training. arXiv 2016, arXiv:1611.09904. [Google Scholar]
- Yu, L.; Zhang, W.; Wang, J.; Yu, Y. SeqGAN: Sequence generative adversarial nets with policy gradient. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar]
- Mirza, M.; Osindero, S. Conditional generative adversarial nets. arXiv 2014, arXiv:1411.1784. [Google Scholar] [CrossRef]
- Xu, T.K.; Wenliang, L.K.; Munn, M.; Acciaio, B. Cot-GAN: Generating sequential data via causal optimal transport. Adv. Neural Inf. Process. Syst. 2020, 33, 8798–8809. [Google Scholar]
- Wiese, M.; Knobloch, R.; Korn, R.; Kretschmer, P. Quant GANs: Deep generation of financial time series. Quant. Financ. 2020, 20, 1419–1440. [Google Scholar] [CrossRef]
- Takahashi, S.; Chen, Y.; Tanaka-Ishii, K. Modeling financial time—Series with generative adversarial networks. Phys. A Stat. Mech. Its Appl. 2019, 527, 121261. [Google Scholar] [CrossRef]
- Huang, A.; Khushi, M.; Suleiman, B. Regime—Specific Quant Generative Adversarial Network: A Conditional Generative Adversarial Network for Regime—Specific Deepfakes of Financial Time Series. Appl. Sci. 2023, 13, 10639. [Google Scholar] [CrossRef]
- Jeon, S.; Seo, J.T. A synthetic time—series generation using a variational recurrent autoencoder with an attention mechanism in an industrial control system. Sensors 2023, 24, 128. [Google Scholar] [CrossRef]
- Leushuis, R.M. Probabilistic forecasting with VAR—VAE: Advancing time series forecasting under uncertainty. Inf. Sci. 2025, 713, 122184. [Google Scholar] [CrossRef]
- Rasul, K.; Seward, C.; Schuster, I.; Vollgraf, R. Autoregressive denoising diffusion models for multivariate probabilistic time series forecasting. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 18–24 July 2021; pp. 8857–8868. [Google Scholar]
- Kollovieh, M.; Ansari, A.F.; Bohlke-Schneider, M.; Zschiegner, J.; Wang, H.; Wang, Y.B. Predict, refine, synthesize: Self-guiding diffusion models for probabilistic time series forecasting. Adv. Neural Inf. Process. Syst. 2023, 36, 28341–28364. [Google Scholar]
- Meijer, C.; Chen, L.Y. The rise of diffusion models in time—Series forecasting. arXiv 2024, arXiv:2401.03006. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze—and—excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 7132–7141. [Google Scholar]
- Pardo, F.D.M.; López, R.C. Mitigating overfitting on financial datasets with generative adversarial networks. J. Financ. Data Sci. 2020, 2, 76–85. [Google Scholar] [CrossRef]
Module | Hyperparameters | Value/Setting |
---|---|---|
MCDF | Kernel sizes | {3, 5, 7} |
Channels per conv | 64 | |
Number of conv | 3 | |
iTransformer | Encoder layers | 6 |
Hidden size | 256 | |
Attention heads | 8 | |
Dropout rate | 0.1 | |
Feedforward hidden size | 1024 | |
Normalization | Pre-LayerNorm | |
Attention type | Scaled dot-product | |
Activation | GELU | |
DGF | Fusion type | Learnable gating with sigmoid |
Dropout rate | 0.1 | |
Discriminator | Structure | Symmetric to Generator |
Normalization | LayerNorm | |
Training | Batch size | 64 |
Optimizer | Adam () | |
Learning rate | ||
Epochs | 200 |
Dataset | Metrics | MiT-WGAN | QuantGAN | GARCH | TimeGAN |
---|---|---|---|---|---|
AAPL | EMD | ||||
ACF(id) | |||||
ACF() | |||||
ACF() | |||||
Leverage Effect | |||||
MMD2 | |||||
AMZN | EMD | ||||
ACF(id) | |||||
ACF() | |||||
ACF() | |||||
Leverage Effect | |||||
MMD2 | |||||
Dataset | Confidence Level | Metric | Real Data | MiT-WGAN | QuantGAN | GARCH | TimeGAN |
---|---|---|---|---|---|---|---|
AAPL | 95% | VaR | −0.191 | −0.357 | −0.424 | −0.553 | −0.634 |
95% | CVaR | −0.453 | −0.678 | −0.712 | −0.824 | −0.948 | |
99% | VaR | −0.575 | −0.782 | −0.873 | −0.996 | −1.094 | |
99% | CVaR | −0.722 | −0.938 | −0.994 | −1.122 | −1.195 | |
AMZN | 95% | VaR | −0.328 | −0.571 | −0.647 | −0.791 | −0.815 |
95% | CVaR | −0.478 | −0.684 | −0.722 | −0.846 | −0.943 | |
99% | VaR | −0.501 | −0.727 | −0.816 | −0.916 | −0.964 | |
99% | CVaR | −0.641 | −0.885 | −0.942 | −1.171 | −1.159 |
Metric | AAPL Real | AAPL Gen | AMZN Real | AMZN Gen |
---|---|---|---|---|
95% VaR | −0.452 | −0.616 | −0.591 | −0.702 |
95% CVaR | −0.568 | −0.825 | −0.711 | −0.815 |
99% VaR | −0.693 | −0.863 | −0.783 | −0.764 |
99% CVaR | −0.743 | −0.927 | −0.964 | −0.905 |
Dataset | Metrics | Full (MiT-WGAN) | w/o MCDF | w/o iTransformer | w/o DGF |
---|---|---|---|---|---|
AAPL | EMD | ||||
ACF(id) | |||||
ACF() | |||||
ACF() | |||||
Leverage Effect | |||||
MMD2 | |||||
AMZN | EMD | ||||
ACF(id) | |||||
ACF() | |||||
ACF() | |||||
Leverage Effect | |||||
MMD2 | |||||
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhu, L.; Long, C. MiT-WGAN: Financial Time-Series Generation GAN Based on Multi-Convolution Dynamic Fusion and iTransformer. Symmetry 2025, 17, 1740. https://doi.org/10.3390/sym17101740
Zhu L, Long C. MiT-WGAN: Financial Time-Series Generation GAN Based on Multi-Convolution Dynamic Fusion and iTransformer. Symmetry. 2025; 17(10):1740. https://doi.org/10.3390/sym17101740
Chicago/Turabian StyleZhu, Lin, and Chunji Long. 2025. "MiT-WGAN: Financial Time-Series Generation GAN Based on Multi-Convolution Dynamic Fusion and iTransformer" Symmetry 17, no. 10: 1740. https://doi.org/10.3390/sym17101740
APA StyleZhu, L., & Long, C. (2025). MiT-WGAN: Financial Time-Series Generation GAN Based on Multi-Convolution Dynamic Fusion and iTransformer. Symmetry, 17(10), 1740. https://doi.org/10.3390/sym17101740