RaDiT: A Differential Transformer-Based Hybrid Deep Learning Model for Radar Echo Extrapolation
Abstract
1. Introduction
- RaDiT architecture: We propose a hybrid encoder–decoder architecture incorporating a DIFF-transformer-based encoding module. This design improves spatiotemporal interaction representation, addressing the challenge of insufficient representation in rapidly evolving convective processes, and enhances the signal-to-noise ratio for more accurate radar echo extrapolation.
- Generative adversarial training: We implement adversarial training to enhance the model’s ability to represent multi-scale spatiotemporal features, directly addressing the issue of suboptimal feature utilization in extended radar echo sequences. This improves extrapolation accuracy for long-sequence radar data.
- Multi-level loss function: A multi-level loss function with adaptive weighted components is developed to guide model training. It improves localization and intensity estimation of high-intensity echo regions, enabling better feature utilization across scales.
- Empirical validation: Our RaDiT framework outperforms state-of-the-art methods, demonstrating superior temporal-spatial consistency and intensity fidelity, which are critical for forecasting extreme weather events.
2. Materials and Methods
2.1. Dataset
2.2. Method
2.2.1. Overall Structure of the RaDiT
Algorithm 1 Discriminator Architecture |
Require: Input tensor |
Ensure: Validity probability |
1: Architecture Configuration: |
2: for do |
3: |
4: where |
5: end for |
6: Forward Process: |
7: forc cto 5 do |
8: |
9: end for |
10: |
11: |
12: return |
2.2.2. Differential Attention of DIFF Transformer
- ,
- are learnable matrices.
Algorithm 2 Differential Multi-Head Attention |
Require: • Input tensor |
• Weight matrices |
• Output projection |
• Balancing factor |
• Number of heads |
Ensure: Output tensor |
1: //——Single Head Differential Attention——DiffAttn |
2: Project inputs: |
3: Compute attention scores: |
4: return |
5: //——Multi-Head Composition—— |
6: Initialize output |
7: for head to do |
8: |
9: Apply normalization: |
10: Residual scaling: |
11: Collect heads: |
12: end for |
13: //——Final Projection—— |
14: Concatenate heads: |
15: Project output: |
2.3. Loss Function
2.3.1. Generator Loss Function
- represents the predicted image, denotes the ground-truth image,
- is the total number of frames in the images.
- and are the mean intensities of and , respectively;
- and are variances of and , respectively;
- is covariance between and ;
- and are stabilization constants.
- and are introduced as learnable parameters to dynamically balance the contributions of the and terms within the hybrid loss function during model training,
- is the continuous values of and , respectively;
- and denote the height and width of the radar echo images, respectively;
- and are the weight and the radar reflectivity at coordinate (, ), respectively.
2.3.2. Discriminator Loss Function
- is predicted probability for the -th sample generated by the generator,
- is Ground truth label for the -th sample,
- is total number of samples in the dataset.
2.4. Evaluation Metrics
- : Quantifies similarity between predicted and ground-truth radar echo sequences by comparing their feature distributions in a latent space, where and are mean feature vectors, and , are covariance matrices. This metric is particularly effective for evaluating temporal consistency.
- : Measures the accuracy of probabilistic forecasts by integrating the squared difference between predicted and observed cumulative distribution functions (CDFs), where is the predicted CDF, and is the ground-truth CDF.
- To the Classification Metrics , , , whereTrue Positives (): Correctly predicted events.False Positives (): Incorrectly predicted events (predicted but not observed).False Negatives (): Missed events (observed but not predicted).True Negatives (): Cases where the model correctly predicts the absence of an event when no event is observed.
- : Measures the quality of extrapolated echoes, where denotes the maximum possible pixel value of the image.
- : Assesses the conservation of quality by calculating the absolute error between the extrapolated radar echo results for each frame and the ground truth, based on the accumulated values within the threshold region. Here, is an indicator function, where if the condition is satisfied, and otherwise. is a threshold used to filter which echo values should be included in the calculation.
2.5. Implementation Details
Algorithm 3 Adversarial Training Strategy |
Require: • Generator , Discriminator |
•Training dataset |
•Loss functions: |
•Hyperparameters: |
Ensure: Optimized parameters: |
1: Initialize gradients: |
2: for epoch = 1 to do |
3: for batch do |
4: //--Generator Forward--// |
5: |
6: //--Discriminator Phase 1--// |
7: |
8: |
9: |
10: //--Generator Optimization--// |
11: |
12: |
13: |
14: //--Discriminator Phase 2--// |
15: |
16: |
17: |
18: if batch then |
19: |
20: {Reset accumulated gradients} |
21: end if |
22: end for |
23: end for |
3. Results
3.1. SEVIR VIL Extrapolation
3.1.1. Comparison with the State of the Art
3.1.2. Ablation Study on SEVIR
3.1.3. Ablation Study of on SEVIR
3.2. CMARC Echo Extrapolation
3.2.1. Experimental Setup
3.2.2. Experimental Results
3.2.3. Ablation Study on CMARC
4. Discussion
- Enhanced Pattern Capture and Noise Suppression with Attention Mechanisms
- Precision in Critical Feature Localization
- Perceptual Quality and Structural Fidelity
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Weisman, M.L.; Klemp, J.B. The dependence of numerically simulated convective storms on vertical wind shear and buoyancy. Mon. Weather Rev. 1982, 110, 504–520. [Google Scholar] [CrossRef]
- Imhoff, R.O.; Brauer, C.C.; Overeem, A.; Weerts, A.H.; Uijlenhoet, R. Spatial and temporal evaluation of radar rainfall nowcasting techniques on 1533 events. Water Resour. Res. 2020, 56, e2019WR026723. [Google Scholar] [CrossRef]
- Dixon, M.; Wiener, G. TITAN: Thunderstorm Identification, Tracking, Analysis, and Nowcasting—A Radar-Based Methodology. J. Atmos. Ocean. Technol. 1993, 10, 785–797. [Google Scholar] [CrossRef]
- Zhang, C.; Zhou, X.; Zhuge, X.; Xu, M. Learnable Optical Flow Network for Radar Echo Extrapolation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 1260–1266. [Google Scholar] [CrossRef]
- Ayzel, G.; Heistermann, M.; Winterrath, T. Optical Flow Models as an Open Benchmark for Radar-Based Precipitation Nowcasting (Rainymotion v0.1). Geosci. Model Dev. 2019, 12, 1387–1402. [Google Scholar] [CrossRef]
- Mai, X.; Zhong, H.; Li, L. Using SVM to Provide Precipitation Nowcasting Based on Radar Data. In Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery. ICNC-FSKD 2019; Liu, Y., Wang, L., Zhao, L., Yu, Z., Eds.; Advances in Intelligent Systems and Computing; Springer: Cham, Switzerland, 2020; Volume 1075. [Google Scholar] [CrossRef]
- Yu, P.; Yang, T.; Chen, S.; Kuo, C.; Tseng, H. Comparison of Random Forests and Support Vector Machine for Real-Time Radar-Derived Rainfall Forecasting. J. Hydrol. 2017, 552, 92–104. [Google Scholar] [CrossRef]
- Yang, Z.; Liu, P.; Yang, Y. Convective/Stratiform Precipitation Classification Using Ground-Based Doppler Radar Data Based on the K-Nearest Neighbor Algorithm. Remote Sens. 2019, 11, 2277. [Google Scholar] [CrossRef]
- Aderyani, F.R.; Mousavi, S.J.; Jafari, F. Short-Term Rainfall Forecasting Using Machine Learning-Based Approaches of PSO-SVR, LSTM, and CNN. J. Hydrol. 2022, 614, 128463. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015; Navab, N., Hornegger, J., Wells, W., Frangi, A., Eds.; MICCAI 2015, Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2015; Volume 9351, pp. 234–241. [Google Scholar] [CrossRef]
- Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. arXiv 2021, arXiv:2102.04306. [Google Scholar] [CrossRef]
- Çiçek, Ö.; Abdulkadir, A.; Lienkamp, S.S.; Brox, T.; Ronneberger, O. 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2016; Ourselin, S., Joskowicz, L., Sabuncu, M., Unal, G., Wells, W., Eds.; MICCAI 2016, Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2016; Volume 9901, pp. 424–432. [Google Scholar] [CrossRef]
- Han, L.; Liang, H.; Chen, H.; Zhang, W.; Ge, Y. Convective Precipitation Nowcasting Using U-Net Model. IEEE Trans. Geosci. Remote Sens. 2022, 60, 4103508. [Google Scholar] [CrossRef]
- Zhou, K.; Zheng, Y.; Dong, W.; Wang, T. A Deep Learning Network for Cloud-to-Ground Lightning Nowcasting with Multisource Data. J. Atmos. Ocean. Technol. 2020, 37, 927–942. [Google Scholar] [CrossRef]
- Leinonen, J.; Hamann, U.; Germann, U. Seamless Lightning Nowcasting with Recurrent-Convolutional Deep Learning. Artif. Intell. Earth Syst. 2022, 1, 1–46. [Google Scholar] [CrossRef]
- Bengio, Y.; Simard, P.; Frasconi, P. Learning Long-Term Dependencies with Gradient Descent Is Difficult. IEEE Trans. Neural Netw. 1994, 5, 157–166. [Google Scholar] [CrossRef] [PubMed]
- Pascanu, R.; Mikolov, T.; Bengio, Y. On the Difficulty of Training Recurrent Neural Networks. In Proceedings of the 30th International Conference on Machine Learning, PMLR, Atlanta, GA, USA, 16–21 June 2013; Available online: https://proceedings.mlr.press/v28/pascanu13.html (accessed on 24 October 2024).
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Salem, F. Gated RNN: The Gated Recurrent Unit (GRU) RNN. In Recurrent Neural Networks; Springer: Cham, Switzerland, 2022. [Google Scholar] [CrossRef]
- Liu, H.; Liu, C.; Wang, J.T.L.; Wang, H. Predicting Solar Flares Using a Long Short-Term Memory Network. Astrophys. J. 2019, 877, 121. [Google Scholar] [CrossRef]
- Shi, X.; Chen, Z.; Wang, H.; Yeung, D.-Y.; Wong, W.-k.; Woo, W.-c. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. In Proceedings of the 29th International Conference on Neural Information Processing Systems—Volume 1, Montreal, QC, Canada, 7–12 December 2015; MIT Press: Cambridge, MA, USA, 2015; pp. 802–810. [Google Scholar]
- Wang, Y.; Long, M.; Wang, J.; Gao, Z.; Yu, P.S. PredRNN: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal LSTMs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 879–888. [Google Scholar]
- Wang, Y.; Wu, H.; Zhang, J.; Gao, Z.; Wang, J.; Yu, P.S.; Long, M. PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 2208–2225. [Google Scholar] [CrossRef]
- Zhong, S.; Zeng, X.; Ling, Q.; Wen, Q.; Meng, W.; Feng, Y. Spatiotemporal Convolutional LSTM for Radar Echo Extrapolation. In Proceedings of the 2020 54th Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA, 1–5 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 58–62. [Google Scholar] [CrossRef]
- Zheng, K.; Liu, Y.; Zhang, J.; Luo, C.; Tang, S.; Ruan, H.; Tan, Q.; Yi, Y.; Ran, X. GAN-argcPredNet v1.0: A Generative Adversarial Model for Radar Echo Extrapolation Based on Convolutional Recurrent Units. Geosci. Model Dev. 2022, 15, 1467–1475. [Google Scholar] [CrossRef]
- Gao, Z.; Shi, X.; Han, B.; Wang, H.; Jin, X.; Maddix, D.; Zhu, Y.; Li, M.; Wang, Y. PreDiff: Precipitation Nowcasting with Latent Diffusion Models. In Proceedings of the 37th International Conference on Neural Information Processing Systems, Orleans, LA, USA, 10–16 December 2023; Curran Associates Inc.: Red Hook, NY, USA, 2023; pp. 1–36. [Google Scholar]
- Liu, Q.; Yang, Z.; Ji, R.; Zhang, Y.; Bilal, M.; Liu, X.; Vimal, S.; Xu, X. Deep Vision in Analysis and Recognition of Radar Data: Achievements, Advancements, and Challenges. IEEE Syst. Man Cybern. Mag. 2023, 9, 4–12. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2017, arXiv:1706.03762. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar] [CrossRef]
- Aleissaee, A.A.; Kumar, A.; Anwer, R.M.; Khan, S.; Cholakkal, H.; Xia, G.-S.; Khan, F.S. Transformers in Remote Sensing: A Survey. Remote Sens. 2023, 15, 1860. [Google Scholar] [CrossRef]
- Bi, K.; Xie, L.; Zhang, H.; Chen, X.; Gu, X.; Tian, Q. Accurate medium-range global weather forecasting with 3D neural networks. Nature 2023, 619, 533–538. [Google Scholar] [CrossRef] [PubMed]
- Chen, L.; Zhong, X.; Zhang, F.; Cheng, Y.; Xu, Y.; Qi, Y.; Li, H. FuXi: A Cascade Machine Learning Forecasting System for 15-Day Global Weather Forecast. npj Clim. Atmos. Sci. 2023, 6, 190. [Google Scholar] [CrossRef]
- Lang, S.; Alexe, M.; Chantry, M.; Dramsch, J.; Pinault, F.; Raoult, B.; Clare, M.; Lessig, C.; Maier-Gerber, M.; Magnusson, L.; et al. AIFS—ECMWF’s Data-Driven Forecasting System. arXiv 2024, arXiv:2406.01465. [Google Scholar] [CrossRef]
- Bojesomo, A.; Al-Marzouqi, H.; Liatsis, P. Spatiotemporal Vision Transformer for Short Time Weather Forecasting. In Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, 15–18 December 2021; pp. 5741–5746. [Google Scholar] [CrossRef]
- Chen, S.; Shu, T.; Zhao, H.; Zhong, G.; Chen, X. TempEE: Temporal–Spatial Parallel Transformer for Radar Echo Extrapolation Beyond Autoregression. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5108914. [Google Scholar] [CrossRef]
- Liu, Y.; Yang, L.; Chen, M.; Song, L.; Han, L.; Xu, J. A Deep Learning Approach for Forecasting Thunderstorm Gusts in the Beijing-Tianjin-Hebei Region. Adv. Atmos. Sci. 2024, 41, 1342–1363. [Google Scholar] [CrossRef]
- Tan, C.; Gao, Z.; Li, S.; Xu, Y.; Li, S.Z. Temporal Attention Unit: Towards Efficient Spatiotemporal Predictive Learning. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 18770–18782. Available online: https://api.semanticscholar.org/CorpusID:250048557 (accessed on 14 October 2024).
- Geng, H.; Zhao, H.; Shi, Z.; Wu, F.; Geng, L.; Ma, K. MBFE-UNet: A Multi-Branch Feature Extraction UNet with Temporal Cross Attention for Radar Echo Extrapolation. Remote Sens. 2024, 16, 3956. [Google Scholar] [CrossRef]
- Zhang, Z.; Song, Q.; Duan, M.; Liu, H.; Huo, J.; Han, C. Deep Learning Model for Precipitation Nowcasting Based on Residual and Attention Mechanisms. Remote Sens. 2025, 17, 1123. [Google Scholar] [CrossRef]
- Kamradt, G. Needle in a Haystack—Pressure Testing LLMs. Available online: https://github.com/gkamradt/LLMTest_NeedleInAHaystack/tree/main (accessed on 5 June 2023).
- Liu, N.F.; Lin, K.; Hewitt, J.; Paranjape, A.; Bevilacqua, M.; Petroni, F.; Liang, P. Lost in the Middle: How Language Models Use Long Contexts. Trans. Assoc. Comput. Linguist. 2024, 12, 157–173. [Google Scholar] [CrossRef]
- Ye, T.; Dong, L.; Xia, Y.; Sun, Y.; Zhu, Y.; Huang, G.; Wei, F. Differential Transformer. In Proceedings of the Thirteenth International Conference on Learning Representations, Singapore, 24–28 April 2025; Available online: https://openreview.net/forum?id=OvoCm1gGhN (accessed on 26 January 2025).
- Tran, D.-P.; Nguyen, Q.-A.; Pham, V.-T.; Tran, T.-T. Trans2Unet: Neural Fusion for Nuclei Semantic Segmentation. In Proceedings of the 2022 11th International Conference on Control, Automation and Information Sciences (ICCAIS), Hanoi, Vietnam, 21–24 November 2022; pp. 583–588. [Google Scholar] [CrossRef]
- Veillette, M.S.; Samsi, S.; Mattioli, C.J. SEVIR: A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NeurIPS ‘20), Online, 6–12 December 2020; Curran Associates Inc.: Red Hook, NY, USA, 2020; Volume 1846, pp. 1–11. ISBN 9781713829546. [Google Scholar]
- Xia, R.; Zhang, D.-L.; Wang, B. A 6-Year Cloud-to-Ground Lightning Climatology and Its Relationship to Rainfall over Central and Eastern China. J. Appl. Meteorol. Climatol. 2015, 54, 150901110117000. [Google Scholar] [CrossRef]
- Yadav, S.; Shukla, S. Analysis of k-Fold Cross-Validation over Hold-Out Validation on Colossal Datasets for Quality Classification. In Proceedings of the 2016 IEEE 6th International Conference on Advanced Computing (IACC), Bhimavaram, India, 27–28 February 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 78–83. [Google Scholar] [CrossRef]
- Laplante, P.A. Comprehensive Dictionary of Electrical Engineering. 2005. Available online: https://api.semanticscholar.org/CorpusID:60992230 (accessed on 12 October 2024).
- Zhang, B.; Sennrich, R. Root Mean Square Layer Normalization. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Curran Associates Inc.: Red Hook, NY, USA, 2019; Volume 1110, pp. 12381–12392. [Google Scholar]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
- Hatamizadeh, A.; Tang, Y.; Nath, V.; Yang, D.; Myronenko, A.; Landman, B.; Roth, H.R.; Xu, D. UNETR: Transformers for 3D Medical Image Segmentation. In Proceedings of the 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1748–1758. [Google Scholar] [CrossRef]
- Yao, J.; Xu, F.; Qian, Z.; Cai, Z. A Forecast-Refinement Neural Network Based on DyConvGRU and U-Net for Radar Echo Extrapolation. IEEE Access 2023, 11, 53249–53261. [Google Scholar] [CrossRef]
- Horé, A.; Ziou, D. Image Quality Metrics: PSNR vs. SSIM. In Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 2366–2369. [Google Scholar] [CrossRef]
- Unterthiner, T.; van Steenkiste, S.; Kurach, K.; Marinier, R.; Michalski, M.; Gelly, S. Towards Accurate Generative Models of Video: A New Metric & Challenges. arXiv 2018, arXiv:1812.01717. Available online: https://arxiv.org/abs/1812.01717 (accessed on 22 October 2024).
- Zamo, M.; Naveau, P. Estimation of the Continuous Ranked Probability Score with Limited Information and Applications to Ensemble Weather Forecasts. Math. Geosci. 2018, 50, 209–234. [Google Scholar] [CrossRef]
- Shaker, A.M.; Maaz, M.; Rasheed, H.; Khan, S.; Yang, M.-H.; Khan, F.S. UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation. IEEE Trans. Med. Imaging 2024, 43, 3377–3390. [Google Scholar] [CrossRef]
- Le Guen, V.; Thome, N. Disentangling Physical Dynamics From Unknown Factors for Unsupervised Video Prediction. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 11471–11481. [Google Scholar] [CrossRef]
- Wang, Y.; Jiang, L.; Yang, M.-H.; Li, L.-J.; Long, M.; Fei-Fei, L. Eidetic 3D LSTM: A Model for Video Prediction and Beyond. In Proceedings of the 7th International Conference on Learning Representations (ICLR), OpenReview.net: 2019, New Orleans, LA, USA, 6–9 May 2019; Available online: https://openreview.net/forum?id=B1lKS2AqtX (accessed on 25 November 2024).
- Bai, C.; Sun, F.; Zhang, J.; Song, Y.; Chen, S. Rainformer: Features Extraction Balanced Network for Radar-Based Precipitation Nowcasting. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Gao, Z.; Shi, X.; Wang, H.; Zhu, Y.; Wang, Y.B.; Li, M.; Yeung, D.-Y. Earthformer: Exploring Space-Time Transformers for Earth System Forecasting. In Advances in Neural Information Processing Systems; Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2022; Volume 35, pp. 25390–25403. Available online: https://proceedings.neurips.cc/paper_files/paper/2022/file/a2affd71d15e8fedffe18d0219f4837a-Paper-Conference.pdf (accessed on 12 November 2024).
- Yan, W.; Zhang, Y.; Abbeel, P.; Srinivas, A. VideoGPT: Video Generation Using VQ-VAE and Transformers. arXiv 2021, arXiv:2104.10157. [Google Scholar] [CrossRef]
- Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-Resolution Image Synthesis with Latent Diffusion Models. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 10674–10685. [Google Scholar] [CrossRef]
- Ren, X.; Li, X.; Ren, K.; Song, J.; Xu, Z.; Deng, K.; Wang, X. Deep Learning-Based Weather Prediction: A Survey. Big Data Res. 2021, 23, 100178. [Google Scholar] [CrossRef]
- Gavahi, K.; Foroumandi, E.; Moradkhani, H. A Deep Learning-Based Framework for Multi-Source Precipitation Fusion. Remote Sens. Environ. 2023, 295, 113723. [Google Scholar] [CrossRef]
Model | Params (M) | Metrics | ||||
---|---|---|---|---|---|---|
FVD ↓ | CRPS ↓ | CSI ↑ | CSI-pool4 ↑ | CSI-pool16 ↑ | ||
U-Net | 16.6 | 753.6 | 0.0353 | 0.3593 | 0.4098 | 0.4805 |
ConvLSTM | 14.0 | 659.7 | 0.0332 | 0.4185 | 0.4452 | 0.5135 |
PredRNN | 23.8 | 663.5 | 0.0306 | 0.4080 | 0.4497 | 0.5005 |
PhyDNet | 3.1 | 723.2 | 0.0319 | 0.3940 | 0.4379 | 0.4854 |
E3D-LSTM | 12.9 | 600.1 | 0.0297 | 0.4038 | 0.4492 | 0.4961 |
Rainformer | 19.2 | 760.5 | 0.0357 | 0.3661 | 0.4232 | 0.4738 |
Earthformer | 7.6 | 690.7 | 0.0304 | 0.4419 | 0.4567 | 0.5005 |
PreDiff | 120.7 | 33.05 | 0.0246 | 0.4100 | 0.4624 | 0.6244 |
VideoGPT | 92.2 | 261.6 | 0.0381 | 0.3653 | 0.4349 | 0.5798 |
LDM | 410.3 | 133.0 | 0.0280 | 0.3580 | 0.4022 | 0.5522 |
RaDiT | 75.1 | 118.8 | 0.0263 | 0.4110 | 0.5061 | 0.6394 |
Model | Metrics for Pool 4 | ||||||
---|---|---|---|---|---|---|---|
CSI-M ↑ | CSI-219 ↑ | CSI-181 ↑ | CSI-160 ↑ | CSI-133 ↑ | CSI-74 ↑ | CSI-16 ↑ | |
ConvLSTM | 0.4452 | 0.1850 | 0.2864 | 0.3245 | 0.4502 | 0.6694 | 0.7556 |
Earthformer | 0.4567 | 0.1484 | 0.2772 | 0.3341 | 0.4911 | 0.7006 | 0.7892 |
PreDiff | 0.4624 | 0.2065 | 0.3130 | 0.3613 | 0.4807 | 0.6691 | 0.7438 |
VideoGPT | 0.4349 | 0.1691 | 0.2825 | 0.3268 | 0.4482 | 0.6529 | 0.7300 |
LDM | 0.4022 | 0.1439 | 0.2420 | 0.2964 | 0.4171 | 0.6139 | 0.6998 |
RaDiT | 0.5061 | 0.2534 | 0.3934 | 0.4302 | 0.5115 | 0.6989 | 0.7494 |
Model | Metrics for Pool 16 | ||||||
---|---|---|---|---|---|---|---|
CSI-M ↑ | CSI-219 ↑ | CSI-181 ↑ | CSI-160 ↑ | CSI-133 ↑ | CSI-74 ↑ | CSI-16 ↑ | |
ConvLSTM | 0.5135 | 0.2651 | 0.3679 | 0.4153 | 0.5408 | 0.7093 | 0.7883 |
Earthformer | 0.5005 | 0.1798 | 0.3207 | 0.3918 | 0.5448 | 0.7304 | 0.8353 |
PreDiff | 0.6244 | 0.3865 | 0.5127 | 0.5757 | 0.6638 | 0.7789 | 0.8289 |
VideoGPT | 0.5798 | 0.3101 | 0.4543 | 0.5211 | 0.6285 | 0.7583 | 0.8065 |
LDM | 0.5522 | 0.2896 | 0.4247 | 0.4987 | 0.5895 | 0.7229 | 0.7876 |
RaDiT | 0.6394 | 0.4152 | 0.5733 | 0.6073 | 0.6784 | 0.7772 | 0.7848 |
Ablation Experiments | Module | Metrics | ||||
---|---|---|---|---|---|---|
Transformer | DIFF Transformer | GAN | CSI ↑ | CSI-Pool4 ↑ | CSI-Pool16 ↑ | |
RaDiT w/o GAN | ✓ | 0.4096 | 0.4943 | 0.6293 | ||
RaDiT w/o DIFF and GAN | ✓ | 0.4055 | 0.4884 | 0.6208 | ||
RaDiT w/o DIFF | ✓ | ✓ | 0.4040 | 0.4871 | 0.6192 | |
RaDiT | ✓ | ✓ | 0.4110 | 0.5061 | 0.6394 |
Ablation Experiments | Ablation Study Formula | Metrics | ||
---|---|---|---|---|
CSI ↑ | CSI-Pool4 ↑ | CSI-Pool16 ↑ | ||
_ours | 0.4110 | 0.5061 | 0.6394 | |
_constant | 0.4019 | 0.5048 | 0.6273 | |
_exponent | 0.4012 | 0.5049 | 0.6327 | |
_logarithm | 0.4092 | 0.5059 | 0.6384 | |
_square root | 0.4021 | 0.5051 | 0.6378 | |
_inverse proportionality | 0.4010 | 0.5036 | 0.6267 |
Model | Params (M) | Iteration Time (s) | Metrics | |||||
---|---|---|---|---|---|---|---|---|
CSI ↑ | CSI-Pool4 ↑ | CSI-Pool16 ↑ | POD↑ | POD-Pool4 ↑ | POD-Pool16 ↑ | |||
ConvLSTM | 14.0 | 10.2 | 0.2908 | 0.2813 | 0.2652 | 0.3530 | 0.2992 | 0.2685 |
PredRNN | 23.8 | 8.1 | 0.3168 | 0.3024 | 0.2886 | 0.3806 | 0.3186 | 0.2915 |
TAU | 14.4 | 2.0 | 0.2208 | 0.1904 | 0.1692 | 0.2469 | 0.1965 | 0.1701 |
U-Net | 16.6 | 4.0 | 0.3388 | 0.3298 | 0.3238 | 0.4135 | 0.3498 | 0.3296 |
UNETR | 39.5 | 6.6 | 0.3116 | 0.3096 | 0.3004 | 0.3852 | 0.3307 | 0.3052 |
UNETR++ | 42.6 | 7.2 | 0.2887 | 0.2860 | 0.2947 | 0.3567 | 0.3070 | 0.3002 |
PhyDNet | 3.1 | 8.6 | 0.3493 | 0.3476 | 0.3360 | 0.4368 | 0.3714 | 0.3418 |
PreDiff | 120.7 | 26.6 | 0.3485 | 0.3609 | 0.3841 | 0.4586 | 0.3990 | 0.3970 |
RaDiT | 75.1 | 9.5 | 0.3651 | 0.3831 | 0.3867 | 0.4926 | 0.4232 | 0.3979 |
Ablation Experiments | Module | Metrics | ||||
---|---|---|---|---|---|---|
Transformer | DIFF Transformer | GAN | CSI ↑ | CSI-Pool4 ↑ | CSI-Pool16 ↑ | |
RaDiT w/o GAN | ✓ | 0.3453 | 0.3591 | 0.3873 | ||
RaDiT w/o DIFF and GAN | ✓ | 0.3342 | 0.3535 | 0.3522 | ||
RaDiT w/o DIFF | ✓ | ✓ | 0.3114 | 0.3195 | 0.3409 | |
RaDiT | ✓ | ✓ | 0.3651 | 0.3831 | 0.3867 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhu, W.; Lu, Z.; Zhang, Y.; Zhao, Z.; Lu, B.; Li, R. RaDiT: A Differential Transformer-Based Hybrid Deep Learning Model for Radar Echo Extrapolation. Remote Sens. 2025, 17, 1976. https://doi.org/10.3390/rs17121976
Zhu W, Lu Z, Zhang Y, Zhao Z, Lu B, Li R. RaDiT: A Differential Transformer-Based Hybrid Deep Learning Model for Radar Echo Extrapolation. Remote Sensing. 2025; 17(12):1976. https://doi.org/10.3390/rs17121976
Chicago/Turabian StyleZhu, Wenda, Zhenyu Lu, Yuan Zhang, Ziqi Zhao, Bingjian Lu, and Ruiyi Li. 2025. "RaDiT: A Differential Transformer-Based Hybrid Deep Learning Model for Radar Echo Extrapolation" Remote Sensing 17, no. 12: 1976. https://doi.org/10.3390/rs17121976
APA StyleZhu, W., Lu, Z., Zhang, Y., Zhao, Z., Lu, B., & Li, R. (2025). RaDiT: A Differential Transformer-Based Hybrid Deep Learning Model for Radar Echo Extrapolation. Remote Sensing, 17(12), 1976. https://doi.org/10.3390/rs17121976