MSLKSTNet: Multi-Scale Large Kernel Spatiotemporal Prediction Neural Network for Air Temperature Prediction
Abstract
:1. Introduction
- We propose a novel multiscale large-kernel spatiotemporal prediction neural network, with the core being multiscale large-kernel spatiotemporal attention units. These units divide spatiotemporal attention into focusing on local and global features in spatial dimensions, as well as focusing on globally smooth and locally prominent evolution in temporal dimensions. This is crucial for capturing complex spatiotemporal dynamics;
- We conducted experiments on the MovingMNIST dataset to validate the model’s ability to capture spatiotemporal features. Dissolution experiments verified the effectiveness and feasibility of the network. To validate the effectiveness of the model in actual temperature prediction tasks, spatiotemporal predictions were performed on the Temperature dataset. We compared our method with mainstream spatiotemporal sequence prediction methods, and the experimental results showed that our method achieved the best predictive performance on both datasets.
2. Related Works
2.1. Model
2.2. Visual Attention Mechanism
3. Methods
3.1. Overall Architecture
3.2. Multi-Scale Spatiotemporal Attention Unit
4. Experiments
4.1. Evaluation Index
4.2. Experiments on MovingMNIST
4.2.1. Experimental Results and Analysis
4.2.2. Ablation Experiments
4.3. Experiments on Temperature
Experimental Results and Analysis
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- O’Neill, M.S.; Ebi, K.L. Temperature extremes and health: Impacts of climate variability and change in the United States. J. Occup. Environ. Med. 2009, 51, 13–25. [Google Scholar] [CrossRef] [PubMed]
- Sun, C.; Jiang, Z.; Li, W.; Hou, Q.; Li, L. Changes in extreme temperature over China when global warming stabilized at 1.5 °C and 2.0 °C. Sci. Rep. 2019, 9, 14982. [Google Scholar] [CrossRef] [PubMed]
- Schmidt, G. World view. Nature 2024, 627, 467. [Google Scholar] [CrossRef] [PubMed]
- Bidarmaghz, A.; Narsilio, G.A.; Johnston, I.W.; Colls, S. The importance of surface air temperature fluctuations on long-term performance of vertical ground heat exchangers. Geomech. Energy Environ. 2016, 6, 35–44. [Google Scholar] [CrossRef]
- Asseng, S.; Foster, I.; Turner, N.C. The impact of temperature variability on wheat yields. Glob. Chang. Biol. 2011, 17, 997–1012. [Google Scholar] [CrossRef]
- Mcclung, C.R.; Lou, P.; Hermand, V.; Kim, J.A. The importance of ambient temperature to growth and the induction of flowering. Front. Plant Sci. 2016, 7, 204980. [Google Scholar] [CrossRef]
- Khatib, T.; Mohamed, A.; Sopian, K. A review of solar energy modeling techniques. Renew. Sustain. Energy Rev. 2012, 16, 2864–2869. [Google Scholar] [CrossRef]
- Clarke, A. Principles of Thermal Ecology: Temperature, Energy and Life; Oxford University Press: Oxford, UK, 2017. [Google Scholar]
- Bertrand, I.; Schijven, J.; Sánchez, G.; Wyn-Jones, P.; Ottoson, J.; Morin, T.; Muscillo, M.; Verani, M.; Nasser, A.; de Roda Husman, A.; et al. The impact of temperature on the inactivation of enteric viruses in food and water: A review. J. Appl. Microbiol. 2012, 112, 1059–1074. [Google Scholar] [CrossRef]
- Richard, C.; Gratton, D. The importance of the air temperature variable for the snowmelt runoff modelling using the SRM. Hydrol. Process. 2001, 15, 3357–3370. [Google Scholar] [CrossRef]
- Bauer, P.; Thorpe, A.; Brunet, G. The quiet revolution of numerical weather prediction. Nature 2015, 525, 47–55. [Google Scholar] [CrossRef]
- Palmer, T.; Shutts, G.; Hagedorn, R.; Doblas-Reyes, F.; Jung, T.; Leutbecher, M. Representing model uncertainty in weather and climate prediction. Annu. Rev. Earth Planet. Sci. 2005, 33, 163–193. [Google Scholar] [CrossRef]
- Allen, M.R.; Kettleborough, J.; Stainforth, D. Model error in weather and climate forecasting. In Proceedings of the ECMWF Predictability of Weather and Climate Seminar; European Centre for Medium Range Weather Forecasts: Reading, UK, 2002; pp. 279–304. [Google Scholar]
- Shi, X.; Yeung, D.Y. Machine learning for spatiotemporal sequence forecasting: A survey. arXiv 2018, arXiv:1808.06865. [Google Scholar]
- Punjabi, A.; Ayala, P.I. Efficient spatio-temporal weather forecasting using U-Net. arXiv 2021, arXiv:2112.06543. [Google Scholar]
- Zhang, X.; Jin, Q.; Xiang, S.; Pan, C. MFNet: The spatio-temporal network for meteorological forecasting with architecture search. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1006605. [Google Scholar] [CrossRef]
- Xu, B.; Wang, X.; Li, J.; Liu, C. Hierarchical U-net with re-parameterization technique for spatio-temporal weather forecasting. Mach. Learn. 2024, 113, 3399–3417. [Google Scholar] [CrossRef]
- Wang, Y.; Long, M.; Wang, J.; Gao, Z.; Yu, P.S. Predrnn: Recurrent neural networks for predictive learning using spatiotemporal lstms. Adv. Neural Inf. Process. Syst. 2017, 30, 1–10. [Google Scholar]
- Wang, Y.; Zhang, J.; Zhu, H.; Long, M.; Wang, J.; Yu, P.S. Memory in memory: A predictive neural network for learning higher-order non-stationarity from spatiotemporal dynamics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 10–15 June 2019; pp. 9154–9162. [Google Scholar]
- Wang, Y.; Gao, Z.; Long, M.; Wang, J.; Philip, S.Y. Predrnn++: Towards a resolution of the deep-in-time dilemma in spatiotemporal predictive learning. In Proceedings of the International Conference on Machine Learning; PMLR: Birmingham, UK, 2018; pp. 5123–5132. [Google Scholar]
- Gao, Z.; Tan, C.; Wu, L.; Li, S.Z. Simvp: Simpler yet better video prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 3170–3180. [Google Scholar]
- Sun, R.; Zhang, B. Topographic effects on spatial pattern of surface air temperature in complex mountain environment. Environ. Earth Sci. 2016, 75, 621. [Google Scholar] [CrossRef]
- He, J.; Zhao, W.; Li, A.; Wen, F.; Yu, D. The impact of the terrain effect on land surface temperature variation based on Landsat-8 observations in mountainous areas. Int. J. Remote Sens. 2019, 40, 1808–1827. [Google Scholar] [CrossRef]
- Wang, Y.; Jiang, L.; Yang, M.H.; Li, L.J.; Long, M.; Fei-Fei, L. Eidetic 3D LSTM: A model for video prediction and beyond. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Wang, Y.; Wu, H.; Zhang, J.; Gao, Z.; Wang, J.; Philip, S.Y.; Long, M. Predrnn: A recurrent neural network for spatiotemporal predictive learning. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 2208–2225. [Google Scholar] [CrossRef]
- Tang, S.; Li, C.; Zhang, P.; Tang, R. Swinlstm: Improving spatiotemporal prediction accuracy using swin transformer and lstm. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 1–6 October 2023; pp. 13470–13479. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 10012–10022. [Google Scholar]
- Yu, W.; Lu, Y.; Easterbrook, S.; Fidler, S. Crevnet: Conditionally reversible video prediction. arXiv 2019, arXiv:1910.11577. [Google Scholar]
- Guen, V.L.; Thome, N. Disentangling physical dynamics from unknown factors for unsupervised video prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11474–11484. [Google Scholar]
- Fang, W.; Chen, Y.; Xue, Q. Survey on research of RNN-based spatio-temporal sequence prediction algorithms. J. Big Data 2021, 3, 97. [Google Scholar] [CrossRef]
- Gong, B.; Langguth, M.; Ji, Y.; Mozaffari, A.; Stadtler, S.; Mache, K.; Schultz, M.G. Temperature forecasting by deep learning methods. Geosci. Model Dev. 2022, 15, 8931–8956. [Google Scholar] [CrossRef]
- Shi, X.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.c. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Adv. Neural Inf. Process. Syst. 2015, 28. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
- Kingma, D.P.; Welling, M. Auto-encoding variational bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar]
- Elsken, T.; Metzen, J.H.; Hutter, F. Neural architecture search: A survey. J. Mach. Learn. Res. 2019, 20, 1–21. [Google Scholar]
- Volkovs, K.; Urtans, E.; Caune, V. Primed UNet-LSTM for Weather Forecasting. In Proceedings of the 2023 7th International Conference on Advances in Artificial Intelligence, Istanbul, Turkiye, 13–15 October 2023; pp. 13–17. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Rasp, S.; Dueben, P.D.; Scher, S.; Weyn, J.A.; Mouatadid, S.; Thuerey, N. WeatherBench: A benchmark data set for data-driven weather forecasting. J. Adv. Model. Earth Syst. 2020, 12, e2020MS002203. [Google Scholar] [CrossRef]
- Trebing, K.; Stanczyk, T.; Mehrkanoon, S. SmaAt-UNet: Precipitation nowcasting using a small attention-UNet architecture. Pattern Recognit. Lett. 2021, 145, 178–186. [Google Scholar] [CrossRef]
- Mnih, V.; Heess, N.; Graves, A. Recurrent models of visual attention. Adv. Neural Inf. Process. Syst. 2014, 27. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV); Springer: Cham, Switzerland, 2018; pp. 3–19. [Google Scholar]
- Dai, J.; Qi, H.; Xiong, Y.; Li, Y.; Zhang, G.; Hu, H.; Wei, Y. Deformable convolutional networks. In Proceedings of the IEEE international Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 764–773. [Google Scholar]
- Hu, H.; Gu, J.; Zhang, Z.; Dai, J.; Wei, Y. Relation networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 3588–3597. [Google Scholar]
- Yuan, Y.; Chen, X.; Wang, J. Object-contextual representations for semantic segmentation. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; pp. 173–190. [Google Scholar]
- Wang, W.; Xie, E.; Li, X.; Fan, D.P.; Song, K.; Liang, D.; Lu, T.; Luo, P.; Shao, L. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 568–578. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Liu, R.; Deng, H.; Huang, Y.; Shi, X.; Lu, L.; Sun, W.; Wang, X.; Dai, J.; Li, H. Decoupled spatial-temporal transformer for video inpainting. arXiv 2021, arXiv:2104.06637. [Google Scholar]
- Guo, M.H.; Lu, C.Z.; Liu, Z.N.; Cheng, M.M.; Hu, S.M. Visual attention network. Comput. Vis. Media 2023, 9, 733–752. [Google Scholar] [CrossRef]
- Wang, X.; Girshick, R.; Gupta, A.; He, K. Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7794–7803. [Google Scholar]
- Yuan, Y.; Huang, L.; Guo, J.; Zhang, C.; Chen, X.; Wang, J. Ocnet: Object context network for scene parsing. arXiv 2018, arXiv:1809.00916. [Google Scholar]
- Zhang, H.; Goodfellow, I.; Metaxas, D.; Odena, A. Self-attention generative adversarial networks. In Proceedings of the International Conference on Machine Learning; PMLR: Birmingham, UK, 2019; pp. 7354–7363. [Google Scholar]
- Srivastava, N.; Mansimov, E.; Salakhudinov, R. Unsupervised learning of video representations using lstms. In Proceedings of the International Conference on Machine Learning; PMLR: Birmingham, UK, 2015; pp. 843–852. [Google Scholar]
- Muñoz-Sabater, J.; Dutra, E.; Agustí-Panareda, A.; Albergel, C.; Arduini, G.; Balsamo, G.; Boussetta, S.; Choulga, M.; Harrigan, S.; Hersbach, H.; et al. ERA5-Land: A state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data 2021, 13, 4349–4383. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Smith, L.N.; Topin, N. Super-convergence: Very fast training of neural networks using large learning rates. In Proceedings of the Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications; SPIE: Bellingham, WA, USA, 2019; Volume 11006, pp. 369–386. [Google Scholar]
- Loshchilov, I.; Hutter, F. Sgdr: Stochastic gradient descent with warm restarts. arXiv 2016, arXiv:1608.03983. [Google Scholar]
- Wang, Z. Image quality assessment: Form error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 604–606. [Google Scholar] [CrossRef]
Properties | Self-Attention | LKA | MSSTA |
---|---|---|---|
Local Receptive Field | ✗ | ✓ | ✓ |
Long-range Dependence | ✓ | ✓ | ✓ |
Multi-scale features | ✗ | ✗ | ✓ |
Temporal evolution attention | ✗ | ✗ | ✓ |
Computational complexity |
Dataname | (C, T, H, W) | (C, , H, W) | ||
---|---|---|---|---|
MovingMNIST | 10,000 | 10,000 | (1, 10, 64, 64) | (1, 10, 64, 64) |
Temperature | 64,343 | 8711 | (1, 12, 128, 128) | (1, 12, 128, 128) |
Method | Conference | MSE ↓ | MAE ↓ | SSIM ↑ |
---|---|---|---|---|
ConvLSTM | NIPS2015 | 103.3 | 182.9 | 0.707 |
PredRNN | NIPS2017 | 56.8 | 126.1 | 0.867 |
PredRNNv2 | PAMI2022 | 48.4 | - | 0.891 |
PredRNN++ | ICML2018 | 46.5 | 106.8 | 0.898 |
MIM | CVPR2019 | 44.2 | 101.1 | 0.910 |
E3D-LSTM | ICLR2018 | 41.3 | 87.2 | 0.920 |
PhyDNet | CVPR2020 | 24.4 | 70.3 | 0.947 |
CrevNet | ICLR2020 | 22.3 | - | 0.949 |
SimVP | CVPR2022 | 23.8 | 68.9 | 0.948 |
SwinLSTM | ICCV2023 | 17.7 | - | 0.962 |
MSLKSTNet | - | 15.4 | 51.0 | 0.966 |
Module | Method 1 | Method 2 | Method 3 | Method 4 | Method 5 | |
---|---|---|---|---|---|---|
MSSA | (3,5,7,9)-(7,dil3) | ✗ | ✓ | ✗ | ✗ | ✗ |
(3,5,7,9)-(11,dil2) | ✗ | ✗ | ✓ | ✓ | ✓ | |
TEA | avg_pool | ✗ | ✗ | ✗ | ✓ | ✗ |
avg_pool + max_pool | ✗ | ✗ | ✗ | ✗ | ✓ | |
Index | MSE ↓ | 30.481 | 30.345 | 30.119 | 28.823 | 28.682 |
MAE ↓ | 86.138 | 85.903 | 85.266 | 82.396 | 82.049 | |
SSIM ↑ | 0.9303 | 0.9307 | 0.9315 | 0.9345 | 0.9350 |
Method | MSE (↓) | MAE (↓) | RMSE (↓) |
---|---|---|---|
ConvLSTM | 2.793 | 1.125 | 1.671 |
PredRNNv2 | 1.944 | 0.912 | 1.394 |
Unet | 2.570 | 1.087 | 1.603 |
SmaAt-UNet | 1.763 | 0.879 | 1.328 |
SimVP | 1.989 | 0.944 | 1.410 |
En-Van-De | 1.719 | 0.867 | 1.311 |
MSLKSTNet | 1.619 | 0.832 | 1.272 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gao, F.; Fei, J.; Ye, Y.; Liu, C. MSLKSTNet: Multi-Scale Large Kernel Spatiotemporal Prediction Neural Network for Air Temperature Prediction. Atmosphere 2024, 15, 1114. https://doi.org/10.3390/atmos15091114
Gao F, Fei J, Ye Y, Liu C. MSLKSTNet: Multi-Scale Large Kernel Spatiotemporal Prediction Neural Network for Air Temperature Prediction. Atmosphere. 2024; 15(9):1114. https://doi.org/10.3390/atmos15091114
Chicago/Turabian StyleGao, Feng, Jiaen Fei, Yuankang Ye, and Chang Liu. 2024. "MSLKSTNet: Multi-Scale Large Kernel Spatiotemporal Prediction Neural Network for Air Temperature Prediction" Atmosphere 15, no. 9: 1114. https://doi.org/10.3390/atmos15091114
APA StyleGao, F., Fei, J., Ye, Y., & Liu, C. (2024). MSLKSTNet: Multi-Scale Large Kernel Spatiotemporal Prediction Neural Network for Air Temperature Prediction. Atmosphere, 15(9), 1114. https://doi.org/10.3390/atmos15091114