MPSTAN: Metapopulation-Based Spatio–Temporal Attention Network for Epidemic Forecasting
Abstract
1. Introduction
- (1)
- Most of the existing methods fail to make full use of the more reasonable epidemiological domain knowledge to help model training. They utilize domain knowledge that either ignores inter-patch interactions [29,30] or requires additional population mobility data to construct inter-patch interactions [32]. The latter approach relies heavily on population mobility data, but collecting population mobility data between patches is inherently challenging and inaccurate, which can also bias the model.
- (2)
- Most of the existing domain-knowledge-based models do not analyze the effectiveness of domain knowledge on model training in detail. Most methods only apply epidemiological domain knowledge to the loss function [30,32], and some works apply epidemiological knowledge to model construction at the same time [31]. However, these methods do not analyze in detail the effectiveness of domain knowledge on model construction and the loss function separately for epidemic forecasting.
- (1)
- We propose a metapopulation-based spatio–temporal attention network for epidemic forecasting. Specifically, we propose a metapopulation epidemic model with parameters adaptively learned through neural networks, which is then incorporated to guide neural network training. This spatio–temporal model does not rely on population mobility data, enabling it to accurately predict epidemic transmission.
- (2)
- We design multiple parameter generators to learn the physical model parameters for the intra- and inter-patches separately. Due to the fact that different parameters represent different information, we utilize embedding representations containing diverse information to feed into each parameter generator separately in order to learn the corresponding physical model parameters.
- (3)
- We reveal the significance of epidemiological domain knowledge in spatio–temporal epidemic forecasting by comparing its different incorporation methods into neural networks. Also, we emphasize the crucial importance of selecting appropriate domain knowledge to simulate potential epidemic transmission within actual circumstances.
- (4)
- We conduct extensive experiments to validate the performance of MPSTAN on two datasets with different epidemiological evolutionary trends. The results show that MPSTAN has accurate short- and long-term forecasting and has the generalization ability for different epidemic evolutions.
2. Related Work
3. Methodology
3.1. Problem Description
3.2. Model Overview
3.3. The Spatio–Temporal Module
3.3.1. Temporal Embedding
3.3.2. Spatial Embedding
3.4. The Epidemiology Module
3.5. The Multiple Parameter Generator Module
3.6. The Information Fusion Module
3.7. Output Layer
3.8. Optimization
4. Experiments
4.1. Datasets
4.2. Experimental Details
4.2.1. Baselines
- (1)
- SIR [5]: The SIR model uses three differential equations to calculate the change in the number of susceptible, infected, and recovered cases in a single patch.
- (2)
- ARIMA [35]: The auto-regressive integrated moving average model is widely used for time-series forecasting. We use ARIMA to predict daily active cases for each patch.
- (3)
- GRU [12]: The gated recurrent unit is a variant of RNN that uses fewer parameters to implement the gating mechanism compared to LSTM. We use a GRU for each patch separately to predict daily active cases.
- (4)
- GraphWaveNet [23]: GraphWaveNet combines an adaptive adjacency matrix, diffusion convolution, and gated TCN to capture spatio–temporal dependencies.
- (5)
- STGODE [46]: STGODE proposes a spatio–temporal tensor model by combining neural ODE with GCN to achieve unified modeling of spatio–temporal dependencies.
- (6)
- CovidGNN [18]: CovidGNN uses the time-series of each patch as node features and predicts epidemics using GCN with skip connections.
- (7)
- ColaGNN [19]: ColaGNN designs a dynamic adjacency matrix using an attention mechanism and adopts a multi-scale dilated convolutional layer for long- and short-term epidemic forecasting.
- (8)
- STAN [30]: STAN utilizes the gravity model to construct networks and applies epidemiological domain knowledge to the loss function, which specifically constructs a dynamics constraint loss by combining with the SIR model.
- (9)
- PatchTST [52]: Based on the transformer, PatchTST considers the channel independence of input features and uses a patching mechanism to extract local semantic information from time-series data.
- (10)
- Crossformer [53]: Based on the transformer, Crossformer introduces a two-stage attention mechanism to effectively capture dependencies across both the time and feature dimensions.
4.2.2. Settings
4.2.3. Evaluation Metrics
4.3. Forecasting Performance
4.4. Ablation Study
- (1)
- MPSTAN w/o Phy-All: Remove epidemiological domain knowledge from both model construction and the loss function. We use only the spatio–temporal module for epidemic forecasting.
- (2)
- MPSTAN w/o Phy-Loss: Remove epidemiological domain knowledge from the loss function. We only implement the knowledge in model construction.
- (3)
- MPSTAN w/o Phy-Model: Remove the epidemiological domain knowledge from model construction. We predict physical model parameters in the output layer and implement the knowledge in the loss function.
- (4)
- MPSTAN w/o Mobility: Combine epidemiological domain knowledge without considering population mobility into the model—mainly by using the SIR model instead of the MP-SIR model.
- (5)
- MPSTAN w/o MPG: Remove multiple parameter generators (MPGs). We generate all the physical model parameters using a single parameter generator for embeddings containing spatio–temporal information.
4.5. Effect of Hyperparameters
4.6. Model Complexity
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- Kaye, A.D.; Okeagu, C.N.; Pham, A.D.; Silva, R.A.; Hurley, J.J.; Arron, B.L.; Sarfraz, N.; Lee, H.N.; Ghali, G.E.; Gamble, J.W.; et al. Economic impact of COVID-19 pandemic on healthcare facilities and systems: International perspectives. Best Pract. Res. Clin. Anaesthesiol. 2021, 35, 293–306. [Google Scholar] [CrossRef]
- Zeroual, A.; Harrou, F.; Dairi, A.; Sun, Y. Deep learning methods for forecasting COVID-19 time-Series data: A Comparative study. Chaos Solitons Fractals 2020, 140, 110121. [Google Scholar] [CrossRef] [PubMed]
- Yu, J.; Tan, M.; Zhang, H.; Rui, Y.; Tao, D. Hierarchical Deep Click Feature Prediction for Fine-Grained Image Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 563–578. [Google Scholar] [CrossRef]
- Yu, J.; Li, J.; Yu, Z.; Huang, Q. Multimodal transformer with multi-view visual representation for image captioning. IEEE Trans. Circuits Syst. Video Technol. 2019, 30, 4467–4480. [Google Scholar] [CrossRef]
- Kermack, W.O.; McKendrick, A.G. A contribution to the mathematical theory of epidemics. Proc. R. Soc. London. Ser. A Contain. Pap. A Math. Phys. Character 1927, 115, 700–721. [Google Scholar]
- Efimov, D.; Ushirobira, R. On an interval prediction of COVID-19 development based on a SEIR epidemic model. Annu. Rev. Control 2021, 51, 477–487. [Google Scholar] [CrossRef]
- Liao, Z.; Lan, P.; Liao, Z.; Zhang, Y.; Liu, S. TW-SIR: Time-window based SIR for COVID-19 forecasts. Sci. Rep. 2020, 10, 22454. [Google Scholar] [CrossRef] [PubMed]
- López, L.; Rodo, X. A modified SEIR model to predict the COVID-19 outbreak in Spain and Italy: Simulating control scenarios and multi-scale epidemics. Results Phys. 2021, 21, 103746. [Google Scholar] [CrossRef]
- Alabdulrazzaq, H.; Alenezi, M.N.; Rawajfih, Y.; Alghannam, B.A.; Al-Hassan, A.A.; Al-Anzi, F.S. On the accuracy of ARIMA based prediction of COVID-19 spread. Results Phys. 2021, 27, 104509. [Google Scholar] [CrossRef]
- Parbat, D.; Chakraborty, M. A python based support vector regression model for prediction of COVID-19 cases in India. Chaos Solitons Fractals 2020, 138, 109942. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Chen, R.T.; Rubanova, Y.; Bettencourt, J.; Duvenaud, D.K. Neural ordinary differential equations. Adv. Neural Inf. Process. Syst. 2018, 31. [Google Scholar]
- Hazarie, S.; Soriano-Paños, D.; Arenas, A.; Gómez-Gardeñes, J.; Ghoshal, G. Interplay between population density and mobility in determining the spread of epidemics in cities. Commun. Phys. 2021, 4, 191. [Google Scholar] [CrossRef]
- Kipf, T.N.; Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. In Proceedings of the International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Liò, P.; Bengio, Y. Graph Attention Networks. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Kapoor, A.; Ben, X.; Liu, L.; Perozzi, B.; Barnes, M.; Blais, M.; O’Banion, S. Examining COVID-19 forecasting using spatio-temporal graph neural networks. arXiv 2020, arXiv:2007.03113. [Google Scholar]
- Deng, S.; Wang, S.; Rangwala, H.; Wang, L.; Ning, Y. Cola-GNN: Cross-location attention based graph neural networks for long-term ILI prediction. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Virtual, 19–23 October 2020; pp. 245–254. [Google Scholar]
- Zhang, H.; Xu, Y.; Liu, L.; Lu, X.; Lin, X.; Yan, Z.; Cui, L.; Miao, C. Multi-modal Information Fusion-powered Regional COVID-19 Epidemic Forecasting. In Proceedings of the 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Houston, TX, USA, 9–12 December 2021; pp. 779–784. [Google Scholar]
- Yu, B.; Yin, H.; Zhu, Z. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018; pp. 3634–3640. [Google Scholar]
- Li, Y.; Yu, R.; Shahabi, C.; Liu, Y. Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Wu, Z.; Pan, S.; Long, G.; Jiang, J.; Zhang, C. Graph wavenet for deep spatial-temporal graph modeling. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, Macao, Chian, 10–16 August 2019; pp. 1907–1913. [Google Scholar]
- Dong, E.; Du, H.; Gardner, L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect. Dis. 2020, 20, 533–534. [Google Scholar] [CrossRef] [PubMed]
- Marcilly, R. “Japan LIVE Dashboard” r COVID-19: A Scalable Solution o Monitor Real-Time d Regional-Level Epidemic Case Data. In Context Sensitive Health Informatics: The Role of Informatics in Global Pandemics; IOS Press: Amsterdam, The Netherlands, 2021; p. 21. [Google Scholar]
- Adiga, A.; Lewis, B.; Levin, S.; Marathe, M.V.; Poor, H.V.; Ravi, S.; Rosenkrantz, D.J.; Stearns, R.E.; Venkatramanan, S.; Vullikanti, A.; et al. AI Techniques for Forecasting Epidemic Dynamics: Theory and Practice. In Artificial Intelligence in COVID-19; Springer: Berlin/Heidelberg, Germany, 2022; pp. 193–228. [Google Scholar]
- Kamalov, F.; Rajab, K.; Cherukuri, A.; Elnagar, A.; Safaraliev, M. Deep Learning for COVID-19 Forecasting: State-of-the-art review. Neurocomputing 2022, 511, 142–154. [Google Scholar] [CrossRef] [PubMed]
- Karpatne, A.; Atluri, G.; Faghmous, J.H.; Steinbach, M.; Banerjee, A.; Ganguly, A.; Shekhar, S.; Samatova, N.; Kumar, V. Theory-guided data science: A new paradigm for scientific discovery from data. IEEE Trans. Knowl. Data Eng. 2017, 29, 2318–2331. [Google Scholar] [CrossRef]
- La Gatta, V.; Moscato, V.; Postiglione, M.; Sperli, G. An epidemiological neural network exploiting dynamic graph structured data applied to the covid-19 outbreak. IEEE Trans. Big Data 2020, 7, 45–55. [Google Scholar] [CrossRef]
- Gao, J.; Sharma, R.; Qian, C.; Glass, L.M.; Spaeder, J.; Romberg, J.; Sun, J.; Xiao, C. STAN: Spatio-temporal attention network for pandemic prediction using real-world evidence. J. Am. Med. Inform. Assoc. 2021, 28, 733–743. [Google Scholar] [CrossRef]
- Wang, L.; Adiga, A.; Chen, J.; Sadilek, A.; Venkatramanan, S.; Marathe, M. Causalgnn: Causal-based graph neural networks for spatio-temporal epidemic forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 22 February–1 March 2022; Volume 36, pp. 12191–12199. [Google Scholar]
- Cao, Q.; Jiang, R.; Yang, C.; Fan, Z.; Song, X.; Shibasaki, R. MepoGNN: Metapopulation Epidemic Forecasting with Graph Neural Networks. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Grenoble, France, 19–23 September 2022. [Google Scholar]
- Moein, S.; Nickaeen, N.; Roointan, A.; Borhani, N.; Heidary, Z.; Javanmard, S.H.; Ghaisari, J.; Gheisari, Y. Inefficiency of SIR models in forecasting COVID-19 epidemic: A case study of Isfahan. Sci. Rep. 2021, 11, 4725. [Google Scholar] [CrossRef]
- Cooper, I.; Mondal, A.; Antonopoulos, C.G. A SIR model assumption for the spread of COVID-19 in different communities. Chaos Solitons Fractals 2020, 139, 110057. [Google Scholar] [CrossRef]
- Benvenuto, D.; Giovanetti, M.; Vassallo, L.; Angeletti, S.; Ciccozzi, M. Application of the ARIMA model on the COVID-19 epidemic dataset. Data Brief 2020, 29, 105340. [Google Scholar] [CrossRef]
- Arora, P.; Kumar, H.; Panigrahi, B.K. Prediction and analysis of COVID-19 positive cases using deep learning models: A descriptive case study of India. Chaos Solitons Fractals 2020, 139, 110017. [Google Scholar] [CrossRef] [PubMed]
- Shahid, F.; Zameer, A.; Muneeb, M. Predictions for COVID-19 with deep learning models of LSTM, GRU and Bi-LSTM. Chaos Solitons Fractals 2020, 140, 110212. [Google Scholar] [CrossRef] [PubMed]
- Wang, L.; Chen, J.; Marathe, M. DEFSI: Deep learning based epidemic forecasting with synthetic information. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 9607–9612. [Google Scholar]
- Li, L.; Jiang, Y.; Huang, B. Long-term prediction for temporal propagation of seasonal influenza using Transformer-based model. J. Biomed. Inform. 2021, 122, 103894. [Google Scholar] [CrossRef]
- Jung, S.; Moon, J.; Park, S.; Hwang, E. Self-Attention-Based Deep Learning Network for Regional Influenza Forecasting. IEEE J. Biomed. Health Inform. 2021, 26, 922–933. [Google Scholar] [CrossRef] [PubMed]
- Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Philip, S.Y. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 4–24. [Google Scholar] [CrossRef]
- Bui, K.H.N.; Cho, J.; Yi, H. Spatial-temporal graph neural network for traffic forecasting: An overview and open research issues. Appl. Intell. 2022, 52, 2763–2774. [Google Scholar] [CrossRef]
- Panagopoulos, G.; Nikolentzos, G.; Vazirgiannis, M. Transfer graph neural networks for pandemic forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 2–9 February 2021; Volume 35, pp. 4838–4845. [Google Scholar]
- Tomy, A.; Razzanelli, M.; Di Lauro, F.; Rus, D.; Della Santina, C. Estimating the state of epidemics spreading with graph neural networks. Nonlinear Dyn. 2022, 109, 249–263. [Google Scholar] [CrossRef] [PubMed]
- Chen, P.; Fu, X.; Wang, X. A graph convolutional stacked bidirectional unidirectional-LSTM neural network for metro ridership prediction. IEEE Trans. Intell. Transp. Syst. 2021, 23, 6950–6962. [Google Scholar] [CrossRef]
- Fang, Z.; Long, Q.; Song, G.; Xie, K. Spatial-temporal graph ode networks for traffic flow forecasting. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Singapore, 14–18 August 2021; pp. 364–373. [Google Scholar]
- Wang, H.; Tao, G.; Ma, J.; Jia, S.; Chi, L.; Yang, H.; Zhao, Z.; Tao, J. Predicting the epidemics trend of COVID-19 using epidemiological-based generative adversarial networks. IEEE J. Sel. Top. Signal Process. 2022, 16, 276–288. [Google Scholar] [CrossRef]
- Truscott, J.; Ferguson, N.M. Evaluating the Adequacy of Gravity Models as a Description of Human Mobility for Epidemic Modelling. PLoS Comput. Biol. 2012, 8. [Google Scholar] [CrossRef]
- Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
- Jarynowski, A.; Belik, V. Access to healthcare as an important moderating variable for understanding the geography of COVID-19 outcomes-preliminary insights from Poland. Eur. J. Transl. Clin. Med. 2022, 5, 5–15. [Google Scholar] [CrossRef]
- Jarynowski, A.; Belik, V. Narrative review of infectious disease spread models developed in Poland during COVID-19 pandemic. In Proceedings of the XLII Max Born Symposium, Wroclaw, Poland, 14–16 September 2023. [Google Scholar]
- Nie, Y.; Nguyen, N.H.; Sinthong, P.; Kalagnanam, J. A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. In Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
- Zhang, Y.; Yan, J. Crossformer: Transformer Utilizing Cross-Dimension Dependency for Multivariate Time Series Forecasting. In Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
- Aktay, A.; Bavadekar, S.; Cossoul, G.; Davis, J.; Desfontaines, D.; Fabrikant, A.; Gabrilovich, E.; Gadepalli, K.; Gipson, B.; Guevara, M.; et al. Google COVID-19 community mobility reports: Anonymization process description (version 1.1). arXiv 2020, arXiv:2004.04145. [Google Scholar]
Dataset | Data Level | Data Size | Time Range | Min | Max | Mean | Std |
---|---|---|---|---|---|---|---|
US | State-level | 2020.5.1 –2020.12.31 | 0 | 838,855 | 40,438 | 75,691 | |
Japan | Prefecture-level | 2022.1.15–2022.6.14 | 104 | 198,011 | 11,458 | 21,188 |
The US dataset | ||||||||||
T = 5 days | T = 10 days | |||||||||
Model | MAE | RMSE | MAPE | PCC | CCC | MAE | RMSE | MAPE | PCC | CCC |
SIR | 5660 | 15,656 | 25.81% | 99.16% | 99.14% | 10,608 | 33,766 | 41.94% | 96.23% | 96.09% |
ARIMA | 6475 | 22,095 | 14.01% | 98.33% | 98.31% | 11,489 | 44,779 | 26.36% | 93.66% | 93.39% |
GRU | 18,348 | 32,950 | 21.88% | 97.88% | 95.63% | 26,749 | 47,328 | 32.52% | 95.66% | 90.39% |
GraphWaveNet | 13,875 | 22,559 | 17.85% | 99.46% | 97.82% | 9526 | 15,673 | 16.64% | 99.21% | 99.09% |
STGODE | 70,454 | 116,865 | 83.21% | 91.95% | 64.48% | 53,693 | 83,823 | 63.51% | 87.89% | 62.19% |
CovidGNN | 9453 | 21,612 | 9.91% | 99.07% | 98.17% | 16,052 | 37,586 | 15.03% | 96.87% | 94.00% |
ColaGNN | 66,005 | 111,622 | 77.57% | 53.54% | 41.79% | 51,822 | 91,680 | 57.61% | 80.18% | 62.46% |
STAN | 10,024 | 19,214 | 17.98% | 98.70% | 98.65% | 13,993 | 25,963 | 19.38% | 97.80% | 97.49% |
PatchTST | 5086 | 12,119 | 7.62% | 99.49% | 99.46% | 8033 | 17,283 | 11.70% | 99.02% | 98.89% |
Crossformer | 20,469 | 41,348 | 21.99% | 95.88% | 93.09% | 24,428 | 47,851 | 26.06% | 94.64% | 90.32% |
MPSTAN | 3960 | 8255 | 6.38% | 99.80% | 99.75% | 7711 | 14,463 | 10.73% | 99.55% | 99.20% |
Improvement | 22.14% | 31.88% | 16.27% | 0.31% | 0.29% | 4.01% | 7.72% | 8.29% | 0.34% | 0.11% |
T = 15 days | T = 20 days | |||||||||
Model | MAE | RMSE | MAPE | PCC | CCC | MAE | RMSE | MAPE | PCC | CCC |
SIR | 16,573 | 60,984 | 57.38% | 89.04% | 88.26% | 23,963 | 101,612 | 76.12% | 76.44% | 73.21% |
ARIMA | 17,151 | 74,295 | 43.01% | 84.86% | 83.59% | 24,849 | 121,875 | 65.20% | 69.78% | 65.31% |
GRU | 33,968 | 59,804 | 41.21% | 92.67% | 83.94% | 38,202 | 65,762 | 45.54% | 90.44% | 80.61% |
GraphWaveNet | 47,020 | 76,735 | 51.64% | 90.76% | 72.64% | 48,154 | 82,098 | 51.19% | 84.43% | 68.47% |
STGODE | 72,622 | 117,611 | 107.65% | 82.26% | 50.75% | 72,132 | 109,536 | 84.84% | 85.16% | 42.81% |
CovidGNN | 21,660 | 48,169 | 19.85% | 94.68% | 89.71% | 26,985 | 57,085 | 24.57% | 92.64% | 84.95% |
ColaGNN | 33,419 | 55,424 | 41.36% | 92.43% | 79.63% | 47,837 | 77,656 | 52.48% | 92.49% | 70.90% |
STAN | 16,784 | 33,383 | 20.78% | 96.43% | 95.49% | 18,679 | 36,180 | 26.81% | 96.09% | 94.52% |
PatchTST | 10,120 | 19,986 | 15.39% | 98.79% | 98.48% | 14,178 | 28,244 | 20.10% | 97.67% | 96.84% |
Crossformer | 27,084 | 50,297 | 30.01% | 94.69% | 88.96% | 28,668 | 51,294 | 32.64% | 95.10% | 88.37% |
MPSTAN | 10,148 | 18,460 | 14.68% | 99.25% | 98.68% | 12,728 | 22,923 | 18.68% | 98.81% | 97.91% |
Improvement | - | 7.64% | 4.61% | 0.47% | 0.20% | 10.23% | 18.84% | 7.06% | 1.17% | 1.10% |
The Japanese dataset | ||||||||||
T = 5 days | T = 10 days | |||||||||
Model | MAE | RMSE | MAPE | PCC | CCC | MAE | RMSE | MAPE | PCC | CCC |
SIR | 896 | 1572 | 18.89% | 99.11% | 97.91% | 1703 | 2874 | 39.38% | 97.73% | 93.67% |
ARIMA | 1113 | 3137 | 24.33% | 91.74% | 91.37% | 2433 | 8719 | 59.59% | 63.42% | 57.19% |
GRU | 2156 | 3955 | 58.91% | 94.06% | 89.02% | 2702 | 5130 | 69.49% | 92.33% | 83.80% |
GraphWaveNet | 2048 | 4490 | 39.06% | 94.93% | 87.35% | 2744 | 6447 | 48.88% | 92.64% | 79.24% |
STGODE | 5420 | 13057 | 103.14% | 83.94% | 57.16% | 8208 | 18396 | 158.08% | 85.00% | 50.91% |
CovidGNN | 1042 | 2305 | 18.06% | 97.27% | 95.71% | 1887 | 3942 | 39.40% | 95.77% | 89.48% |
ColaGNN | 2566 | 5746 | 50.29% | 92.17% | 82.16% | 5294 | 10,402 | 101.50% | 86.60% | 63.78% |
STAN | 1070 | 2400 | 22.97% | 95.87% | 94.82% | 1623 | 3165 | 34.38% | 94.80% | 91.97% |
PatchTST | 828 | 2987 | 15.90% | 92.33% | 91.56% | 1324 | 2608 | 31.64% | 95.09% | 94.09% |
Crossformer | 1732 | 3826 | 34.91% | 94.82% | 89.70% | 2741 | 6161 | 58.48% | 88.98% | 78.61% |
MPSTAN | 1016 | 2311 | 16.91% | 96.74% | 95.60% | 1356 | 3016 | 24.34% | 93.38% | 92.27% |
Improvement | - | - | - | - | - | - | - | - | - | - |
Model | T = 15 days | T = 20 days | ||||||||
MAE | RMSE | MAPE | PCC | CCC | MAE | RMSE | MAPE | PCC | CCC | |
SIR | 2632 | 4373 | 66.60% | 95.22% | 87.05% | 3515 | 5883 | 92.93% | 92.08% | 79.20% |
ARIMA | 3443 | 7715 | 86.16% | 65.62% | 61.39% | 3757 | 7513 | 130.90% | 72.79% | 66.56% |
GRU | 2124 | 3758 | 59.84% | 88.58% | 87.70% | 2977 | 5343 | 68.13% | 71.75% | 70.72% |
GraphWaveNet | 2828 | 6520 | 49.39% | 93.62% | 79.34% | 2773 | 6547 | 46.11% | 92.96% | 79.38% |
STGODE | 10,330 | 23,345 | 195.76% | 82.22% | 38.62% | 12,156 | 27,407 | 221.51% | 83.58% | 33.33% |
CovidGNN | 2988 | 6515 | 66.73% | 90.20% | 77.42% | 3990 | 8805 | 94.82% | 84.97% | 67.12% |
ColaGNN | 4192 | 8688 | 93.21% | 84.31% | 67.68% | 7195 | 15,400 | 140.32% | 84.30% | 50.40% |
STAN | 2026 | 3887 | 51.03% | 93.86% | 88.92% | 2804 | 5238 | 72.10% | 90.59% | 82.24% |
PatchTST | 1654 | 2984 | 40.45% | 95.61% | 92.80% | 2321 | 5102 | 56.56% | 87.05% | 81.40% |
Crossformer | 3575 | 7896 | 84.07% | 82.25% | 69.05% | 5237 | 14,458 | 115.39% | 71.97% | 47.94% |
MPSTAN | 1465 | 3104 | 28.29% | 91.84% | 91.29% | 1854 | 4014 | 34.67% | 85.78% | 84.97% |
Improvement | 11.43% | - | 30.06% | - | - | 20.12% | 21.32% | 24.81% | - | 3.32% |
The US dataset | ||||||||||
T = 5 days | T = 10 days | |||||||||
Model | MAE | RMSE | MAPE | PCC | CCC | MAE | RMSE | MAPE | PCC | CCC |
MPSTAN w/o Phy-All | 14,865 | 34,756 | 10.96% | 96.19% | 95.18% | 22,911 | 54,185 | 17.38% | 91.55% | 86.63% |
MPSTAN w/o Phy-Loss | 18,908 | 39,201 | 14.62% | 94.53% | 92.97% | 15,201 | 27,700 | 16.09% | 98.53% | 96.57% |
MPSTAN w/o Phy-Model | 19,002 | 45,127 | 13.04% | 94.09% | 90.52% | 25,372 | 64,364 | 18.17% | 86.59% | 81.28% |
MPSTAN w/o Mobility | 5030 | 9845 | 7.09% | 99.78% | 99.65% | 8147 | 14,895 | 11.17% | 99.56% | 99.16% |
MPSTAN w/o MPG | 4399 | 9033 | 6.71% | 99.77% | 99.70% | 7640 | 14,456 | 10.70% | 99.55% | 99.21% |
MPSTAN | 3960 | 8255 | 6.38% | 99.80% | 99.75% | 7711 | 14,463 | 10.73% | 99.55% | 99.20% |
T = 15 days | T = 20 days | |||||||||
Model | MAE | RMSE | MAPE | PCC | CCC | MAE | RMSE | MAPE | PCC | CCC |
MPSTAN w/o Phy-All | 22,876 | 58,160 | 19.10% | 88.68% | 85.34% | 27,659 | 60,632 | 24.83% | 89.30% | 83.16% |
MPSTAN w/o Phy-Loss | 18,526 | 32,033 | 20.19% | 99.15% | 95.58% | 22,138 | 37,753 | 24.06% | 97.54% | 93.59% |
MPSTAN w/o Phy-Model | 27,509 | 63,056 | 21.84% | 88.68% | 80.58% | 27,425 | 61,194 | 24.36% | 88.96% | 82.69% |
MPSTAN w/o Mobility | 11,054 | 20,240 | 15.33% | 99.19% | 98.39% | 11,859 | 22,477 | 18.37% | 98.71% | 98.02% |
MPSTAN w/o MPG | 10,441 | 18,984 | 14.92% | 99.25% | 98.59% | 13,064 | 23,702 | 18.98% | 98.87% | 97.75% |
MPSTAN | 10,148 | 18,460 | 14.68% | 99.25% | 98.68% | 12728 | 22923 | 18.68% | 98.81% | 97.91% |
The Japanese dataset | ||||||||||
T = 5 days | T = 10 days | |||||||||
Model | MAE | RMSE | MAPE | PCC | CCC | MAE | RMSE | MAPE | PCC | CCC |
MPSTAN w/o Phy-All | 3326 | 10,410 | 26.59% | 92.27% | 66.80% | 3201 | 9632 | 29.52% | 92.06% | 68.65% |
MPSTAN w/o Phy-Loss | 928 | 2024 | 15.81% | 96.05% | 95.90% | 1196 | 2620 | 22.31% | 93.70% | 93.50% |
MPSTAN w/o Phy-Model | 3674 | 11,714 | 28.11% | 91.84% | 62.93% | 3896 | 10,762 | 45.96% | 91.48% | 64.93% |
MPSTAN w/o Mobility | 1142 | 2309 | 21.93% | 98.46% | 95.60% | 1273 | 2633 | 27.24% | 96.77% | 94.33% |
MPSTAN w/o MPG | 1047 | 2339 | 19.18% | 96.52% | 95.44% | 1216 | 2630 | 23.49% | 94.52% | 93.90% |
MPSTAN | 1016 | 2311 | 16.91% | 96.74% | 95.60% | 1356 | 3016 | 24.34% | 93.38% | 92.27% |
T = 15 days | T = 20 days | |||||||||
Model | MAE | RMSE | MAPE | PCC | CCC | MAE | RMSE | MAPE | PCC | CCC |
MPSTAN w/o Phy-All | 3435 | 9796 | 35.25% | 91.37% | 67.63% | 3054 | 7736 | 41.09% | 90.29% | 74.42% |
MPSTAN w/o Phy-Loss | 1774 | 3941 | 32.28% | 84.22% | 83.44% | 1928 | 4383 | 41.30% | 86.70% | 84.83% |
MPSTAN w/o Phy-Model | 3897 | 11,099 | 42.38% | 90.35% | 63.40% | 4273 | 10,664 | 69.34% | 86.83% | 63.25% |
MPSTAN w/o Mobility | 1100 | 2271 | 24.42% | 96.25% | 95.31% | 1319 | 2958 | 26.72% | 93.80% | 92.42% |
MPSTAN w/o MPG | 1391 | 3379 | 25.48% | 91.89% | 90.44% | 1786 | 4073 | 31.19% | 87.89% | 85.75% |
MPSTAN | 1465 | 3104 | 28.29% | 91.84% | 91.29% | 1854 | 4014 | 34.67% | 85.78% | 84.97% |
Neural Network Parameters | Training Time Consumption | |
---|---|---|
GRU | 32 K | 108 s |
GraphWaveNet | 270 K | 122 s |
STGODE | 456 K | 328 s |
CovidGNN | 119 K | 20 s |
ColaGNN | 277 K | 132 s |
STAN | 949 K | 1560 s |
PatchTST | 6310 K | 390 s |
Crossformer | 14,774 K | 1404 s |
MPSTAN | 24 K | 735 s |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mao, J.; Han, Y.; Wang, B. MPSTAN: Metapopulation-Based Spatio–Temporal Attention Network for Epidemic Forecasting. Entropy 2024, 26, 278. https://doi.org/10.3390/e26040278
Mao J, Han Y, Wang B. MPSTAN: Metapopulation-Based Spatio–Temporal Attention Network for Epidemic Forecasting. Entropy. 2024; 26(4):278. https://doi.org/10.3390/e26040278
Chicago/Turabian StyleMao, Junkai, Yuexing Han, and Bing Wang. 2024. "MPSTAN: Metapopulation-Based Spatio–Temporal Attention Network for Epidemic Forecasting" Entropy 26, no. 4: 278. https://doi.org/10.3390/e26040278
APA StyleMao, J., Han, Y., & Wang, B. (2024). MPSTAN: Metapopulation-Based Spatio–Temporal Attention Network for Epidemic Forecasting. Entropy, 26(4), 278. https://doi.org/10.3390/e26040278