Machine Learning Models to Predict Recoveries and Deaths from COVID-19 in Mexican Society in the Post-Pandemic Era
Abstract
1. Introduction
2. Materials and Methods
2.1. Artificial Neural Networks
2.1.1. Significant Features
2.1.2. Data Balancing, Training Data, and Test Data
2.1.3. Artificial Neural Network Generation and Evaluation
2.2. Logistic Regression Models
2.3. Classification Algorithms
3. Results
3.1. Artificial Neural Network Results
3.2. Logistic Regression Model Results
3.3. Results of Classification Algorithms
3.4. Comparison of Results
4. Discussion
5. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
ANN | Artificial Neural Network |
CI | Confidence Interval |
COPD | Chronic Obstructive Pulmonary Disease |
DL | Deep Learning |
ELM | Extreme Learning Machine |
FP | False Positive |
HR | Hazard Ratio |
LSTM | Long-Short Term Memory |
ML | Machine Learning |
MLP | Multi-Layer Perceptron |
MSE | Mean Squared Error |
ReLU | Rectified Linear Unit |
RMSE | Root Mean Squared Error |
RNN | Recurrent Neural Network |
SGD | Stochastic Gradient Descent |
TP | True Positive |
References
- Piccialli, F.; di Cola, V.S.; Giampaolo, F.; Cuomo, S. The Role of Artificial Intelligence in Fighting the COVID-19 Pandemic. Inf. Syst. Front. 2021, 23, 1467–1497. [Google Scholar] [CrossRef] [PubMed]
- Vaishya, R.; Javaid, M.; Khan, I.H.; Haleem, A. Artificial Intelligence (AI) applications for COVID-19 pandemic. Diabetes Metab. Syndr. Clin. Res. Rev. 2020, 14, 337–339. [Google Scholar] [CrossRef] [PubMed]
- Debnath, S.; Barnaby, D.P.; Coppa, K.; Makhnevich, A.; Kim, E.J.; Chatterjee, S.; Tóth, V.; Levy, T.J.; Paradis, M.D.; Cohen, S.L. Machine learning to assist clinical decision-making during the COVID-19 pandemic. Bioelectron Med. 2020, 6, 14. [Google Scholar] [CrossRef] [PubMed]
- Zafar, N.; Ahamed, J. Emerging technologies for the management of COVID19: A review. Sustain. Oper. Comput. 2022, 3, 249–257. [Google Scholar] [CrossRef]
- Podder, P.; Mondal, M.R.H. Machine learning to predict COVID-19 and ICU requirement. In Proceedings of the 2020 11th International Conference on Electrical and Computer Engineering, Dhaka, Bangladesh, 17–19 December 2020. [Google Scholar]
- Zhan, C.; Zheng, Y.; Zhang, H.; Wen, Q. Random-Forest-Bagging Broad Learning System with Applications for COVID-19 Pandemic. IEEE Internet Things J. 2021, 8, 15906–15918. [Google Scholar] [CrossRef] [PubMed]
- Purwandari, T.; Zahroh, S.; Hidayat, Y.; Sukono Mamat, M.; Saputra, J. Forecasting model of COVID-19 pandemic in Malaysia: An application of time series approach using neural network. Decis. Sci. Lett. 2022, 11, 35–42. [Google Scholar] [CrossRef]
- Yenurkar, G.; Mal, S. Future forecasting prediction of COVID-19 using hybrid deep learning algorithm. Multimed. Tools Appl. 2022, 82, 22497–22523. [Google Scholar] [CrossRef] [PubMed]
- Villavicencio, C.N.; Macrohon, J.J.E.; Inbaraj, X.A.; Jeng, J.H.; Hsieh, J.G. COVID-19 prediction applying supervised machine learning algorithms with comparative analysis using weka. Algorithms 2021, 14, 201. [Google Scholar] [CrossRef]
- Goldbloom, A. Kaggle Datasets. Available online: https://www.kaggle.com/datasets/ (accessed on 21 August 2025).
- Frank, E.; Hall, M.A.; Witten, I.H. The WEKA Workbench Data Mining: Practical Machine Learning Tools and Techniques, 4th ed.; Morgan Kaufmann: San Francisco, CA, USA, 2016. [Google Scholar]
- Aledhari, M.; Razzak, R.; Parizi, R.M.; Dehghantanha, A. A Deep Recurrent Neural Network to Support Guidelines and Decision Making of Social Distancing. In Proceedings of the 2020 IEEE International Conference on Big Data, Atlanta, GO, USA, 10–13 October 2020. [Google Scholar]
- Roser, M. Our World in Data. Available online: https://ourworldindata.org/ (accessed on 21 August 2025).
- Apple. Mobility Trends Reports. Available online: https://covid19.apple.com/mobility (accessed on 21 August 2025).
- Parbat, D.; Chakraborty, M. A python based support vector regression model for prediction of COVID19 cases in India. Chaos Solitons Fractals 2020, 138, 109942. [Google Scholar] [CrossRef] [PubMed]
- Lakshmanarao, A.; Babu, M.R.; Kiran, T.S.R. An Efficient Covid19 Epidemic Analysis and Prediction Model Using Machine Learning Algorithms. Int. J. Online Biomed. Eng. 2021, 17, 176–184. [Google Scholar] [CrossRef]
- Zgheib, R.; Chahbandarian, G.; Kamalov, F.; El Labban, O. Neural Networks Architecture for COVID-19 Early Detection. In Proceedings of the 2021 International Symposium on Networks, Computers and Communications, Dubai, United Arab Emirates, 31 Occtober–2 November 2021. [Google Scholar]
- Galván-Tejada, C.E.; Zanella-Calzada, L.A.; Villagrana-Bañuelos, K.E.; Moreno-Báez, A.; Luna-García, H.; Celaya-Padilla, J.M.; Galván-Tejada, J.I.; Gamboa-Rosales, H. Demographic and comorbidities data description of population in mexico with SARS-CoV-2 infected patients (COVID19): An online tool analysis. Int. J. Environ. Res. Public Health 2020, 17, 5173. [Google Scholar] [CrossRef] [PubMed]
- Mancilla-Galindo, J.; Vera-Zertuche, J.M.; Navarro-Cruz, A.R.; Segura-Badilla, O.; Reyes-Velázquez, G.; Tepepa-López, F.J.; Aguilar-Alonso, P.; Vidal-Mayo, J.d.J.; Kammar-García, A. Development and Validation of the Patient History COVID-19 (PH-Covid19) Scoring System: A Multivariable Prediction Model of Death in Mexican Patients with COVID-19. Epidemiol. Infect. 2020, 148, 1–37. [Google Scholar] [CrossRef] [PubMed]
- Gomez-Cravioto, D.A.; Diaz-Ramos, R.E.; Cantu-Ortiz, F.J.; Ceballos, H.G. Data Analysis and Forecasting of the COVID-19 Spread: A Comparison of Recurrent Neural Networks and Time Series Models. Cognit Comput. 2021, 16, 1794–1805. [Google Scholar] [CrossRef] [PubMed]
- Ascencio-Montiel Ide, J.; Ovalle-Luna, O.D.; Rascón-Pacheco, R.A.; Borja-Aburto, V.H.; Chowell, G. Comparative epidemiology of five waves of COVID-19 in Mexico, March 2020–August 2022. BMC Infect. Dis. 2022, 22, 813. [Google Scholar] [CrossRef] [PubMed]
- Almustafa, K.M. Covid19-Mexican-Patients’ Dataset (Covid19MPD) Classification and Prediction Using Feature Importance. Concurr Comput. 2022, 34, e6675. [Google Scholar] [CrossRef] [PubMed]
- Dirección General de Epidemiología. Datos Abiertos de Epidemiología (Open Epidemiology Data). Available online: https://www.gob.mx/salud/documentos/datos-abiertos-152127 (accessed on 12 August 2025).
- Microsoft. Power BI Desktop version 2.141.1754.0. Available online: https://www.microsoft.com/es-es/power-platform/products/power-bi (accessed on 13 August 2025).
- Posit. RStudio Desktop Version 4.3.1. Available online: https://posit.co/download/rstudio-desktop/ (accessed on 13 August 2025).
- Microsoft. Microsoft Excel Version 16.101.3. Available online: https://www.microsoft.com/es-mx/microsoft-365/excel (accessed on 13 August 2025).
- van Rossum, G. Python Version 3.13.7. Available online: https://www.python.org/ (accessed on 13 August 2025).
- Apicella, A.; Donnarumma, F.; Isgrò, F.; Prevete, R. A survey on modern trainable activation functions. Neural Netw. 2021, 138, 14–32. [Google Scholar] [CrossRef] [PubMed]
- Dubey, S.R.; Singh, S.K.; Chaudhuri, B.B. Activation functions in deep learning: A comprehensive survey and benchmark. Neurocomputing 2022, 53, 92–108. [Google Scholar] [CrossRef]
- Haque, M.; Afsha, S.; Nyeem, H. Developing BrutNet: A New Deep CNN Model with GRU for Realtime Violence Detection. In Proceedings of the 2022 International Conference on Innovations in Science, Engineering and Technology, ICISET, Chittagong, Bangladesh, 25–28 February 2022. [Google Scholar]
- Dubey, S.R.; Chakraborty, S.; Roy, S.K.; Mukherjee, S.; Singh, S.K.; Chaudhuri, B.B. DiffGrad: An Optimization Method for Convolutional Neural Networks. IEEE Trans. Neural. Netw. Learn Syst. 2020, 31, 4500–4511. [Google Scholar] [CrossRef] [PubMed]
- Yang, J.; Yang, G. Modified convolutional neural network based on dropout and the stochastic gradient descent optimizer. Algorithms 2018, 11, 28. [Google Scholar] [CrossRef]
Best Neural Network Model | |||||
Precision | Recall | f1-Score | Confusion Matrix | ||
0 | 0.93 | 0.93 | 0.93 | 1125 | 79 |
1 | 0.93 | 0.93 | 0.93 | 87 | 1094 |
0 | 1 | ||||
accuracy | 0.93 | ||||
macro avg | 0.93 | 0.93 | 0.93 | ||
weighted avg | 0.93 | 0.93 | 0.93 | ||
Worst Neural Network Model | |||||
Precision | Recall | f1-Score | Confusion Matrix | ||
0 | 0.94 | 0.86 | 0.89 | 1436 | 242 |
1 | 0.72 | 0.86 | 0.78 | 96 | 611 |
0 | 1 | ||||
accuracy | 0.86 | ||||
macro avg | 0.83 | 0.86 | 0.84 | ||
weighted avg | 0.87 | 0.86 | 0.86 |
Precision | Recall | f1-Score | Confusion Matrix | ||
---|---|---|---|---|---|
0 | 0.90 | 0.96 | 0.93 | 1142 | 51 |
1 | 0.95 | 0.90 | 0.92 | 124 | 1068 |
0 | 1 | ||||
accuracy | 0.93 | ||||
macro avg | 0.93 | 0.93 | 0.93 | ||
weighted avg | 0.93 | 0.93 | 0.93 |
Classifier | Classification Report | Confusion Matrix | |||||||
---|---|---|---|---|---|---|---|---|---|
J48 (C4.5 algorithm) | Correctly classified instances | 2205 | 92.45% | Dead | Alive | ||||
Incorrectly classified instances | 180 | 7.55% | Dead | 1087 | 99 | ||||
Kappa statistic | 0.849 | Alive | 81 | 1118 | |||||
Class | TP rate | FP rate | Precision | Recall | F-Measure | MCC | ROC area | PRC area | |
Dead | 0.917 | 0.068 | 0.931 | 0.917 | 0.924 | 0.849 | 0.949 | 0.932 | |
Alive | 0.932 | 0.083 | 0.919 | 0.932 | 0.925 | 0.849 | 0.949 | 0.928 | |
0.925 | 0.076 | 0.925 | 0.925 | 0.925 | 0.849 | 0.949 | 0.930 | ||
Random Forest | Correctly classified instances | 2199 | 92.20% | Dead | Alive | ||||
Incorrectly classified instances | 186 | 7.80% | Dead | 1096 | 90 | ||||
Kappa statistic | 0.844 | Alive | 96 | 1103 | |||||
Class | TP rate | FP rate | Precision | Recall | F-Measure | MCC | ROC area | PRC area | |
Dead | 0.924 | 0.080 | 0.919 | 0.924 | 0.922 | 0.844 | 0.965 | 0.960 | |
Alive | 0.920 | 0.076 | 0.925 | 0.920 | 0.922 | 0.844 | 0.965 | 0.956 | |
0.922 | 0.078 | 0.922 | 0.922 | 0.922 | 0.844 | 0.965 | 0.958 | ||
Naïve Bayes | Correctly classified instances | 2195 | 92.03% | Dead | Alive | ||||
Incorrectly classified instances | 190 | 7.97% | Dead | 1096 | 90 | ||||
Kappa statistic | 0.8407 | Alive | 100 | 1099 | |||||
Class | TP rate | FP rate | Precision | Recall | F-Measure | MCC | ROC area | PRC area | |
Dead | 0.924 | 0.083 | 0.916 | 0.924 | 0.920 | 0.841 | 0.964 | 0.965 | |
Alive | 0.917 | 0.076 | 0.924 | 0.917 | 0.920 | 0.841 | 0.964 | 0.944 | |
0.920 | 0.080 | 0.920 | 0.920 | 0.920 | 0.841 | 0.964 | 0.954 | ||
Bayes Net | Correctly classified instances | 2190 | 91.82% | Dead | Alive | ||||
Incorrectly classified instances | 195 | 8.18% | Dead | 1083 | 103 | ||||
Kappa statistic | 0.8365 | Alive | 92 | 1107 | |||||
Class | TP rate | FP rate | Precision | Recall | F-Measure | MCC | ROC area | PRC area | |
Dead | 0.913 | 0.077 | 0.922 | 0.913 | 0.917 | 0.837 | 0.969 | 0.966 | |
Alive | 0.923 | 0.087 | 0.915 | 0.923 | 0.919 | 0.837 | 0.969 | 0.967 | |
0.918 | 0.082 | 0.918 | 0.918 | 0.918 | 0.837 | 0.969 | 0.966 | ||
Random Tree | Correctly classified instances | 2141 | 89.77% | Dead | Alive | ||||
Incorrectly classified instances | 244 | 10.23% | Dead | 1065 | 121 | ||||
Kappa statistic | 0.7954 | Alive | 123 | 1076 | |||||
Class | TP rate | FP rate | Precision | Recall | F-Measure | MCC | ROC area | PRC area | |
Dead | 0.898 | 0.103 | 0.896 | 0.898 | 0.897 | 0.795 | 0.905 | 0.871 | |
Alive | 0.897 | 0.102 | 0.899 | 0.897 | 0.898 | 0.795 | 0.905 | 0.865 | |
0.898 | 0.102 | 0.898 | 0.898 | 0.898 | 0.795 | 0.905 | 0.868 |
Authors | Proposal in the Mexican Context | Reported Metrics | Period |
---|---|---|---|
Almustafa (2021) [22] | Classification algorithms to predict survival and death cases from COVID-19. | 94.41% accuracy in survival/death classification | Pandemic era |
Ascencio-Montiel et al. (2022) [21] | Logistic regression models to assess the association of demographic factors, comorbidities, waves, and vaccination with the risk of death from COVID-19. | Hospital case fatality rate for five infection waves: 45.1%, 50.8%, 43.6%, 34.7%, 17.7% (95% CI) | Pandemic era |
Gomez-Cravioto et al. (2021) [20] | Regression models to describe the growth of COVID-19 incidents and LSTM neural network to perform forecasting for daily cases and fatalities. | 275.35 RMSE in predicting daily incidents and 31.91 RMSE in predicting daily fatalities | Pandemic era |
Mancilla-Galindo et al. (2020) [19] | Multivariate prediction model of death in patients with COVID-19 based on Cox regression model. | From 1.05 HR to 1.86 HR for regression coefficients (95% CI, p value < 0.0001) | Pandemic era |
Luna-Ramírez et al. (2025) [this article] | Classification models to predict recoveries and deaths from COVID-19 based on ANN, logistic regression models and classification algorithms. | 93% accuracy and 5.6% loss in predicting recoveries/deaths | Post-pandemic era |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Luna-Ramírez, E.; Soria-Cruz, J.; Castillo-Zúñiga, I.; López-Veyna, J.I. Machine Learning Models to Predict Recoveries and Deaths from COVID-19 in Mexican Society in the Post-Pandemic Era. COVID 2025, 5, 174. https://doi.org/10.3390/covid5100174
Luna-Ramírez E, Soria-Cruz J, Castillo-Zúñiga I, López-Veyna JI. Machine Learning Models to Predict Recoveries and Deaths from COVID-19 in Mexican Society in the Post-Pandemic Era. COVID. 2025; 5(10):174. https://doi.org/10.3390/covid5100174
Chicago/Turabian StyleLuna-Ramírez, Enrique, Jorge Soria-Cruz, Iván Castillo-Zúñiga, and Jaime Iván López-Veyna. 2025. "Machine Learning Models to Predict Recoveries and Deaths from COVID-19 in Mexican Society in the Post-Pandemic Era" COVID 5, no. 10: 174. https://doi.org/10.3390/covid5100174
APA StyleLuna-Ramírez, E., Soria-Cruz, J., Castillo-Zúñiga, I., & López-Veyna, J. I. (2025). Machine Learning Models to Predict Recoveries and Deaths from COVID-19 in Mexican Society in the Post-Pandemic Era. COVID, 5(10), 174. https://doi.org/10.3390/covid5100174