Management of Severe COVID-19 Diagnosis Using Machine Learning
Abstract
1. Introduction
2. Materials and Methods
- Data acquisition and preprocessing—construction of the feature set, imputation of missing values, and reduction of redundancy through the elimination of highly correlated variables.
- Correlation analysis—application of statistical methods to assess associations among variables and preliminary selection of the most relevant predictors of COVID-19 severity.
- Model development and testing—implementation of multiple machine learning classification algorithms, including ensemble methods, naïve Bayes, logistic regression, and others, to address the multiclass classification task of COVID-19 severity.
- Model optimization and feature importance assessment—hyperparameter tuning of classifiers and estimation of each variable’s contribution to the final predictive decision.
- Interpretation of results—visualization of model performance using receiver operating characteristic (ROC) curves and explanation of classification logic derived from decision tree structures.
3. Results
- Most influential predictors:
- IL-6: Consistently emerged as the single most important feature across all classifiers, with normalized importance values of ~1.0. This is concordant with its strong correlation with severity (r = 0.838), reinforcing its central role as a systemic inflammatory marker in COVID-19 pathogenesis.
- Depression: Assigned high importance by several models, including BernoulliNB (1.0), CalibratedClassifierCV (1.0), BaggingClassifier (0.632), ExtraTreesClassifier (0.683), and DecisionTreeClassifier (0.555). This finding mirrors its correlation coefficient (r = 0.739) and underscores the psychosomatic contribution of depression to disease progression.
- Lymphocytes: Particularly important for the HistGradientBoostingClassifier (1.0), RandomForestClassifier (0.581), CalibratedClassifierCV (0.623), and ExtraTreesClassifier (0.554), aligning with its correlation (r = 0.627) with severity.
- LDL-C, AST, ALT, and triglycerides: These biochemical indicators demonstrated moderate-to-high importance in ensemble methods such as RandomForestClassifier (0.480–0.687) and ExtraTreesClassifier (0.308–0.340), consistent with their correlations (0.663, 0.663, 0.612, and 0.602, respectively).
- Platelets: Retained predictive value in RandomForestClassifier (0.197), ExtraTreesClassifier (0.219), and CalibratedClassifierCV (0.038), in agreement with its negative correlation with severity (r = −0.628).
- 2.
- Less influential predictors:
- Sex, vaccination status, and genetic markers (FGB, NOS3, TMPRSS2), PI%: These variables had minimal or no importance in most models, reflecting their weak correlations with severity (<0.1). For instance, sex contributed marginal importance (0.008–0.025) in only a few classifiers, while genetic variants often showed zero contribution.
- Age and BMI: Despite a moderate correlation for BMI (r = 0.324), both features demonstrated low importance (0.0–0.067), suggesting that their effects may be mediated through other covariates.
- Fibrinogen, D-dimer, ET-1, and GFR: These markers displayed modest importance in some ensemble models (e.g., RandomForestClassifier, ExtraTreesClassifier), but their contributions were consistently lower than those of IL-6 or depression.
- 3.
- Model-specific patterns:
- Ensemble methods (ExtraTreesClassifier, RandomForestClassifier, BaggingClassifier) captured a broad spectrum of influential features, reflecting their ability to model non-linear interactions. For example, RandomForestClassifier assigned substantial weight to LDL-C (0.480), AST (0.510), and triglycerides (0.298).
- HistGradientBoostingClassifier focused almost exclusively on IL-6 (0.828) and lymphocytes (1.0), which may reflect either strong regularization or sensitivity to class imbalance.
- BernoulliNB and CalibratedClassifierCV prioritized depression (1.0), likely due to their sensitivity to categorical or binary predictors.
- LogisticRegressionCV and Linear Discriminant Analysis emphasized a narrower set of variables (IL-6, lymphocytes, depression), consistent with their linear structure and more limited capacity to account for complex feature interactions.
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
SNP | single-nucleotide polymorphisms |
ML | machine learning |
ROC | receiver operating characteristic |
WHO | World Health Organization |
CDC | Centers for Disease Control and Prevention |
BMI | body mass index |
IL | interleukin |
MI | myocardial infarction |
PAD | peripheral artery disease |
BP | blood pressure |
GFR | glomerular filtration rate |
References
- Chan, J.F.W.; Yuan, S.; Chu, H.; Sridhar, S.; Yuen, K.Y. COVID-19 drug discovery and treatment options. Nat. Revs Microbiol. 2024, 22, 391–407. [Google Scholar] [CrossRef]
- Zhao, H.; To, K.K.; Lam, H.; Zhou, X.; Chan, J.F.W.; Peng, Z.; Lee, A.C.Y.; Cai, J.; Chan, W.-M.; Ip, J.D.; et al. Cross-linking peptide and repurposed drugs inhibit both entry pathways of SARS-CoV-2. Nat. Commun. 2021, 12, 1517. [Google Scholar] [CrossRef]
- Wu, Y.; Xu, X.; Chen, Z.; Duan, J.; Hashimoto, K.; Yang, L.; Liu, C.; Yang, C. Nervous system involvement after infection with COVID-19 and other coronaviruses. Brain Behav. Immun. 2020, 87, 18–22. [Google Scholar] [CrossRef] [PubMed]
- Sokolenko, M.O.; Sydorchuk, L.P.; Sokolenko, L.S.; Sokolenko, A.A. General immunologic reactivity of patients with COVID-19 and its relation to gene polymorphism, severity of clinical course of the disease and combination with comorbidities. Med. Perspekt. 2024, 29, 108. [Google Scholar] [CrossRef]
- Chowdhury, M.E.H.; Rahman, T.; Khandakar, A.; Al-Madeed, S.; Zughaier, S.M.; Doi, S.A.R.; Hassen, H.; Islam, M.T. An Early Warning Tool for Predicting Mortality Risk of COVID-19 Patients Using Machine Learning. Cogn. Comput. 2021, 16, 1778–1793. [Google Scholar] [CrossRef]
- Nemati, M.; Ansary, J.; Nemati, N. Machine-Learning Approaches in COVID-19 Survival Analysis and Discharge-Time Likelihood Prediction Using Clinical Data. Patterns 2020, 1, 100074. [Google Scholar] [CrossRef] [PubMed]
- Sayed, S.A.; Elkorany, A.M.; Sayed, M.S. Applying Different Machine Learning Techniques for Prediction of COVID-19 Severity. IEEE Access 2021, 9, 135697–135707. [Google Scholar] [CrossRef]
- Gallo, M.B.; Aghagoli, G.; Lavine, K.; Yang, L.; Siff, E.J.; Chiang, S.S.; Salazar-Mather, T.P.; Dumenco, L.; Savaria, M.C.; Aung, S.N.; et al. Predictors of COVID-19 severity: A literature review. Rev. Med. Virol. 2021, 31, 1–10. [Google Scholar] [CrossRef]
- Gazzaruso, C.; Paolozzi, E.; Valenti, C.; Brocchetta, M.; Naldani, D.; Grignani, C.; Salvucci, F.; Marino, F.; Coppola, A.; Gallotti, P. Association between antithrombin and mortality in patients with COVID-19. A possible link with obesity. Nutr. Metab. Cardiovasc. Dis. 2020, 30, 1914–1919. [Google Scholar] [CrossRef]
- Malki, Z.; Atlam, E.S.; Hassanien, A.E.; Dagnew, G.; Elhosseini, M.A.; Gad, I. Association between weather data and COVID-19 pandemic predicting mortality rate: Machine learning approaches. Chaos Soliton Fractals 2020, 138, 110137. [Google Scholar] [CrossRef]
- Lalmuanawma, S.; Hussain, J.; Chhakchhuak, L. Applications of machine learning and artificial intelligence for COVID-19 (SARS-CoV-2) pandemic: A review. Chaos Soliton Fractals 2020, 139, 110059. [Google Scholar] [CrossRef]
- Cobianchi, L.; Piccolo, D.; Dal Mas, F.; Agnoletti, V.; Ansaloni, L.; Balch, J.; Biffl, W.; Butturini, G.; Catena, F.; Coccolini, F.; et al. Surgeons’ perspectives on artificial intelligence to support clinical decision-making in trauma and emergency contexts: Results from an international survey. World J. Emerg. Surg. 2023, 18, 1. [Google Scholar] [CrossRef] [PubMed]
- De Simone, B.; Abu-Zidan, F.M.; Saeidi, S.; Deeken, G.; Biffl, W.L.; Moore, E.E.; Sartelli, M.; Coccolini, F.; Ansaloni, L.; Di Saverio, S.; et al. Knowledge, attitudes and practices of using Indocyanine Green (ICG) fluorescence in emergency surgery: An international web-based survey in the ARtificial Intelligence in Emergency and trauma Surgery (ARIES)—WSES project. Updates Surg. 2024, 76, 1969–1981. [Google Scholar] [CrossRef]
- Silva, K.; Lee, W.K.; Forbes, A.; Demmer, R.T.; Barton, C.; Enticott, J. Use and performance of machine learning models for type 2 diabetes prediction in community settings: A systematic review and meta-analysis. Int. J. Med. Inform. 2020, 143, 104268. [Google Scholar] [CrossRef]
- Krittanawong, C.; Virk, H.U.H.; Bangalore, S.; Wang, Z.; Johnson, K.W.; Pinotti, R.; Zhang, H.; Kaplin, S.; Narasimhan, B.; Kitai, T.; et al. Machine learning prediction in cardiovascular diseases: A meta-analysis. Sci. Rep. 2020, 10, 16057. [Google Scholar] [CrossRef]
- Castaldo, R.; Cavaliere, C.; Soricelli, A.; Salvatore, M.; Pecchia, L.; Franzese, M. Radiomic and Genomic Machine Learning Method Performance for Prostate Cancer Diagnosis: Systematic Literature Review. J. Med. Internet Res. 2021, 23, e22394. [Google Scholar] [CrossRef]
- Lu, W.; Fu, D.; Kong, X.; Huang, Z.; Hwang, M.; Zhu, Y.; Chen, L.; Jiang, K.; Li, X.; Wu, Y.; et al. FOLFOX treatment response prediction in metastatic or recurrent colorectal cancer patients via machine learning algorithms. Cancer Med. 2020, 9, 1419–1429. [Google Scholar] [CrossRef]
- Fleuren, L.M.; Klausch, T.L.T.; Zwager, C.L.; Schoonmade, L.J.; Guo, T.; Roggeveen, L.F.; Swart, E.L.; Girbes, A.R.J.; Thoral, P.; Ercole, A.; et al. Machine learning for the prediction of sepsis: A systematic review and meta-analysis of diagnostic test accuracy. Intensive Care Med. 2020, 46, 383–400. [Google Scholar] [CrossRef] [PubMed]
- Lee, Y.; Ragguett, R.M.; Mansur, R.B.; Boutilier, J.J.; Rosenblat, J.D.; Trevizol, A.; Brietzke, E.; Lin, K.; Pan, Z.; Subramaniapillai, M.; et al. Applications of machine learning algorithms to predict therapeutic outcomes in depression: A meta-analysis and systematic review. J. Affect. Disord. 2018, 241, 519–532. [Google Scholar] [CrossRef]
- Li, W.T.; Ma, J.; Shende, N.; Castaneda, G.; Chakladar, J.; Tsai, J.C.; Apostol, L.; Honda, C.O.; Xu, J.; Wong, L.M.; et al. Using machine learning of clinical data to diagnose COVID-19: A systematic review and meta-analysis. BMC Med. Inform. Decis. Mak. 2020, 20, 247. [Google Scholar] [CrossRef] [PubMed]
- Wu, G.; Yang, P.; Xie, Y.; Woodruff, H.C.; Rao, X.; Guiot, J.; Frix, A.N.; Louis, R.; Moutschen, M.; Li, J.; et al. Development of a clinical decision support system for severity risk prediction and triage of COVID-19 patients at hospital admission: An international multicentre study. Eur. Respir. J. 2020, 56, 2001104. [Google Scholar] [CrossRef]
- Wynants, L.; Van Calster, B.; Collins, G.S.; Riley, R.D.; Heinze, G.; Schuit, E.; Bonten, M.M.J.; Dahly, D.L.; Damen, J.A.A.; Debray, T.P.A.; et al. Prediction models for diagnosis and prognosis of covid-19: Systematic review and critical appraisal. BMJ 2020, 369, m1328. [Google Scholar] [CrossRef] [PubMed]
- Albahri, A.S.; Hamid, R.A.; Alwan, J.K.; Al-Qays, Z.T.; Zaidan, A.A.; Zaidan, B.B.; Albahri, A.O.S.; AlAmoodi, A.H.; Khlaf, J.M.; Almahdi, E.M.; et al. Role of biological Data Mining and Machine Learning Techniques in Detecting and Diagnosing the Novel Coronavirus (COVID-19): A Systematic Review. J. Med. Syst. 2020, 44, 122. [Google Scholar] [CrossRef] [PubMed]
- De Simone, B.; Abu-Zidan, F.M.; Kasongo, L.; Moore, E.E.; Podda, M.; Sartelli, M.; Isik, A.; Bala, M.; Coimbra, R.; Balogh, Z.J.; et al. COVID-19 infection is a significant risk factor for death in patients presenting with acute cholecystitis: A secondary analysis of the ChoCO-W cohort study. World J. Emerg. Surg. 2025, 20, 16. [Google Scholar] [CrossRef] [PubMed]
- Barough, S.S.; Safavi-Naini, S.A.A.; Siavoshi, F.; Tamimi, A.; Ilkhani, S.; Akbari, S.; Ezzati, S.; Hatamabadi, H.; Pourhoseingholi, M.A. Generalizable machine learning approach for COVID-19 mortality risk prediction using on-admission clinical and laboratory features. Sci. Rep. 2023, 13, 2399. [Google Scholar] [CrossRef]
- Patel, D.; Kher, V.; Desai, B.; Lei, X.; Cen, S.; Nanda, N.; Gholamrezanezhad, A.; Duddalwar, V.; Varghese, B.; A Oberai, A. Machine learning based predictors for COVID-19 disease severity. Sci. Rep. 2021, 11, 4673. [Google Scholar] [CrossRef]
- Reichert, M.; Sartelli, M.; Weigand, M.A.; Hecker, M.; Oppelt, P.U.; Noll, J.; Askevold, I.H.; Liese, J.; Padberg, W.; Coccolini, F.; et al. Two years later: Is the SARS-CoV-2 pandemic still having an impact on emergency surgery? An international cross-sectional survey among WSES members. World J. Emerg. Surg. 2022, 17, 34. [Google Scholar] [CrossRef]
- Alotaibi, A.; Shiblee, M.; Alshahrani, A. Prediction of severity of COVID-19-infected patients using machine learning techniques. Computers 2021, 10, 31. [Google Scholar] [CrossRef]
- Sokolenko, M.; Sydorchuk, L.; Sokolenko, A.; Sydorchuk, R.; Kamyshna, I.; Sydorchuk, A.; Sokolenko, L.; Sokolenko, O.; Oksenych, V.; Kamyshnyi, O. Antiviral Intervention of COVID-19: Linkage of Disease Severity with Genetic Markers FGB (rs1800790), NOS3 (rs2070744) and TMPRSS2 (rs12329760). Viruses 2025, 17, 792. [Google Scholar] [CrossRef]
- Protocol “Provision of medical assistance for the treatment of coronavirus disease (COVID-19)”. Approved by the Order of the Ministry of Health of Ukraine of 2 April 2020 No. 762 (As Amended by the Order of the Ministry of Health of Ukraine of 17 May 2023 No. 913. (In Ukrainian). Available online: https://www.dec.gov.ua/wp-content/uploads/2023/05/protokol-covid2023.pdf (accessed on 12 September 2025).
- National Medical Care Standard “Coronavirus disease (COVID-19)”. Approved by Order No. 722 of the Ministry of Health of Ukraine Dated 28 March 2020. (In Ukrainian). Available online: https://www.dec.gov.ua/wp-content/uploads/2021/10/2020_722_standart_covid_19.pdf (accessed on 12 September 2025).
- CDC 24/7: Saving Lives, Protecting People. Prevention Actions to Use at All COVID-19 Community Levels. Center for Disease Control and Prevention. 2023. Available online: https://www.cdc.gov/covid/prevention/index.html (accessed on 12 September 2025).
- Hu, C.; Liu, Z.; Jiang, Y.; Shi, O.; Zhang, X.; Xu, K.; Suo, C.; Wang, Q.; Song, Y.; Yu, K.; et al. Early prediction of mortality risk among patients with severe COVID-19, using machine learning. Int. J. Epidemiol. 2021, 49, 1918–1929. [Google Scholar] [CrossRef]
- Lv, H.; Liu, Y.; Yin, H.; Xi, J.; Wei, P. ML Applications in prediction models for COVID-19: Bibliometric analysis. Information 2024, 15, 575. [Google Scholar] [CrossRef]
- Han, H.; Ma, Q.; Li, C.; Liu, R.; Zhao, L.; Wang, W.; Zhang, P.; Liu, X.; Gao, G.; Liu, F.; et al. Profiling serum cytokines in COVID-19 patients reveals IL-6 and IL-10 are disease severity predictors. Emerg. Microbes Infect. 2020, 9, 1123–1130. [Google Scholar] [CrossRef]
- De Simone, B.; Abu-Zidan, F.M.; Chouillard, E.; Di Saverio, S.; Sartelli, M.; Podda, M.; Gomes, C.A.; Moore, E.E.; Moug, S.J.; Ansaloni, L.; et al. The ChoCO-W prospective observational global study: Does COVID-19 increase gangrenous cholecystitis? World J. Emerg. Surg. 2022, 17, 61. [Google Scholar] [CrossRef] [PubMed]
- Tulu, T.W.; Wan, T.K.; Chan, C.L.; Wu, C.H.; Woo, P.Y.M.; Tseng, C.Z.S.; Vodencarevic, A.; Menni, C.; Chan, K.H.K. Machine learning-based prediction of COVID-19 mortality using immunological-metabolic biomarkers. BMC Digit. Health 2023, 1, 6. [Google Scholar]
- Sekaran, K.; Gnanasambandan, R.; Thirunavukarasu, R.; Iyyadurai, R.; Karthik, G.; Doss, C.G.P. A systematic review of artificial intelligence-based COVID-19 modeling on multimodal genetic information. Prog. Biophys. Mol. Biol. 2023, 179, 1–9. [Google Scholar] [CrossRef]
- Statsenko, Y.; Al Zahmi, F.; Habuza, T.; Gorkom, K.N.; Zaki, N. Prediction of COVID-19 severity using laboratory findings on admission: Informative values, thresholds, ML model performance. BMJ Open 2021, 11, e044500. [Google Scholar] [CrossRef]
- Liang, P.; Li, Y.; Meng, L.; Li, Y.; Mai, H.; Li, T.; Ma, J.; Ma, J.; Wang, J.; Zhuan, B.; et al. Prognostic significance of serum interleukin-6 in severe/critical COVID-19 patients treated with tocilizumab: A detailed observational study analysis. Sci. Rep. 2024, 14, 29634. [Google Scholar] [CrossRef]
- Li, Y.; Shang, K.; Bian, W.; He, L.; Fan, Y.; Ren, T.; Zhang, J. Prediction of disease progression in patients with COVID-19 by artificial intelligence assisted lesion quantification. Sci. Rep. 2020, 10, 22083. [Google Scholar] [CrossRef] [PubMed]
- Nguyen, H.T.T.; Le-Quy, V.; Ho, S.V.; Thomsen, J.H.D.; Pontoppidan Stoico, M.; Tong, H.V.; Nguyen, N.L.; Krarup, H.B.; Nguyen, S.H.; Tran, V.Q.; et al. Outcome prediction model and prognostic biomarkers for COVID-19 patients in Vietnam. ERJ Open Res. 2023, 9, 00481-2022. [Google Scholar] [CrossRef] [PubMed]
- Pourhomayoun, M.; Shakibi, M. Predicting mortality risk in patients with COVID-19 using machine learning to help medical decision-making. Smart Health 2021, 20, 100178. [Google Scholar] [CrossRef]
- Laatifi, M.; Douzi, S.; Bouklouz, A.; Ezzine, H.; Jaafari, J.; Zaid, Y.; El Ouahidi, B.; Naciri, M. Machine learning approaches in COVID-19 severity risk prediction in Morocco. J. Big Data 2022, 9, 5. [Google Scholar] [CrossRef] [PubMed]
Patients/Observation | COVID-19 Severity Course | ||
---|---|---|---|
Mild, n (%) | Moderate, n (%) | Severe, n (%) | |
Primary dataset (enrolled patients), n = 257 (%) | 60 (23.35) | 55 (21.40) | 142 (55.25) |
Final dataset, n = 226 (%) | 30 (13.27) | 54 (23.89) | 142 (62.83) |
Class/severity distribution | 0 | 1 | 2 |
Variable | Correlation |
---|---|
IL-6 | 0.838 |
Depression | 0.738 |
LDL-H | 0.663 |
AST | 0.662 |
Platelets | −0.628 |
Lymphocytes | 0.627 |
ALT | 0.611 |
Triglycerides | 0.602 |
GFR | −0.471 |
Fibrinogen | 0.329 |
BMI | 0.324 |
D-dimer | 0.319 |
TMPRSS2 | 0.252 |
Smoking | −0.205 |
Age, yo | 0.176 |
Vaccination | −0.168 |
ET-1 | 0.141 |
Gene FGB | 0.087 |
Gene TMPRSS2 | 0.056 |
Gene NOS3 | 0.037 |
Sex | 0.018 |
PI, % | −0.002 |
Full Dataset | Optimized Dataset | |||
---|---|---|---|---|
Model | Accuracy (Mean ± SD) | AUC-ROC | Accuracy (Mean ± SD) | AUC-ROC |
ExtraTreesClassifier | 0.965 (±0.027) | 1.000 | 0.974 (±0.022) | 1.000 |
RandomForestClassifier | 0.945 (±0.050) | 1.000 | 0.960 (±0.035) | 1.000 |
HistGradientBoostingClassifier | 0.948 (±0.051) | 1.000 | 0.960 (±0.038) | 1.000 |
BernoulliNB | 0.945 (±0.044) | 1.000 | 0.956 (±0.037) | 1.000 |
BaggingClassifier | 0.944 (±0.043) | 1.000 | 0.951 (±0.036) | 1.000 |
CalibratedClassifierCV | 0.936 (±0.037) | 0.998 | 0.943 (±0.030) | 1.000 |
DecisionTreeClassifier | 0.932 (±0.051) | 0.996 | 0.938 (±0.043) | 1.000 |
GradientBoostingClassifier | 0.920 (±0.056) | 0.996 | 0.934 (±0.046) | 1.000 |
LogisticRegressionCV | 0.923 (±0.029) | 0.997 | 0.934 (±0.020) | 1.000 |
LinearDiscriminantAnalysis | 0.917 (±0.037) | 0.997 | 0.929 (±0.029) | 1.000 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sydorchuk, L.; Sokolenko, M.; Škoda, M.; Lajcin, D.; Vyklyuk, Y.; Sydorchuk, R.; Sokolenko, A.; Martjanov, D. Management of Severe COVID-19 Diagnosis Using Machine Learning. Computation 2025, 13, 238. https://doi.org/10.3390/computation13100238
Sydorchuk L, Sokolenko M, Škoda M, Lajcin D, Vyklyuk Y, Sydorchuk R, Sokolenko A, Martjanov D. Management of Severe COVID-19 Diagnosis Using Machine Learning. Computation. 2025; 13(10):238. https://doi.org/10.3390/computation13100238
Chicago/Turabian StyleSydorchuk, Larysa, Maksym Sokolenko, Miroslav Škoda, Daniel Lajcin, Yaroslav Vyklyuk, Ruslan Sydorchuk, Alina Sokolenko, and Dmytro Martjanov. 2025. "Management of Severe COVID-19 Diagnosis Using Machine Learning" Computation 13, no. 10: 238. https://doi.org/10.3390/computation13100238
APA StyleSydorchuk, L., Sokolenko, M., Škoda, M., Lajcin, D., Vyklyuk, Y., Sydorchuk, R., Sokolenko, A., & Martjanov, D. (2025). Management of Severe COVID-19 Diagnosis Using Machine Learning. Computation, 13(10), 238. https://doi.org/10.3390/computation13100238