Next Article in Journal
Using the P-CaRES Tool to Identify Palliative Care Needs in Patients with Life-Limiting Diseases: An Analysis of Internal Medicine Admissions
Previous Article in Journal
Advances in Complement Inhibitory Strategies for the Treatment of Glomerular Disease: A Rapidly Evolving Field
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Calculating the Risk of Admission to Intensive Care Units in COVID-19 Patients Using Machine Learning

by
Mireia Ladios-Martin
1,*,
María José Cabañero-Martínez
2,
José Fernández-de-Maya
3,
Francisco-Javier Ballesta-López
1,
Ignacio Garcia-Garcia
4,
Adrián Belso-Garzas
5,
Francisco-Manuel Aznar-Zamora
5 and
Julio Cabrero-García
2
1
Grupo Ribera, Edificio Sorolla Center, Avda Cortes Valencianas, 58, 46015 Valencia, Spain
2
Nursing Department, University of Alicante, 03690 San Vicente del Raspeig, Spain
3
Vinalopó University Hospital, 03293 Elche, Spain
4
Verne Technology Group, 03144 Alicante, Spain
5
Ribera Salud Technologies, 03203 Elche, Spain
*
Author to whom correspondence should be addressed.
J. Clin. Med. 2025, 14(12), 4205; https://doi.org/10.3390/jcm14124205
Submission received: 29 April 2025 / Revised: 29 May 2025 / Accepted: 6 June 2025 / Published: 13 June 2025
(This article belongs to the Section Epidemiology & Public Health)

Abstract

:
Background: The COVID-19 pandemic clearly posed a global challenge to healthcare systems, where the allocation of limited resources had important logistical and ethical implications. Detecting and prioritizing the population at risk of intensive care unit (ICU) admission is the first step to being able to care for the most vulnerable people and avoid unnecessary consumption of resources by mildly ill patients. Objective: To create a model, using machine learning techniques, capable of identifying the risk of admission to the ICU throughout the hospital stay of the COVID patient and to evaluate the performance of the model. Methods: A retrospective cohort design was used to develop and validate a classification model of adult COVID-19 patients with or without risk of ICU admission. Data from three hospitals in Spain were used to develop the model (n = 1272) and for subsequent external validation (n = 550). Sensitivity, specificity, positive and negative predictive value, accuracy, F1 score, Youden index and area under the curve of the model were evaluated. Results: The LightGBM model, incorporating 40 variables, was used. The area under the curve obtained by the model when the test dataset was used was 1.00 (0.99–1.0), specificity 0.99 (0.97–1.00) and sensitivity 0.92 (0.86–0.98). Conclusions: A model for predicting ICU admission of hospitalized COVID-19 patients was created with very good results. The identification and prioritization of COVID-19 patients at risk of ICU admission allows the right care to be provided to those who are most in need when the healthcare system is under pressure.

1. Introduction

In December 2019, a new pneumonia (COVID-19) caused by the SARS-CoV-2 virus broke out in Wuhan (China). The most widespread symptoms were fever, dry cough, muscle weakness, and chest pain [1]. As of December 2021, the number of COVID-19 infections worldwide amounted to 290 million people and 5,400,000 deaths. In Spain, 6,290,000 people were infected and 89,000 died of the disease [2]. According to data collected from the beginning of the pandemic, 20% to 30% of COVID patients required hospitalization, and 5% to 12% needed intensive care [3]. The situation clearly posed a global challenge to health systems for various reasons, such as: rapid spread of the disease; high concentration of cases; excessive consumption of resources; high percentage of severe cases superior to other respiratory syndromes [4]; and high mortality rate of the most severely affected [5].
In scenarios of such huge pressure on health systems, allocating limited resources such as intensive care unit (ICU) beds has important logistical and ethical implications [6]. In other words, it is crucial to detect and prioritize patients in need of intensive therapy to avoid the unnecessary consumption of medical resources by mild or asymptomatic patients [7,8].
No clear prognostic biomarkers have hitherto been defined allowing to predict which patients will require ICU care. Indeed, many laboratory markers are affected by the disease and their presentation varies in terms of symptom severity or patient deterioration speed [9].
Nevertheless, multiple efforts have been made since the beginning of the pandemic to create tools based on artificial intelligence which help to screen, diagnose, and predict COVID-19 patient prognosis. These tools use radiological images and clinical laboratory data [10] because respiratory status, immune and inflammatory response as well as coagulation, among others, are significantly altered during the disease [11]. Most studies which have focused on probable disease evolution are based on different aspects of clinical deterioration as dependent variables, whether grouped or separated, such as ICU admission, severe symptoms, shock, mechanical ventilation needs, or patient death [3,4,7,12,13,14,15,16,17,18,19,20,21,22,23,24,25]. Most of the studies, which include the variable ICU admission or analogous outcomes (such as the need for invasive mechanical ventilation), present predictive models which rest on data relating to the patient’s first contact (static variables) with the hospital [7,14,15,16,17,18], but few consider the patient’s evolution (dynamic variables) during hospital stay [3,19,20,21]. The exclusive use of static variables prevents the model from being able to reflect the patient’s condition over the days and makes suboptimal predictions of poor value.
Based on the above, the objective of the study was to develop a predictive model of the risk of intensive care unit admission in patients with COVID-19.

2. Materials and Methods

2.1. Setting

The study was conducted with patients admitted to 3 hospitals in Spain, all of which were of medium size (200–250 beds) and had emergency, medical hospitalization, and ICU departments. All variables were retrieved from the electronic medical record (EMR).

2.2. Design

A retrospective cohort design was used to develop and validate the model for classifying patients with or without ICU admission risk. Data from the three hospitals were used to develop the model and for its subsequent external validation. Datasets from different time periods were used for each purpose (training and test).

2.3. Data Preparation

The eligible population met the following criteria: patients aged over 16 years who were confirmed COVID-19 positive by a laboratory (C-Reactive Protein (CRP) test, CRP rapid test, or antigen test) at the time of admission or over the previous 7 days, who presented a respiratory-type main diagnosis (COVID-19, SARS-associated coronavirus infection, infection due to unspecified coronavirus, unspecified pneumonia), as well as a hospitalization duration equal to or greater than 24 h (including a stay in the emergency department). Patients whose medical record included therapeutic effort limitation were excluded (17% of the total number of COVID-19-diagnosed patients).
The data collection period spanned from 15 February 2020 to 30 April 2021. The total sample included 1822 patients. The first dataset of the 3 hospitals was used to develop the model which corresponded to the sample collected between 15 February and 31 December 2020 (n = 1272). External validation rested on a dataset from the same hospitals but obtained over a different period, specifically between 1 January and 30 April 2021 (n = 550) (Figure 1).

2.4. Dependent Variable

The dependent variable was defined as the ICU admission of a patient coming from any service and having been previously hospitalized for more than 24 h. If the patient had been admitted more than once to that unit, only the first admission was included in the analysis. Those who were not admitted to the ICU and whose end-destination was discharge home, transfer to another center, or death were considered as non-ICU patients in this study.

2.5. Independent Variables

We conducted a review of the literature to identify ICU admission predictors in patients with COVID-19 and other pneumonias. A total of 96 variables were selected. Next, the possibility of obtaining these variables from the EMR was evaluated and the criteria for doing so were defined according to each variable, both for patients who were admitted to the ICU and for those who were not. The values of each variable were extracted based on different temporal strategies (on admission, during the stay, or at discharge). When the data of the variable were generated during the stay, the value was obtained at a single time for each subject at the time closest to ICU admission (or on the expected ICU admission day, median).
Of the total number of variables, the following were excluded before the imputation: 23 categorical variables, because they were already included as numerical variables; 30 variables, because they presented a high rate of null values (more than 10%); 7 due to high correlation with other variables already included; 5 variables using recursive feature elimination of variables (RFE) techniques; and 1 due to inaccessibility. Missing values were imputed with the median in continuous variables and with the mode in categorical variables. To handle outliers, the interquartile range (IQR) was applied. Additionally, categorical variables with response rates below 25% were recoded.
The extracted variables, to which we added new derived variables (created from relevant original variables) were processed to generate a database that we used to select the algorithm. The original variables which the model was finally based on are detailed in Appendix A.1.

2.6. Model Development

To select the predictive model algorithm, the training sample from February to December 2020 was used. The performance of four algorithms was evaluated: LightGBM, XGBoost, logistic regression and random forest. Recursive feature elimination cross validation (RFECV) was implemented to simplify the model by identifying the optimal number of variables without compromising its predictive capacity. To counteract the imbalance of the dependent variable, the Adaptive synthetic sampling approach for imbalanced learning (ADASYN) technique was applied. In addition, stratified k-fold cross validation was used, with k = 5, to prevent model overfitting and bias. The hyperparameters were optimized using the Bayesian search technique.
Once the final model was established, its performance was analyzed calculating sensitivity (the probability of the positive label being true), specificity (the probability of the negative label being true), positive and negative predictive value (the proportions of positive and negative results in tests that are true positive and true negative results, respectively), accuracy (probability of the true value of the class label), F1-score (harmonic mean of the model’s precision and sensitivity), Youden index (evaluates the algorithm’s ability to avoid failure), and area under the curve (AUC quantifies the ability of a model to distinguish between different classes), with their confidence intervals. An external validation using the test dataset was performed and Shapley additive explanations (SHAP) were used to improve model interpretability. The development was based on a machine learning framework in Python v3.7.9, using Scikit-Learn and LightGBM. “Azure Machine Learning” and ‘Microsoft Kubernetes’ were used to move from test environment to a production environment, available to end users.

3. Results

The model development was based on a sample of 1272 subjects, of which 12% were admitted to the ICU. The discharge of the remaining 88% who remained hospitalized was defined as follows: 71.73% were sent home; 21.63% were admitted to home hospitalization; 6.38% passed away; and 0.28% were transferred to another center. A total of 58% were men, and those aged over 60 accounted for 63% of the sample. Most of patients were Spanish (67%). They remained hospitalized for 208 h on average after admission and the most common respiratory therapy was nasal prongs. The median time elapsed to be admitted to the ICU was 72 h. The samples used to develop and validate the model presented significant differences across all variables except sex, place of birth, hours of hospitalization, hours of anticoagulant treatment, creatinine value, D-dimer value, ferritin value, PCO2 value, platelet value, and aspartate aminotransferase (AST) value (Table 1).
The performance of the four machine learning models studied is detailed in Table 2. LightGBM was the selected algorithm because it presented the best metrics (the higher the numerical value, the greater the predictive capacity).
The final model was composed of a total of 40 variables, 30 of which were original and 10 derived.
Only two of the variables that made up the model were collected statically (age and place of birth), the rest were collected dynamically. The classification of the type of variables is detailed in Appendix A.1. In descending order according to their gain, the most significant variables were type of oxygen therapy, hours of hospitalization, hours of hospitalization and age, hours of anticoagulant treatment, and lymphocyte value and oxygen saturation value (Figure 2).
The results of the model validation using the test datasets are presented in Table 3 and Figure 3. The area under the curve obtained by the LightGBM model when the test dataset was used was 1.00 (0.99–1.0), a higher value than that obtained with the training dataset 0.95 (0.93–1.00) and presenting very similar metrics in terms of specificity 0.99 (0.97–1.00) vs. 0.99 (0.98–1.00) and sensitivity 0.92 (0.86–0.98) vs. 0.91 (0.82–0.99), respectively.
Variable interpretability using SHAP was analyzed based on the test dataset. The results indicated that more intensive respiratory therapy, fewer hospitalization hours, especially in older people, low oxygen saturation, and low lymphocyte value were related to higher ICU admission risk (Figure 4).

4. Discussion

By identifying and prioritizing COVID-19 patients at risk of ICU admission, it is possible to provide appropriate care to those who are most vulnerable. Similarly, patients who likely do not require higher levels of care can be identified, thereby enhancing resource management efficiency during peak pressure on the health system.
Based on the above, we created and validated a model to predict ICU admission of hospitalized COVID-19 patients using machine learning techniques. The study obtained very good results.
Worthy of note, the developed model rested on variables, all of which were not collected at the same time, i.e., at admission, but on variables which were expected to be dynamic (such as laboratory results or oxygen therapy, among others), and which were collected throughout the patient’s stay. The goal was to accurately reflect the patient’s true trajectory during hospitalization. This approach prevents static databased predictions from determining the resource planning. Indeed, the disease may change in course during the provision of care: it can become more or less serious making the planning suboptimal. Moreover, the external validity of the model under study was evaluated using data from a later period.
Studies predicting ICU admission or similar outcomes, like the need for invasive mechanical ventilation, include those by Mauer et al., Cheng F.-Y. et al., Douville et al., and Park et al. Comparing the studies, our model was observed to present a superior performance to that obtained by Cheng F.-Y. et al., Douville et al., Park et al. Likewise, our model also obtained better results than those of other studies focusing on ICU admission risk in which only hospital admission time data were used [7,14,15,16,17,18]. This is unsurprising since our model was based on more accurate information about the patient’s true condition. Finally, a recent meta-analysis evaluated the joint performance of four predictive models and showed a slightly poorer overall result than that obtained in our study [8].
Regarding the variables that made up the different models which rested on the same methodology as ours, we found that all studies included variables relating to respiratory failure (e.g., respiratory rate, oxygen saturation, and type of oxygen therapy) as one of the major variables, as well as other variables linked to inflammation and/or infection, most of which were obtained from laboratory variables [3,19,20,21]. However, to the best of our knowledge, no study except the present one has included hospitalization-related variables in the final model such as “hours of hospitalization” or the derived variable “hours of hospitalization and age”, whose SHAP interpretation revealed that a lower number of hospitalization hours was related to ICU admission, especially in the elderly. Similarly, not all models included pharmacological variables in their final composition, such as the consumption of corticosteroids [21] or anticoagulants. Regarding anticoagulant treatment, the SHAP interpretation showed that a greater number of hours of treatment was related to ICU admission only in certain cases, and that in the rest, neither a greater number of hours nor fewer hours were related to this admission. In the case of corticosteroid treatment, greater use was related to ICU admission, as also mentioned in the study of Park et al.
In this work, we used a collection of dynamic variable values which reflected the patient’s actual condition and supported optimal planning according to that situation. Nevertheless, it is worth noting some study limitations. Certain variables included in previous studies, such as the National Early Warning Score (NEWS) or respiratory rate, could not be integrated in this study because they presented a high number of null values. Others, such as bilateral infiltrates, could not be incorporated because the information was not available in the EMR. Although the CRP test is the diagnostic test of reference, patients diagnosed via antigen test were included in the study. The sample size constitutes a limitation of the present study. Indeed, the number of events in the outcome (data training) was 154, which, according to the criterion of at least ten events per parameter, is lower than the minimum recommended value. However, we have calculated confidence intervals for the measures selected to appraise the model performance, which allows us to assess the accuracy of our estimates. Furthermore, the fact that the performance of the model in the dataset was even slightly higher than in the training data increases the confidence in the reliability of the results. Finally, during the study, some patients were included in clinical trials of drugs to which these researchers were blinded. Moreover, the COVID-19 vaccination campaign began during the last phase of the study and related information could not be systematically collected from the EMR.
This study has developed a machine learning-based model of ICU admission risk in COVID-19 patients, with predictors measured at admission and during the patient’s hospital stay, allowing a more realistic assessment of the patient’s condition and high predictive power. The use of SHAP techniques has facilitated the interpretability of the model, revealing the importance of predictors not previously examined in the literature, such as hours of hospitalization. The model is clinically valuable, as indicated by the fact that it has been applied routinely in the study hospitals since its validation. Its use was beneficial at the peak of the pandemic. In the current scenario, where COVID-19 coexists with other acute respiratory infections (ARI), the research team plans to re-evaluate and adjust the model for these pathologies. Thus, the model could be helpful in periods of high hospital occupancy and incidence of ARI.

Author Contributions

Conceptualization, M.L.-M., J.F.-d.-M. and F.-J.B.-L.; methodology, M.L.-M., J.F.-d.-M., F.-J.B.-L., A.B.-G., J.C.-G. and M.J.C.-M.; software, A.B.-G., F.-M.A.-Z. and I.G.-G.; validation, M.L.-M., J.F.-d.-M., F.-J.B.-L., A.B.-G., F.-M.A.-Z., I.G.-G., J.C.-G. and M.J.C.-M.; formal analysis, M.L.-M., J.F.-d.-M., F.-J.B.-L., A.B.-G., F.-M.A.-Z., I.G.-G., J.C.-G. and M.J.C.-M.; investigation, M.L.-M., J.F.-d.-M., F.-J.B.-L., A.B.-G., F.-M.A.-Z., I.G.-G., J.C.-G. and M.J.C.-M.; data curation, A.B.-G., F.-M.A.-Z. and I.G.-G.; writing—original draft preparation, M.L.-M., J.F.-d.-M., F.-J.B.-L., A.B.-G., F.-M.A.-Z., I.G.-G., J.C.-G. and M.J.C.-M.; writing—review and editing, M.L.-M., J.F.-d.-M., F.-J.B.-L., A.B.-G., F.-M.A.-Z., I.G.-G., J.C.-G. and M.J.C.-M.; visualization, M.L.-M., J.F.-d.-M., F.-J.B.-L., A.B.-G., F.-M.A.-Z., I.G.-G., J.C.-G. and M.J.C.-M.; supervision, M.L.-M., J.F.-d.-M., F.-J.B.-L., A.B.-G., F.-M.A.-Z., I.G.-G., J.C.-G. and M.J.C.-M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was approved by the Ethics Committee (University Hospital of Vinalopó and University Hospital of Torrevieja, code 2020.023, 20 May 2020) due to the use of patient information obtained from medical records, which was anonymized, by avoiding the use of personal data for data collection and subsequent model development.

Informed Consent Statement

Patient consent was waived because the data used was fully anonymized, avoiding the use of personal information during both data collection and subsequent model development. This approach ensures the protection of individual privacy and complies with applicable ethical and regulatory standards.

Data Availability Statement

Data is unavailable due to privacy restrictions.

Conflicts of Interest

Mireia Ladios-Martin and Francisco-Javier Ballesta-López were employed at Ribera Salud. Adrián Belso-Garzas and Francisco-Manuel Aznar-Zamora were employed at Futurs (the technology subsidiary of Ribera Salud). José Fernandez-de-Maya was employed at the Vinalopo University Hospital (managed by Ribera Salud). Ignacio Garcia-Garcia was employed by Verne Technology Group. All authors declare that the research was conducted without any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ICUIntensive Care Unit
EMRElectronic Medical Record
CRPC-Reactive Protein
RFERecursive Feature Elimination
IQRInterquartile range
RFECVRecursive Feature Elimination Cross Validation
ADASYNAdaptive synthetic sampling approach for imbalanced learning
AUCArea Under the Curve
SHAPShapley Additive Explanations
ASTAspartate aminotransferase
NEWSNational Early Warning Score
ARIAcute Respiratory Infections
LDHLactate dehydrogenase
PCO2Partial pressure CO2
PO2Partial pressure O2
aPTTActivated partial thromboplastin
PPVPositive Predictive Value
NPVNegative Predictive Value
F1-SF1-Score

Appendix A

Appendix A.1

Table A1. Model variables.
Table A1. Model variables.
VariableNo. ValuesType of ValueType of VariableData Collection Timing
Number of antibiotics8AccumulatedDynamicValue prior to ICU admission or expected ICU admission (median)
Hours of anticoagulant treatmentAllAccumulatedDynamicValue prior to ICU admission or expected ICU admission (median)
Aspartate aminotransferase (AST) valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
Bilirubin valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
Creatine Kinase (CK) valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
Hours of corticosteroid treatmentAllAccumulatedDynamicValue prior to ICU admission or expected ICU admission (median)
Creatinine valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
D-dimer valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
Ferritin valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
Heart rateAllIndividualDynamicLast recorded value in the 24 h prior to ICU admission or expected ICU admission (median)
Age group6IndividualStaticOn admission
Hours of hospitalizationAllAccumulatedDynamicValue from the emergency department admission to the ICU admission or to the discharge day
Hours of Emergency departmentAllAccumulatedDynamicValue from the emergency department stay
Lactate dehydrogenase (LDH) valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
Leukocytes valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
Lymphocyte valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
Place of birth2IndividualStaticOn admission
Type of oxygen therapy6IndividualDynamicLast recorded value in the 24 h prior to ICU admission or expected ICU admission (median)
Partial Pressure CO2 (PCO2) valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
Platelet valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
Partial Pressure O2 (PO2) valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
C-Reactive Protein (CRP) valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
Urea range2IndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
Oxygen saturation valueAllIndividualDynamicLast recorded value in the 24 h prior to ICU admission or expected ICU admission (median)
Oxygen saturation Lab valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
TemperatureAllIndividualDynamicLast recorded value in the 24 h prior to ICU admission or expected ICU admission (median)
Systolic blood pressureAllIndividualDynamicLast recorded value in the 24 h prior to ICU admission or expected ICU admission (median)
Troponin valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
Activated partial thromboplastin (aPTT) valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)
Urea valueAllIndividualDynamicLast recorded value prior to ICU admission or expected ICU admission (median)

Appendix A.2

Table A2. Hyperparameters setting for the LightGBM model.
Table A2. Hyperparameters setting for the LightGBM model.
HyperparameterValue
boosting_typegbdt
class_weightbalanced
colsample_bytree0.4750764935387468
importance_typesplit
learning_rate0.28312968303160313
max_depth−1
min_child_samples3
min_child_weight0.001
min_split_gain0.0
n_estimators76
n_jobs4
num_leaves326
objectivebinary
random_state50
reg_alpha0.5985119228590867
reg_lambda0.5845451820124326
silentTrue
subsample1.0
subsample_for_bin300000
subsample_freq0
bagging_fraction0.6087654298259604

References

  1. Baj, J.; Karakuła-Juchnowicz, H.; Teresiński, G.; Buszewicz, G.; Ciesielka, M.; Sitarz, R.; Forma, A.; Karakuła, K.; Flieger, W.; Portincasa, P.; et al. COVID-19: Specific and non-specific clinical manifestations and symptoms: The current state of knowledge. J. Clin. Med. 2020, 9, 1753. [Google Scholar] [CrossRef] [PubMed]
  2. Johns Hopkins Coronavirus Resource Center. COVID-19 Map. 2022. Available online: https://coronavirus.jhu.edu/map.html (accessed on 15 October 2022).
  3. Cheng, F.-Y.; Joshi, H.; Tandon, P.; Freeman, R.; Reich, D.L.; Mazumdar, M.; Kohli-Seth, R.; Levin, M.A.; Timsina, P.; Kia, A. Using machine learning to predict ICU transfer in hospitalized COVID-19 patients. J. Clin. Med. 2020, 9, 1668. [Google Scholar] [CrossRef]
  4. Wu, G.; Yang, P.; Xie, Y.; Woodruff, H.C.; Rao, X.; Guiot, J.; Frix, A.-N.; Louis, R.; Moutschen, M.; Li, J.; et al. Development of a clinical decision support system for severity risk prediction and triage of COVID-19 patients at hospital admission: An international multicentre study. Eur. Respir. J. 2020, 56, 2001104. [Google Scholar] [CrossRef] [PubMed]
  5. Chen, N.; Zhou, M.; Dong, X.; Qu, J.; Gong, F.; Han, Y.; Qiu, Y.; Wang, J.; Liu, Y.; Wei, Y.; et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: A descriptive study. Lancet 2020, 395, 507–513. [Google Scholar] [CrossRef]
  6. White, D.B.; Lo, B. A framework for rationing ventilators and critical care beds during the COVID-19 pandemic. JAMA 2020, 323, 1773–1774. [Google Scholar] [CrossRef] [PubMed]
  7. Kim, H.-J.; Han, D.; Kim, J.-H.; Kim, D.; Ha, B.; Seog, W.; Lee, Y.-K.; Lim, D.; Hong, S.O.; Park, M.-J.; et al. An easy-to-use machine learning model to predict the prognosis of patients with COVID-19: Retrospective cohort study. J. Med. Internet Res. 2020, 22, e24225. [Google Scholar] [CrossRef]
  8. Chen, R.; Chen, J.; Yang, S.; Luo, S.; Xiao, Z.; Lu, L.; Liang, B.; Liu, S.; Shi, H.; Xu, J. Prediction of prognosis in COVID-19 patients using machine learning: A systematic review and meta-analysis. Int. J. Med. Inform. 2023, 177, 105151. [Google Scholar] [CrossRef]
  9. Hou, W.; Zhao, Z.; Chen, A.; Li, H.; Duong, T.Q. Machining learning predicts the need for escalated care and mortality in COVID-19 patients from clinical variables. Int. J. Med. Sci. 2021, 18, 1739–1745. [Google Scholar] [CrossRef]
  10. Adamidi, E.S.; Mitsis, K.; Nikita, K.S. Artificial intelligence in clinical care amidst COVID-19 pandemic: A systematic review. Comput. Struct. Biotechnol. J. 2021, 19, 2833–2850. [Google Scholar] [CrossRef]
  11. Myers, L.C.; Parodi, S.M.; Escobar, G.J.; Liu, V.X. Characteristics of hospitalized adults with COVID-19 in an integrated health care system in California. JAMA 2020, 323, 2195–2198. [Google Scholar] [CrossRef]
  12. Zhou, Y.; He, Y.; Yang, H.; Yu, H.; Wang, T.; Chen, Z.; Yao, R.; Liang, Z. Development and validation a nomogram for predicting the risk of severe COVID-19: A multi-center study in Sichuan, China. PLoS ONE 2020, 15, e0233328. [Google Scholar] [CrossRef] [PubMed]
  13. Liang, W.; Liang, H.; Ou, L.; Chen, B.; Chen, A.; Li, C.; Li, Y.; Guan, W.; Sang, L.; Lu, J.; et al. Development and validation of a clinical risk score to predict the occurrence of critical illness in hospitalized patients with COVID-19. JAMA Intern. Med. 2020, 180, 1081–1089. [Google Scholar] [CrossRef] [PubMed]
  14. Burian, E.; Jungmann, F.; Kaissis, G.A.; Lohöfer, F.K.; Spinner, C.D.; Lahmer, T.; Treiber, M.; Dommasch, M.; Schneider, G.; Geisler, F.; et al. Intensive care risk estimation in COVID-19 pneumonia based on clinical and imaging parameters: Experiences from the Munich cohort. J. Clin. Med. 2020, 9, 1514. [Google Scholar] [CrossRef] [PubMed]
  15. Patel, D.; Kher, V.; Desai, B.; Lei, X.; Cen, S.; Nanda, N.; Gholamrezanezhad, A.; Duddalwar, V.; Varghese, B.; A Oberai, A. Machine learning based predictors for COVID-19 disease severity. Sci. Rep. 2021, 11, 4673. [Google Scholar] [CrossRef]
  16. Statsenko, Y.; Al Zahmi, F.; Habuza, T.; Gorkom, K.N.-V.; Zaki, N. Prediction of COVID-19 severity using laboratory findings on admission: Informative values, thresholds, ML model performance. BMJ Open 2021, 11, e044500. [Google Scholar] [CrossRef]
  17. Bolourani, S.; Brenner, M.; Wang, P.; McGinn, T.; Hirsch, J.S.; Barnaby, D.; Zanos, T.P. A machine learning prediction model of respiratory failure within 48 hours of patient admission for COVID-19: Model development and validation. J. Med. Internet Res. 2021, 23, e24246. [Google Scholar] [CrossRef]
  18. Wendland, P.; Schmitt, V.; Zimmermann, J.; Häger, L.; Göpel, S.; Schenkel-Häger, C.; Kschischo, M. Machine learning models for predicting severe COVID-19 outcomes in hospitals. Inform. Med. Unlocked 2023, 37, 101188. [Google Scholar] [CrossRef]
  19. Mauer, E.; Lee, J.; Choi, J.; Zhang, H.; Hoffman, K.L.; Easthausen, I.J.; Rajan, M.; Weiner, M.G.; Kaushal, R.; Safford, M.M.; et al. A predictive model of clinical deterioration among hospitalized COVID-19 patients by harnessing hospital course trajectories. J. Biomed. Inform. 2021, 118, 103794. [Google Scholar] [CrossRef]
  20. Douville, N.J.; Douville, C.B.; Mentz, G.; Mathis, M.R.; Pancaro, C.; Tremper, K.K.; Engoren, M. Clinically applicable approach for predicting mechanical ventilation in patients with COVID-19. Br. J. Anaesth. 2021, 126, 578–589. [Google Scholar] [CrossRef]
  21. Park, H.; Choi, C.-M.; Kim, S.-H.; Kim, S.H.; Kim, D.K.; Jeong, J.B. In-hospital real-time prediction of COVID-19 severity regardless of disease phase using electronic health records. PLoS ONE 2024, 19, e0294362. [Google Scholar] [CrossRef]
  22. Subudhi, S.; Verma, A.; Patel, A.B.; Hardin, C.C.; Khandekar, M.J.; Lee, H.; McEvoy, D.; Stylianopoulos, T.; Munn, L.L.; Dutta, S.; et al. Comparing machine learning algorithms for predicting ICU admission and mortality in COVID-19. NPJ Digit. Med. 2021, 4, 87. [Google Scholar] [CrossRef] [PubMed]
  23. Assaf, D.; Gutman, Y.; Neuman, Y.; Segal, G.; Amit, S.; Gefen-Halevi, S.; Shilo, N.; Epstein, A.; Mor-Cohen, R.; Biber, A.; et al. Utilization of machine-learning models to accurately predict the risk for critical COVID-19. Intern. Emerg. Med. 2020, 15, 1435–1443. [Google Scholar] [CrossRef] [PubMed]
  24. Li, X.; Ge, P.; Zhu, J.; Li, H.; Graham, J.; Singer, A.; Richman, P.S.; Duong, T.Q. Deep learning prediction of likelihood of ICU admission and mortality in COVID-19 patients using clinical variables. PeerJ 2020, 8, e10337. [Google Scholar] [CrossRef]
  25. Jimenez-Solem, E.; Petersen, T.S.; Hansen, C.; Hansen, C.; Lioma, C.; Igel, C.; Boomsma, W.; Krause, O.; Lorenzen, S.; Selvan, R.; et al. Developing and validating COVID-19 adverse outcome risk prediction models from a bi-national European cohort of 5594 patients. Sci. Rep. 2021, 11, 3246. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Design of the study.
Figure 1. Design of the study.
Jcm 14 04205 g001
Figure 2. Importance of the variables (training dataset).
Figure 2. Importance of the variables (training dataset).
Jcm 14 04205 g002
Figure 3. Receiver operator characteristic curve (test dataset).
Figure 3. Receiver operator characteristic curve (test dataset).
Jcm 14 04205 g003
Figure 4. SHAP interpretation (test dataset).
Figure 4. SHAP interpretation (test dataset).
Jcm 14 04205 g004
Table 1. Patient characteristics.
Table 1. Patient characteristics.
VariableTotal%2020 Training%2021 Test%Chi2, t-Student or Comparison of Two Population Meansp Value
Age (categorized)
0–20190.90151.1840.73
21–401245.85907.08224.00
41–6058327.5036128.3813825.09
61–7583239.2450439.6222841.45
76–8542720.1423918.7911621.09
86–1501356.37634.95427.64
2120100.001272100.00550100.0014.340.0136 *
Sex
Male122057.5574158.2631657.45
Female90042.4553141.7423442.550.08540.7702
2120100.001272100.00550100.00
Place of birth
Spain123058.0285567.2237568.1
Outside Spain59227.9241732.7817531.81
2120100.001272100.00550100.000.12190.7269
Emergency hours
1st Qu 5.00 2.00
Mean (SD) 13.19 (10.85) 8.68 (7.83) −41.065 (−43.73; −39.74)<0.0001 *
3rd Qu 21.00 12.00
Hours of hospitalization
1st Qu 24.00 24.00
Mean (SD) 207.86 (146.29) 202.00 (166.16) 0.751 (−9.42; 21.12)0.4523
3rd Qu 261.2 258.00
Type of oxygen therapy
No oxygen68232.1745235.5314125.64
Nasal prong91243.0253542.0724444.36
Venturi oxygen mask23110.901169.126311.45
Reservoir1939.101209.436411.64
High flow622.92221.73285.09
Non-invasive mechanical ventilation401.89272.12101.82
2120100.001272100.00550100.0031.995<0.0001 *
Number of antibiotics
0125659.2670455.3641375.10
177136.3753942.3712923.45
2763.57272.1281.45
390.4220.1500.00
460.2800.0000.00
510.0500.0000.00
810.0500.0000.00
2120100.001272100.00550100.0063.664<0.0001 *
Hours of anticoagulant treatment
1st Qu 25.00 13.50
Mean (SD) 66.75 (47.82) 63.34 (60.16)
3rd Qu 103.00 92.00 1.2814 (−1.79; 8.58)0.2000
Hours of corticosteroid treatment
1st Qu 0.00 0.00
Mean (SD) 27.12 (42.83) 51.53 (47.57)
3rd Qu 55.00 80.00 −10.789 (−28.84; −19.97)<0.0001 *
Systolic blood pressure
1st Qu 110.00 115.00
Mean (SD) 124.50 (16.47) 126.20 (16.17)
3rd Qu 135.00 140.00 −1.9996 (−3.31; −0.03)0.0457 *
Heart rate
1st Qu 67.00 66.00
Mean (SD) 77.23 (12.74) 75.44 (13.26)
3rd Qu 86.00 82.00 2.7197 (0.50; 3.08)0.0066 *
Temperature
1st Qu 36.00 35.00
Mean (SD) 36.03 (0.75) 35.80 (0.74)
3rd Qu 39.00 36.00 5.1661 (0.13; 0.28)<0.0001 *
Urea value
1st Qu 29.00 36.00
Mean (SD) 44.83 (27.47) 54.97 (32.30)
3rd Qu 63.00 63.00 −6.8451 (−13.04; −7.23)<0.0001 *
Bilirubin value
1st Qu 0.40 0.30
Mean (SD) 0.63 (0.49) 0.45 (0.35)
3rd Qu 0.70 0.50 8.1922 (0.13; 0.22)<0.0001 *
Creatine Kinase (CK) value
1st Qu 46.00 63.00
Mean (SD) 170.61 (230.82) 138.79 (145.72) 2.9863 (10.92; 52.73)0.0029 *
3rd Qu 160.0 150.60
Creatinine value
1st Qu 0.69 0.71
Mean (SD) 0.98 (0.76) 0.99 (0.86)
3rd Qu 1.03 1.01 −0.2779 (−0.09; 0.07)0.7917
D-dimer value
1st Qu 425 484
Mean (SD) 1621 (3833.99) 1920 (4688) −1.0543 (−0.15; 0.04)0.292
3rd Qu 1621 1920
Ferritin value
1st Qu 258 304
Mean (SD) 887.8 (1041.45) 831.89 (797.67)
3rd Qu 1197 1014 0.0464 (−0.09: 0.10)0.963
Lactate dehydrogenase (LDH) value
1st Qu 407 454
Mean (SD) 565.6 (286.26) 623.32 (353.85) −5.1556 (−0.13; −0.06)
3rd Qu 633 687 <0.0001 *
Leukocytes value
1st Qu 4.99 5.85
Mean (SD) 7.31 (4.22) 8.67 (4.01)
3rd Qu 8.71 10.51 −6.5963 (−78; −0.96)<0.0001 *
Partial pressure CO2 (PCO2)
1st Qu 33.10 35.20
Mean (SD) 41.30 (8.60) 41.53 (9.33)
3rd Qu 45.55 45.45 −0.496 (−1.14; 0.68)0.62
Partial pressure O2 (PO2)
1st Qu 38.38 55.95
Mean (SD) 5976 (26.77) 63.59 (12.79)
3rd Qu 72.20 69.05 −4.1106 (−8.47; −2.99)<0.0001 *
Platelet value
1st Qu 182 176
Mean (SD) 258 (108.97) 254 (110.66)
3rd Qu 312 323 0.9963 (−0.02; 0.07)0.3194
Troponin value
1st Qu 0.017 0.008
Mean (SD) 0.03 (0.07) 0.05 (0.073)
3rd Qu 0.03 0.046 −4.0135 (−0.34; −0.12)<0.0001 *
Aspartate aminotransferase (AST) value
1st Qu 27 27
Mean (SD) 46 (33.92) 44 (28.71) 1.1084 (−1.32; 4.75)0.2679
3rd Qu 54 50
Activated partial thromboplastin (aPTT) value
1st Qu 24.40 22.30
Mean (SD) 29.94 (7.50) 26.74 (9.69)
3rd Qu 34.40 30.94 8.2431 (2.14; 3.47)<0.0001 *
Note: * Statistical difference with a p-value less than 0.05.
Table 2. Performance of the different predictive classification models.
Table 2. Performance of the different predictive classification models.
PPV
(95% CI)
NPV
(95% CI)
Accuracy
(95% CI)
AUC
(95% CI)
Specificity
(95% CI)
Sensitivity
(95% CI)
F1-SYouden Index
LightGBM0.93 (0.85–1.0)0.99 (0.98–1.0)0.98 (0.97–1.0)0.95 (0.93–0.97)0.99 (0.98–1.0)0.91 (0.82–0.99)0.940.91
XGBoost0.93 (0.85–1.0)0.99 (0.98–1.0)0.98 (0.97–1.0)0.94 (0.91–0.97)0.99 (0.98–1.0)0.91 (0.82–0.99)0.910.91
Logistic regression0.85 (0.75–0.96)0.97 (0.96–0.99)0.98 (0.97–1.0)0.92 (0.89–0.95)0.98 (0.97–1.0)0.80 (0.68–0.91)0.790.81
Random Forest0.95 (0.88–1.0)0.98 (0.96–1.0)0.96 (0.94–0.98)0.90 (0.87–0.93)0.99 (0.98–1.0)0.84 (0.73–0.95)0.880.93
Note: PPV (positive predictive value), NPV (negative predictive value), accuracy, AUC (area under the curve), F1-S (F1-score).
Table 3. Results of the models using training dataset and test dataset.
Table 3. Results of the models using training dataset and test dataset.
PPV
(95% CI)
NPV
(95% CI)
Accuracy
(95% CI)
AUC
(95% CI)
Specificity
(95% CI)
Sensitivity
(95% CI)
F1-SYouden Index
LightGBM (Training)0.93 (0.85–1.00)0.99 (0.97–1.00)0.98 (0.97–1.00)0.95 (0.93–1.00)0.99 (0.98–1.00)0.91 (0.82–0.99)0.940.91
LightGBM (Test)0.95 (0.90–1.00)0.99 (0.98–1.00)0.98 (0.97–0.99)1.00 (0.99–1.00)0.99 (0.97–1.00)0.92 (0.86–0.98)0.930.93
Note: PPV (Positive Predictive Value), NPV (Negative Predictive Value), AUC (Area Under the Curve), F1-S (F1 Score).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ladios-Martin, M.; Cabañero-Martínez, M.J.; Fernández-de-Maya, J.; Ballesta-López, F.-J.; Garcia-Garcia, I.; Belso-Garzas, A.; Aznar-Zamora, F.-M.; Cabrero-García, J. Calculating the Risk of Admission to Intensive Care Units in COVID-19 Patients Using Machine Learning. J. Clin. Med. 2025, 14, 4205. https://doi.org/10.3390/jcm14124205

AMA Style

Ladios-Martin M, Cabañero-Martínez MJ, Fernández-de-Maya J, Ballesta-López F-J, Garcia-Garcia I, Belso-Garzas A, Aznar-Zamora F-M, Cabrero-García J. Calculating the Risk of Admission to Intensive Care Units in COVID-19 Patients Using Machine Learning. Journal of Clinical Medicine. 2025; 14(12):4205. https://doi.org/10.3390/jcm14124205

Chicago/Turabian Style

Ladios-Martin, Mireia, María José Cabañero-Martínez, José Fernández-de-Maya, Francisco-Javier Ballesta-López, Ignacio Garcia-Garcia, Adrián Belso-Garzas, Francisco-Manuel Aznar-Zamora, and Julio Cabrero-García. 2025. "Calculating the Risk of Admission to Intensive Care Units in COVID-19 Patients Using Machine Learning" Journal of Clinical Medicine 14, no. 12: 4205. https://doi.org/10.3390/jcm14124205

APA Style

Ladios-Martin, M., Cabañero-Martínez, M. J., Fernández-de-Maya, J., Ballesta-López, F.-J., Garcia-Garcia, I., Belso-Garzas, A., Aznar-Zamora, F.-M., & Cabrero-García, J. (2025). Calculating the Risk of Admission to Intensive Care Units in COVID-19 Patients Using Machine Learning. Journal of Clinical Medicine, 14(12), 4205. https://doi.org/10.3390/jcm14124205

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop