Next Article in Journal
Biomarkers for Monitoring of Changes in Disease Activity in Ulcerative Colitis
Previous Article in Journal
Musculoskeletal Rehabilitation: New Perspectives in Postoperative Care Following Total Knee Arthroplasty Using an External Motion Sensor and a Smartphone Application for Remote Monitoring
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multivariable Risk Modelling and Survival Analysis with Machine Learning in SARS-CoV-2 Infection

1
Nuclear Medicine Unit, Ospedale Civile Sant’Andrea, Via Vittorio Veneto 170, 19124 La Spezia, Italy
2
Oncology Unit, Ospedale Civile Sant’Andrea, 19124 La Spezia, Italy
3
Radiology Unit, Ospedale Civile Sant’Andrea, 19124 La Spezia, Italy
4
Infectius Diseases Unit, Ospedale Civile Sant’Andrea, 19124 La Spezia, Italy
5
Internal Medicine Unit, Ospedale San Bartolomeo, 19138 Sarzana, Italy
6
Pneumology Unit, Ospedale Civile Sant’Andrea, 19124 La Spezia, Italy
7
Intensive Care Unit, Ospedale Civile Sant’Andrea, 19124 La Spezia, Italy
8
Emergency Department, Ospedale Civile Sant’Andrea, 19124 La Spezia, Italy
9
Emergency Department, Ospedale San Bartolomeo, 19138 Sarzana, Italy
*
Author to whom correspondence should be addressed.
J. Clin. Med. 2023, 12(22), 7164; https://doi.org/10.3390/jcm12227164
Submission received: 10 October 2023 / Revised: 3 November 2023 / Accepted: 15 November 2023 / Published: 18 November 2023
(This article belongs to the Section Infectious Diseases)

Abstract

:
Aim: To evaluate the performance of a machine learning model based on demographic variables, blood tests, pre-existing comorbidities, and computed tomography(CT)-based radiomic features to predict critical outcome in patients with acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Methods: We retrospectively enrolled 694 SARS-CoV-2-positive patients. Clinical and demographic data were extracted from clinical records. Radiomic data were extracted from CT. Patients were randomized to the training (80%, n = 556) or test (20%, n = 138) dataset. The training set was used to define the association between severity of disease and comorbidities, laboratory tests, demographic, and CT-based radiomic variables, and to implement a risk-prediction model. The model was evaluated using the C statistic and Brier scores. The test set was used to assess model prediction performance. Results: Patients who died (n = 157) were predominantly male (66%) over the age of 50 with median (range) C-reactive protein (CRP) = 5 [1, 37] mg/dL, lactate dehydrogenase (LDH) = 494 [141, 3631] U/I, and D-dimer = 6.006 [168, 152.015] ng/mL. Surviving patients (n = 537) had median (range) CRP = 3 [0, 27] mg/dL, LDH = 484 [78, 3.745] U/I, and D-dimer = 1.133 [96, 55.660] ng/mL. The strongest risk factors were D-dimer, age, and cardiovascular disease. The model implemented using the variables identified using the LASSO Cox regression analysis classified 90% of non-survivors as high-risk individuals in the testing dataset. In this sample, the estimated median survival in the high-risk group was 9 days (95% CI; 9–37), while the low-risk group did not reach the median survival of 50% (p < 0.001). Conclusions: A machine learning model based on combined data available on the first days of hospitalization (demographics, CT-radiomics, comorbidities, and blood biomarkers), can identify SARS-CoV-2 patients at risk of serious illness and death.

1. Introduction

Severe acute respiratory syndrome coronavirus disease 2 (SARS-CoV-2) has had a significant economic and global health impact and continues to be a major concern as new variants are identified [1,2,3]. Moreover, global environmental changes could increase the probability of future pandemics, necessitating the development and application of accurate and innovative tools for risk stratification [4].
The clinical disease phenotype for SARS-CoV-2 is extremely heterogeneous. The infection can proceed asymptomatically or evolve with differing intensities up to the severe disease that is associated with a low survival rate [5,6].
The variability of clinical manifestations makes outcome prediction particularly difficult. This can be a major issue when the volume of patients is high and resources are limited, as occurs during a pandemic. Therefore, the identification of major risk factors and the implementation of an outcome-prediction model could support treatment planning and optimal resource allocation.
To date, several studies have reported on the association between the mortality rate and the subject’s age, pre-existing comorbidities, some blood biomarkers, and the degree of lung involvement (mainly based on computed tomography (CT) scans) [5,7,8].
The data published so far have highlighted greater frailty in older adults, who seem to have a higher rate of severe disease and mortality than young patients [9].
There is broad agreement in the literature that comorbidities are present in approximately half of patients with SARS-CoV-2. According to Richardson et al., coronary heart disease, hypertension, diabetes, and chronic obstructive lung disease are significantly associated with increased mortality [5,10].
Several blood biomarkers have been associated with SARS-CoV-2. High D-dimer levels have been reported as predictors of mortality in hospitalized patients [11]. Similarly, some blood biomarkers of inflammation, such as C-reactive protein (CRP), and cell damage, such as lactate dehydrogenase (LDH), appear to be significantly increased in the most severe forms of the disease [12].
The degree of lung involvement, as primarily assessed using CT, is a potential predictor of outcome [8,13,14,15,16,17]. Data of potential clinical interest contained in the medical images can be read by expert radiologists or extracted using dedicated software; the latter approach is known as radiomics. Recently, several studies have proposed that radiomics and deep learning methods can be used to distinguish normal lung parenchyma from that affected by SARS-CoV-2 pneumonia [18] or to predict patient diagnosis [19] and outcome [20]. These approaches offer high diagnostic performance, as evidenced by the area under the receiver-operating characteristic curve (AUC ≥ 89%) [20,21].
Radiomics data were extracted with Pyradiomics, which is an open-source software implemented in Python 3.6 able to extract radiomics features from two- or three-dimensional medical images [22]. This platform has been widely used by several researchers to evaluate the predictive value of radiomics in several diseases including SARS-CoV-2 pneumonia [23,24,25].
The current study aims to implement and validate a mortality-risk-prediction model for SARS-CoV-2 based on demographic data, blood biomarkers, baseline comorbidities, and radiomic CT data using machine learning methods.

2. Materials and Methods

2.1. Population

This retrospective study was based on clinical records from patients admitted to hospital services through the emergency department. These patients exhibited fever, sore throat, dry cough, diarrhoea, loss of taste or smell, chest pain, and/or shortness of breath or breathing difficulty between 1 March 2020 and 31 December 2020. The regional review committee granted ethical approval (CER Liguria: 553/2020/10988) for this study, and written informed consent was waived. We deidentified data to avoid any potential breach of patient privacy and processed it for research purposes from 1 April 2021 to 31 December 2021. We retrieved CT images from the hospital’s picture-archiving and communications systems (PACs). Inclusion criteria included (i) positive RT-PCR assay for COVID-19 and (ii) at least one non-contrast chest CT. For patients with multiple RT-PCR tests or CT scans, we used the test closest to the time of initial presentation to the emergency department. Exclusion criteria included (i) patients without basal CT, (ii) diagnosis of pneumonia with SARS-CoV-2 not confirmed, and (iii) CT images deteriorated by motion artifact. The study cohort consisted of 694 subjects with RT-PCR confirmed diagnosis of COVID-19 pneumonia, encompassing 447 males and 247 females.
Overall survival (OS) was defined as the time from the first hospital presentation to the date of death or censoring. Patients who were alive were censored at their last follow-up to 31 December 2020. We used hospital records to determine the status of the patients.

2.2. CT-Acquisition Parameters and Interpretation

All patients underwent non-enhanced chest CT. Images were acquired in supine position on Aquilion (Toshiba Medical Systems, Tokyo, Japan) and Optima CT660 (GE Healthcare, Milwaukee, WI, USA) multi-detector CT scanners (120 kVp; 120–440 mAs; thickness: 5–7 mm; slice interval: 5 m; rotation speed: 0.5–1.0 s; helical pitch 1.0875:1 or 1.375:1). Images were reconstructed at 512 × 512 pixels, with a section width of 0.625 mm. CT images were reviewed in S. Andrea Hospital’s imaging laboratory by two board-certified radiologists with approximately 5 years of experience in chest-CT reading. CT images were classified according to the criteria proposed by the Radiological Society of North America (RSNA) into two classes: “typical” and “atypical” findings, as defined by Simpson and colleagues [26].

2.3. Image Analysis and Texture Features Extraction

Lung images were segmented using the 3D slicer software v4.11 [27]. Two certified radiologists reviewed all segmented images to rule out segmentation errors.
In compliance with the Imaging Biomarker Standardisation Initiative (IBSI) protocols (https://arxiv.org/abs/1612.07003 accessed on 24 July 2023), we applied intensity discretisation and spatial resampling before feature extraction. Images were discretised with a 64-bin width and resampled to 2 × 2 × 2 mm3 voxel size with B-spline interpolation. Before analysis, we applied the ComBat harmonisation method [28] to extracted features to remove batch effects from different scanners’ images, using the “neuroComBat 1.0.13” package in R. We applied ComBat harmonisation to the training dataset alone, and subsequently applied estimates obtained from training data harmonisation to the test set. Radiomic features were extracted in the open-source software package Pyradiomics (https://github.com/Radiomics/pyradiomics/releases, accessed on 9 October 2023 v3.1.0), and a total of 43 features were extracted from each CT lung image [22]. Among them were three first-order statistical features, nine Gray-level co-occurrence matrices (GLCM), thirteen Gray-level run-length matrices (GLRLM), thirteen Gray-level size-zone matrices, a Gray difference matrix (GLSZM), and five Gray difference matrix features (NGTDM).

2.4. Feature Selection and Classification

Least absolute shrinkage and selection operator (LASSO) regularized Cox regression [29] was used to build the model for predicting overall patient survival with the demographic, laboratory, and radiomic features [30]. The LASSO method is used for its firm ability to reduce the number of predictors by selecting only those with higher predictive performance [31]. Moreover, LASSO regression is reported among the most commonly used techniques for radiomic-feature selection [32].
LASSO regularized Cox regression was used to choose the most relevant predictors under the Cox proportional-hazard model, using a penalty term for the estimation of the partial maximum likelihood. This penalty reduces the coefficient values forcing them close or equal to zero for those affecting the model least. The penalty can be tuned using a constant called lambda (λ). The best lambda, was defined as the lambda that minimize the 10-fold cross-validation prediction error. We performed regularised Cox regression with cv.glmnet under the R 4.1.3 (http://www.r-project.org, accessed on 9 October 2023) glmnet package 4.1-4 [33].
The cv.glmnet function provides the cross-validated mean C-index and C-index standard error estimate. The function also reports the minimum mean cross-validated error (lambda.min) and the value of lambda, providing the most regularised model, with a cross-validated error within 1 standard error of the minimum [30]. C-statistic was used to assess the predictive performance of the LASSO–Cox regression model.

2.5. Model Design

The LASSO’s initial selection included 57 predictors. Among these were two demographic factors (age and gender), three laboratory tests (C-Reactive Protein, Lactate Dehydrogenase, and D-dimer), nine comorbidities (cancer, blood cancer, diabetes, obesity, haematological disease, cardiovascular disease, cerebrovascular disease, and chronic obstructive pulmonary disease), and forty-three radiomic features. We implemented the predictive model with the demographic, metabolic, and radiomic characteristics that survived the LASSO analysis with the Cox multiple regression method.
To evaluate the model’s performance on new data not used for training, we divided the cohort randomly to include 80% of the sample in the training set and 20% in the validation set [34]. We subsequently evaluated the performance by applying the estimated training parameters to the testing data (Figure 1).

2.6. Model Validation and Calibration

Regression Modeling Strategies (RMS v6.6) is an open-source package implemented under R, containing a collection of functions aimed at evaluating the performance of predictive models [35]. The validation and calibration of the model obtained from the LASSO–Cox regression was carried out with the “validated” and “calibrated” functions included in the RMS package. We used the calibration method to evaluate the performance of the prediction model by comparing the predicted with the observed probabilities. To reduce overfitting and quantify optimism, we internally validated the model by computing an optimism-corrected C-statistic after 1000 bootstrapped resampling. Validation was performed using a test dataset. Model calibration and validation were based on C-index and Brier score metrics. After validation, we calculated each patient’s individual risk score using the ggrisk v1.3 package. Subjects were placed in high- and low-risk groups based on the median risk score.
To evaluate the ability of the risk score to stratify patients into clinically relevant classes, we used Kaplan–Meier to estimate the fraction of subjects who survived in high- and low-risk groups.

2.7. Statistical Analysis

We used R software (version 4.1.3, http://www.r-project.org, accessed on 9 October 2023) for data analysis and graphics. We tested continuous data using independent t-tests, with degrees of freedom adjusted for inequality of variance where appropriate. Wald’s test was used to evaluate the relative importance of each predictor with the outcome. This measure ranges from 0 to ∞ as the association of the predictors with outcome increases, allowing comparison of continuous and categorical variables [35,36].
We conducted LASSO logistic regression analysis using the glmnet package in R. The survival curves were generated using the Kaplan–Meier method implemented in the ggsurvplot function. Validation plots were produced using the root mean squares (RMS) package.
We used the c-statistic to evaluate the discrimination ability of the model. The c-statistic is defined as the proportion of subjects in whom the rankings of predicted and observed survival times agree [35]. C-statistics of > 0.80, between 0.70 and 0.80 and ≤0.50 indicate good, acceptable, and low model discrimination ability, respectively.
The Brier score is defined as the average squared difference between actual events and predicted probabilities. The values range from 0 to 1, with the extremes of 0 and 1 indicating perfect or totally inaccurate agreement between predicted and observed events [37].
We used chi-square analysis for categorical variables. We calculated the 95% confidence intervals (CIs) for sensitivity (SS), specificity (SP), odds ratio (OR), positive predictive value (PPV), and negative predictive value (NPV) to estimate how strongly the model-predicted diagnosis was associated with clinical outcome. Two-tailed p values of less than 0.05 were considered statistically significant.

3. Results

We recruited a total of 694 patients and randomized them to include 80% (n = 556) in training and 20% (n = 138) in test datasets. Table 1 summarizes the patient characteristics. The median age was 64 years (age range: 20–107 years). The study sample consisted predominantly of males (64%). Most patients were residents of northeastern Italy. The median hospital stay was 11 days (range: 3–86 days). Patients had a median CRP of 3.00 mg/dL (range: 0.11–37.00 mg/dL) and a median LDH of 486 U/I (range: 78–3745 U/I). Patients had a median D-dimer of 1133 ng/mL (range: 96–152,015 ng/mL). Deceased patients were predominantly male (66%) and more than 50 years old. Compared with survivors, deceased patients showed differences in laboratory findings (Table 1). As expected, in the current study sample, D-dimer, CRP and LDH were also significantly increased in non-survivors compared with survivors (Table 1). Visual assessment of CT images according to RSNA guidelines [26] identified 111 of 157 non-survivors and 299 of 537 survivors had typical findings. Table 2 shows the impact of pre-existing comorbidities on mortality in patients with SARS-CoV-2. Cardiovascular and cerebrovascular diseases, cancer, haematological diseases and chronic obstructive pulmonary disease all significantly increased the probability of death in the study sample.
Figure S1 (see Supplemental Material) shows the relevance of each predictor based on the Wald test. This relevance was obtained from the multivariable logistic regression used for modelling patient mortality. Predictors are sorted by decreasing importance, and only those with a significance ≤ 0.05 are shown. The most important predictor was the D-dimer, which was the most significant among the laboratory tests, and the demographic variables used to define the prediction model. Moreover, we also found important outcome predictors among some textural features belonging to the GLOBAL, GLCM, GLSZM, GLRLM, and NGDTM families. Among the comorbidities, cardiovascular disease appears to have a significant impact on survival; it is the most significant predictor of mortality.
Ten out of fifty-seven variables with non-zero coefficients survived the LASSO regression and were thus included in the predictive model. The parameter producing a C-index within one standard error was 0.044, corresponding to a C-index of 0.87 (standard error = 0.014) (Figure 2A,B). Selected variables included age, D-dimer, LDH, three groups of comorbidities and four radiomic variables (Figure 2C).
Validation on the training dataset showed high agreement between the predicted and observed survival curves (Figure S2 in Supplemental Material). The unadjusted and bias-adjusted curves were similar and aligned with the dashed curve that represents the best possible relationship between the observed and predicted outcomes as estimated by the mean absolute error (MAE) of 0.01 (Figure S2, left panel; see Supplemental Material). The C-index and Brier scores were 0.872 and 0.0708, respectively. On the test dataset, the C-index, Brier score, and MAE estimated between the predicted and observed curves were 0.885, 0.056, and 0.03, respectively (Figure S2, right panel; see Supplemental Material).
We used the median individual risk score assessed using Cox regression as a cut-off point to classify patients into high- and low-risk groups in each dataset. In the training sample, median survival times were 12 days (95% CI; 10–14) and were not reached (Figure 3A) in high-risk and low-risk patients, respectively (HR = 6.59, 95% CI = 4.34–10.0, p < 0.001). Median survival times in the test set were 9 days (95% CI; 6–37) and were not reached (Figure 3B) in high-risk and low-risk patients, respectively (HR = 4.23, 95% CI = 1.93–9.26, p < 0.001).
Table 3 shows the comparison between the observed outcome and the estimated risk in the training and testing datasets. The prediction model on the testing dataset identifies 90% of true positives among subjects at risk of death, while 66% of true negatives were classified as having a low risk of an event. The risk of mortality was found to be significantly higher (Odds ratio = 13 [95% CI; 4–42], p < 0.0001) among the high-risk group than in the low-risk group.

4. Discussion

Prediction of disease severity and progression in SARS-CoV-2 patients is relevant because early intervention is significantly associated with reduced mortality [38,39]. In this study, we developed and validated a risk-scoring model based on demographics, laboratory tests, and radiomic features to predict the disease progression and survival of hospitalized patients with SARS-CoV-2.
We implemented the model with 10 out of 57 variables selected using LASSO Cox regression and the C-index metric. The proposed model is highly predictive, identifying 90% of deceased patients in the testing set as high-risk and 66% of surviving patients as low-risk. In addition, the estimated C-index of 0.885 summarizes how well the model-predicted risk describes the observed sequence of events. The risk-estimation model included age, laboratory tests (D-dimer and LDH), and four radiomic features.
To date, numerous studies have been conducted on CT-based radiomics features aimed at building diagnostic models to detect SARS-CoV-2 infection or predict hospital stay and outcome.
Shiri et al. demonstrated that a mixed model including CT-based radiomic features and clinical data could be used to predict survival in SARS-CoV-2 patients. Among the radiomics variables, HGLZE from GLSZM and RLNU from GLRMN showed the highest diagnostic performance (AUC = 0.73) [20]. Yang et al. reported a high diagnostic performance of radiomic variables in the classification of SARS-CoV-2 patients from other pneumonias with AUC values of 91.5%, 90.0%, 89.0%, and 87.6% for the GLCM, GLRLM, NGLDM, and GLZLM classes, respectively [40].
Radiomics variables significantly associated with outcome are able to describe the distribution of voxel intensities within the image included in a mask that defines the region of interest. The most significant predictors are those sensitive to changes in the distribution of voxel intensity associated with disease progression. In SARS-CoV-2 pneumonia, the extent of ground glass opacity is generally greater in patients who progress to more severe stages of the disease. In these cases, the voxel-intensity distribution is more uniform and leads to a different estimate of the texture parameters compared to less-involved patients.
The variables used to estimate the risk of developing critical illness due to SARS-CoV-2 infection are generally available in the early stages of hospitalization. Risk estimation in this phase could support clinicians in planning a treatment strategy by allowing them, in higher-risk cases, to allocate resources for more aggressive treatments or admit patients to intensive care units; equally, it would enable physicians to adopt a “watch and wait” approach to low-risk cases.
Previous studies have reported the impact of age on SARS-CoV-2 mortality. A meta-analysis demonstrated age’s decisive effect on mortality [9]. A 60% higher risk of mortality was reported in subjects aged >80 years [9]. As expected, the age of non-surviving patients in our study was significantly higher than that of survivors (median 80 years, 95% CI: 51–107 vs. 59 years, 95% CI: 20–94; p < 0.001). Age was an important predictor of disease outcome (Figure 1) and survived the LASSO regression. It thus contributed to the implemented prediction model.
D-dimer was associated with poor outcomes for patients with SARS-CoV-2, presumably due to the increased likelihood of their developing pulmonary embolisms when they had D-dimer levels above 2590 ng/mL [41]. According to recently published papers [42,43], reinforced by the results for our sample, D-dimer was the variable most strongly associated with patient outcome (Figure S1) as suggested by measured levels of 6.006 vs. 1.133 ng/mL for deceased patients and survivors, respectively.
Similarly, elevated lactate dehydrogenase (LDH) levels have been associated with worse outcomes in patients with viral infections [44,45]. Deceased patients in our study had significantly higher LDH levels than survivors (Table 1), and it was selected using LASSO regression for the survival-prediction model.
Reports in the literature have documented that chronic comorbidities are associated with an increased risk of poor prognosis and a fatal outcome associated with SARS-CoV-2 [10]. Similarly, in our model, pre-existing comorbidities (including cardiovascular and cerebrovascular diseases, cancer, haematological diseases, and chronic obstructive pulmonary disease) were significant predictors of severity of disease and death following SARS-CoV-2 infection. Among the comorbidities, cardiovascular disease was the strongest predictor of mortality in our study sample, with a 4.42-fold higher risk of poor prognosis, in line with the findings of several meta-analyses [46,47,48,49].
CT is the most widespread imaging modality to play a key role in the diagnosis and assessment of the prognosis of patients with SARS-CoV-2 [50]. However, CT findings (such as ground-glass opacities or consolidation) are not specific to SARS-CoV-2, as these can also be found in other diseases associated with a lower risk of death, such as seasonal influenza.
Innovative methods of quantitative image analysis (such as radiomics) can provide an operator-independent semi-quantitative approach by describing spatial and temporal information derived from images (CT, MRI, and PET/CT). Until now, radiomics have been applied in medical fields such as oncology and to specific disorders such as neurodegenerative disease [51,52,53]. Lately, it has also been used to support “digital biopsy”, a non-invasive tissue-characterization technique.
Previous studies have reported the potential use of CT radiomic features to better characterize pulmonary involvement in patients with SARS-CoV-2. Spatial information measured with radiomic features can be used to support differential diagnosis between COVID and non-COVID disease [54], as well as for modelling risk of death and predicting survival [24].
In our study, we selected four radiomic features (Global_Skewness, GLCM_Correlation, GLSZM_LZE, NGTDM_Busyness) to model a risk profile with significant discriminative capabilities for patient outcome. Indeed, selected variables were significantly associated with patient outcome in multivariate logistic regression (p < 0.001). These features contribute to risk modelling by providing quantitative information on lung CT-signal intensity and heterogeneity in SARS-CoV-2 patients.
A systematic review of existing prognostic models identified several were designed to support diagnosis and predict mortality among patients hospitalized for SARS-CoV-2 [7]. Most of the studies reported that predictive models implemented with CT images and/or clinical variables were combined differently depending on the available data.
Only a few studies combined radiomics, demographics, comorbidities, and laboratory tests as potential predictor candidates. The main disadvantage of these studies is their small sample size, which exposes the results to a high risk of bias due to inappropriate evaluation of the predictive performance of the test dataset.
Our study included 694 patients with complete radiomic and clinical datasets. The predictors needed to calculate the risk of developing serious disease are usually available within the first few hours of hospital admission. Using these variables, the model can estimate the risk of mortality, identifying 90% of non-survivors in the study sample. The availability of this information could be useful for optimising treatment planning according to the estimated risk when patients are admitted to hospital.

5. Study Limitations

The major limitation of the study lies in the lack of external validation using a dataset obtained from another hospital. Although our validation was performed on a test set not used for training, to build a robust model and obtain reliable performance evaluation, it would be advisable to validate the model using data from different sources.
Our model is not available as a ready-to-use software package. The study was designed to define and validate a predictive risk model to be subsequently produced as a usable application in clinical practice. To this end, we used commonplace open-source statistical software. These packages facilitate the easy transfer of the method into clinical practice.
Our study lacks information on out-of-hospital mortality. Therefore, mortality may be underestimated due to the death of patients after discharge.
Future development of the results and findings of this study could include i) validating the predictive model on an external dataset and ii) re-evaluating the time-to-event analysis if new data become available. Furthermore, considering the availability of different machine learning algorithms used in different clinical settings including the SARS-CoV-2 pandemic, a useful future aim will be to compare the performance of predictive models based on different approaches.

6. Conclusions

We developed a predictive model of mortality in a sample of 694 SARS-CoV-2 patients using demographic, CT-radiomic, and laboratory tests. We calibrated and validated the model by randomly splitting the sample into training and test datasets. We implemented the final model with a combination of 10 variables including age, D-dimer, LDH, preexisting comorbidities such as cancer and cardiovascular and cerebrovascular diseases, and four radiomic features. The model was able to correctly identify 90% of non-survivors. Identifying high-risk individuals with predictors usually available within the first few hours of hospital admission could be useful in cases of widespread disease to enable more effective allocation of available resources.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/jcm12227164/s1, Figure S1: Predictors impact ranking on mortality. Higher Chi-squared estimstes are suggestive of a greater association with clinical outcome. Label on Y axis show predictor name and P value; Figure S2: Calibration curves from regression model predictions by datdaset.

Author Contributions

Methodology, A.C.; Validation, R.S.; Formal analysis, F.T.; Investigation, D.C., M.S. (Massimiliano Sivori), R.S. and T.S.; Data curation, E.G., A.M., M.B., M.S. (Maurizio Setti), C.S., A.B. and S.A.; Writing—original draft, A.C.; Writing—review & editing, F.T. and G.G.; Visualization, N.Y.; Supervision, A.C.; Project administration, N.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Ethics Committee CER Liguria (Approval code: 10988, Approval date: 1 March 2021).

Informed Consent Statement

Patients consent was waived due to covid pandemic.

Data Availability Statement

Data not available due to privacy issues.

Acknowledgments

The authors would like to thank all study participants, including Manuele Sicuteri, head of the information and communication technology unit of the S. Andrea hospital, whose support in data retrieval and organisation was fundamental.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Wu, Z.; McGoogan, J.M. Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: Summary of a report of 72,314 cases from the Chinese center for disease control and prevention. JAMA 2020, 323, 1239–1242. [Google Scholar] [CrossRef]
  2. Lambrou, A.S.; Shirk, P.; Steele, M.K.; Paul, P.; Paden, C.R.; Cadwell, B.; Reese, H.E.; Aoki, Y.; Hassell, N.; Zheng, X.Y.; et al. Genomic surveillance for SARS-CoV-2 variants: Predominance of the delta (b.1.617.2) and omicron (b.1.1.529) variants—United states, June 2021–January 2022. Morb. Mortal. Wkly. Rep. 2022, 71, 206–211. [Google Scholar] [CrossRef] [PubMed]
  3. Colson, P.; Delerce, J.; Burel, E.; Dahan, J.; Jouffret, A.; Fenollar, F.; Yahi, N.; Fantini, J.; La Scola, B.; Raoult, D. Emergence in southern france of a new SARS-CoV-2 variant harbouring both n501y and e484k substitutions in the spike protein. Arch. Virol. 2022, 167, 1185–1190. [Google Scholar] [CrossRef] [PubMed]
  4. Vadiati, M.; Beynaghi, A.; Bhattacharya, P.; Bandala, E.R.; Mozafari, M. Indirect effects of covid-19 on the environment: How deep and how long? Sci. Total Environ. 2022, 810, 152255. [Google Scholar] [CrossRef]
  5. Richardson, S.; Hirsch, J.S.; Narasimhan, M.; Crawford, J.M.; McGinn, T.; Davidson, K.W.; Northwell COVID-19 Research Consortium. Presenting Characteristics, Comorbidities, and Outcomes Among 5700 Patients Hospitalized With COVID-19 in the New York City Area. JAMA 2020, 323, 2052–2059. [Google Scholar] [CrossRef]
  6. Ceasovschih, A.; Sorodoc, V.; Shor, A.; Haliga, R.E.; Roth, L.; Lionte, C.; Onofrei Aursulesei, V.; Sirbu, O.; Culis, N.; Shapieva, A.; et al. Distinct Features of Vascular Diseases in COVID-19. J. Inflamm. Res. 2023, 16, 2783–2800. [Google Scholar] [CrossRef]
  7. Wynants, L.; Van Calster, B.; Collins, G.S.; Riley, R.D.; Heinze, G.; Schuit, E.; Bonten, M.M.J.; Dahly, D.L.; Damen, J.A.A.; Debray, T.P.A.; et al. Prediction models for diagnosis and prognosis of Covid-19: Systematic review and critical appraisal. BMJ 2020, 369, m1328. [Google Scholar] [CrossRef]
  8. Esposito, A.; Palmisano, A.; Cao, R.; Rancoita, P.; Landoni, G.; Grippaldi, D.; Boccia, E.; Cosenza, M.; Messina, A.; La Marca, S.; et al. Quantitative assessment of lung involvement on chest CT at admission: Impact on hypoxia and outcome in COVID-19 patients. Clin. Imaging 2021, 77, 194–201. [Google Scholar] [CrossRef]
  9. Bonanad, C.; Garcia-Blas, S.; Tarazona-Santabalbina, F.; Sanchis, J.; Bertomeu-Gonzalez, V.; Facila, L.; Ariza, A.; Nunez, J.; Cordero, A. The Effect of Age on Mortality in Patients With COVID-19: A Meta-Analysis With 611,583 Subjects. J. Am. Med. Dir. Assoc. 2020, 21, 915–918. [Google Scholar] [CrossRef] [PubMed]
  10. Zhou, F.; Yu, T.; Du, R.; Fan, G.; Liu, Y.; Liu, Z.; Xiang, J.; Wang, Y.; Song, B.; Gu, X.; et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: A retrospective cohort study. Lancet 2020, 395, 1054–1062. [Google Scholar] [CrossRef]
  11. Huang, I.; Pranata, R.; Lim, M.A.; Oehadian, A.; Alisjahbana, B. C-reactive protein, procalcitonin, D-dimer, and ferritin in severe coronavirus disease-2019: A meta-analysis. Ther. Adv. Respir. Dis. 2020, 14, 1753466620937175. [Google Scholar] [CrossRef]
  12. Ponti, G.; Maccaferri, M.; Ruini, C.; Tomasi, A.; Ozben, T. Biomarkers associated with COVID-19 disease progression. Crit. Rev. Clin. Lab. Sci. 2020, 57, 389–399. [Google Scholar] [CrossRef] [PubMed]
  13. Colombi, D.; Bodini, F.C.; Petrini, M.; Maffi, G.; Morelli, N.; Milanese, G.; Silva, M.; Sverzellati, N.; Michieletti, E. Well-aerated Lung on Admitting Chest CT to Predict Adverse Outcome in COVID-19 Pneumonia. Radiology 2020, 296, E86–E96. [Google Scholar] [CrossRef]
  14. Huang, L.; Han, R.; Ai, T.; Yu, P.; Kang, H.; Tao, Q.; Xia, L. Serial Quantitative Chest CT Assessment of COVID-19: A Deep Learning Approach. Radiol. Cardiothorac. Imaging 2020, 2, e200075. [Google Scholar] [CrossRef] [PubMed]
  15. Revel, M.P.; Boussouar, S.; de Margerie-Mellon, C.; Saab, I.; Lapotre, T.; Mompoint, D.; Chassagnon, G.; Milon, A.; Lederlin, M.; Bennani, S.; et al. Study of Thoracic CT in COVID-19: The STOIC Project. Radiology 2021, 301, E361–E370. [Google Scholar] [CrossRef] [PubMed]
  16. Zhan, J.; Li, H.; Yu, H.; Liu, X.; Zeng, X.; Peng, D.; Zhang, W. 2019 novel coronavirus (COVID-19) pneumonia: CT manifestations and pattern of evolution in 110 patients in Jiangxi, China. Eur. Radiol. 2021, 31, 1059–1068. [Google Scholar] [CrossRef]
  17. Zhao, C.; Xu, Y.; He, Z.; Tang, J.; Zhang, Y.; Han, J.; Shi, Y.; Zhou, W. Lung Segmentation and Automatic Detection of COVID-19 Using Radiomic Features from Chest CT Images. Pattern Recognit. 2021, 119, 108071. [Google Scholar] [CrossRef]
  18. Jiao, Z.; Choi, J.W.; Halsey, K.; Tran, T.M.L.; Hsieh, B.; Wang, D.; Eweje, F.; Wang, R.; Chang, K.; Wu, J.; et al. Prognostication of patients with COVID-19 using artificial intelligence based on chest x-rays and clinical data: A retrospective study. Lancet Digit. Health 2021, 3, e286–e294. [Google Scholar] [CrossRef]
  19. Tan, H.B.; Xiong, F.; Jiang, Y.L.; Huang, W.C.; Wang, Y.; Li, H.H.; You, T.; Fu, T.T.; Lu, R.; Peng, B.W. The study of automatic machine learning base on radiomics of non-focus area in the first chest CT of different clinical types of COVID-19 pneumonia. Sci. Rep. 2020, 10, 18926. [Google Scholar] [CrossRef]
  20. Shiri, I.; Sorouri, M.; Geramifar, P.; Nazari, M.; Abdollahi, M.; Salimi, Y.; Khosravi, B.; Askari, D.; Aghaghazvini, L.; Hajianfar, G.; et al. Machine learning-based prognostic modeling using clinical data and quantitative radiomic features from chest CT images in COVID-19 patients. Comput. Biol. Med. 2021, 132, 104304. [Google Scholar] [CrossRef]
  21. Guiot, J.; Vaidyanathan, A.; Deprez, L.; Zerka, F.; Danthine, D.; Frix, A.N.; Thys, M.; Henket, M.; Canivet, G.; Mathieu, S.; et al. Development and Validation of an Automated Radiomic CT Signature for Detecting COVID-19. Diagnostics 2020, 11, 41. [Google Scholar] [CrossRef] [PubMed]
  22. van Griethuysen, J.J.M.; Fedorov, A.; Parmar, C.; Hosny, A.; Aucoin, N.; Narayan, V.; Beets-Tan, R.G.H.; Fillion-Robin, J.C.; Pieper, S.; Aerts, H. Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Res. 2017, 77, e104–e107. [Google Scholar] [CrossRef] [PubMed]
  23. Wang, L.; Kelly, B.; Lee, E.H.; Wang, H.; Zheng, J.; Zhang, W.; Halabi, S.; Liu, J.; Tian, Y.; Han, B.; et al. Multi-classifier-based identification of COVID-19 from chest computed tomography using generalizable and interpretable radiomics features. Eur. J. Radiol. 2021, 136, 109552. [Google Scholar] [CrossRef] [PubMed]
  24. Shiri, I.; Salimi, Y.; Pakbin, M.; Hajianfar, G.; Avval, A.H.; Sanaat, A.; Mostafaei, S.; Akhavanallaf, A.; Saberi, A.; Mansouri, Z.; et al. COVID-19 prognostic modeling using CT radiomic features and machine learning algorithms: Analysis of a multi-institutional dataset of 14,339 patients. Comput. Biol. Med. 2022, 145, 105467. [Google Scholar] [CrossRef]
  25. Zorzi, G.; Berta, L.; Rizzetto, F.; De Mattia, C.; Felisi, M.M.J.; Carrazza, S.; Nerini Molteni, S.; Vismara, C.; Scaglione, F.; Vanzulli, A.; et al. Artificial intelligence for differentiating COVID-19 from other viral pneumonias on CT: Comparative analysis of different models based on quantitative and radiomic approaches. Eur. Radiol. Exp. 2023, 7, 3. [Google Scholar] [CrossRef]
  26. Simpson, S.; Kay, F.U.; Abbara, S.; Bhalla, S.; Chung, J.H.; Chung, M.; Henry, T.S.; Kanne, J.P.; Kligerman, S.; Ko, J.P.; et al. Radiological Society of North America Expert Consensus Statement on Reporting Chest CT Findings Related to COVID-19. Endorsed by the Society of Thoracic Radiology, the American College of Radiology, and RSNA—Secondary Publication. J. Thorac. Imaging 2020, 35, 219–227. [Google Scholar] [CrossRef]
  27. Fedorov, A.; Beichel, R.; Kalpathy-Cramer, J.; Finet, J.; Fillion-Robin, J.C.; Pujol, S.; Bauer, C.; Jennings, D.; Fennessy, F.; Sonka, M.; et al. 3D Slicer as an image computing platform for the Quantitative Imaging Network. Magn. Reson. Imaging 2012, 30, 1323–1341. [Google Scholar] [CrossRef]
  28. Orlhac, F.; Boughdad, S.; Philippe, C.; Stalla-Bourdillon, H.; Nioche, C.; Champion, L.; Soussan, M.; Frouin, F.; Frouin, V.; Buvat, I. A Postreconstruction Harmonization Method for Multicenter Radiomic Studies in PET. J. Nucl. Med. Off. Publ. Soc. Nucl. Med. 2018, 59, 1321–1328. [Google Scholar] [CrossRef] [PubMed]
  29. Tibshirani, R. Regression Shrinkage and Selection via the Lasso. J. R. Stat. Society. Ser. B 1996, 58, 267–288. [Google Scholar] [CrossRef]
  30. Friedman, J.; Hastie, T.; Tibshirani, R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J. Stat. Softw. 2010, 33, 1–22. [Google Scholar] [CrossRef] [PubMed]
  31. Ge, G.; Zhang, J. Feature selection methods and predictive models in CT lung cancer radiomics. J. Appl. Clin. Med. Phys. 2023, 24, e13869. [Google Scholar] [CrossRef] [PubMed]
  32. Song, J.; Yin, Y.; Wang, H.; Chang, Z.; Liu, Z.; Cui, L. A review of original articles published in the emerging field of radiomics. Eur. J. Radiol. 2020, 127, 108991. [Google Scholar] [CrossRef] [PubMed]
  33. Glmnet. Available online: http://cran.r-project.org/web/packages/glmnet (accessed on 1 March 2020).
  34. Dobbin, K.K.; Simon, R.M. Optimally splitting cases for training and testing high dimensional classifiers. BMC Med. Genom. 2011, 4, 31. [Google Scholar] [CrossRef] [PubMed]
  35. Harrell, F.E., Jr. Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
  36. Steyerberg, E.W. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating; Springer International Publishing: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
  37. Steyerberg, E.W.; Vickers, A.J.; Cook, N.R.; Gerds, T.; Gonen, M.; Obuchowski, N.; Pencina, M.J.; Kattan, M.W. Assessing the performance of prediction models: A framework for traditional and novel measures. Epidemiology 2010, 21, 128–138. [Google Scholar] [CrossRef]
  38. Sun, Q.; Qiu, H.; Huang, M.; Yang, Y. Lower mortality of COVID-19 by early recognition and intervention: Experience from Jiangsu Province. Ann. Intensive Care 2020, 10, 33. [Google Scholar] [CrossRef] [PubMed]
  39. Goyal, D.K.; Mansab, F.; Iqbal, A.; Bhatti, S. Early intervention likely improves mortality in COVID-19 infection. Clin. Med. 2020, 20, 248–250. [Google Scholar] [CrossRef]
  40. Yang, N.; Liu, F.; Li, C.; Xiao, W.; Xie, S.; Yuan, S.; Zuo, W.; Ma, X.; Jiang, G. Diagnostic classification of coronavirus disease 2019 (COVID-19) and other pneumonias using radiomics features in CT chest images. Sci. Rep. 2021, 11, 17885. [Google Scholar] [CrossRef]
  41. Mouhat, B.; Besutti, M.; Bouiller, K.; Grillet, F.; Monnin, C.; Ecarnot, F.; Behr, J.; Capellier, G.; Soumagne, T.; Pili-Floury, S.; et al. Elevated D-dimers and lack of anticoagulation predict PE in severe COVID-19 patients. Eur. Respir. J. 2020, 56, 2001811. [Google Scholar] [CrossRef]
  42. Soni, M.; Gopalakrishnan, R.; Vaishya, R.; Prabu, P. D-dimer level is a useful predictor for mortality in patients with COVID-19: Analysis of 483 cases. Diabetes Metab. Syndr. 2020, 14, 2245–2249. [Google Scholar] [CrossRef]
  43. Poudel, A.; Poudel, Y.; Adhikari, A.; Aryal, B.B.; Dangol, D.; Bajracharya, T.; Maharjan, A.; Gautam, R. D-dimer as a biomarker for assessment of COVID-19 prognosis: D-dimer levels on admission and its role in predicting disease outcome in hospitalized patients with COVID-19. PLoS ONE 2021, 16, e0256744. [Google Scholar] [CrossRef]
  44. Henry, B.M.; Aggarwal, G.; Wong, J.; Benoit, S.; Vikse, J.; Plebani, M.; Lippi, G. Lactate dehydrogenase levels predict coronavirus disease 2019 (COVID-19) severity and mortality: A pooled analysis. Am. J. Emerg. Med. 2020, 38, 1722–1726. [Google Scholar] [CrossRef] [PubMed]
  45. Tao, R.J.; Luo, X.L.; Xu, W.; Mao, B.; Dai, R.X.; Li, C.W.; Yu, L.; Gu, F.; Liang, S.; Lu, H.W.; et al. Viral infection in community acquired pneumonia patients with fever: A prospective observational study. J. Thorac. Dis. 2018, 10, 4387–4395. [Google Scholar] [CrossRef] [PubMed]
  46. Li, X.; Guan, B.; Su, T.; Liu, W.; Chen, M.; Bin Waleed, K.; Guan, X.; Gary, T.; Zhu, Z. Impact of cardiovascular disease and cardiac injury on in-hospital mortality in patients with COVID-19: A systematic review and meta-analysis. Heart 2020, 106, 1142–1147. [Google Scholar] [CrossRef] [PubMed]
  47. Borges do Nascimento, I.J.; Cacic, N.; Abdulazeem, H.M.; von Groote, T.C.; Jayarajah, U.; Weerasekara, I.; Esfahani, M.A.; Civile, V.T.; Marusic, A.; Jeroncic, A.; et al. Novel Coronavirus Infection (COVID-19) in Humans: A Scoping Review and Meta-Analysis. J. Clin. Med. 2020, 9, 941. [Google Scholar] [CrossRef]
  48. Nishiga, M.; Wang, D.W.; Han, Y.; Lewis, D.B.; Wu, J.C. COVID-19 and cardiovascular disease: From basic mechanisms to clinical perspectives. Nat. Rev. Cardiol. 2020, 17, 543–558. [Google Scholar] [CrossRef] [PubMed]
  49. Wang, B.X. Susceptibility and prognosis of COVID-19 patients with cardiovascular disease. Open Heart 2020, 7, e001310. [Google Scholar] [CrossRef]
  50. Ai, T.; Yang, Z.; Hou, H.; Zhan, C.; Chen, C.; Lv, W.; Tao, Q.; Sun, Z.; Xia, L. Correlation of Chest CT and RT-PCR Testing for Coronavirus Disease 2019 (COVID-19) in China: A Report of 1014 Cases. Radiology 2020, 296, E32–E40. [Google Scholar] [CrossRef] [PubMed]
  51. Sun, R.; Limkin, E.J.; Vakalopoulou, M.; Dercle, L.; Champiat, S.; Han, S.R.; Verlingue, L.; Brandao, D.; Lancia, A.; Ammari, S.; et al. A radiomics approach to assess tumour-infiltrating CD8 cells and response to anti-PD-1 or anti-PD-L1 immunotherapy: An imaging biomarker, retrospective multicohort study. Lancet. Oncol. 2018, 19, 1180–1191. [Google Scholar] [CrossRef]
  52. Ciarmiello, A.; Giovannini, E.; Pastorino, S.; Ferrando, O.; Foppiano, F.; Mannironi, A.; Tartaglione, A.; Giovacchini, G.; Alzheimer’s Disease Neuroimaging, I. Machine Learning Model to Predict Diagnosis of Mild Cognitive Impairment by Using Radiomic and Amyloid Brain PET. Clin. Nucl. Med. 2023, 48, 1–7. [Google Scholar] [CrossRef] [PubMed]
  53. Ciarmiello, A.; Giovannini, E.; Florimonte, L.; Bonatto, E.; Bareggi, C.; Milano, A.; Aschele, C.; Castellani, M. Machine learning radiomics for prediction of survival in non-small cell lung cancer patients studied with PET/CT and FDG. Ann. Oncol. 2021, 32, S926. [Google Scholar] [CrossRef]
  54. Hu, Z.; Yang, Z.; Lafata, K.J.; Yin, F.F.; Wang, C. A radiomics-boosted deep-learning model for COVID-19 and non-COVID-19 pneumonia classification using chest x-ray images. Med. Phys. 2022, 49, 3213–3222. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Flowchart for machine learning model development.
Figure 1. Flowchart for machine learning model development.
Jcm 12 07164 g001
Figure 2. Predictors of outcome. (A) Coefficient profile plotted versus the log (λ). Each colored line represents the coefficient of each feature. (B) The C-index was plotted versus log (λ). The green circle and line locate the Lambda with minimum cross-validation error. The blue circle and line locate the point with minimum cross-validation error plus one standard deviation. (C) Variables that survived the LASSO regression, including age, D-dimer, LDH, three comorbidities, and four radiomic variables.
Figure 2. Predictors of outcome. (A) Coefficient profile plotted versus the log (λ). Each colored line represents the coefficient of each feature. (B) The C-index was plotted versus log (λ). The green circle and line locate the Lambda with minimum cross-validation error. The blue circle and line locate the point with minimum cross-validation error plus one standard deviation. (C) Variables that survived the LASSO regression, including age, D-dimer, LDH, three comorbidities, and four radiomic variables.
Jcm 12 07164 g002
Figure 3. Survival curves. Training data set (A): the survival time of SARS-CoV-2 patients in the high-risk group differed significantly from that of the low-risk subjects, with a median of 12 days (95% CI; 10–14). The low-risk group did not achieve the 50% survival rate. Test dataset (B): the median survival duration of the high-risk group was 9 days (95% CI; 6–37) and low-risk patients did not reach the median survival of 50%.
Figure 3. Survival curves. Training data set (A): the survival time of SARS-CoV-2 patients in the high-risk group differed significantly from that of the low-risk subjects, with a median of 12 days (95% CI; 10–14). The low-risk group did not achieve the 50% survival rate. Test dataset (B): the median survival duration of the high-risk group was 9 days (95% CI; 6–37) and low-risk patients did not reach the median survival of 50%.
Jcm 12 07164 g003
Table 1. Clinical characteristics of SARS-CoV-2 population in living and deceased subjects.
Table 1. Clinical characteristics of SARS-CoV-2 population in living and deceased subjects.
Observed Outcome
VariableOverall N = 694Alive
N = 537 (77%)
Deceased
N = 157 (23%)
Statisticp-Value 1
Age 213<0.001
Median (Range)64 (20–107)59 (20–94)80 (51–107)
Gender, n (%) 0.070.8
Female247 (36%)193 (36%)54 (34%)
Male447 (64%)344 (64%)103 (66%)
Hospital stay 100.001
Median (Range)11 (3–86)10 (3–86)13 (3–62)
CT findings, n (%) 110.001
Negative/Atypical284 (41%)238 (44%)46 (29%)
Typical410 (59%)299 (56%)111 (71%)
C-reactive protein 19<0.001
Median (Range)3 (0–37)3 (0–27)5 (1–37)
Lactate dehydrogenase 97<0.001
Median (Range)486 (78–3745)484 (78–3745)494 (141–3631)
D-dimer 68<0.001
Median (Range)1133 (96–152,015)1133 (96–55,660)6006 (168–152,015)
1 One-way ANOVA; Pearson’s chi-squared test.
Table 2. Diseases associated with a high risk of mortality in SARS-CoV-2 infection.
Table 2. Diseases associated with a high risk of mortality in SARS-CoV-2 infection.
ComorbidityAlive, N = 537 1Deceased, N = 157 1Odds Ratio 295% CI 2,3p-Value 2
Cardiovascular disease90 (17%)74 (47%)4.422.95, 6.63<0.001
Cancer2 (0%)9 (6%)16.23.30, 155<0.001
Cerebrovascular disease9 (2%)14 (9%)5.722.25, 15.3<0.001
Haematological disease14 (3%)13 (8%)3.361.42, 7.910.003
Chronic obstructive pulmonary disease24 (4%)16 (10%)2.421.17, 4.900.011
Blood cancer11 (2%)1 (1%)0.310.01, 2.140.3
Hypertension66 (12%)16 (10%)0.810.42, 1.470.6
Type 2 diabetes77 (14%)20 (13%)0.870.49, 1.500.7
Obesity11 (2%)3 (2%)0.930.16, 3.59>0.9
1 n (%); 2 Fisher’s Exact Test for Count Data; and 3 CI = Confidence Interval.
Table 3. Bivariate analysis of model performance by dataset.
Table 3. Bivariate analysis of model performance by dataset.
Observed
PredictedNDeceased 1Alive 1X2p 2SS 95%CI 3SP 95%CI 4PPV 95%CI 5NPV 95%CI 6OR 95%CI 7
Training556 1402.1 × 10−3297 (92, 99)64 (59, 68)44 (38, 50)99 (96, 100)48 (18, 125)
 High risk 122 (97%)156 (36%)
 Low risk 4 (3.2%)274 (64%)
Test138 249.8 × 10−790 (74, 98)62 (52, 71)41 (29, 53)96 (88, 99)13 (4, 42)
 High risk 28 (90%)41 (38%)
 Low risk 3 (9.7%)66 (62%)
1 n (%), 2 Pearson’s chi-squared test, 3 sensitivity, confidence interval, 4 specificity, confidence interval, 5 positive predictive value, confidence interval, 6 negative predictive value, confidence interval, 7 odds ratio, and confidence interval.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ciarmiello, A.; Tutino, F.; Giovannini, E.; Milano, A.; Barattini, M.; Yosifov, N.; Calvi, D.; Setti, M.; Sivori, M.; Sani, C.; et al. Multivariable Risk Modelling and Survival Analysis with Machine Learning in SARS-CoV-2 Infection. J. Clin. Med. 2023, 12, 7164. https://doi.org/10.3390/jcm12227164

AMA Style

Ciarmiello A, Tutino F, Giovannini E, Milano A, Barattini M, Yosifov N, Calvi D, Setti M, Sivori M, Sani C, et al. Multivariable Risk Modelling and Survival Analysis with Machine Learning in SARS-CoV-2 Infection. Journal of Clinical Medicine. 2023; 12(22):7164. https://doi.org/10.3390/jcm12227164

Chicago/Turabian Style

Ciarmiello, Andrea, Francesca Tutino, Elisabetta Giovannini, Amalia Milano, Matteo Barattini, Nikola Yosifov, Debora Calvi, Maurizo Setti, Massimiliano Sivori, Cinzia Sani, and et al. 2023. "Multivariable Risk Modelling and Survival Analysis with Machine Learning in SARS-CoV-2 Infection" Journal of Clinical Medicine 12, no. 22: 7164. https://doi.org/10.3390/jcm12227164

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop