Exploring the Synergistic Potential of Radiomics and Laboratory Biomarkers for Enhanced Identification of Vulnerable COVID-19 Patients

Background: Severe courses and high hospitalization rates were ubiquitous during the first pandemic SARS-CoV-2 waves. Thus, we aimed to examine whether integrative diagnostics may aid in identifying vulnerable patients using crucial data and materials obtained from COVID-19 patients hospitalized between 2020 and 2021 (n = 52). Accordingly, we investigated the potential of laboratory biomarkers, specifically the dynamic cell decay marker cell-free DNA and radiomics features extracted from chest CT. Methods: Separate forward and backward feature selection was conducted for linear regression with the Intensive-Care-Unit (ICU) period as the initial target. Three-fold cross-validation was performed, and collinear parameters were reduced. The model was adapted to a logistic regression approach and verified in a validation naïve subset to avoid overfitting. Results: The adapted integrated model classifying patients into “ICU/no ICU demand” comprises six radiomics and seven laboratory biomarkers. The models’ accuracy was 0.54 for radiomics, 0.47 for cfDNA, 0.74 for routine laboratory, and 0.87 for the combined model with an AUC of 0.91. Conclusion: The combined model performed superior to the individual models. Thus, integrating radiomics and laboratory data shows synergistic potential to aid clinic decision-making in COVID-19 patients. Under the need for evaluation in larger cohorts, including patients with other SARS-CoV-2 variants, the identified parameters might contribute to the triage of COVID-19 patients.


Introduction
The pandemic spread of severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2) and the emergence of coronavirus disease 2019 (COVID-19) has had enormous global health and socio-economic consequences and high infectivity as well as hospitalization rates have put hospital bed and Intensive-Care-Unit (ICU) capacities under enormous stress during the first pandemic waves [1]. Accordingly, the rapid identification of disease severity enabling triaging of patients is an essential clinical aspect and requires a multidisciplinary approach to optimize the diagnostic potential. Various routine laboratory parameters associated with disease severity have already been described, but an integrative approach including Radiomics and cfDNA is missing so far. Among those, C-reactive protein (CRP), activated partial thromboplastin time (PTT), D-dimer, and lactate dehydrogenase (LDH) have been reported [2][3][4]. Moreover, previous studies have shown an association between increased cell-free DNA (cfDNA) and a severe course of COVID-19 [5,6]. In this study, we intended to identify suitable markers associated with intensive care requirements. The routine laboratory parameters examined in other studies were further augmented by quantified cfDNA in our work as a dynamic marker of cell decay. Since we anticipated increased cell death of lung tissue, especially in the presence of lung consolidations, this study was called "Laboratory Assessment of Ground Glass Opacities" (LAGGO), emphasizing the interdisciplinary aspect of the work.
While blood-bourne laboratory parameters serve as surrogate markers for monitoring various organ functions, the chest's computed tomography (CT) adds important diagnostic topological information on lung involvement in COVID-19 patients [7]. Tsang et al. have developed the SARS severity score to estimate the severity of lung involvement semiquantitatively [8]. Additionally, the Radiological Society of North America has developed a structured reporting system that classifies findings related to COVID-19 [9]. Radomics analysis of COVID-19 CTs aims to quantify lung involvement in a fully automatic and reader-independent fashion. Thanks to recent advances in deep-learning-based machine vision, the software can aid the image segmentation necessary for radiomics analyses [10]. Radiomics is an innovative and rapidly evolving field, including the extraction and analysis of quantitative features from medical imaging. By converting an image into mineable data, radiomics complements the traditional visual interpretation and enables a quantitative evaluation of radiological images. In this manner, radiological data can be leveraged not only for qualitative evaluation but also in the form of diverse quantitative datasets to enable personalized patient predictions. This presents many opportunities for analysing radiological data, particularly in assessing tumor diseases, where it is commonly applied. However, ongoing research is necessary to prove the promising potential of radiomics with regard to acquisition protocols, segmentations, and feature extractions [11].
Therefore, so far, the use of radiomics is not widely adopted in the clinical setting, yet [12]. Both laboratory medicine and radiology provide complementary diagnostic value in various stages of COVID-19. Thus, we investigated the potential value of integrated diagnostics in estimating the likelihood of ICU admission to aid in planning ICU capacities in managing Corona cases.
For this purpose, we utilized conserved residual specimens obtained during the initial SARS-CoV-2 pandemic outbreaks to quantify cfDNA and reanalyzed previously acquired data to retrospectively evaluate the significance of specific biomarkers in predicting a severe hospitalized COVID-19 course. Thus, in this study, we present biomarkers that potentially allow discrimination between ICU requirements and normal inpatient treatment in cases of infection with the first SARS-CoV-2 variants in Germany.
The primary objective of this investigation is to establish a suitable algorithm for identifying distinct laboratory and radiology parameters correlated with the need for intensive care unit (ICU) admission (aim I). Subsequently, a verification of the selected parameters via an alternative method is required (aim II). Furthermore, in case of a substantial number of parameters, selecting the most significant ones has to be performed via an algorithm (aim III). Finally, the individual radiomics, RSNA Score, routine laboratory, cfDNA and combined variables have to be compared in their predictive power (aim IV).

Participant Recruitment
From May 2020 to September 2021, SARS-CoV-2 patients aged 18 or older previously confirmed by qPCR were enrolled in the LAGGO (Laboratory Assessment of Ground Glass Opacities) study at the University Medical Center Mannheim, Germany (see Figure 1). Informed written consent was obtained from each subject (n = 52). The Institutional Review Board (2020-541N) approved the study protocol, and the study was conducted in accordance with the Declaration of Helsinki. During the initial wave of the SARS-CoV-2 pandemic, we deemed it inappropriate to obtain informed consent when requiring intensive care treatment based on ethical considerations. Therefore, the study inclusion was conducted retrospectively after the completion of treatment. Considering this aspect and the high Glass Opacities) study at the University Medical Center Mannheim, Germany (see Figure  1). Informed written consent was obtained from each subject (n = 52). The Institutional Review Board (2020-541N) approved the study protocol, and the study was conducted in accordance with the Declaration of Helsinki. During the initial wave of the SARS-CoV-2 pandemic, we deemed it inappropriate to obtain informed consent when requiring intensive care treatment based on ethical considerations. Therefore, the study inclusion was conducted retrospectively after the completion of treatment. Considering this aspect and the high mortality rate, this accounts for the limited number of participants. We have to address this point in the study s limitations.  Figure 1: Presentation of the study concept and the research objectives. The inclusion criterion in the study was the diagnosis of COVID-19 based on a positive qRT-PCR result of a nasopharyngeal swab. Radiological chest CT data were segmented and radiomically analyzed. In addition, the patient s routine laboratory was evaluated, and cfDNA was prospectively isolated and quantified. Radiological and laboratory features were selected separately for predicting the duration of intensive care. Before inclusion in an integrated prediction model, the existence of collinearities was reduced using a minimal redundancy  Figure 1: Presentation of the study concept and the research objectives. The inclusion criterion in the study was the diagnosis of COVID-19 based on a positive qRT-PCR result of a nasopharyngeal swab. Radiological chest CT data were segmented and radiomically analyzed. In addition, the patient's routine laboratory was evaluated, and cfDNA was prospectively isolated and quantified. Radiological and laboratory features were selected separately for predicting the duration of intensive care. Before inclusion in an integrated prediction model, the existence of collinearities was reduced using a minimal redundancy algorithm. The final model intends to indicate the patient outcome by predicting an intensive care requirement and facilitating clinical decisions.

Routine Laboratory Analysis
Blood count was measured on Sysmex XN-9000 (Sysmex, Hamburg, Germany) platform. Hemostaseological parameters were determined on the CS-5100 analyzer (Sysmex, Hamburg, Germany). Clinical chemistry biomarkers were measured on an Atellica-CH Analyzer (Siemens Healthcare GmbH, Eschborn, Germany). For all measurements, the dedicated reagent systems were used according to the manufacturers' recommendations and after internal verification in compliance with DIN EN ISO 15189 in an accredited laboratory. Pre-analytical quality was subsequently judged by centrifugation using the hemolysis assessment system of the analyzer platform on an ordinal scale ranging from no (0) to significant hemolysis (5)). For samples exceeding the value "1", the results for LDH and ASAT were not used in the respective samples since an influence with regard to increased values is described [13,14]. Although the manufacturer does not specify any restrictions in the corresponding instructions, we decided to enhance the quality of the preanalytic by the mentioned procedure. Blood gas analyses (BGA) from arterial and venous blood were conducted under point-of-care-testing conditions.

Sample Collection and cfDNA Analysis
For the isolation of cfDNA, ethylene diamine tetraacetic acid (EDTA) plasma obtained when clinically indicated was processed within 4 h of blood collection. Specimens were centrifuged at 1600× g for 10 min at 20 • C. The supernatant was transferred to a new 15 mL tube and centrifuged at 3000× g for 10 min. Optical control for hemolysis was performed, and insofar as it was visually detectable, the sample was excluded. The final supernatant was stored at −80 • C until the isolation of the cfDNA. CfDNA was isolated using the Qiagen QIAmp Circulating Nucleic Acid Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions without modifications. For cfDNA isolation, the maximal plasma volume processed from the subject's specimen was utilized (range 0.4 and 1.5 mL). The quantification of the cfDNA was performed by means of a Qubit Fluorometer and Qubit cfDNA HS Assay Kit (Invitrogen, Los Angeles, CA, USA) and the results was normalized via a control with known concentration included in each measurement. In addition, the determined concentration was recalculated in relation to the input volume and reported as ng per mL plasma.

Chest CT Imaging
All patients in this study underwent native or contrast-enhanced CT imaging of the chest. The scans were performed on either a SOMATOM Definition AS, SOMATOM Definition Flash or a SOMATOM Definition 64 (Siemens Healthcare GmbH, Erlangen, Germany). Depending on the history, clinical presentation and possible comorbidities, patients were scanned using one of the following protocols: Low-dose CT, routine noncontrast-enhanced CT, contrast-enhanced CT or CT pulmonary angiography. In total, 74.54% of scans were performed with contrast agents, of which 58.54% were performed as arterial phase CT. Imeron 300 (Bracco Imaging S.p.A., Milan, Italy) was used as a contrast agent in a dose adjusted for CT protocol and weight.

Chest CT Imaging Analysis
CTs were analyzed by a resident radiologist, using a semi-quantitative score to quantify pathological changes in the lung parenchyma. To calculate the score, each lung is divided into three sections and scored from 0-4 with regard to severity. For 25% involvement, one point is given per section. Then the sum of all six sections is added, resulting in a score from 0 to 24 [9]. Furthermore, CTs were analyzed quantitatively with radiomics methods using the research application MM Radiomics Frontier Prototype 1.2.6. (August 2016, Siemens Healthcare GmbH, Erlangen, Germany) within syngo.via VB60A (May 2021, Siemens Healthcare GmbH, Erlangen, Germany). To extract radiomics features, segmentation of CT scans is necessary [13]. Segmentation was collected in an automated fashion using the deeplearning-based research segmentation application CT Pneumonia Analysis prototype 2.5.2 (April 2021, Siemens Healthcare GmbH, Erlangen, Germany). This software is currently classified as "for research use only". A binWidth of 25, a 512 × 512 matrix, voxelArrayShift of 0 was applied. For the analysis, pyradiomics version 2.1.0 was applied. Only original radiomics features were included in the analysis.

Performance of Feature Selection and Statistical Analysis
For identifying adequate parameters associated with a severe course, we opted for an algorithm-based training of a model. A multivariate linear regression with internal 3-fold cross-validation was performed to construct a linear model initially predicting the duration of intensive care in days. It was adapted into a categorizing model dividing subjects into "ICU" versus "no-ICU-demand". This was realized separately for laboratory and radiological data based on a stepwise forward and backward feature selection to create a linear regression model with "ICU period in days" as the initial training target. With regard to the routine laboratory, all mentioned parameters exclusive to cfDNA were used for internal cross-validation comprising three sub-datasets randomly split (each consisting of n = 22 for the training and n = 10 for the validation). The training was always performed on 22 subjects and validated in the unaffected cohort. This was repeated successively with different cohort formations to obtain a more representative selection despite the limited number of participants. In addition, we used two selection methods-forward and backward selection. The forward selection is a method using subsets of features to train the model, starting with one variable, and adding further variables in each iteration until no model improvement can be achieved. Regarding backward selection all parameters are used initially and then reduced until the model deteriorates due to the omission of variables.
Due to the high number of variables identified, especially for Radiomics, a further reduction before integration into a model was essential. Therefore, a ranking was implemented via the frequencies of feature selection in the sub-datasets resulting in values between 0-3 (0: not selected in a sub-dataset; 3: selected in all three datasets). We excluded parameters selected only once or less.
Moreover, we used a random forest algorithm to verify the selected laboratory parameters and to examine the relevance of cfDNA predicting the regression target "ICU period". The algorithm creates shadow variables for each real variable by permutation and compares the importance of the real variable with the maximum importance of all shadow variables. If the real variable shows higher importance than the corresponding shadow variable, the algorithm assigns high importance to the feature [14,15]. The feature selection was performed identically for the radiological parameters.
Furthermore, we first created separate correlation plots for radiomics and laboratory data in R Studio to identify collinearities using the "library(corrplot)". Subsequently, we applied a minimal redundancy algorithm utilizing the following commands, among others "findCorrelation", "library(heatmap)") to reduce redundant parameters as a combination of collinear variables would not enhance the predictive potential. The selection of initial parameters, including clustering of strongly correlated variables (shown in dark brown), is presented as the first correlation plot in the results. Following the reduction of parameters using the algorithm, a second visualization in the form of a correlation plot is provided. These parameters were then used for singular radiomics or laboratory models and the integrative model.
After the variable reduction, the maintaining potential to classify subjects was illustrated by a heatmap performing unsupervised clustering based on the final parameters (R package "pheatmap"). The application of the validation dataset served to prove the maintenance of classification potential and not to determine the model's power, as this would lead to overfitting (Supplemental Material). Due to the Root Mean Square Error (RMSE) of predicted and actual days in ICU, even in our validation cohort, the model was adapted to a logistic regression approach with the clinical decision endpoint "ICU stay yes/no" and a cut-off for this categorization has been selected based on this RMSE.
The final verification and the determination of the accuracies of the integrative model were realized with a training and validation independent test cohort. In addition to establishing an integrative model, we compared individual cfDNA, RSNA score, radiomics or routine laboratory models with the combined model. The prediction of ICU needs was performed using the test cohort in R-Studio. To accomplish this, we applied the previously trained and validated models on the test cohort as a logistic model. The algorithm employed classified values above six as indicating "ICU need" and values below six as indicating "regular inpatient treatment".
Additionally, we conducted a ROC analysis to compare true positives with false positives based on the test cohort ("library(ROCR)"). This analysis was performed for different models, and the Area under the Curve (AUC) was calculated. Patient's symptoms were not included in the model but compared between ICU and non-ICU cohorts. All statistical analyses, including comparing demographics, COVID-19 symptoms, treatment and laboratory parameters of ICU and non-ICU cohorts, were performed using R statistics software (Version 4.1.2) [16]. Cohort comparisons of non-normally distributed continuous variables were performed by the Kruskal-Wallis rank sum test, and normally distributed continuous variables were compared via regular ANOVA test. Categorical variables are presented as frequency and percentage. For the comparison of categorical variables, a Fisher exact test was performed. p-values < 0.05 were considered significant.

Demographics and Clinical Aspects
For the assessment of the diversity of the disease in COVID-19 severity and treatment, a comparison between the ICU-and non-ICU cohorts was performed. Moreover, this comparison revealed significantly elevated laboratory parameters in cases requiring ICU admission (Table 1/ Figure 2). Participants in whom CT could be assessed for pulmonary embolism were not observed to have a central or distal embolism.  Exemplary presentation of two test persons with severe and mild progression.

Creation of the Laboratory Prediction Model
Differences in laboratory parameters between the ICU and non-ICU cohorts are summarised in Table 1. Moreover, the training and cross-validation described in more detail in the methods were performed with a dataset comprising three sub-datasets ( Table  2). The most frequent parameters, PTT, albumin, GGT and CRP, and ALT, platelets, and WBC, were selected in two training sets and were included in further analysis. Exemplary presentation of two test persons with severe and mild progression.

Creation of the Laboratory Prediction Model
Differences in laboratory parameters between the ICU and non-ICU cohorts are summarised in Table 1. Moreover, the training and cross-validation described in more detail in the methods were performed with a dataset comprising three sub-datasets ( Table 2). The most frequent parameters, PTT, albumin, GGT and CRP, and ALT, platelets, and WBC, were selected in two training sets and were included in further analysis. The difference in the number of subjects in the laboratory (n = 32) and radiological (n = 30) cross-validation was because two subjects did not receive a chest CT during routine care. p-values for the "Laboratory values" and "Radiomics" are presented and variables were excluded from further analysis at a frequency of 1. In addition, the feature selection was methodically verified using a random forest analysis with the same target as our regression model (ICU stay). This was done to verify the importance of the variables selected by forward and backward feature selection. In the following, the previously selected parameters were used, but due to the high importance of cfDNA, cfDNA was included in the further establishment of the prediction model ( Figure 3).
The difference in the number of subjects in the laboratory (n = 32) and radiological (n = 30) crossvalidation was because two subjects did not receive a chest CT during routine care. P-values for the "Laboratory values" and "Radiomics" are presented and variables were excluded from further analysis at a frequency of 1.
In addition, the feature selection was methodically verified using a random forest analysis with the same target as our regression model (ICU stay). This was done to verify the importance of the variables selected by forward and backward feature selection. In the following, the previously selected parameters were used, but due to the high importance of cfDNA, cfDNA was included in the further establishment of the prediction model ( Figure 3). High importance is illustrated by green, medium by yellow and low by red. In addition, the minimum, mean, and maximum importance of the shadow variables are shown in blue. Parameters with a lower relevance for predicting the duration of intensive care requirements than the maximum shadow variable have been assigned low importance.
Furthermore, we considered the first BGA results of the subjects and examined the results. The parameters were investigated for their suitability as predictors of ICU admission via unsupervised clustering in Supplemental 1. Here, no clear differentiation between normal inpatients to long −term intensive care patients could be observed as the values were either similar among the groups (see pH variation) or showed heterogeneities within all subcohorts.

Creation of the Radiological Prediction Model
The radiomics data were equally cross − validated (n = 30), and the ranking was performed equivalently as previously described. Due to the high diversity of radiomics, the algorithm selected more parameters per dataset than for the laboratory data. Details of all identified parameters of each sub-dataset are presented in Table 2. 16 parameters  were  selected  for  establishing  a  model  predicting  ICU  stay  (original_firstorder_10Percentile,  original_gldm_LargeDependenceLowGrayLevelEmphasis,  original_shape_Maximum2DDiameterSlice, original_firstorder_Energy, High importance is illustrated by green, medium by yellow and low by red. In addition, the minimum, mean, and maximum importance of the shadow variables are shown in blue. Parameters with a lower relevance for predicting the duration of intensive care requirements than the maximum shadow variable have been assigned low importance. Furthermore, we considered the first BGA results of the subjects and examined the results. The parameters were investigated for their suitability as predictors of ICU admission via unsupervised clustering in Supplemental S1. Here, no clear differentiation between normal inpatients to long-term intensive care patients could be observed as the values were either similar among the groups (see pH variation) or showed heterogeneities within all subcohorts.

Prognostic Value of Radiological Parameters Creation of the Radiological Prediction Model
The radiomics data were equally cross − validated (n = 30), and the ranking was performed equivalently as previously described. Due to the high diversity of radiomics, the algorithm selected more parameters per dataset than for the laboratory data. Details of all identified parameters of each sub-dataset are presented in Table 2. 16 parameters were selected for establishing a model predicting ICU stay (original_firstorder_10Percentile, original_gldm_LargeDependenceLowGrayLevelEmphasis, original_shape_Maximum2D DiameterSlice, original_firstorder_Energy, original_firstorder_TotalEnergy, original_glcm_ ClusterShade, original_glcm_DifferenceVariance, original_glrlm_RunEntropy, original_glrlm_ RunLengthNonUniformity, original_ngtdm_Busyness, original_ngtdm_Contrast, origi-nal_shape_Elongation, original_shape_Flatness, original_shape_LeastAxisLength, origi-nal_shape_MajorAxisLength, original_shape_Maximum3DDiameter). Therefore, a reduction of the selected parameters was essential, as described in the following.
In addition, the CT COVID Severity (RSNA) score was used as a variable to predict ICU stay.

Prognostic Value of Integrated Diagnostics
The radiological and laboratory parameters were examined for collinearities before integration into the final models, as a reduction of features was essential. Since various correlations were identified, we applied a redundancy reduction algorithm (Figure 4). For this purpose, separate correlation matrices were initially created for radiomic parameters ( Figure 4A) and laboratory parameters ( Figure 4C). A high correlation between parameters is illustrated by a dark color. After applying the "findCorrelation" command, which identifies collinear parameters and removes one of the two variables, updated correlation plots were generated for the remaining radiomic parameters ( Figure 4B) and laboratory parameters ( Figure 4D). Finally, seven laboratory (albumin, ALT, GGT, platelets, PTT, CRP and cfDNA) and six radiomics parameters (original_glcm_CLusterShade, origi-nal_gldm_LargeDependenceLowGrayLevelEmphasis, original_glrlm_RunEntropy, origi-nal_shape_Elongation, original_shape_MajorAxisLength and original_ngtdm_Busyness) were integrated in the combined model.  After successfully identifying suitable parameters, we created a heatmap illustrating the unsupervised clustering of the combined dataset. We applied the model to our crossvalidation dataset (Supplemental 1) to verify the selected variables even after the After successfully identifying suitable parameters, we created a heatmap illustrating the unsupervised clustering of the combined dataset. We applied the model to our cross-validation dataset (Supplemental S1) to verify the selected variables even after the previously described variable reduction.
Moreover, the period in ICU predicted by the integrative model was compared to the actual days in an independent test cohort (n = 15, Figure 5). The Root Mean Square (RMSE) of the deviations between the actual and predicted days was 5.3 days in the cross-validation set and 12.3 days in the test-cohort set (outliner V5 is excluded as the training set is not representative of values above 40 days). The application on the validation cohort only served to verify the variable selection even after reducing the initial parameters and not to assess the model's power, as this would cause overfitting. Based on these results, revealing limitations in the linear prediction of shorter ICU stay even in the validation cohort, the linear approach had to be adapted via a categorization into likely ICU and unlikely ICU with six days as a decision cut-off between intensive care and normal care treatment. Correlation between actual and predicted ICU treatment applied to a second dataset not affected by cross-validation. The predicted days are compared to the actual days, and the model is categorized as described previously. X-axis: predicted ICU days, y-axis: actual ICU days for the validation-naïve patients V1-V15. Only the categorizing version represents the final model. Thus, the light-colored patients would have a recommendation for normal inpatient treatment, and the dark-colored patients would have a referral for ICU treatment. Correlation between actual and predicted ICU treatment applied to a second dataset not affected by cross-validation. The predicted days are compared to the actual days, and the model is categorized as described previously. X-axis: predicted ICU days, y-axis: actual ICU days for the validation-naïve patients V1-V15. Only the categorizing version represents the final model. Thus, the light-colored patients would have a recommendation for normal inpatient treatment, and the dark-colored patients would have a referral for ICU treatment.
The cut-off was based on the RMSE in the validation set and was finally tested in a validation-independent cohort. Two false positives were identified in the test cohort resulting in an AUC of 0.91 in ROC analysis ( Figure 6). Compared to singular models (RSNA, cfDNA, Radiomics, Routine lab), the integrated model demonstrates the highest predictive potency for intensive care requirements (accuracy = 0.87, Table 3). The accuracies were determined using the independent test set not used for prior cross-validation. ROC analysis of the cfDNA model, the Radiomics model, and the integrated model applying the categorized approach predicting ICU demand (yes/no).

Discussion
This study assessed several diagnostic models for predicting the requirement of intensive care treatment in COVID-19 patients. The special aspect of our model is the integration of routine laboratory, cfDNA, and radiomics, which was intended to increase the diagnostic potential and has thus been trained, validated, and verified in independent datasets. Our results show a synergistic potential of laboratory and radiological parameters to support clinical decision-making in COVID-19 patients.
These results align with published literature but gain additional insights using a truly interdisciplinary diagnostic assessment. Some studies have focused on differentiating COVID-19 pneumonia from other lung conditions [17]. Subsequently, the prognostic value of radiomics based on initial CT scans was investigated by Zhang et al., who proposed an AI-based radiomics nomogram to predict disease progression in COVID-19 patients [18]. Similarly, Lassau et al. have shown via AI-deep-learning mechanisms that the severity of COVID-19 can be predicted by integrating CT scan data and biological and clinical parameters [19]. Some other studies have also dealt with outcome parameters such as hospitalization, patient management, or organ involvement, such as acute renal failure in COVID- 19.
In some cases, as in our work, cell decay markers (LDH instead of cfDNA), acute phase parameters (CRP, WBC), and quantitative lung parenchyma data were identified as possible predictors. Thus, the selection of our potential predictors is partially supported

Discussion
This study assessed several diagnostic models for predicting the requirement of intensive care treatment in COVID-19 patients. The special aspect of our model is the integration of routine laboratory, cfDNA, and radiomics, which was intended to increase the diagnostic potential and has thus been trained, validated, and verified in independent datasets. Our results show a synergistic potential of laboratory and radiological parameters to support clinical decision-making in COVID-19 patients.
These results align with published literature but gain additional insights using a truly interdisciplinary diagnostic assessment. Some studies have focused on differentiating COVID-19 pneumonia from other lung conditions [17]. Subsequently, the prognostic value of radiomics based on initial CT scans was investigated by Zhang et al., who proposed an AI-based radiomics nomogram to predict disease progression in COVID-19 patients [18]. Similarly, Lassau et al. have shown via AI-deep-learning mechanisms that the severity of COVID-19 can be predicted by integrating CT scan data and biological and clinical parameters [19]. Some other studies have also dealt with outcome parameters such as hospitalization, patient management, or organ involvement, such as acute renal failure in COVID- 19. In some cases, as in our work, cell decay markers (LDH instead of cfDNA), acute phase parameters (CRP, WBC), and quantitative lung parenchyma data were identified as possible predictors. Thus, the selection of our potential predictors is partially supported by results published in other studies [20][21][22]. Nevertheless, combining radiomics, routine laboratory, and cfDNA represents a new aspect.
Concerning the prognostic value of initial routine laboratory values, the prognostic value of D-dimers has been described extensively [23,24]. Initial hypercoagulability with the transition to the consumptive stage of disseminated intravascular coagulopathy (DIC) has been reported [25][26][27]. Moreover, Gatto et al. showed a frequent occurrence of pulmonary embolism between days 1 and 47 of hospitalization, occurring in the majority on day 10 [27]. We evaluated the CT images in temporal proximity to the first available blood sampling for the presence of central or distal pulmonal embolism. In those that could be assessed for pulmonary embolism, none were demonstrated. As we used the closest available initial laboratory to identify appropriate treatment predictors, this could explain why pulmonary embolism was not present then. Even Gatto et al. described a high variability of the incidence of pulmonal embolism in COVID-19, thus supporting our result [27]. This may explain why D-dimers were not identified as a marker for predicting the need for intensive care treatment in our study. Still, platelets and PTT were included in the final integrative model emphasizing the importance of hemostaseological diagnostic findings in COVID-19. In addition, models for predicting mortality that combines laboratory or radiological parameters with clinical aspects have already been established via comparable machine-learning approaches [28,29]. Thus, the potential of AI-based algorithms has been demonstrated and can be expanded for interdisciplinary approaches combining laboratory and radiomics values [28]. Predictive endpoints in the previous study by Yu et al. were the need for ventilation and, ultimately, patient mortality. However, our study presents a tool that might help clinicians triage COVID-19 patients upon initial presentation.
For this reason, we have adapted the initial target to predict ICU duration into a categorizing approach and propose a model that might help physicians in emergency departments to distinguish between ICU and normal care demand. Chieregato et al. emphasize that classification into ICU/non-ICU depicts an endpoint representing clinical decision support, a conclusion we would like to emphasize with our results. In addition, the high variation of clinical symptoms was addressed by Chieregato et al. Thus we adopt a purely apparative diagnostic approach [30]. Moreover, we can support this with our results, as there was no significant difference in initial symptom-based severity score between the ICU-and non-ICU cohorts.
Furthermore, in this study, we augmented routine laboratory parameters used in the mentioned previous studies by cfDNA, a dynamic marker of cell decay. Regarding cfDNA, Cavalier et al. have already described nucleosomal cfDNA to predict requirements for ICU care, underlining the relevance of cfDNA for ICU prediction [5]. We demonstrated significant differences between the cfDNA concentrations in normal and intensive care units (p < 0.001), and the importance of predicting ICU demand was verified via the random forest approach.
Moreover, a study by Giraudo et al. presents a radiomics model for predicting ICU transfer [30]. Our results may indicate the elevated diagnostic accuracy of an integrated, multimodal approach for COVID-19 diagnostic evaluation. This can be explained by the additional information offered by routine labs on extrapulmonary organ damage as a compound increased risk of ICU treatment. The results highlight the need for an integrated assessment of interdisciplinary diagnostic data to stratify the planning of treatment capacities better and potentially achieve better clinical outcomes.
Yet, this study must be interpreted with some limitations. First, the results presented are from a small collective due to many deceased patients during the first global spread of COVID-19, and explicit patient consent was required for cfDNA testing from residual routine care material. Due to the prospective part of the laboratory analysis, not all inhouse data could be used for our model, which would have increased the generalizability of our results. In particular, the assessment in a naïve cohort has to be expanded in follow-up studies, as we had to minimize this in favor of the training and validation cohort. Still, we extracted important material and data from the first pandemic waves. We were able to present the applicability of cfDNA and machine-learning algorithms for the stratification of ICU capacities. Thus, this can now be used to extrapolate information from past scenarios that may apply to future variants potentially associated with elevated hospitalization rates again. In a subsequent approach with significantly larger datasets, we aim to evaluate the model's potential for other SARS-CoV-2 variants. To ensure a higher number of participants, we will assess the necessity of cfDNA in this model. A potential focus on radiomics and routine laboratory parameters could enable its applicability in smaller centers that may not practice these isolation methods and quantification. A purely retrospective analysis would enable the utilization of a larger dataset for testing the model, enhancing the significance, robustness, and generalizability of the potential predictors presented in this study. Additionally, we are considering testing the model on other respiratory diseases to determine whether it is a general model for infectious respiratory diseases or specific to COVID-19.
Considering the selected parameters for the model, further limitations can be discussed. One aspect is the influence of anticoagulatory medication, such as heparin, used in intensive care cohorts. However, when considering the aPTT of both cohorts, no significant difference could be observed, suggesting no influence of heparin on the variable selection. In addition to indicating organ damage, a simultaneous elevation of AST, LDH, and cfDNA may also point to in-vitro hemolysis as a potential confounding factor. To minimize this pre-analytical factor, visual control and photometric evaluation of the sample quality were performed before the analyses ensuring the validity of the results.
Furthermore, there was a certain degree of heterogeneity with regard to the imaging data. Yet, most cases were scanned at one scanner with one standardized protocol. However, a remaining bias cannot be fully ruled out.
With regard to the cfDNA isolation method, it has been shown that the final elution volume and elution steps can be adjusted when the initial plasma volume is low [31]. Since we did not apply these adjustments, this influence on concentration values should be considered when comparing the results with other studies. In addition, a fluorometric approach was used for the concentration measurements, as this is a more cost-effective method and easier to implement in routine diagnostic procedures. In this context, the influence of the carrier RNA contained in the isolation kit may be considered. However, as all samples were treated identically, this does not affect the comparison of our ICU and non-ICU cohorts, ensuring significant differences remain valid. In addition, it is well known that patients with significantly elevated BMI suffer from more severe diseases requiring intensive care treatment [32]. As cfDNA levels are known to correlate with body weight, this might also elevate cfDNA levels in addition to infection-associated cell damage [33].
Moreover, pathognomonic symptoms have been included in various other scoring systems [34,35]. Our cohort's clinical data points of initial symptoms revealed no significant difference among the sub-cohorts. Due to this low discrimination potential, the clinical variables were not included. Furthermore, it could be argued that the pulmonary oxygenation capacity influences clinical disease progression. However, reliable BGA requires arterial blood sampling, which is often difficult to achieve outside intensive care environments, suggesting limited applicability. Equally, including BGA data from non-arterial samples, regularly done during hospital admission, did not provide relevant predictive information for predicting ICU stay. For these reasons, we did not include BGA while still focusing on our model's radiomics evaluation of lung imaging and the laboratory parameters.
Finally, we established an integrative linear prediction model for the requirement of ICU admission. Based on our cross-validation set, the model was not capable of correctly predicting impending shorter ICU stays of up to 6 days. For this reason, we have resorted to a categorization model of "ICU stay likely/unlikely" with a cut-off at six days of predicted ICU stay. The cut-off is based on the mean deviations of predicted and actual days in ICU in our cross-validation set and was further applied to a validation-naive set with high accuracy. This approach is intended to assist medical staff in assessing the probable demand for intensive treatment during their patients' initial presentation. The ROC analysis shows a clear superiority of the integrated model compared to the isolated assessment of the biomarkers. We also conclude that cfDNA is a complementary but not essential parameter in this categorized approach. Nevertheless, cfDNA was identified to be significantly elevated in patients requiring intensive care, confirming its potential as a dynamic marker of severe disease.

Conclusions
In conclusion, we have identified radiomics and laboratory biomarkers associated with a severe COVID-19 course via feature selection algorithms (aim I). The selected parameters have been verified by a random forest approach (aim II), and collinear parameters were reduced via a minimal redundancy algorithm (aim III). Moreover, our results might suggest a solution for a difficult clinical decision-making problem in patients experiencing severe COVID-19, namely, predicting whether a patient on time of admission to the hospital might need ICU treatment shortly. An interdisciplinary approach of integrated diagnostics using laboratory medicine and radiology biomarkers was used to establish this clinical prediction model and was superior to single models (aim IV). Therefore, we propose to study the potentially improved efficiency of ICU capacities using prediction algorithms. Particularly, in scenarios of rapidly rising global infection rates and concomitant hospitalizations, this approach of facilitating triaging vulnerable patients might be relevant.

Supplementary Materials:
The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/microorganisms11071740/s1, Supplemental S1: Blood gas analysis to predict ICU days; Supplemental S2: Verification of reduced features via unsupervised clustering in our validation set.  Institutional Review Board Statement: Institutional Review Board (2020-541N) approved the study protocol, and the study was conducted in accordance with the Declaration of Helsinki.
Informed Consent Statement: Informed written consent was obtained from each subject.