Simplified Acute Physiology Score 3 Performance in Austrian COVID-19 Patients Admitted to Intensive Care Units with and without Diabetes

This study evaluated and compared the performance of simplified acute physiology score 3 (SAPS 3) for predicting in-hospital mortality in COVID-19 patients admitted to intensive care units (ICUs) with and without diabetes in Austria. The Austrian national public health institute (GÖG) data of COVID-19 patients admitted to ICUs (n = 5850) were analyzed. Three versions of SAPS 3 were used: standard equation, Central European equation, and Austrian equation customized for COVID-19 patients. The observed in-hospital mortality was 38.9%, 42.9%, and 37.3% in all, diabetes, and non-diabetes patients, respectively. The overall C-statistics was 0.69 with an insignificant (p = 0.193) difference between diabetes (0.70) and non-diabetes (0.68) patients. The Brier score was > 0.20 for all SAPS 3 equations in all cohorts. Calibration was unsatisfactory for both standard and Central European equations in all cohorts, whereas it was satisfactory for the Austrian equation in diabetes patients only. The SAPS 3 score demonstrated low discrimination and accuracy in Austrian COVID-19 patients, with an insignificant difference between diabetes and non-diabetes. All equations were miscalibrated particularly in non-diabetes patients, while the Austrian equation showed satisfactory calibration in diabetes patients only. Both uncalibrated and calibrated versions of SAPS 3 should be used with caution in COVID-19 patients.


Introduction
Coronavirus disease (COVID-19) has caused a devastating pandemic with a high hospitalization rate and mortality. As of 6 December 2021, more than 272 million cases of COVID-19 and more than 5 million deaths have been reported worldwide [1]. This health crisis has severely challenged the capacity of healthcare systems to treat hospitalized

Study Design and Data Source
The "Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD)" checklist was used for reporting this study [7]. This study retrospectively analyzed the cohort of patients with and without diabetes mellitus admitted to ICUs following primary or secondary diagnosis of SARS-CoV-2 infection from March 2020 to March 2021 in Austria. These data are collected and maintained by the "Data platform COVID-19" commissioned by the Austrian National Public Health Institute (Gesundheit Österreich GmbH, Vienna, Austria). This platform gathers nationally representative countrywide epidemiological and clinical data of COVID-19 patients to provide updated evidence on SARS-CoV-2 infection in Austria. The details of this data platform can be accessed at: https://datenplattform-covid.goeg.at/english (accessed on 3 May 2021).

Data Extraction
For this study, two anonymized datasets were received from the Austrian data platform: (1) hospital data that comprised variables on demographic characteristics, comorbidities, ICU stay, and in-hospital mortality and (2) SAPS 3 data that comprised variables for calculating SAPS 3 score and the number of readmissions in the ICU. Data of patients admitted to ICU were extracted from the hospital data and then matched and merged with the SAPS 3 data after removing readmissions. Afterwards, patients aged less than 20 years were removed from the merged data, as only adults were considered in the study. A total of 5850 patients were included in the final analysis ( Figure 1).

Study Variables
The outcome variable was in-hospital mortality, which was defined as death occurring in the hospital following hospitalization for primary or secondary SARS-CoV-2 infection or discharged alive from the hospital. Diabetes was recorded in the database as a comorbidity

Study Variables
The outcome variable was in-hospital mortality, which was defined as death occurring in the hospital following hospitalization for primary or secondary SARS-CoV-2 infection or discharged alive from the hospital. Diabetes was recorded in the database as a comorbidity (insulin and non-insulin dependent diabetes) for the SAPS 3 and as per International Classification of Disease (ICD) version 10 codes (E10, E11, E12, E13, E14).
The SAPS 3 score consists of 20 variables that were recorded at the time of ICU admission. These variables were classified as patient characteristics, reasons for ICU admission, and acute physiological disruptions. Patient characteristics included age deciles (20-90+ years), gender, previous health status, comorbidities, intra-hospital location before ICU, length of stay in the hospital before ICU admission, and major therapeutic interventions before ICU admission. Reasons for ICU admission included health conditions, status and site of surgery, and the presence of infection at ICU admission. Acute physiological disruptions were measured in terms of vital signs, neurological status, serum creatinine, leukocytes, platelets, blood pH, partial pressure of oxygen (PaO2), and a fraction of inspired oxygen (FiO2). The detailed information regarding the calculation of the SAPS 3 score is published elsewhere [4].

Summary Statistics
Data were received in Microsoft Excel and analyzed in R version 1.4.1 and Stata version 17.0 (Stata Corp, Houston, TX, USA). Missing values of SAPS 3 variables were replaced with either reference or normal categories as recommended in the SAPS 3 publication. Continuous variables were reported as mean ± standard deviation (SD) or median and interquartile range (IQR) if not normally distributed. Categorical variables were reported as frequencies with corresponding percentages (%).

Calculation of SAPS 3 Score and Predicted in-Hospital Mortality
The SAPS 3 score was calculated only for the first episode of ICU admission using the variables and algorithm recommended in the original publication [4]. The predicted in-hospital mortality was estimated from the SAPS 3 score using three logit regression equations: (1) standard equation ( The SAPS 3 score consists of 20 variables that were recorded at the time of ICU admission. These variables were classified as patient characteristics, reasons for ICU admission, and acute physiological disruptions. Patient characteristics included age deciles (20-90+ years), gender, previous health status, comorbidities, intra-hospital location before ICU, length of stay in the hospital before ICU admission, and major therapeutic interventions before ICU admission. Reasons for ICU admission included health conditions, status and site of surgery, and the presence of infection at ICU admission. Acute physiological disruptions were measured in terms of vital signs, neurological status, serum creatinine, leukocytes, platelets, blood pH, partial pressure of oxygen (PaO 2 ), and a fraction of inspired oxygen (FiO 2 ). The detailed information regarding the calculation of the SAPS 3 score is published elsewhere [4]. . Missing values of SAPS 3 variables were replaced with either reference or normal categories as recommended in the SAPS 3 publication. Continuous variables were reported as mean ± standard deviation (SD) or median and interquartile range (IQR) if not normally distributed. Categorical variables were reported as frequencies with corresponding percentages (%).

Calculation of SAPS 3 Score and Predicted in-Hospital Mortality
The SAPS 3 score was calculated only for the first episode of ICU admission using the variables and algorithm recommended in the original publication [4]. The predicted in-hospital mortality was estimated from the SAPS 3 score using three logit regression equations: (1) standard equation ( [4,6]. In addition, the standardized mortality ratio (SMR) was estimated by dividing the observed mortality rate with the predicted mortality rate with corresponding 95% confidence intervals (CI) to test the

Assessment of Predictive Performance of SAPS 3
The predictive performance of each SAPS 3 equation was assessed in terms of discrimination, calibration, and accuracy. Discrimination was assessed by estimating the area under the receiver operating characteristics curve (AUC) or C-statistic with corresponding 95% CI. The AUC was compared between patients with and without diabetes using the DeLong test, and a p-value of <0.05 was chosen to determine statistical significance. The Youden index was estimated to select the optimal cut-off value of SAPS 3 score for the overall, diabetes, and non-diabetes cohorts. The identified cut-off values were then used to calculate sensitivity, specificity, and predictive values of SAPS 3 score.
Calibration was assessed by comparing the predicted probability against the observed probability using the Hosmer-Lemeshow (H-L) goodness-of-fit test and the calibration plot.
In the H-L test, the p-value > 0.05 indicates a good fit. In the calibration plot, the calibration slope close to 1 indicates good calibration, the calibration intercept (calibration in the large (CITL)) close to 0 indicates good calibration, and the alignment of calibration lowess curve with the reference line indicates good calibration. Accuracy of the SAPS 3 in predicting in-hospital mortality was assessed using the Brier score. The Brier score ranges from 0 to 0.25, with 0 indicating perfect accuracy and 0.25 indicating non-informative accuracy.

Ethical Considerations
This study was approved by the Ethics Committee of the Medical University of Graz, Graz, Austria (ethics number 32-355 ex 19/20). This study followed the guidelines of good clinical practice and the Declaration of Helsinki 1964. No consent forms were obtained from the study participants, as it was a retrospective analysis of pseudonymized data. Table 1 shows the distribution of characteristics, SAPS 3 variables, and SAPS 3 score in COVID-19 patients admitted to ICU in all, diabetes, and non-diabetes patients. Of the 5850 patients admitted to ICU, 1667 (28.50%) had diabetes. Most patients were males (66.07%) and aged above 60 years. The mean ± SD SAPS 3 score was 57.39 ± 13.18 in the overall cohort and was significantly higher in patients with diabetes than those without diabetes (58.78 ± 12.92 vs. 56.84 ± 13.23, p < 0.001). In reasons for admission, the category "all others" in each system include reasons that are either not related to that particular system or those not falling in specified categories within that system.

Characteristics of Patients
Pearson's chi-square or Fisher's exact test were applied to compare qualitative variables with diabetes status. Two sample t-tests or Wilcoxon rank sum tests were applied to compare quantitative variables with diabetes status. Table 2 shows that the overall observed in-hospital mortality was 38.91%, and it was significantly higher in patients with diabetes (42.95% vs. 37.29%, p < 0.001) compared to those without diabetes. Patients who died in the hospital had significantly higher mean ± SD SAPS 3 scores compared to those who were alive in all (62.57 ± 12.86 vs. 54.10 ± 12.29, p < 0.001), diabetes (63.96 ± 13.15 vs. 54.87 ± 11.28, p < 0.001), and nondiabetes patients each (61.92 ± 12.68 vs. 53.82 ± 12.62, p < 0.001).

Observed In-Hospital Mortality and Its Comparison with Variables
In reasons for admission, the category "all others: in each system include reasons that are either not related to that particular system or those not falling in specified categories within that system.
Pearson's chi-square or Fisher's exact test were applied to compare qualitative variables with in-hospital mortality status. Two sample t-tests or Wilcoxon rank sum tests were applied to compare quantitative variables with in-hospital mortality status. populations i.e., all, diabetes, and non-diabetes patients. The Austrian equation concorded well with the observed mortality in the overall and non-diabetes cohorts, whereas it slightly underpredicted the mortality in patients with diabetes.

Discrimination and Accuracy of SAPS 3
The optimal cut-off SAPS 3 score was 55, 55, and 58 for the overall, non-diabetes, and diabetes cohorts, respectively. Based on these cut-off scores, sensitivity was 72.4%, 70.6%, and 66.8%; specificity was 54.5%, 56.0%, and 60.6%; positive predictive value was 50.3%, 48.8%, and 56.0%; and negative predictive value was 75.6%, 76.2%, and 70.8% for the overall, non-diabetes, and diabetes cohorts, respectively. The SAPS 3 showed unsatisfactory discrimination for all three equations (AUC = 0.69) with an insignificantly (p = 0.193) higher discrimination in patients with diabetes (AUC = 0.70) compared to those without diabetes (AUC = 0.68) for each equation (Table 3 and Figure 2). The Brier score was > 0.20 for all three equations in three patient cohorts, which indicated its poor accuracy in COVID-19 patients (Table 3).

Discrimination and Accuracy of SAPS 3
The optimal cut-off SAPS 3 score was 55, 55, and 58 for the overall, non-diabetes, and diabetes cohorts, respectively. Based on these cut-off scores, sensitivity was 72.4%, 70.6%, and 66.8%; specificity was 54.5%, 56.0%, and 60.6%; positive predictive value was 50.3%, 48.8%, and 56.0%; and negative predictive value was 75.6%, 76.2%, and 70.8% for the overall, non-diabetes, and diabetes cohorts, respectively. The SAPS 3 showed unsatisfactory discrimination for all three equations (AUC = 0.69) with an insignificantly (p = 0.193) higher discrimination in patients with diabetes (AUC = 0.70) compared to those without diabetes (AUC = 0.68) for each equation (Table 3 and Figure 2). The Brier score was > 0.20 for all three equations in three patient cohorts, which indicated its poor accuracy in COVID-19 patients (Table 3).

Calibration of SAPS 3
The SAPS 3 standard and Central European equations were miscalibrated in all three patient cohorts. Both equations underpredicted the mortality in low-and medium-risk groups and overpredicted the mortality in high-risk groups of all and non-diabetes patients. In patients with diabetes, these equations under-predicted the mortality in low-and medium-risk strata. In comparison, the Austrian recalibrated equation overpredicted the mortality in high-risk groups in the entire cohort and non-diabetes patients but had good calibration in low-and medium-risk strata. It showed reasonable calibration across all risk strata of diabetes patients as indicated by the calibration curve and H-L test (p = 0.339) (Table 3, Figure 3).

Calibration of SAPS 3
The SAPS 3 standard and Central European equations were miscalibrated in all three patient cohorts. Both equations underpredicted the mortality in low-and medium-risk groups and overpredicted the mortality in high-risk groups of all and non-diabetes patients. In patients with diabetes, these equations under-predicted the mortality in lowand medium-risk strata. In comparison, the Austrian recalibrated equation overpredicted the mortality in high-risk groups in the entire cohort and non-diabetes patients but had good calibration in low-and medium-risk strata. It showed reasonable calibration across all risk strata of diabetes patients as indicated by the calibration curve and H-L test (p = 0.339) (Table 3, Figure 3).

Discussion
This countrywide retrospective cohort analysis assessed and compared the performance of SAPS 3 for predicting the mortality in COVID-19 patients with and without diabetes using the standard and customized equations for Central Europe and Austrian COVID-19 patients. The standard and Central European equations significantly underestimated the in-hospital mortality in all three patient populations, while the Austrian equation accurately predicted in-hospital mortality in all three patient populations. The discrimination of all SAPS 3 equations was unsatisfactory in all patient cohorts and was insignificantly higher in patients with diabetes compared to those without diabetes. Likewise, the forecasting accuracy of all SAPS 3 equations was low in all cohorts. The calibration was poor for SAPS 3 standard and Central Europe equations in all three patient cohorts, and it was the worst in non-diabetes patients. The Austrian equation showed superior calibration to other SAPS 3 equations in all three populations; however, its calibration was satisfactory in diabetes patients only.
Our analysis revealed that both uncalibrated and calibrated versions of SAPS 3 equations demonstrated unsatisfactory discriminatory performance (AUC = 0.69) and

Discussion
This countrywide retrospective cohort analysis assessed and compared the performance of SAPS 3 for predicting the mortality in COVID-19 patients with and without diabetes using the standard and customized equations for Central Europe and Austrian COVID-19 patients. The standard and Central European equations significantly underestimated the in-hospital mortality in all three patient populations, while the Austrian equation accurately predicted in-hospital mortality in all three patient populations. The discrimination of all SAPS 3 equations was unsatisfactory in all patient cohorts and was insignificantly higher in patients with diabetes compared to those without diabetes. Likewise, the forecasting accuracy of all SAPS 3 equations was low in all cohorts. The calibration was poor for SAPS 3 standard and Central Europe equations in all three patient cohorts, and it was the worst in non-diabetes patients. The Austrian equation showed superior calibration to other SAPS 3 equations in all three populations; however, its calibration was satisfactory in diabetes patients only.
Our analysis revealed that both uncalibrated and calibrated versions of SAPS 3 equations demonstrated unsatisfactory discriminatory performance (AUC = 0.69) and accuracy (Brier score > 0.20) in patients with COVID-19. Although the performance of various prognostication scores has been evaluated in COVID-19 patients, surprisingly, only a few studies have validated the SAPS 3 in this patient population. Compared to our study, a recent research letter reported the discrimination of SAPS 3 (AUC = 0.75) in 1464 patients admitted to ICUs in Austria. However, it remarkably underestimated the in-hospital mortality (SMR = 1.20) especially in low-risk groups, thereby questioning its clinical applicability [6]. Another research letter showed that the discrimination of the SAPS 3 regional equation was good (AUC = 0.83) with a well-concorded SMR (0.95) in Brazilian COVID-19 patients [5]. We speculate that the SAPS 3 tool has yielded different discrimination in Brazilian and Austrian patients due to differences in healthcare infrastructure and other healthcare-related factors, treatment regimens, the severity of disease, and the distribution of these risk factors in the population under study [8]. In addition, the underlying risk factors and the magnitude of their coefficients that comprise the tool are central to the discriminatory performance of a tool [9]. Furthermore, the SAPS 3 simplifies significant factors such as old age [10] and physiological disturbances into categories, which may provide inappropriate coefficients of associations for predicting in-hospital mortality in COVID-19 patients [4]. This particular issue for some SAPS 3 risk factors was highlighted by a multicenter European study [11]. Moreover, COVID-19 is more prevalent in people with multimorbidity and affects multiple body systems, and several inflammatory, coagulation, and cardiac markers have been shown to predict its severity and adverse outcomes [12]. However, the SAPS 3 score does not incorporate all these factors and markers into its equation, which might have resulted in underpredicting the mortality and henceforth its poor predictive performance in COVID-19 patients [13,14].
While validating the performance of risk tools in a specific population, satisfactory discrimination alone does not guarantee that the very tool performs well in different risk strata of patients. For this reason, achieving an optimal calibration is equally important for accurately classifying patients into risk strata and henceforth making accurate clinical decisions. Considering that COVID-19 is a debilitating infection with a high mortality rate, the accurate identification of high-risk COVID-19 patients could be vital for their clinical management and prognosis. However, in our study, the SAPS 3 standard and Central European equations were extremely miscalibrated particularly in low-and medium-risk strata of patients. These findings are not surprising, as previous validation studies also found inadequate calibration for SAPS 3 in the Austrian and Brazilian COVID-19 patients. However, in the Brazilian COVID-19 patients, the miscalibration was more obvious in high-risk groups, while similar to our findings, it was more apparent in low-risk groups in the Austrian COVID-19 patients [5,6]. The issue of miscalibration for SAPS 3 standard equations has been well documented in various patient populations [11,15,16], which indicates that this tool does not perform well in specific populations due to various patient characteristics, healthcare-related factors, variability in the coefficient of association between some risk factors and mortality, and the level of predicted outcome in the population [11]. Consequently, poor calibration of SAPS 3 compromises its clinical utility in COVID-19 patients, a fact that clinicians should be aware of.
Given the above-mentioned reasons, recalibration of the SAPS 3 has been recommended prior to applying to any patient population [5,6]. Therefore, we also adopted the recently published SAPS 3 equation for COVID-19 patients to evaluate its predictive performance in our cohort of COVID-19 patients [6]. As expected, this equation was superior to standard and Central European equations for predicting the mortality as indicated by the SMR close to 1. However, interestingly, this equation overpredicted the mortality in high-risk groups in the entire cohort and non-diabetes patients but exhibited satisfactory calibration in patients with diabetes. As mentioned earlier, the uncalibrated equations showed a similar pattern of miscalibration in the Austrian COVID-19 patients in our study and the previous study [6]. Hence, it is possible that recalibrating the equation specifically for low-risk groups might have induced the miscalibration in high-risk groups as shown in our study. The selective adequate calibration of this SAPS 3 equation in diabetes patients is a conundrum when, in fact, diabetes is not included as a risk factor in this tool. We can only conjecture that people with diabetes are more likely to have severe COVID-19 disease, a higher burden of multimorbidity and risk factors, and pronounced physiological disturbances than their counterparts [13,15]. Perhaps that is why even uncalibrated equations showed better calibration in these patients. Nevertheless, our findings suggest that even the recalibrated equation of SAPS 3 have performed inadequately in COVID-19 patients, and therefore, this tool ought to be used with caution in this population.
As stated above, people with diabetes may experience severe COVID-19 infection, its complications, and mortality due to compromised immune and inflammatory response, advanced age, multimorbidity, and metabolic derangements [13,17,18]. Hence, we expected that the SAPS 3 will exhibit superior discriminatory performance in patients with diabetes in comparison with non-diabetes. On the contrary, the discrimination was only~2% (p = 0.193) higher in patients with diabetes than without diabetes. One probable reason for the similar discrimination might be related to the inherent risk factors that are considered in the calculation of SAPS 3 score. To elaborate, the SAPS 3 is not designed for any specific disease. Rather, it is based on comprehensive patient characteristics, previous health status and therapeutic interventions, surgical status, and physiological markers, which are not specific to diabetes and hence could be altered in ICU patients with any pathophysiological condition [4]. These afore-mentioned reasons further support our findings that SAPS 3 may not be an appropriate prognostic tool for many clinical conditions including COVID-19.
This study has several limitations. First, diabetes was not classified into type 1 and type 2 diabetes because of the issue of miscoding in ICD-10 codes. Second, in-hospital mortality was defined as death occurring from any underlying causes. This could have included non-COVID-19-related deaths. Nevertheless, as this database captures data for COVID-19 patients only, the probability of including other causes of death is minimal. Third, the predictive performance of the SAPS 3 is significantly influenced by characteristics of patients, distribution of risk factors comprising the SAPS 3 score, and the healthcare system under study. Therefore, the findings of our study may not be transferable to other COVID-19 cohorts.

Conclusions
To conclude, SAPS 3 showed low discrimination and accuracy in Austrian COVID-19 patients, which was insignificant between diabetes and non-diabetes patients. Both uncalibrated and European calibrated equations of SAPS 3 were extremely miscalibrated especially in non-diabetes patients. We therefore recommend investigating specific determinants of SAPS 3 discrimination and calibration in COVID-19 patients. Moreover, even though the Austrian equation calibrated for COVID-19 patients demonstrated a better calibration especially in patients with diabetes, its low discrimination and forecasting power suggests that even calibrated SAPS 3 versions should be administered with caution in COVID-19 patients and revalidated locally. In addition, it would be prudent to re-evaluate its predictive performance periodically and update it as required to incorporate the impact of changes in the SARS-CoV-2 virus characteristics and treatment regimens. Furthermore, as both standard and recalibrated equations of SAPS 3 demonstrated better predictive performance in COVID-19 patients with diabetes compared to non-diabetes patients, we recommend further studies investigating this phenomenon.

Informed Consent Statement:
No consent forms were obtained from the study participants, as it was the retrospective analysis of pseudonymized data.

Data Availability Statement:
The dataset used in this study is a property of the Austrian National Public Health Institute (Gesundheit Österreich GmbH). Further information regarding data access is available at: https://datenplattform-covid.goeg.at/english, accessed on 3 May 2021.