A Simple Bacteremia Score for Predicting Bacteremia in Patients with Suspected Infection in the Emergency Department: A Cohort Study

Bacteremia is a life-threatening condition that has increased in prevalence over the past two decades. Prompt recognition of bacteremia is important; however, identification of bacteremia requires 1 to 2 days. This retrospective cohort study, conducted from 10 November 2014 to November 2019, among patients with suspected infection who visited the emergency department (ED), aimed to develop and validate a simple tool for predicting bacteremia. The study population was randomly divided into derivation and development cohorts. Predictors of bacteremia based on the literature and logistic regression were assessed. A weighted value was assigned to predictors to develop a prediction model for bacteremia using the derivation cohort; discrimination was then assessed using the area under the receiver operating characteristic curve (AUC). Among the 22,519 patients enrolled, 18,015 were assigned to the derivation group and 4504 to the validation group. Sixteen candidate variables were selected, and all sixteen were used as significant predictors of bacteremia (model 1). Among the sixteen variables, the top five with higher odds ratio, including procalcitonin, neutrophil–lymphocyte ratio (NLR), lactate level, platelet count, and body temperature, were used for the simple bacteremia score (model 2). The proportion of bacteremia increased according to the simple bacteremia score in both cohorts. The AUC for model 1 was 0.805 (95% confidence interval [CI] 0.785–0.824) and model 2 was 0.791 (95% CI 0.772–0.810). The simple bacteremia prediction score using only five variables demonstrated a comparable performance with the model including sixteen variables using all laboratory results and vital signs. This simple score is useful for predicting bacteremia-assisted clinical decisions.


Introduction
Bacteremia is a major cause of morbidity and requires early detection and appropriate antibiotics [1][2][3].Blood culture sampling is a mandatory method used to detect bacteremia and it is commonly performed for various patients from less severe infection to septic shock in emergency departments [4,5].However, the prevalence of bacteremia is 7-20%, with a high rate of false positives, and the indication(s) for performing blood cultures is not well established, and thus remains controversial [6,7].This results in unnecessary invasive procedures, consumption of resources, increased costs, inappropriate or delayed use of antibiotics, and prolonged hospital admission [8,9].The rate of false positives in blood cultures or contamination is often the highest in emergency departments (EDs).Robertson et al. reported contamination rates of 11.7% in the ED versus 2.5% in other hospital areas [10].
Several clinical tools have been developed to predict bacteremia using biomarkers and clinical scores [11].Consequently, prediction tools that enable exclusion of bacteremia are highly desirable to increase the cost effectiveness of microbiological tests [12].Shapiro et al. suggested indications for blood culture if at least one major or two minor criteria were present among 13 clinical parameters associated with high risk; otherwise, patients are classified as "low risk" and unnecessary blood cultures may be omitted [13,14].In addition to clinical findings, many studies have suggested that laboratory investigations, such as procalcitonin (PCT) and neutrophil-lymphocyte ratio (NLR), may play a useful role in predicting bacteremia [15,16].
Although many efforts have been made to predict bacteremia, there are no detailed guidelines specifying which patients should undergo blood culture testing, and no simple prediction score for bacteremia has yet been developed.To identify patient groups at low risk for bacteremia and optimize the blood culture practice, we aimed to develop a simple scoring system that has a discriminatory value for predicting bacteremia and can help physicians classify bacteremia risk.

Materials and Methods
This large retrospective cohort study was conducted at the Samsung Medical Center, a university-affiliated, tertiary care referral hospital, located in Seoul, South Korea, from 10 November 2014 to 10 November 2019.This study was approved by the Institutional Review Board (IRB, 2023-09-144) of Samsung Medical Center.Given the retrospective nature of the study and the use of anonymized patient data, requirements for informed consent were waived.The study population comprised patients >18 years of age with suspected infection who underwent blood culture sampling and administration of antibiotics at ED admission, excluding those with septic shock [17].

Study Design
The primary goal was to develop a simple bacteremia score (model 2) and to compare its predictive accuracy with a reference model (model 1).
Data were retrospectively collected from electronic medical records.The population was randomly divided into a derivation cohort (80% of randomly selected samples) and a validation cohort (20% of randomly selected samples) with R statistical programming.The sampling code splits 80% of data selected randomly into the training set and the remaining 20% of samples into the test dataset.The sampling function in R randomly picks 80% of rows from the dataset without replacement.Sixteen candidate variables possibly associated with bacteremia were analyzed, including epidemiological factors, vital signs, and laboratory results, using simple comparison and univariable and multivariable logistic regression analyses of the derivation cohort to identify risk factors using variables with a p value < 0.05.The cut-off values for each variable were determined using the area under the receiver operating characteristic (ROC) curve (AUC) and based on a literature review.A reference model (model 1) was developed using the derivation cohort.Model 1 comprised variables that were found to be risk factors for bacteremia in multivariable logistic regression.Among the variables used in model 1, the top 5 with the highest odds ratio (OR) were selected to develop a simple bacteremia score model (model 2) using a regression coefficient-based scoring method.Predictive factors for bacteremia were identified using multivariable analysis and were assigned a weighted value to each factor using β coefficients that reflect predictive power.The β coefficients were rounded to the nearest whole number.A rounded number for each predictive factor was assigned to the bacteremia score.The overall risk score was calculated as the sum of these scores.
Finally, an ROC curve was generated, and the AUC was used to calculate the performance of the prediction model using the validation cohort.The prediction performances of the two models were compared along with 1 variable (PCT) that exhibited the most potent association with bacteremia.

Statistical Analysis
Standard descriptive statistics were used for all variables including baseline demographics and outcomes.The results are expressed as the median and interquartile range (IQR) for continuous variables and as the number with percentage for categorical data.Continuous variables were compared using the Wilcoxon rank-sum test.Categorical variables were compared using the chi-squared test.Univariate and multivariate logistic regression analyses were performed to assess variables related to bacteremia.Multivariate analysis using the logistic regression model was used to evaluate independent predictors of bacteremia, as measured by the estimated OR with corresponding 95% confidence interval (CI).Adjusted variables were selected based on their clinical relevance to bacteremia, and significant associations in the univariate analysis were entered into a stepwise logistic regression model.In the stepwise logistic regression model, the p value threshold to enter into the model was set at 0.25, and at 0.1 to be excluded from the model.The goodness-offit of the final logistic regression model was assessed using the Hosmer-Lemeshow test.The variables entered into the model were assigned a score based on the ORs to calculate a simple and easy clinical prediction scale.The discrimination performance of the risk index was assessed using the AUC, and the optimum cut-off value was chosen for optimal sensitivity and specificity.DeLong's test was used to compare ROC curves between the models.Differences with p < 0.05 were considered to be statistically significant.Statistical analysis was performed using Stata version 18.0 (StataCorp LLC, College Station, TX, USA).

Score Development and Developing a Simple Bacteremia Score (Model 2)
Among the sixteen variables, PCT, NLR, Lac, PLT, and BT were the top five variables associated with bacteremia in the derivation cohort.Subsequently, a final logistic regression was performed to develop a simple bacteremia score (model 2) with the top five variables in the derivation cohort (Table 2).More specifically, these top five variables were significantly associated with an increased risk for bacteremia in a multivariable logistic regression: PCT (adjusted odds ratio [  Using these top five variables, a simple score was developed.To simplify the assessment of bacteremia risk, we used points-based scoring systems, which enable a rapid decision for risk without the use of computers or electronic devices.To develop point-based scoring systems, the OR (β coefficients) of this model were converted into integer single risk scores by rounding to the nearest whole number.The points associated with each level of each risk factor are defined relative to the points associated with an increase in a specified continuous variable.The calculated points were assigned as independent variables.The simple bacteremia score was developed by summing the computed component variables; the total score ranged from 0 to 6 points (Table 3).

Validation
Prediction models were proposed and the performance of each was assessed using ROC curves and calibration plots.In the derivation cohort, the AUC for predicting bacteremia in model 1 was calculated to be 0.803 (95% CI 0.794-0.813),model 2 was 0.790 (95% CI 0.781-0.800),and PCT alone was 0.717 (95% CI 0.708-0.727)(p < 0.0001 [De-Long's test]).The predictive accuracy of model 2 (simple bacteremia risk score) was 0.87 (95% CI 0.87-0.88)with sensitivity of 0.958, specificity of 0.215, positive predictive value of 0.900, and negative predictive value of 0.411.Using the validation cohort, an internal validation of the predictive value of PCT versus another prediction rule was performed (model 1, model 2, and PCT alone).A total of 4504 patients were enrolled and analyzed for internal validation.The bacteremia prediction performance of model 1 was 0.805 (95% CI 0.785-0.824),model 2 was 0.791 (95% CI 0.772-0.810),and that of PCT alone was 0.753 (95% CI 0.773-0.774)(p < 0.0001 [DeLong's test]) (Table 4, Figure 4).The predictive accuracy of model 2 (simple bacteremia risk score) was 0.86 (95% CI 0.84-0.87)with sensitivity of 0.93, specificity of 0.310, positive predictive value of 0.905, and negative predictive value of 0.389.The constructed model calibration plot is presented in Figure 4, showing that predicted probabilities were close to the observed bacteremia.

Subgroup Analysis with Missing Data Imputation
To address missing data, traditional approaches were used by imputing missing using the median of the observed values.After imputation of missing data, 43,294 p were enrolled and included in the second analysis.The population was randomly divid a derivation cohort (n = 30,305 [70% randomly selected sample]) and a validation coho 12,989 [30% randomly selected sample]).The performances of model 1, model 2, an were 0.797 (95% CI 0.783-0.811),0.778 (95% CI 0.764-0.793),and 0.706 (95% CI 0.690respectively; DeLong's test had a p < 0.0001 in the validation cohort (Supplementary S1).In addition, 13,832 had normal PCT within <0.5 ng/mL among 43,294 patients.Th

Subgroup Analysis with Missing Data Imputation
To address missing data, traditional approaches were used by imputing missing values using the median of the observed values.After imputation of missing data, 43,294 patients were enrolled and included in the second analysis.The population was randomly divided into a derivation cohort (n = 30,305 [70% randomly selected sample]) and a valida-tion cohort (n = 12,989 [30% randomly selected sample]).The performances of model 1, model 2, and PCT were 0.797 (95% CI 0.783-0.811),0.778 (95% CI 0.764-0.793),and 0.706 (95% CI 0.690-0.772),respectively; DeLong's test had a p < 0.0001 in the validation cohort (Supplementary Figure S1).In addition, 13,832 had normal PCT within <0.5 ng/mL among 43,294 patients.The proportion of patients with bacteremia was also higher at higher score levels, even in the subgroup with normal PCT (Supplementary Figure S2).The simple bacteremia score was 0.686 (95% CI 0.664-0.709) in the derivation cohort and 0.671 (95% CI 0.624-0.718) in the normal PCT group.

Discussion
In this derivation and validation analysis of 22,519 patients with suspected infection, we compared the predictive performance of model 1 comprising 16 variables based on factors associated with bacteremia, model 2 comprising five variables, and PCT which was used alone.This study derived a simple bacteremia prediction score (i.e., model 2) to simplify the prediction of bacteremia, demonstrating a comparable performance to that of model 1.The simple bacteremia score (model 2) demonstrated a similar performance to that of model 1 (AUC of 0.805 vs. 0.791), whereas PCT, as the best individual variable, yielded a weaker AUC of 0.753.The risk for developing bacteremia was proportional to an increase in score.
Strengths of the present study include its large population size with validation, risk stratification guiding blood cultures, applicability to a wide range of populations, including low-risk patients, heterogeneous characteristics of the ED, and simplicity of the score.We identified significant predictors of bacteremia in a large derivation cohort and validated the performance of the model.An increased risk for bacteremia has been an important issue among patients with sepsis; consequently, false-positive blood cultures are associated with prolonged hospital stays and increased costs, with no definitive guidelines for blood cultures [5,7].Several studies have explored bacteremia prediction tools; however, these have been limited to specific diseases and complexity [18][19][20][21][22]. Therefore, risk stratification of bacteremia using this simple tool may help identify patients who require a blood culture.Conversely, a score of 0 can aid in the direction of not performing a blood culture because the probability of bacteremia was <2% at this score.Miquel et al. established a bacteremia rate < 8% for patients with pneumonia with a score ≤1 using six variables [23].Potentially, the application of our simple bacteremia score (model 2) may better eliminate unnecessary blood cultures and the misuse/abuse of antibiotics.
Moreover, the simplicity of the bacteremia prediction score makes it convenient and useful for clinicians.In a recent study, David et al. used a modified Shapiro score (MSS) ≥ 3 and NLR > 12, which demonstrated an equal ability to predict bacteremia, with AUCs of 0.71 and 0.74, respectively; however, combining MSS and NLR did not increase the predictive performance [24].Although Chun-Yuan et al. established an AUC of 0.867 (95% CI 0.806-0.928)using a combination of four factors (age ≥ 65 years, involvement area, liver cirrhosis, systemic inflammatory response syndrome); however, this score was limited in patients with cellulitis [22].Lars et al. reported an AUC of 0.86 (95% CI 0.83-0.89)using a combination of four biomarkers (NLR, CRP, Lac, and PCT) [25].However, it was likely easier to distinguish bacteremia in the population, which had a relatively high risk for bacteremia because verified bacterial infection reached 55.6% of enrolled patients.However, the present study yielded an AUC of approximately 0.80 using simple variables, even in heterogeneous populations (12% bacteremia).Therefore, our simple bacteremia score would be applicable in a wide range of populations containing low-risk patients and heterogeneous ED characteristics with the advantage of simplicity.
Regarding the risk factors for bacterial infection, the Shapiro score, which was originally developed to rule out patients with a low risk for positive blood cultures, is commonly used [14,26].Our variables are consistent with the Shapiro scores and the previous literature.Among the variables analyzed in this study, PCT was the most influential independent predictor.Afshan et al. reported that AUCs for PCT were 0.781 and 0.70 [27] in a study by Sibtain et al. [28], outperforming CRP in both studies.Abderrahim et al. reported that a PCT threshold, ranging from ≤0.4 to ≥0.75 ng/mL, demonstrated high diagnostic accuracies for bacteremia in a cross-sectional study [29].Marik et al. also suggested PCT < 0.5 ng/mL as an effective screening tool to exclude bacteremia, and NLR as a screening test for bacteremia when PCT is unavailable [30].The NLR has been described as a predictor of bacteremia [14].Ratzinger et al. found that neutrophils were the best individual variable to predict bacteremia, with an AUC of 0.694 [31].Thrombocytopenia has also been known to be a prognostic marker for bacteremia and associated with bacteremia [32][33][34][35].Lac, a prognostic biomarker for sepsis, is not considered to be specific for diagnosing sepsis [36]; however, several studies have shown that Lac is a biomarker for diagnosing bacterial sepsis [25].
Previous studies have proposed several models to predict bacteremia using not only simple predictors but also > 10 variables.In a cross-sectional study, models with 20 and 10 variables were established with AUCs of 0.767 and 0.759, respectively [31].Paul et al. reported that a computerized decision support system (TREAT) yielded an AUC of 0.68 (95% CI 0.63-0.73) in the first cohort and 0.70 (95% CI 0.67-0.73) in the second cohort in predicting bacteremia [37].Another study by Ratzinger et al. proposed 29 parameters to predict bacteremia, with an AUC of 0.729 (95% CI 0.679-0.779),whereas PCT exhibited an AUC similar to that reported by machine learning methods that failed to improve the moderate diagnostic accuracy of PCT [38].
Our study had several limitations, the first of which were its single center, cohort design.As a result, the proposed predictive model requires external validation to confirm the fitting of models.Nevertheless, this scoring algorithm enables ease of usability.Second, although we attempted to identify risk factors for bacteremia, other possible confounding factors should be considered, and significant predictors that have clinical validity should be identified.Third, in the subgroup analysis of missing imputations, most single-imputation methods provided biased estimates and incorrect standard errors.Fourth, patients taking antibiotics before their ED visits were not investigated.This could have affected the results of detecting bacteremia, although it could have made the models more practical.Fifth, this study lacks the data such as investigation of underlying disease states including diabetes mellitus or immunosuppression that may impact rates of bacteremia.

Conclusions
In this study, we developed and validated a simple bacteremia prediction score, which using only five variables, demonstrated a similar performance to the model with sixteen variables using all laboratory results and vital signs.This simple score is useful in predicting bacteremia and assisting in clinical decisions.

Institutional Review Board Statement:
The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board (IRB 2023-09-144) of Samsung Medical Center.
Informed Consent Statement: Patient consent was waived due to the retrospective and anonymous nature of the study.

Figure 2 .
Figure 2. Distribution of bacteremia according to the simple bacteremia score levels in the derivation and validation cohort.(a) Derivation cohort.(b) Validation cohort.

Figure 2 .
Figure 2. Distribution of bacteremia according to the simple bacteremia score levels in the derivation and validation cohort.(a) Derivation cohort.(b) Validation cohort.

Figure 2 .
Figure 2. Distribution of bacteremia according to the simple bacteremia score levels in the derivation and validation cohort.(a) Derivation cohort.(b) Validation cohort.

12 Figure 3 .
Figure 3. Calibration of simple bacteremia score in the derivation and validation cohort.Calibration plot indicating the agreement between model predictions (predicted probabilities) and observed frequencies.Individual data points are shown for the derivation and validation cohort.(a) Derivation cohort.(b) Validation cohort.

Figure 3 .
Figure 3. Calibration of simple bacteremia score in the derivation and validation cohort.Calibration plot indicating the agreement between model predictions (predicted probabilities) and observed frequencies.Individual data points are shown for the derivation and validation cohort.(a) Derivation cohort.(b) Validation cohort.

Table 4 .Model 1 :
Area under the receiver operating characteristics curve (AUC) of model 1, simple score, and procalcitonin to predict bacteremia in the derivation and validation data.bacteremia predicting model based on 16 predictors associated with bacteremia.Model 2: simple bacteremia score using top five predictors associated with bacteremia among sixteen variables.J. Pers.Med.2024, 14, x FOR PEER REVIEW

Figure 4 .
Figure 4. Receiver operating characteristic (ROC) curves of model 1, simple score, and procalcit predict bacteremia in the validation data.

Figure 4 .
Figure 4. Receiver operating characteristic (ROC) curves of model 1, simple score, and procalcitonin to predict bacteremia in the validation data.

Table 1 .
The baseline characteristics of the study population.

Table 2 .
Multivariable logistic regression of top five predictors of bacteremia in all variables, derivation cohort, and validation cohort.

Table 3 .
Clinical prediction scale (simple bacteremia risk score of the final model).