A Machine Learning Predictive Model of Bloodstream Infection in Hospitalized Patients

The aim of the study was to build a machine learning-based predictive model to discriminate between hospitalized patients at low risk and high risk of bloodstream infection (BSI). A Data Mart including all patients hospitalized between January 2016 and December 2019 with suspected BSI was built. Multivariate logistic regression was applied to develop a clinically interpretable machine learning predictive model. The model was trained on 2016–2018 data and tested on 2019 data. A feature selection based on a univariate logistic regression first selected candidate predictors of BSI. A multivariate logistic regression with stepwise feature selection in five-fold cross-validation was applied to express the risk of BSI. A total of 5660 hospitalizations (4026 and 1634 in the training and the validation subsets, respectively) were included. Eleven predictors of BSI were identified. The performance of the model in terms of AUROC was 0.74. Based on the interquartile predicted risk score, 508 (31.1%) patients were defined as being at low risk, 776 (47.5%) at medium risk, and 350 (21.4%) at high risk of BSI. Of them, 14.2% (72/508), 30.8% (239/776), and 64% (224/350) had a BSI, respectively. The performance of the predictive model of BSI is promising. Computational infrastructure and machine learning models can help clinicians identify people at low risk for BSI, ultimately supporting an antibiotic stewardship approach.


Introduction
Hospital-acquired (HA) bloodstream infection (BSI) is a frequent and challenging clinical condition worldwide with a documented considerable impact on hospitalization length and healthcare costs.In 2019, a systematic review of the literature estimated 4.95 million deaths associated with bacterial antimicrobial resistance (AMR) and 1.27 million deaths attributable to bacterial AMR, BSI being the second most frequent infectious syndrome [1].
Without treatment, evolution from BSI to sepsis, a complex life-threatening syndrome caused by a dysregulated host response to infections, is highly probable [2,3].As pooled mortality of sepsis can reach 40% in critically ill patients [4,5], the early recognition of sepsis and initiation of antibiotic and support therapies are recommended to increase life expectancy [6].On the other hand, clinical recognition of BSI or sepsis can be challenging.Therefore, clinicians, guided by the fear of an unfavorable evolution of the patient, are prone to prescribe empirical antibiotic overtreatments with possible consequences such as adverse drug effects, the emergence of multidrug-resistant infections or Clostridioides difficile infection, and an increase in healthcare costs [7], rather than opting for a watchful waiting approach [8] to save useless antibiotics when BSI is not confirmed.Given that antibiotic therapy is lifesaving and, at the same time, the major driver of antibiotic-resistant microorganism selection, it is of great importance to distinguish those clinical situations for which the initiation of antibiotic therapy should be started as soon as possible from those for which it could be delayed or even avoided.
Machine learning (ML)-based approaches are considered more accurate than traditional clinical scores [9,10], thanks to the larger amount of data split into two different datasets (training and validation) and the flexible nature of the algorithm.While numerous ML-based predictive models for early sepsis identification have been developed, showcasing impressive accuracy, those specifically designed for patients with BSI are relatively scarce.Some studies have focused on the prediction of mortality of patients with BSI.The Bloomy score [11] demonstrated a good performance in predicting the 14-day and 6-month mortality of an ML model on patients with BSI caused by multidrug-resistant organisms.Other studies showed similar good performance of ML-based models for the prediction of mortality in people with BSI [10,[12][13][14].Few studies investigated the accuracy of models to predict the probability of having a BSI in people undergoing blood cultures (BCs).These models demonstrated acceptable performances in patients admitted to the triage of an Emergency Department [15,16], in patients admitted to the Intensive Care Unit [17], in febrile children [18], and in hemodialysis patients [19].In a study of an ML-based model, the authors predicted BSI and estimated the performances in subgroups stratified by causative pathogen, where Acinetobacter baumannii, Escherichia coli, and Klebsiella pneumoniae showed high accuracy for bacteremia prediction [20].
Therefore, the objective of the study was to build a predictive model for HA-BSI in patients admitted to ordinary wards.To achieve this aim, we conducted the following: (i) several parameters were automatically extracted from electronic health records (EHR) to build a multidimensional BSI Data Mart; (ii) a machine learning-based pipeline was implemented, and several predictors of HA-BSI were identified; (iii) we assigned a probabilistic risk level for HA-BSI (categorized as low or high risk) for each hospitalization.This predictive model serves to support an antibiotic stewardship approach in managing and optimizing the use of antibiotics.

BSI Data Mart Building Procedures
The implementation of the ML-based pipeline was performed by the Generator Center at the Fondazione Policlinico Universitario A. Gemelli IRCCS (FPG), Rome, Italy.The Generator Real World Data (RWD) lab is responsible for extracting, standardizing, and integrating the huge amount of both structured and unstructured healthcare data, which are heterogeneously stored in the hospital's Data Warehouse (DWH) and in the archives of individual departments.Both ontology-based systems and information technology (IT) procedures were implemented to build an integrated pseudo-anonymized database, ensuring data ownership and patient privacy.Then, predictive models were developed and implemented with the aim of supporting clinical diagnosis [21].The first step of the pipeline construction was the development of an integrated and multidimensional database (Data Mart).The Data Mart was based on an ontology defined by an interdisciplinary group of clinicians, who identified an extensive list of variables usually considered in clinical practice and in the treatment of possible BSI.Starting from these selected variables, the corresponding data sources in the EHR archives were identified.Specific extraction, transformation, and loading (ETL) procedures were developed for structured and free-text reports to integrate all the relevant data sources and save the variables under study in a single multidimensional database.In the case of structured variables, such as laboratory analyses and unique identification codes, material and units of measurement were defined and extracted for each variable.For the variables generally reported in free-text clinical reports, such as clinical/nursery diaries, a dedicated extraction process was developed.Natural Language Processing (NLP) was implemented based on text mining procedures, including sentence/word tokenization, a rule-based approach supported by annotations defined by clinical experts, and the use of semantic/syntactic corrections.The Data Mart was developed using the SAS Institute software analysis tool, the SAS ® Vyia ® environment V.03.05.The Open-Source R ® environment version 4.0.4 was used for rapid prototyping and modeling.

Ontology and Study Design
The Data Mart consists of all patients for whom BCs were performed during the hospitalization.All BCs performed from peripheral access (PA) and/or central venous catheter (CVC) during the same calendar day were considered representative of a single event.Only the first event per single patient, i.e., collected within the first 24 h of the onset of signs and symptoms, was included, whereas those following the initial event were a priori excluded.Second and further episodes of BSI during the same hospitalization were excluded because the risk of further BSI could have been influenced by the first episode.The presence of BSI was defined as the growth of a clinically relevant microorganism in at least one BC or as the growth of a potential contaminant (i.e., coagulase-negative staphylococci, Bacillus spp., viridans group streptococci, Corynebacterium spp., Propionibacterium spp., and Micrococcus spp.) in multiple BCs.For each hospitalization, demographics, comorbidities, vital signs, devices, and laboratory data were queried from the Data Mart.The information gathered on clinical signs, devices, and laboratory values was closest to the date of the BC request.Specifically, only values recorded within two days before or after the BC date were considered.Preference was given to data on the same day as the request or on previous days.

Cohort Selection
To identify clinical predictors of BSI, hospitalizations were selected from the entire Data Mart with the aim of analyzing as homogeneous a group as possible.Cohort selection criteria were defined by clinicians and summarized in Figure 1.The cohort included all patients (age ≥ 18 years) hospitalized at FPG for whom BCs were performed within the period from 1 January 2016 to 31 December 2019 and presenting clinical diary information within two days from the date of BC collection.Each patient was included at the time of the first negative or positive BC during the study period.All BCs without procalcitonin information within two days of the BC collection date were excluded from the analysis.Moreover, single BCs with contaminants were excluded.To have a cohort as homogeneous as possible, only hospital-acquired BSI (HA-BSI) were considered for the study.Therefore, all BCs performed within the first 48 h of hospital admission were excluded.

Statistical Analysis
The pipeline for the BSI prediction model is illustrated in Figure 2. All variables included in the study were first analyzed by descriptive statistic techniques.Qualitative variables were described as absolute and relative frequencies.Quantitative variables were described as medians and interquartile ranges (IQR).Numerical variables with a percentage of missing data < 15% were imputed with the median imputation technique.To understand the effect of imputation, the distribution of the percentages of missing values and the percentage of change in the correlation coefficient with respect to the patients with or without BSI of the imputed and not imputed variables was analyzed.Comorbidities were aggregated by means of an index considering 5 macro-groups (Index CM).The macro-groups were immunodepression (composed of one or more among cirrhosis, connective tissue disease, HIV, autoimmune disease, transplantation), neurological pathology (composed of one or more among neurological disease, dementia, hemiplegia, stroke, cerebrovascular disease), neoplasia (composed of one or more among cancer, hematological neoplasia, leukemia, lymphoma, or treatment with chemotherapy or radiotherapy), renal insufficiency (kidney failure and/or dialysis), and diabetes.The belonging to each group had the same weight; the index was given by the sum of the membership in a macro-group and ranged from 1 to 5. A univariate statistical analysis was performed on the entire dataset using the chi-square test for categorical variables and the Mann-Whitney test for numerical variables.The numerical variables were made categorical using both cut-offs derived from clinical practice and those derived from the distribution of each variable based on the BSI outcome.In particular, the Kaplan-Meier estimator was used to identify the value that maximizes the statistical difference between positive and negative BSI.All cut-offs were then clinically validated.Several steps were followed to build the model.Initially, the entire dataset was divided into two groups for training and model validation, respectively.In the training group, a univariate analysis was performed to select features by evaluating the relationship of each variable with the outcome (univariate logistics), and then the correlation between variables was analyzed to remove redundancy.Multivariate logistic regression was used to construct a clinically interpretable multivariate predictive model.Logistic regression is considered a standard machine learning model in the clinical setting, as it has shown a good trade-off between predictive performance and clinical interpretability in various contexts [22,23].The different steps have been summarized in Figure 2 and detailed below.

Statistical Analysis
The pipeline for the BSI prediction model is illustrated in Figure 2. All variables included in the study were first analyzed by descriptive statistic techniques.Qualitative variables were described as absolute and relative frequencies.Quantitative variables were described as medians and interquartile ranges (IQR).Numerical variables with a percentage of missing data < 15% were imputed with the median imputation technique.To understand the effect of imputation, the distribution of the percentages of missing values and the percentage of change in the correlation coefficient with respect to the patients with or without BSI of the imputed and not imputed variables was analyzed.Comorbidities were aggregated by means of an index considering 5 macro-groups (Index CM).The macro-groups were immunodepression (composed of one or more among cirrhosis, connective tissue disease, HIV, autoimmune disease, transplantation), neurological pathology (composed of one or more among neurological disease, dementia,

Results
The eligible training cohort included a total of 5660 hospitalizations with at least one BC request during the timeframe from 1 January 2016 to 31 December 2019, more than 48 h from the hospital admission.In the entire cohort, the number of hospitalizations with HA-BSI was 1904 (33.6%).The numerical variables that were imputed were as follows: white blood cells, platelets, creatinine, blood urea, neutrophils, C-reactive protein (missing value < 8%), and total bilirubin (missing value < 15%).The percentage of missing values in the positive and negative population is comparable (difference <2.25% for all variables), and the percentage difference in the correlation coefficient with respect to patients with and without BSI of the imputed and non-imputed variables is < 10% for all variables analyzed.Clinical and demographic information of the study population is shown in Table 1.All numerical variables were categorized according to the cut-offs described in Table S1.The dataset was divided into two groups for training and validation: 4026 (71%) hospitalizations with BC requests between 1 January 2016 and 31 December 2018 were used to train the model, while 1634 (29%) hospitalizations with BC requests between 1 January 2019 and 31 December 2019 were used to test the model.The occurrence of HA-BSI was 34.0% for the training set and 32.7% for the test set.Characteristics of the population belonging to the two groups are shown in Table 1.The subset of univariate significant variables and the final predictors of the multivariate analysis are shown in Table 2. Age > 80 years, fever, hypotension, altered mental status, central venous catheter, blood urea nitrogen > 13 mg/dl, procalcitonin > 1 ng/mL, total bilirubin > 2 mg/dl, time from admission to BSI > 12 days, 2 or more index comorbidities and platelets < 50,000/mm 3 were associated with an increased risk of having a BSI.Five-fold cross-validation resulted in an AUROC of 0.74 on the training set.The respective statistics of the classification matrix are shown below: accuracy 0.66, sensitivity 0.74, specificity 0.62, NPV 0.82, and PPV 0.50.The model was then tested on the validation set.The performance in terms of AUROC was 0.74, and the confusion matrix was as follows: accuracy 0.69, sensitivity 0.69, specificity 0.69, NPV 0.82, and PPV 0.52.No statistically significant difference (p > 0.05) was observed between actual and predicted BSI and the corresponding calibration plot, as shown in Figure S1.A lift and gain analysis of the validation set is shown in Figure S2.The lift plot on the testing data showed that for the first decile of predictions, the model performs more than two times better than random guessing based on prevalence only.
Three risk groups were selected based on the interquartile predicted risk score on the training set to better categorize the patient risk and minimize antibiotic treatment of those without a true BSI.Of the entire validation set, 508 (31.1%) patients were classified at low risk, 776 (47.5%) at medium risk, and 350 (21.4%) at high risk of BSI (Figure 3).

p-Value
Odds Five-fold cross-validation resulted in an AUROC of 0.74 on the training s respective statistics of the classification matrix are shown below: accuracy 0.66, sen 0.74, specificity 0.62, NPV 0.82, and PPV 0.50.The model was then tested on the val set.The performance in terms of AUROC was 0.74, and the confusion matrix follows: accuracy 0.69, sensitivity 0.69, specificity 0.69, NPV 0.82, and PPV 0. statistically significant difference (p > 0.05) was observed between actual and pre BSI and the corresponding calibration plot, as shown in Figure S1.A lift and gain a of the validation set is shown in Figure S2.The lift plot on the testing data showed the first decile of predictions, the model performs more than two times better than r guessing based on prevalence only.
Three risk groups were selected based on the interquartile predicted risk score training set to better categorize the patient risk and minimize antibiotic treatment o without a true BSI.Of the entire validation set, 508 (31.1%) patients were classified risk, 776 (47.5%) at medium risk, and 350 (21.4%) at high risk of BSI (Figure 3).The percentages of BSI associated with each risk class were 14.2% (72/508 pa for low risk, 30.8% (239/776 patients) for medium risk, and 64% (224/350 patients) f risk.Among low-risk patients in the validation set, 436 patients (85.8%) classi negative were true negatives, while 72 patients (14.2%) had a BSI and were class negatives.In the medium-risk class, 324 patients were true negatives (41.7%), 93 p were false negatives (12%), 146 patients were true positives (18.8%) and 213 wer positives (27.5%).Among high-risk patients, whom all were predicted as positi The percentages of BSI associated with each risk class were 14.2% (72/508 patients) for low risk, 30.8% (239/776 patients) for medium risk, and 64% (224/350 patients) for high risk.Among low-risk patients in the validation set, 436 patients (85.8%) classified as negative were true negatives, while 72 patients (14.2%) had a BSI and were classified as negatives.In the medium-risk class, 324 patients were true negatives (41.7%), 93 patients were false negatives (12%), 146 patients were true positives (18.8%) and 213 were false positives (27.5%).Among high-risk patients, whom all were predicted as positive, 224 patients (64%) were true positives while 126 patients (36%) had no BSI and were classified as positive (Figure 4).

Discussion
In the present study, a machine learning-based model was built to predict the probability of having an HA-BSI at the time a BC was requested.On 5660 patients hospitalized from 1 January 2016 to 31 December 2019, for which at least a BC was drawn, the model showed that age > 80 years, fever, hypotension, an altered mental status, the presence of a CVC, BUN > 13 mg/dl, procalcitonin >1 ng/mL, total bilirubin > 2 mg/dl, time from admission to BSI >12 days, 2 or more index comorbidities and platelets < 50,000/mm 3 were associated with an increased risk of having an HA-BSI.Three risk level groups were identified: low risk, with a BSI prevalence of 14.2%; medium risk, with a BSI prevalence of 30.8%; and high risk, with a BSI prevalence of 64%.The AUROC of the model in the validation set was 0.74, suggesting moderate discriminatory ability.The AUROC result of our study is in line with those reported in other studies, ranging from 0.54 [26] to 0.82 [10].However, this value, together with the NPV = 0.82, describes a good prediction of true negatives, which is of considerable importance from an antibiotic stewardship point of view.Indeed, the good reliability of true negatives may allow antibiotic therapy not to be administered or, at most, to be delayed in at least 31.1% of patients considered at low risk.Interestingly, the variables included in our model coincide with two of the three major criteria and four of the eight minor criteria of the non-machine learning-based decision rule proposed by Shapiro et al. in 2008 [27], which remains one of the highest-performing predictive models.
As previously noted, physicians tend to overestimate the probability of BSI for many patients [28].From a practical point of view, assuming to treat all patients for which a BC was required and applying the present model, 86% of antibiotic therapy in patients at low risk would have been saved and delayed in 14% of patients.In the middle-risk group, 41.7% of antibiotic therapies would have been saved but delayed in 12% of patients.In the high-risk group, no antibiotic therapies would have been delayed.
While machine learning-based models built for an early prediction of sepsis are numerous [29,30], especially in Intensive Care Unit (ICU) populations, machine learning-

Discussion
In the present study, a machine learning-based model was built to predict the probability of having an HA-BSI at the time a BC was requested.On 5660 patients hospitalized from 1 January 2016 to 31 December 2019, for which at least a BC was drawn, the model showed that age > 80 years, fever, hypotension, an altered mental status, the presence of a CVC, BUN > 13 mg/dl, procalcitonin >1 ng/mL, total bilirubin > 2 mg/dl, time from admission to BSI >12 days, 2 or more index comorbidities and platelets < 50,000/mm 3 were associated with an increased risk of having an HA-BSI.Three risk level groups were identified: low risk, with a BSI prevalence of 14.2%; medium risk, with a BSI prevalence of 30.8%; and high risk, with a BSI prevalence of 64%.The AUROC of the model in the validation set was 0.74, suggesting moderate discriminatory ability.The AUROC result of our study is in line with those reported in other studies, ranging from 0.54 [26] to 0.82 [10].However, this value, together with the NPV = 0.82, describes a good prediction of true negatives, which is of considerable importance from an antibiotic stewardship point of view.Indeed, the good reliability of true negatives may allow antibiotic therapy not to be administered or, at most, to be delayed in at least 31.1% of patients considered at low risk.Interestingly, the variables included in our model coincide with two of the three major criteria and four of the eight minor criteria of the non-machine learning-based decision rule proposed by Shapiro et al. in 2008 [27], which remains one of the highest-performing predictive models.
As previously noted, physicians tend to overestimate the probability of BSI for many patients [28].From a practical point of view, assuming to treat all patients for which a BC was required and applying the present model, 86% of antibiotic therapy in patients at low risk would have been saved and delayed in 14% of patients.In the middle-risk group, 41.7% of antibiotic therapies would have been saved but delayed in 12% of patients.In the high-risk group, no antibiotic therapies would have been delayed.
While machine learning-based models built for an early prediction of sepsis are numerous [29,30], especially in Intensive Care Unit (ICU) populations, machine learningbased prediction models of BSI among people for which a BC was required are rare [20].In a recently published paper, Mahmoud et al. [26] presented a machine learning-based model with a very high specificity but low sensitivity.Unfortunately, almost 90% of the study patients started antibiotic therapy at least 24 h before BC, probably influencing the result.Surprisingly, in the model by Mahmoud et al. [26], procalcitonin level was not correlated to positive BC.In a recent paper, Ratzinger et al. [31] screened 3370 patients with Systemic Inflammatory Response Syndrome (SIRS) in a prospective cohort study and built a random forest model with good performance (AUC 0.738).However, the model did not perform better than procalcitonin alone (AUC 0.729).Other machine learning-based models with high accuracy were built [32], even though only in ICU patients.
Since models such as those presented in this paper are implemented into Electronic Health Records (EHR), real-time processing of the data provides an immediate and seamless calculation of the likelihood of having a BSI.The instant translation of a mathematical model into an explainable and implementable score for clinical decisions enhances its usability in daily practice.This can be especially useful in settings where the healthcare system is overloaded, or decisions need to be made very quickly, such as in the Emergency Department.Machine learning-based models allow us to analyze a large amount of data directly extracted from EHR, overcoming the limitations of many published BSI probability models [33].
A predictive model may assist the clinician in investigating a suspected BSI or might be useful in identifying patients for more expensive diagnostic techniques.However, the present model was not designed as a warning system (detecting the onset of sepsis as early as possible) but as a support for clinicians to decrease unnecessary exposure to antibiotics when the probability of having a BSI is very low.The main driver of antimicrobial resistance (AMR) is the inappropriate use of antimicrobials [7].According to the antibiotic stewardship perspective, a predictive model of BSI, such as that of the present study, has the potential to support a wise watchful waiting approach [8].The present model may also contribute to better selecting patients with a high pre-test probability of BSI for whom a BC might be requested [34].
This study has some limitations.First, the model was built on retrospective data from a single clinical center.Even though the amount of analyzed data is relevant, the results of the study cannot be generalized to other clinical centers.Moreover, the study is observational, and the impact of its use on daily clinician decision-making (external validation) was not assessed.Similarly, the cut-offs of the numerical variables that were categorized using thresholds derived directly from the data need external validation.Further studies should evaluate whether the routinary implementation of this model in daily practice may result in significant savings of useless antibiotics and reduction of costs.Finally, due to the variable reliability of data capture within our EHR, we did not include the source of infection, a component correlated to the likelihood of positive BC.

Conclusions
Our predictive model gives an example of how EHR-based clinical decision support (CDS) systems are promising tools in an antibiotic stewardship approach to thin out unnecessary antibiotic treatments.The study highlights how computational infrastructure and machine learning models, updated in real-time, can continuously inform clinicians of the best clinical decisions, representing a supplement and never replacing the clinical judgment.If the low number of patients with false negative results is confirmed by future studies, clinicians may be supported, in situations of uncertainty or in low-risk patients, not to administer antibiotic therapy or, at most, to delay it.Simple prognostic scores are probably dated, and more advanced multimorbidity models should be considered.We strongly support the 3PM approach: predictive, preventive, and personalized medicine.The availability of tools to prescribe antibiotics more precisely could be a step towards this purpose.Finally, data and models may be shared among centers to refine analyses and improve the fight against antimicrobial resistance using methodology that preserves data ownership and patient privacy [35].

Figure 1 .
Figure 1.Cohort selection.The initial dataset was divided into two groups for model training and validation based on the BC request date.The model was trained in the first three years of data (1 January 2016-31 December 2018) and tested in the next year (1 January 2019-31 December 2019).A univariate logistic regression was first used on the training set to select candidate predictors of BSI.The p-value and Odds Ratio (OR) were estimated, and only the features with p-value ≤ 0.05 and Odds Ratio > 1 were included.A linear cross-correlation analysis (Pearson's correlation method) of the significant variables was performed to exclude linearly related variables.Among the pairs of linearly correlated variables (Pearson's cor-

Figure 2 .
Figure 2. Pipeline for the predictive model of hospital-acquired bloodstream infection (BSI).The initial dataset was divided into two groups for model training and validation based on the BC request date.The model was trained in the first three years of data (1 January 2016-31 December 2018) and tested in the next year (1 January 2019-31 December 2019).A univariate logistic regression was first used on the training set to select candidate predictors of BSI.The p-value and Odds Ratio (OR) were estimated, and only the features with p-value ≤ 0.05 and Odds Ratio > 1 were included.A linear cross-correlation analysis (Pearson's correlation method) of the significant variables was performed to exclude linearly related variables.Among the pairs of linearly correlated variables (Pearson's correlation coefficient > 0.8), we removed variables with lower correlation with the

Figure 2 .
Figure 2. Pipeline for the predictive model of hospital-acquired bloodstream infection (BSI).

Figure 3 .
Figure 3. Distribution of the training set prediction index and definition of risk classes.

Figure 3 .
Figure 3. Distribution of the training set prediction index and definition of risk classes.

Figure 4 .
Figure 4. Risk groups with prediction information for positive and negative bloodstream infection (BSI) on the validation set.Positive BSIs are represented by the blue outline, of which the predicted positive ones are represented by the purple fill and the predicted negative ones by the light blue fill.Negative BSIs are represented by the red outline, of which the predicted negative ones are represented by the green fill and the predicted positive ones by the red fill.

Figure 4 .
Figure 4. Risk groups with prediction information for positive and negative bloodstream infection (BSI) on the validation set.Positive BSIs are represented by the blue outline, of which the predicted positive ones are represented by the purple fill and the predicted negative ones by the light blue fill.Negative BSIs are represented by the red outline, of which the predicted negative ones are represented by the green fill and the predicted positive ones by the red fill.

Table 1 .
Characteristics of patients included in the training and validation subsets.

Table 2 .
Predictors of bloodstream infection (BSI) positivity.Features selected with univariate logistic regression (p-value ≤ 0.05 and Odds Ratio > 1) and multivariate logistic regression with stepwise feature selection in five-fold cross-validation.