Using Machine Learning to Predict Bacteremia in Febrile Children Presented to the Emergency Department

Blood culture is frequently used to detect bacteremia in febrile children. However, a high rate of negative or false-positive blood culture results is common at the pediatric emergency department (PED). The aim of this study was to use machine learning to build a model that could predict bacteremia in febrile children. We conducted a retrospective case-control study of febrile children who presented to the PED from 2008 to 2015. We adopted machine learning methods and cost-sensitive learning to establish a predictive model of bacteremia. We enrolled 16,967 febrile children with blood culture tests during the eight-year study period. Only 146 febrile children had true bacteremia, and more than 99% of febrile children had a contaminant or negative blood culture result. The maximum area under the curve of logistic regression and support vector machines to predict bacteremia were 0.768 and 0.832, respectively. Using the predictive model, we can categorize febrile children by risk value into five classes. Class 5 had the highest probability of having bacteremia, while class 1 had no risk. Obtaining blood cultures in febrile children at the PED rarely identifies a causative pathogen. Prediction models can help physicians determine whether patients have bacteremia and may reduce unnecessary expenses.


Introduction
Fever is one of the most frequent reasons for visits to the Pediatric Emergency Department (PED) [1,2], and estimates say that up to 10% to 25% of cases with febrile illness had bacterial infection [3,4]. Determining the appropriate method for evaluating febrile children remains a challenge, especially due to fears regarding occult bacteremia in febrile children who appear well without an obvious infection focus [5]. Bacteremia is a severe bacterial infection that can be detected by a blood culture, one of the most frequently ordered microbiological tests in the PED [6]. Blood cultures remain the gold standard test for detecting patients with bacteremia. Isolating the organism from the blood can confirm a diagnosis, which helps physicians identify the cause of the infection and then administer the appropriate antimicrobial agents. Upon receiving a blood culture result, physicians must decide whether the organism represents a clinically significant infection [6][7][8]. However, seeing a high rate of negative, false-positive, or contaminated blood cultures is quite common in children visiting the PED [9]. All of the above conditions result in the unnecessary use of healthcare resources and costs, including additional invasive testing, the inappropriate use of antibiotics, prolonged hospital admission, and parental anxiety [10][11][12][13].
Over the past few decades, several algorithms have been developed to identify children at a higher risk of severe bacterial illness [3,4,14]. In addition to clinical findings, some studies have suggested that laboratory findings, such as white blood cell count (WBC), absolute neutrophil count (ANC), C-reactive protein (CRP), and procalcitonin (PCT), may be useful in helping physicians recognize children with severe bacterial infection [4,[14][15][16][17][18]. Although these parameters can help pediatric clinicians identify febrile children at high risk of severe bacterial infection, are they capable of predicting bacteremia in febrile children? Many studies also debate the role of appropriate blood cultures in the PED. Obtaining blood cultures is only recommended for children with extensive infections or immune-compromised patients or for those with moderately or severely ill children, according to published guidelines and studies [19,20].
In this study, we tried to build machine learning models to predict bacteremia in children with fever who visit the PED. Our findings can be clinically important as they may help physicians in the PED either order the appropriate blood culture or manage treatment depending on whether bacteremia is predicted.

Study Population
Patients younger than 18 years of age who presented to the PED of the Kaohsiung Chang Gung Memorial Hospital in Taiwan with fever during the period of January 2008 through December 2015 were evaluated. Febrile children from whom a blood test and blood culture were obtained formed our retrospective cohort. In this cohort, each febrile child with true bacteremia was randomly matched with 10 febrile children without bacteremia according to gender and age in order to form a case control study. All the blood cultures were collected by nurses in accordance with our institution's standard procedures. Obtaining two sets of blood cultures was quite difficult in pediatric patients, so one set of blood culture for children with fever was generally practiced in our PED. This study was approved by the Institutional Review Board of Chang Gung Medical Foundation.

Blood Culture Criteria
The following organisms that were isolated from the blood sample represent a true pathogen: Staphylococcus aureus, Streptococcus pneumoniae, Salmonella enterica, group A streptococci, Pseudomonas aeruginosa, Haemophilus influenzae, Escherichia coli (E. coli), and Candida species, among others [21][22][23][24]. Certain organisms isolated from blood samples have been found to represent contamination. These pathogens include coagulase-negative staphylococci, Staphylococcus epidermidis, Corynebacterium spp., Gram-positive Bacillus, Micrococcus, etc. [21][22][23]. In the case of any doubt related to the potential pathogenicity of one of the isolated species, the research coordinator reviewed the case to determine whether the corresponding blood culture was an actual infection.

Statistical Analysis
Continuous variables were expressed as mean ± standard deviation, and categorical variables were reported as percentages. To compare clinical characteristics between children with and without bacteremia, Student t tests and Fisher's exact test or x 2 test were used for continuous variables and categorical variables, respectively. Univariate and multivariate binary logistic regression analyses were used to identify the significant risk factors. A P-value less than 0.05 was considered statistically significant. IBM SPSS statistical software for Windows, version 22.0 (Chicago, IL, USA) was used for statistical analyses.

Experimental Methodology
Age, gender, and laboratory values obtained from medical records were evaluated as predictive variables by the models. For every laboratory value variable, only values obtained simultaneously with the blood culture sample were used. In addition to gender and age, 17 laboratory values were included in our study population. The normal reference of these values may differ in different ages or genders, so case control study was used to eliminate these confounding factors. Due to the hugely different case numbers in the bacteremia group and non-bacteremia group, each case with bacteremia was randomly matched by age and gender with ten cases without bacteremia. To avoid sample selection bias, the matching procedure was repeated 100 times for the characteristic variable analysis. The features selected for machine learning were determined according to the 100 times of characteristic variable analysis. The selected features were further used to establish the predictive model.
Python language, R language, and machine learning methods-logistic regression (LR) and support vector machines (SVM) were used to establish a predictive model for bacteremia. Cost-sensitive learning was applied to make a trade-off between false negative predictions and cost reduction to increase the usability of the predictive model. In the dataset, 97.5% of the cases were used for training, and 2.5% were used as testing data. A risk value of each case was calculated using the SVM or LR predictive model. Since the dataset is generally imbalanced, cost-sensitive learning is used for imbalanced classification. The costs of prediction errors (and potentially other costs) are considered when training a machine learning model. We set bacteremia as positive and non-bacteremia as negative. The amounts of positive and negative cases were quite different. Considering the confusion matrix of inference, the costs of true positive and true negative were set to zero. That is, those cases for which we can correctly predict the result had no cost. In this study, the false negative was more important, which means it will cost more when we predict bacteremia as non-bacteremia. Therefore, we must pay more attention to false negatives. If we assign the cost of false positive as 1, then we must carry out experiments to dynamically adjust the cost of a false negative (an important parameter that can significantly affect the performance of prediction) and find the optimal one. The optimal value is in the range of 7~13. The risk is defined as the cost of negative (the cost of predicting a non-bacteremia result) minus the cost of a positive (the cost of predicting a bacteremia result) (i.e., risk = negative cost -positive cost). If cost > 0, predict bacteremia, otherwise, predict non-bacteremia. However, this risk-based binary prediction (i.e., cost > 0 and cost < 0) does not have good performance. Instead of predicting by plus or minus of risk value, we partition the range of risk value into a couple of segments, such as the quartile, and we make a prediction for each segment; therefore, the performance can be obviously improved.

Patient Characteristics
Among a total of 266,679 children that visited the PED during the 8-year period, 16,967 febrile children with blood culture tests were enrolled for further analysis. The mean age of the total study population was 5.16 ± 3.97 years old. Of these 16,967 children, 55.3% (n = 9388) were male. We observed three kinds of blood culture results: bacteremia indicating true infection, contamination indicating a false-positive result, and negative culture (no growth of wither aerobic or anaerobic pathogens). Patients were categorized into two groups according to the blood culture results: the bacteremia group and the non-bacteremia group (including contamination and negative results). In this study cohort, 146 (0.86%) febrile children had true bacteremia, 405 (2.39%) children had a contaminant result, and 16,416 (96.75%) children had a negative result. The gender, age, and laboratory tests were obtained for further analysis and compared according to the blood culture results, as shown in Table 1. Age, percentage of neutrophil and band, ANC, Hb, platelet, and CRP were statistically different between bacteremia and non-bacteremia encounters. The three most common pathogens in the bacteremia group were Salmonella entericae (28/146, 19.18%), Escherichia coli (22/146, 15.07%), and Streptococcus pneumoniae (14/146, 9.59%). The three most common pathogens isolated from blood culture and considered contamination were Staphylococcus epidermidis (137/405, 33.83%), coagulase-negative staphylococcus (129/405, 31.85%), and Micrococcus (45/405, 11.11%). The age distribution of the bacteremia group is shown in Figure 1. More than half (55.7%) of the febrile children with bacteremia were under the age of 3 years old.  33.83%), coagulase-negative staphylococcus (129/405, 31.85%), and Micrococcus (45/405, 11.11%). The age distribution of the bacteremia group is shown in Figure 1. More than half (55.7%) of the febrile children with bacteremia were under the age of 3 years old.

Feature Selection and Risk Classification
After repeating the characteristic variable analysis 100 times, WBC, MCV, MCH, MCHC, Monocyte, Eosinophil, CRP, Band percentage, Segment + Band percentage, and ANC were all positively or all negatively correlated with bacteremia (Table 2) and their odds ratio are shown in Table 3. These 10 features were selected to establish the predictive model. Among these 10 features, significant risk factors were also picked up for multivariate binary logistic regression analysis and their coefficients are shown in Table 4. We used the cost-sensitive approach in machine learning to tackle the imbalance dataset which we have collected in this study. The curves of recall (sensitivity), true-negative rate (specificity) and AUC (area under the ROC curve) for different cost we applied are shown in Figure 2 (each point represents the average result of 100 times analyses). These three performance indexes were considered in our study. Thus, we can choose the best cost value to balance these performance indexes.
The maximum values of different performance indexes by LR are shown in Table 5. With cost-sensitive learning, the maximum areas under the curve (AUC) of LR and SVM to predict bacteremia were 0.768 and 0.832, respectively. Table 2. The maximum value, minimum value, mean value, and standard deviation of logistic regression coefficients of characteristic variables.
Diagnostics 2020, 10, x FOR PEER REVIEW 5 of 13 After repeating the characteristic variable analysis 100 times, WBC, MCV, MCH, MCHC, Monocyte, Eosinophil, CRP, Band percentage, Segment + Band percentage, and ANC were all positively or all negatively correlated with bacteremia (Table 2) and their odds ratio are shown in Table 3. These 10 features were selected to establish the predictive model. Among these 10 features, significant risk factors were also picked up for multivariate binary logistic regression analysis and their coefficients are shown in Table 4. We used the cost-sensitive approach in machine learning to tackle the imbalance dataset which we have collected in this study. The curves of recall (sensitivity), true-negative rate (specificity) and AUC (area under the ROC curve) for different cost we applied are shown in Figure 2 (each point represents the average result of 100 times analyses). These three performance indexes were considered in our study. Thus, we can choose the best cost value to balance these performance indexes. The maximum values of different performance indexes by LR are shown in Table 5. With cost-sensitive learning, the maximum areas under the curve (AUC) of LR and SVM to predict bacteremia were 0.768 and 0.832, respectively.

Variable
Minimum      Considering the bacteremia samples in the dataset, we calculated a risk value of each sample and their quartile ranges are shown in Figure 3. We did the same for the risk value of non-bacteremia samples and showed its quartile range. According to the range of the minimum risk value of bacteremia −0.640) and maximum risk value of non-bacteremia (0.644), it is divided into three categories. Further analyzing the quartile range, bacteremia data are located between [−0.014, 0.15], while the non-bacteremia data are located between [−0.061, 0.132]. Using the 1st quartile of bacteremia and 3rd quartile of non-bacteremia, we can divide the range of risk value to five blocks (as shown in Figure 4). With these five classes, we can predict class 1 as being non-bacteremia, class 2 as low risk of bacteremia, class 3 as medium risk, class 4 as high risk, and class 5 as being bacteremia.
Diagnostics 2020, 10, x FOR PEER REVIEW 7 of 13 categories. Further analyzing the quartile range, bacteremia data are located between [−0.014, 0.15], while the non-bacteremia data are located between [−0.061, 0.132]. Using the 1st quartile of bacteremia and 3rd quartile of non-bacteremia, we can divide the range of risk value to five blocks (as shown in Figure 4). With these five classes, we can predict class 1 as being non-bacteremia, class 2 as low risk of bacteremia, class 3 as medium risk, class 4 as high risk, and class 5 as being bacteremia.

Subgroup Study
Most febrile children with bacteremia were those under the age of 3 years old. Therefore, we conducted a subgroup study that included only these younger children. Among this subgroup, 1.58% (n = 81) febrile children had true bacteremia, 4.7% (n = 238) children had a contaminant result, and 93.8% (n = 4795) children had a negative result. Gender, age, and laboratory tests were obtained for further analysis and compared according to the blood culture results, as shown in Table 6. Age, percentage of band and eosinophil, hemoglobin and CRP differed statistically between bacteremia and non-bacteremia encounters in the subgroup study. Of these young children, the most common isolated pathogens were Salmonella entericae ( The AUC ranged between 0.616 to 0.750 for predicting bacteremia in children under the age of 3 years old in the model we developed.

Variable
Bacteremia Non-bacteremia p-value   Figure 4). With these five classes, we can predict class 1 as being non-bacteremia, class 2 as low risk of bacteremia, class 3 as medium risk, class 4 as high risk, and class 5 as being bacteremia.

Subgroup Study
Most febrile children with bacteremia were those under the age of 3 years old. Therefore, we conducted a subgroup study that included only these younger children. Among this subgroup, 1.58% (n = 81) febrile children had true bacteremia, 4.7% (n = 238) children had a contaminant result, and 93.8% (n = 4795) children had a negative result. Gender, age, and laboratory tests were obtained for further analysis and compared according to the blood culture results, as shown in Table 6. Age, percentage of band and eosinophil, hemoglobin and CRP differed statistically between bacteremia and non-bacteremia encounters in the subgroup study. Of these young children, the most common isolated pathogens were Salmonella entericae ( The AUC ranged between 0.616 to 0.750 for predicting bacteremia in children under the age of 3 years old in the model we developed.

Subgroup Study
Most febrile children with bacteremia were those under the age of 3 years old. Therefore, we conducted a subgroup study that included only these younger children. Among this subgroup, 1.58% (n = 81) febrile children had true bacteremia, 4.7% (n = 238) children had a contaminant result, and 93.8% (n = 4795) children had a negative result. Gender, age, and laboratory tests were obtained for further analysis and compared according to the blood culture results, as shown in Table 6. Age, percentage of band and eosinophil, hemoglobin and CRP differed statistically between bacteremia and non-bacteremia encounters in the subgroup study. Of these young children, the most common isolated pathogens were Salmonella entericae (26/81, 32.10%), Escherichia coli (19/81, 23.46%), and Group B Streptococcus (6/81, 7.41%). After repeating the characteristic variable analysis 100 times, the significant factors and their ORs are shown in Supplemental Table S1. The multivariate binary logistic regression coefficients of significant risk factors are shown in Supplemental Table S2.
The AUC ranged between 0.616 to 0.750 for predicting bacteremia in children under the age of 3 years old in the model we developed.

Discussion
The main findings of this study were as follows: (1) the bacteremia rate in febrile children that presented to the PED was low, (2) CRP was significantly higher and hemoglobin was significantly lower in children with bacteremia, (3) younger children (<3 years of age) with fever are more likely to have bacteremia than older children, and (4) machine learning can help us classify the risk of bacteremia in febrile children.
As many as 3-10% of well-appearing children under the age of 3 years old with fever without a source were found to have an occult bacteremia in the prevaccine era. Due to the concern of bacteremia becoming an invasive illness, many practitioners recommended routine blood tests, including blood culture, and then antibiotic therapy based on WBC results, as part of the management strategy for these children [3,14]. However, since the introduction of the Haemophilus influenzae type b (Hib) vaccine in the late 1980s and the pneumococcal conjugate vaccine (PCV) in the 2000s, a dramatic decline in bacteremia was observed as low as 0.25% to 1.43% in children [25][26][27][28]. Irwin et al. also reported an annual reduction of 10.6% in vaccine-preventable bacteremia and found that PCV was associated with a 49% reduction in pneumococcal bacteremia between 2001 and 2011 [29]. The Hib vaccine and PCV were first introduced to Taiwan in 1996 and 2005 respectively. In our current study, the overall bacteremia rate was about 0.86% in all febrile children and about 1.58% in younger febrile children (less than 3 years) that presented to the PED. Furthermore, only <0.1% (n = 14) febrile children were identified to have bacteremia with Streptococcus pneumoniae, and none of the blood culture results yielded Hib. This result is in agreement with the low bacteremia rate in the post-Hib vaccine and post-pneumococcal vaccine eras. Although adequate aseptic techniques can substantially reduce the risk of contaminating blood culture specimens, contamination rates of 2% to 3% are considered acceptable [30]. The overall contamination rate was 2.39% in our study, and the isolation of contaminant organisms from a blood culture has a significant negative impact on patient management, including misdiagnosis, unnecessary antibiotics, performance of additional and unnecessary diagnostic tests, additional costs, and prolonged hospital stays [11][12][13]. A low positive blood culture result with a high rate of contaminant results has made physicians doubt the usefulness of blood cultures in children with fever that visit the PED. How to reduce unnecessary blood cultures will become an important issue for healthcare systems in the postvaccine era.
CRP is an acute-phase reactant protein synthesized by the liver in response to elevated cytokine levels and has been studied as a sensitive marker of bacterial infection [31,32]. Many studies have proposed that high CRP concentration may be associated with severe bacterial infection in febrile infants and children [15,18,[33][34][35]. In both the complete study population and the younger age subgroup (<3 years) in our study, elevated CRP concentration was significantly higher in patients with bacteremia. Our results support the finding of high CRP levels in children with bacterial infection. Using CRP to properly manage children with fever may help identify true bacteremia and reduce unnecessary antibiotic therapy. Some studies have used CRP together with other parameters to predict children with severe bacterial infection. In two recent studies, CRP with extreme leukocytosis was proposed to be useful in predicting severe bacterial infection in children [33,34]. Buendia et al. showed that Rochester criteria plus CRP testing was the most cost-effective strategy for detecting serious bacterial infections in children one to three months old with fever without a source [36]. However, the Rochester criteria is especially applied to young infants, not older infants and children. To the best of our knowledge, no study used more than two clinical parameters together with CRP to predict bacteremia in febrile children. We proposed a useful model for predicting bacteremia in febrile children, not only those with CRP but also other common laboratory parameters in the PED setting.
Anemia due to disease is often seen in various inflammatory states, including acute or chronic infections, autoimmune problems, chronic kidney disease and inflammation, and certain cancers [37]. Anemia has commonly been associated with infections that are typically seen in a pediatric primary care setting [38]. In 2009, Ballin et al. also reported that bacteremia and pyelonephritis are accompanied by a significant drop in hemoglobin levels without evidence of hemolytic anemia [39]. When infection occurred, the inflammatory cytokine could induce hepcidin production in the liver, increase macrophage activation and red blood cell (RBC) destruction, and suppress erythropoiesis. Therefore, inflammation-related anemia may result from hepcidin-induced hypoferremia combined with the cytokine-mediated suppression of erythropoiesis and decreased lifespan of erythrocytes [40]. This phenomenon can explain the findings of lower Hb in children with bacteremia in our cohort.
Significant differences in the percentage of neutrophil, ANC, platelet and eosinophils were also observed in our study. However, most reference intervals of pediatric hematology analytes are age-dependent, especially WBC and its differential count [41,42]. The changes in either absolute count or percentage of neutrophils are dynamic, particularly in the young infants and during the first years of life. The mean percentage of neutrophils may as low as 31-33% and the mean count of neutrophils also achieves its nadir at an age of 6 months to 2 years old [42]. Both lower percentage of neutrophils and ANC in children with bacteremia may due to the younger age (81 of 146 cases are younger than 3 years of age) in our study population. Reactive thrombocytosis has diverse etiologies, including inflammatory, neoplastic and infectious diseases [43]. In most patient series, acute infections represent the most common cause of reactive thrombocytosis [44,45]. In addition to CRP induction, interleukin-6 also plays a pivotal role in thrombocytosis of inflammation [46]. In children with bacteremia, the inflammation-associated cytokines produced primarily by WBC at inflammatory sites may further cause the elevation of CRP level and induce reactive thrombocytosis. This can explain the finding of higher platelet count in our bacteremia group. The pathophysiology of eosinopenia is related to the migration of eosinophils to the inflammatory site, presumably as a result of chemotactic substances secreted during the acute phase of inflammation [47]. A decreased number of circulating eosinophils is regarded as a consequence of acute bacterial infection and several studies have used eosinophil count as an indicator of bacteremia [48][49][50][51]. Our finding supports the view of low eosinophil count in patients with bacteremia.
Zeretzke et al. reported that the children most at risk for occult bacteremia are those younger than 36 months of age with a fever of 39 • C or higher [52] due to the high probability of developing serious bacterial infections, such as meningitis, sepsis, pneumonia, septic arthritis, osteomyelitis, and pyelonephritis [5,14,53]. Therefore, obtaining blood cultures for febrile children with a young age is reasonable. However, febrile children with bacteremia are mostly seen in children under the age of 3 years old with a bacteremia rate of 1.58% versus 0.55% in children more than 3 years old in our study. In other words, these older febrile children with lower probability of bacteremia may have more unnecessary blood cultures, which may waste medical resources. How to reduce the frequency of blood culture in febrile children without misdiagnosis of bacteremia becomes an important issue.
The quality and cost of the healthcare being provided has become an increasing issue worldwide. This concern has led to a focus on how we can achieve equal or better quality outcomes with fewer health resources or less money. Segal et al. described contaminant blood cultures in 85 children that added more than $78,000 in unnecessary charges [13]. A recent study also demonstrated a yearly savings of ∼$250,000 in hospital charges when the blood culture contamination rate was reduced from 3.9% to 1.6% [11]. Some guidelines for inpatient community-acquired pneumonia (CAP) management recommend considering blood culture testing for inpatients with moderate to severe bacterial pneumonia [19,20]. However, obtaining blood cultures in children hospitalized with CAP rarely identifies a causative pathogen, which makes blood cultures less useful [54,55]. The high rates of negative culture results can also represent overuse. In our current study, we have also found a high negative culture rate and a low bacteremia rate, which indicates an overuse of blood culture and a waste of healthcare resource in febrile children. Therefore, how to reduce the over-use of blood culture without missing patients with dangerous bacteremia is important. With the "cost-sensitive learning" model that we proposed with machine learning, we can identify those febrile children with no risk of bacteremia (class 1 risk value) and avoid unnecessary blood cultures to save healthcare resources.
Blood culture remains the gold standard to diagnose bacteremia, but it is a time-consuming diagnostic tool. After the blood sample being collected, it may take couple days to have the initially result of Gram stain, such as Gram-positive cocci, Gram-negative bacilli, etc. Physicians may be informed the final result of blood culture further few days later. Those features used in our prediction model are laboratory data which can be available within one hour after blood sample being collected. In clinical practice, we can implement a decision-making application program running over personal computer as a decision support tool. Those variables of laboratory findings are input data, and this tool can give a report to illustrate a risk probability of bacteremia for clinical reference. In addition, the report will also come out with the distribution of each variable of patients in the database in a way of data visualization for comparison. Therefore, our prediction model can be a part of clinical decision support system to help physicians determine whether patients have risk of bacteremia and thereafter arrange adequate medical treatment.
The results of this study should be interpreted with respect to certain limitations. First, procalcitonin (PCT), a useful biomarker proposed for predicting bacterial infection [56], was not commonly used in our hospital during the study periods. Therefore, PCT use was noted as a variable in our models. Second, those febrile patients who visited out-patient departments and were hospitalized were not included. These patients may also have bacteremia. Third, the models that we used mostly relied on laboratory tests, and the information contained within medical notes were not used. Fourth, some information recorded in medical notes, such as clinical symptoms, location of infection, vital sign (heart rate, respiratory rate, blood pressure, and oxygen saturation), respiratory pattern, and general appearance of the patient, etc. are important to help physician to determine the severity of a febrile patient. However, these data were not available in our database to improve our prediction model. These might have limited our model's performance. Natural language processing techniques to get bacteremia-relevant information from unstructured medical notes are expected to improve the predictive models.

Conclusions
Obtaining blood cultures in febrile children at the PED are definite diagnosis of bacteremia but they rarely identify a causative pathogen. Moreover, overuse or waiting the result of blood culture has been described in relation to a financial burden to healthcare system. Our machine learning prediction model can be a part of clinical decision support system and help physicians determine whether patients have risk of bacteremia and may reduce unnecessary expenses.
Supplementary Materials: The following are available online at http://www.mdpi.com/2075-4418/10/5/307/s1, Table S1: The odds ratio of each variable and its 95%CI after repeating 100 times univariate logistic regression in subgroup study, Table S2: The maximum value, minimum value, mean value, and standard deviation of multivariate binary logistic regression coefficients of significant risk factors in subgroup study.