Classification Model for Diabetic Foot, Necrotizing Fasciitis, and Osteomyelitis

Simple Summary Necrotizing fasciitis (NF) and osteomyelitis (OM) are severe complications in patients with diabetic foot ulcers (DFUs). Although NF and OM often cause results including limb amputation and death, definite diagnoses of these are challenging. To aid the prompt and proper diagnosis of NF and OM in patients with DFU, we developed and evaluated a novel prediction model based on machine learning technology. In summary, our prediction model appropriately discriminated the NF and OM from diabetic foot. Moreover, this prediction model has advantages in that it is based on the demographic data and routine laboratory results, which requires no additional examinations which are complicated or expensive. Abstract Diabetic foot ulcers (DFUs) and their life-threatening complications, such as necrotizing fasciitis (NF) and osteomyelitis (OM), increase the healthcare cost, morbidity and mortality in patients with diabetes mellitus. While the early recognition of these complications could improve the clinical outcome of diabetic patients, it is not straightforward to achieve in the usual clinical settings. In this study, we proposed a classification model for diabetic foot, NF and OM. To select features for the classification model, multidisciplinary teams were organized and data were collected based on a literature search and automatic platform. A dataset of 1581 patients (728 diabetic foot, 76 NF, and 777 OM) was divided into training and validation datasets at a ratio of 7:3 to be analyzed. The final prediction models based on training dataset exhibited areas under the receiver operating curve (AUC) of the 0.80 and 0.73 for NF model and OM model, respectively, in validation sets. In conclusion, our classification models for NF and OM showed remarkable discriminatory power and easy applicability in patients with DFU.


Introduction
Diabetic foot ulcers (DFUs), one of the most common complications of diabetes mellitus (DM), lead to increased morbidity, mortality, and healthcare costs. Approximately 19-34% of patients with diabetes could be encountered with DFU in their lifetime, and these patients have a 2.5-fold increased risk of death at five years compared with those without foot ulcers [1]. Surprisingly, the total costs for diabetic foot care exceed those for many common cancers, including breast, colorectal, and lung cancers [2]. DFU is preceded by repetitive stress on the foot surface, with peripheral neuropathy and/or peripheral artery disease. The European Study Group on Diabetes and the Lower This study was approved by the Institutional Review Board (IRB) of WSCH (IRB No. CR322026). As the study was performed retrospectively with pre-existing medical records, the requirement for written consent from patients was waived, which was confirmed This study was approved by the Institutional Review Board (IRB) of WSCH (IRB No. CR322026). As the study was performed retrospectively with pre-existing medical records, the requirement for written consent from patients was waived, which was confirmed by the IRB. This study was conducted in accordance with the ethical principles of the Declaration of Helsinki. All enrolled individuals were processed anonymously and de-identified.

Selection of Predictors for Necrotizing Fasciitis and Osteomyelitis
To establish a classification model for DFU infections, we used the Bayesian approach, which provides two major advantages. Bayesian manners yield transparency by offering the complete probability distributions for the estimated model parameters, statistical metrics, and predictions. Further, Bayesian models can easily incorporate previously available scientific information into new data [15]. Subsequently, the selection of plausible predictors (also termed features) was considered a crucial task based on previous studies. In fact, several studies used a literature-search approaches to identify disease-related features or predictors [22][23][24]. Motivated by these studies, two experts in plastic surgery and family medicine reviewed the literature and yielded approximately 30 variables known to be related to NF and/or OM (Table S1). Then, the database administrator extracted the automatic platform-based features from the electronic health records (EHR) at the WSCH. Finally, the candidate features obtained from the literature search and EHR were evaluated using a statistical model ( Figure 1B).
Among numerous statistical methods for selecting features, the stepwise feature selection could be divided into forward selection or backward elimination [25]. We implemented the modified version of backward elimination for feature selection. Although typical backward stepwise elimination sequentially removes a feature with the most insignificant result one by one, our modified backward elimination method subtracted all features exhibiting insignificant finding (p < 0.1 in multivariate LR) at once. The modified backward elimination method used in our study has been attempted in previous studies [22,23,26].

Establishment of a Prediction Model
Numerous statistical approaches have been used for feature identification. Lee and Lee [27] integrated multiple statistical methods, such as the t-test and correlation method (i.e., the biweight midcorrelation method), to identify features. Moon et al. [26] initially screened risk factors based on expert knowledge and finally determined predictors using multiple steps of statistical methods, including logistic regression (LR). LR is a frequently used approach for predicting DFU infections [28,29]. Furthermore, efforts have been made to establish a prediction model for NF and OM using LR [30,31]. Likewise, we could establish the classification model for DFU infections using LR based on the ML technique. This LR model consists of linear units and non-linear unit referred to as the 'sigmoid function'.

Statistics
Differences in variables were analyzed based on DFU infection status using Student's t-test and Chi-square test for continuous and categorical variables, respectively. The DFU infection prediction model was evaluated in terms of performance using the receiveroperating characteristic (ROC) curve and the area under the curve (AUC), which is a combination of sensitivity and specificity. Statistical analysis was performed using R language (R package ver. 4.1.2, R Foundation for Statistical Computing, Vienna, Austria). The p-values < 0.05 were considered statistically significant.

Results
The general characteristics of the datasets are presented in Table 1. NF showed the following differential characteristics compared to diabetic foot: low ratio of females; increased levels of C-reactive protein (CRP), white blood cells (WBC), mean platelet volume (MPV), delta neutrophil index (DNI), myeloperoxidase index (MPXI), and neutrophil-lymphocyte ratio (NLR), and decreased levels of creatinine (Cr), total protein (TP), Ca, K, HbA1c, and platelets (PLT). In addition, the following OM-related characteristics were observed: higher TP, Ca, Na, Cl, red blood cells (RBC), hemoglobin (Hb), and hematocrit (Hct), and lower age, female ratio, CRP, blood urea nitrogen (BUN), Cr, K, HbA1c, erythrocyte sedimentation rate (ESR), WBC, MPV, DNI, and NLR. Among the 28 risk factors (referred to as literature search-based features) related to NF or OM obtained from the literature-based search, 22 predictors were present in the automatic platform (Table 1). Therefore, we processed these 22 variables using univariate and multivariate LR analyses (also termed stepwise LR) to identify the features of NF or OM prediction models (Tables 2 and 3).
In the univariate LR for NF status, 11 of the 22 predictors were preliminarily selected ( Table 2) and processed into a multivariate model. Notably, when identifying predictors for the classification model, p < 0.1 was implemented in univariate or multivariate LR models. Seven predictors were identified as final input variables for the NF prediction model. Female sex, CRP, DNI, and NLR were positively associated with NF status (vs. diabetic foot), and the remaining three variables were negatively related to NF (Table 2). In the OM prediction model, 12 predictors were selected as final input variables. Among these 12 variables, younger age, female sex, TP, Cl, Hct, and PLT were positively correlated with OM status ( Table 3).
The odds ratios for each feature in NF and OM (Tables 2 and 3) were log-transformed, followed by the establishment of the DFU infection prediction model described in Table 4. From these weights (coefficients), index values for the probability of DFU infections, ranging from 0 to 1, were calculated and applied to the validation dataset. As a result, areas under the receiver operating curve (AUC) of 0.80 and 0.73 were obtained from the NF and OM prediction models for the validation sets, respectively ( Figure 2). Based on the maximum value of the F-measure, the optimal cut-offs for NF and OM were determined as 0.13 and 0.3, respectively.

Discussion
We identified predictors for DFU infections and proposed a novel classific model based on a literature search and automatic platform data using the LR me Although Wong et al. [32] proposed the Laboratory Risk Indicator for Necrotizing Fas (LRINEC) scoring system for differentiating NF from other infections in 2004, this sy could not maintain outstanding performance in various studies [33,34]. Considerin fulminant and dismal course of NF, a reliable and robust method is required for differential diagnosis of patients with DFU. Further, OM, which is more prevalent challenging than NF for clinicians managing patients with diabetic foot, should n overlooked. Our classification model could discriminate NF and OM from other DF fections, despite being made of easily and cheaply obtainable parameters includin mographic data and routine laboratory results, rather than state-of-the-art or costly m ers.
The IWGDF guidelines recommend the use of inflammatory biomarkers suc WBC, ESR, CRP, and procalcitonin to establish a diagnosis of diabetic foot infection

Discussion
We identified predictors for DFU infections and proposed a novel classification model based on a literature search and automatic platform data using the LR method. Although Wong et al. [32] proposed the Laboratory Risk Indicator for Necrotizing Fasciitis (LRINEC) scoring system for differentiating NF from other infections in 2004, this system could not maintain outstanding performance in various studies [33,34]. Considering the fulminant and dismal course of NF, a reliable and robust method is required for early differential diagnosis of patients with DFU. Further, OM, which is more prevalent and challenging than NF for clinicians managing patients with diabetic foot, should not be overlooked. Our classification model could discriminate NF and OM from other DFU infections, despite being made of easily and cheaply obtainable parameters including demographic data and routine laboratory results, rather than state-of-the-art or costly markers.
The IWGDF guidelines recommend the use of inflammatory biomarkers such as WBC, ESR, CRP, and procalcitonin to establish a diagnosis of diabetic foot infection [20]. Among the inflammatory markers included in our study, CRP, DNI, and NLR showed a significant positive correlation with NF in the multivariate analysis, whereas ESR and MPXI did not. In general, CRP is known to have a higher diagnostic accuracy for infection than that of WBC or ESR [20]. Moreover, a recent study demonstrated that the DNI and NLR are robust predictors of equivocal septic conditions using clustering analysis [24]. In contrast, the OM group exhibited significantly negative associations only with ESR and NLR. Several studies have revealed that ESR is the most useful marker for bone infection, except that it shows an opposite trend [35,36]. Similarly, Serban et al. [37] found that elevated NLR is correlated with OM in DFU infections. The negative correlation between ESR, NLR, and OM in our study, which is inconsistent with previous results, should thus be evaluated further.
In the EURODIALE study, male sex was found to be an independent predictor of non-healing DFU [38]. Similar findings have been reported in other populations [39,40]. Therefore, this could be partially explained by the hypothesis that prolonged non-healing DFU, which is more common in male patients, is likely to be the point of pathogen entry and consequently cause infections [28,41]. Likewise, male predominance was observed in all three groups in our study, while both the NF and OM groups exhibited relative female predilections compared to those of the diabetic foot group. There are conflicting data regarding whether sex affects the development of DFU infection. An Australian cohort study proposed that female sex was a risk factor for DFU infection, consistent with the results of our study [42]. Nevertheless, a recent meta-analysis argued that sex does not affect the development of osteomyelitis in patients with DFU [43].
Further, age showed a significantly negative association in the OM group, consistent with a previous study by Lavery et al. [44] showing that older patients (age ≥ 70 years) had reduced osteomyelitis (relative risk = 0.46). Similarly, the aforementioned Australian study revealed that younger age is a risk factor for developing DFU infections [42]. Although OM has distinct differences in the major routes of infections and causative organisms according to patient age, our dataset has limitations in considering these factors [45]. Therefore, future studies that consider these factors are needed to verify the negative correlation between OM and age observed in our study.
Researchers of ML-based studies should understand the trade-off relationship between explainability and training accuracy among many ML models and select the relevant model for the intended goals [46]. Owing to insufficient research on ML-based classification models for DFU infections, we used the LR model, which is highly interpretable despite its lower accuracy, in our study. Therefore, several discordant findings with previous studies require verification through further research using more accurate ML models such as support vector machines, random forests, and deep neural networks [47].
For example, although increased serum creatinine was a positive predictor of the LRINEC score, it was a negative predictor for both NF and OM in our study population [32]. This result is contrary to Game's theory that inflammation associated with DFU induces a decline in renal function [48]. We postulated that the discrepancy with Wong's study [32] could partly have resulted from difference in the control group (e.g., cellulitis or abscesses vs. DFU). Furthermore, our study revealed that HbA1c levels are negatively associated with both NF and OM. Increased levels of HbA1c, a surrogate marker for poor glycemic control, have usually been employed to predict adverse outcomes such as lower extremity amputation or mortality in diabetic infections [49,50]. The cross-sectional nature of our study could be implicated in this questionable result. Given the fact that all three groups in our study showed the fairly higher level of HbA1c than 7%, the optimal glycemic target recommended by American Diabetes Association [51], the association of HbA1c with NF and/or OM could be different in prospectively well-controlled glycemic cohort. Lastly, lower MPV was associated with OM in our study, whereas MPV is often increased in many inflammatory conditions, including cardiovascular diseases, cerebral stroke, respiratory diseases, chronic renal failure, intestinal diseases, rheumatoid diseases, diabetes, and various cancers [52].
The present study has some limitations. First, the patients enrolled in this study were from a single tertiary hospital in South Korea, which makes it difficult to apply this classification model to every region or ethnicity. Further study based on multi-center or multi-ancestry cohort is required to obtain more generalized and improved prediction models for NF and OM. Second, we designed the study using only a cross-sectional dataset and did not include longitudinal trends or changes in predictors. Therefore, it is desirable that the diagnostic performance of the proposed classification model should be evaluated in a prospective cohort study to identify the causality of the predictors in our models and validate whether the cross-sectional data-derived model could truly predict newonset cases [23]. Third, a relatively small number of patients with NF were included compared to those with diabetic foot and OM. However, this rarity is natural given the low incidence of NF (0.03-2.17 per 100,000 population) in Korea [53]. Moreover, the number of patients with NF in our cohort is large enough to establish a robust prediction model, compared with previous NF studies [34]. Fourth, the microbiological results and clinical outcomes such as low extremity amputation rates and mortality were not analyzed owing to the incompleteness and heterogeneity of the data. In our opinion, the proposed prediction model in this study could be combined with the microbiological and clinical outcome profiles in well-designed prospective cohort in the future. By extension, further study based on a clustering analysis method comparing the combined model with preexisting etiology-based NF classification (type I~IV) is feasible and promising [24,54]. Lastly, we did not consider other comorbidities affecting the infection status. For instance, Furuse et al. [55] suggested that chronic hepatitis could be a risk factor for NF and should be included in the LRINEC scoring system.

Conclusions
In conclusion, we propose a novel classification model for DFU infections, composed of several common indices in routine clinical settings based on multidisciplinary approaches, including various departments of clinicians and experts on automatic platforms. This classification model might pave the way for the early discrimination of severe infectious complications and improve the clinical outcome and prognosis in patients with DFU.
Data Availability Statement: Not applicable.

Conflicts of Interest:
The authors declare no conflict of interest.