Integrated Machine Learning Approach for the Early Prediction of Pressure Ulcers in Spinal Cord Injury Patients

(1) Background: Pressure ulcers (PUs) substantially impact the quality of life of spinal cord injury (SCI) patients and require prompt intervention. This study used machine learning (ML) techniques to develop advanced predictive models for the occurrence of PUs in patients with SCI. (2) Methods: By analyzing the medical records of 539 patients with SCI, we observed a 35% incidence of PUs during hospitalization. Our analysis included 139 variables, including baseline characteristics, neurological status (International Standards for Neurological Classification of Spinal Cord Injury [ISNCSCI]), functional ability (Korean version of the Modified Barthel Index [K-MBI] and Functional Independence Measure [FIM]), and laboratory data. We used a variety of ML methods—a graph neural network (GNN), a deep neural network (DNN), a linear support vector machine (SVM_linear), a support vector machine with radial basis function kernel (SVM_RBF), K-nearest neighbors (KNN), a random forest (RF), and logistic regression (LR)—focusing on an integrative analysis of laboratory, neurological, and functional data. (3) Results: The SVM_linear algorithm using these composite data showed superior predictive ability (area under the receiver operating characteristic curve (AUC) = 0.904, accuracy = 0.944), as demonstrated by a 5-fold cross-validation. The critical discriminators of PU development were identified based on limb functional status and laboratory markers of inflammation. External validation highlighted the challenges of model generalization and provided a direction for future research. (4) Conclusions: Our study highlights the importance of a comprehensive, multidimensional data approach for the effective prediction of PUs in patients with SCI, especially in the acute and subacute phases. The proposed ML models show potential for the early detection and prevention of PUs, thus contributing substantially to improving patient care in clinical settings.


Introduction
Spinal cord injury (SCI) results primarily from traumatic events and can cause considerable sensory and motor impairments and complications [1].Among the myriad challenges that patients with SCI encounter, pressure ulcers (PUs) are notable; previous studies revealed that over 20% of SCI individuals develop PUs, with significant implications for morbidity, mortality, and quality of life, especially in developing countries [2,3].Untreated PUs have a significant impact on patient well-being and place a high financial burden on healthcare systems.These ulcers exacerbate physical and emotional distress and reduce patients' quality of life [4].Economically, the treatment of PUs is costly, with the U.S. healthcare system spending approximately USD 26.8 billion annually [5].
These PUs typically occur over bony prominences due to prolonged pressure, and key sites for PUs include the sacrum, heels, and ischial tuberosities, with complications ranging from infections to delayed rehabilitation, underscoring the need for early prediction and intervention [6].Prevention is crucial in the management of pressure ulcers, especially for individuals at higher risk such as those with spinal cord injuries, and regular repositioning, careful skin inspection, and the use of pressure-relieving devices are key strategies [7].The prediction and early identification of pressure ulcers are vital, as early-stage ulcers can often be managed more easily and heal faster compared to advanced ulcers, thus highlighting the importance of innovative prediction and intervention strategies in healthcare [8].The severity of SCI varies, with more severe cases, such as complete tetraplegia, showing a higher PU risk due to an extensive loss of sensory and motor functions [9].This increased risk is attributed to prolonged immobility and areas prone to sores, like the sacrum and heels [10].The management and prediction of PUs in patients with SCI have advanced through the use of traditional clinical assessments and monitoring tools [11].While the Braden Scale, Norton Scale, and Spinal Cord Injury Pressure Ulcer Scale (SCIPUS) are commonly used in clinical settings, their predictive accuracy varies considerably among individual patients with SCI, as highlighted in previous studies [12,13].The Braden Scale evaluates factors like sensory perception and moisture, the Norton Scale focuses on physical condition and activity, and the Spinal Cord Injury Pressure Ulcer Scale is tailored specifically to patients with SCI, considering aspects like spasticity and sweating [12].In previous studies, the Braden Scale has demonstrated the highest overall accuracy, whereas the Norton Scale has exhibited greater specificity [12,14].However, another study reported that functional assessments, such as the Functional Independence Measure (FIM), outperformed both the SCIPUS and Braden Scales in terms of accuracy [15].This variability highlights the need for more individualized and effective assessment tools for pressure ulcer risk assessment in this patient population.Molecular markers, including proinflammatory cytokines like interleukin (IL)-1α, show promise in early pressure ulcer detection, though their clinical application remains exploratory [16,17].The challenges encompass enhancing predictive accuracy and ensuring that methods are cost-effective, accessible, and universally applicable.Tackling the challenges of pressure ulcer management requires a combination of clinical expertise and cutting-edge technologies, including alternative support surfaces and wireless patient monitoring systems.These technologies are integral for risk identification, effective repositioning, and microclimate control, thereby emphasizing the need for a patient-centric care approach [18].
In recent years, machine learning (ML) has become a significant factor in healthcare, particularly in areas such as diagnosis, prognosis, and personalized treatment [19].ML uses algorithms, ranging from simple decision trees to sophisticated deep learning models, to uncover complex patterns and correlations, leveraging increased computational power and extensive healthcare datasets [20].For instance, decision trees use a tree-like structure to represent decisions and their potential outcomes, which makes them highly interpretable and adaptable to different data types [21].On the other hand, deep learning models employ layered neural networks to analyze data in a complex manner, making them particularly effective at identifying subtle patterns in large datasets [22].Advanced algorithms are currently being utilized in spinal cord injury (SCI) care to predict neurological and functional recovery through the analysis of medical records and imaging data [22,23].Machine learning techniques, including these algorithms, are being explored for patients with SCI to identify risk factors for pressure ulcers (PUs) [24].This addresses the challenge of limited clinical integration due to previously undefined risk factors.
The primary goal of our study is to establish optimal prediction models by comprehensively integrating clinical, physical, and biological parameters, with a focus on improving the accuracy of prognostic predictions for pressure ulcers during the acute and subacute phases of hospitalization in patients with SCI.The secondary goal is to translate these models into practical tools for clinical application to enable the early intervention and effective prevention of pressure ulcers, thereby significantly improving patient outcomes during their hospital stay.

Ethics and Study Design
This retrospective observational study was approved by the institutional review board (IRB) of Dankook University Hospital (IRB No. 2021-05-021) and was conducted in accordance with the ethical guidelines of the 1975 Declaration of Helsinki.We reviewed the medical records of 1117 patients with SCI from Dankook University Hospital (DKUH) and Chungnam National University Hospital (CNUH) in South Korea.Patients were included if they underwent surgical or conservative treatment for traumatic or nontraumatic SCI with confirmed spinal cord signal changes by spinal magnetic resonance imaging (MRI) as demonstrated in previous studies [25,26] from May 1996 to May 2021.The clinical data during the initial hospitalization period for SCI were collected by three researchers, who were specifically assigned to ensure impartiality and minimize bias.These researchers were not involved in the statistical analysis or development of the ML model due to a separation of roles that was implemented to maintain the objectivity and integrity of both the data collection and analysis phases.The clinical parameters included baseline characteristics, such as sex, age, height, weight, alcohol consumption, smoking status, and medical history; subscale and total score of the International Standards for Neurological Classification of Spinal Cord Injury (ISNCSCI), the Korean version of the modified Barthel Index (K-MBI), and Functional Independence Measure (FIM), which were initially assessed during the initial hospitalization period for SCI.We obtained all the laboratory data from the laboratory medicine department of each hospital.The laboratory parameters included complete blood count (CBC), electrolytes, lipid battery, glucose, albumin, protein, C-reactive protein (CRP), the erythrocyte sedimentation rate (ESR), procalcitonin, blood urea nitrogen (BUN), creatinine, aspartate transaminase (AST), alanine transaminase (ALT), and total bilirubin.The patients with SCI who experienced PUs at least once during the hospitalization period were classified into the PU group, while the patients who never experienced PUs were classified into the non-PU group.For the PU group, we extracted only clinical and laboratory data from 3 days to 60 days before the onset of PU.

Machine Learning Analysis
In this study, the recursive feature elimination (RFE) technique was used for feature selection.In the RFE method, a given machine learning algorithm is trained on the initial set of baseline features, after which the importance of each feature is computed.The least important feature is then iteratively eliminated at each step.This elimination process is repeated until the optimal set of features that contributes significantly to the model remains.The linear support vector machine (SVM_linear) classifier was selected as the training algorithm.The selection process was implemented using the RFE with a cross-validation (RFECV) module provided by the scikit-learn library [27], which ensures robust feature selection by considering the cross-validation performance during the elimination process.
We utilized various machine learning methods, combining advanced deep learning with traditional techniques.We employed the graph neural network-graph convolutional network (GNN-GCN) and deep neural network (DNN), which are complex artificial neural networks.GNN-GCN analyzes data structured in graphs, while DNN processes data through interconnected layers [28].Traditional methods used for classification include linear support vector machine (SVM_linear) and support vector machine with a radial basis function kernel (SVM_RBF) for decision boundary-based classification, K-nearest neighbors (KNN) for proximity-based classification, random forest (RF) as an ensemble of decision trees to improve prediction accuracy, and logistic regression (LR) for binary outcome probabilities.
The GNN-GCN model was trained using a graph matrix computed by Euclidean distances between input data, while the other models were trained directly on input data.The GNN-GCN and DNN models were implemented using PyTorch 1.10 [29], and the other ML methods were implemented using scikit-learn 0.24.
Fivefold cross-validation was performed to evaluate the model performance.The dataset was randomized and divided into five partitions, one of which was used for testing and the other for training.To ensure balanced case-control ratios in each partition, a stratified K-fold cross-validation method was used.Cross-validation was repeated five times to ensure a robust and reliable evaluation.Model performance was evaluated by accuracy, area under the receiver operating characteristic curve (AUC), and F1-score.
To improve the interpretability of the problem, we performed additional analyses on the decision tree.The decision tree was trained with entropy as the partitioning criterion.The graph shows the use of features for prediction and the corresponding criteria.

Statistics
All the laboratory, neurological, and functional data were compared between the two hospitals using PASW 20.0 (IBM Corp., New York, NY, USA).The Shapiro-Wilk test was used to assess the normality of the distribution of all the numerical data from each group.The chi-squared test was used for categorial parameters, and the independent t test was used for continuous parameters to compare the differences between the two groups.p < 0.05 indicated statistical significance.

Flow of the Machine Learning Algorithm
Figure 1 outlines the flow of our machine learning algorithm.Clinical data, sourced from the two participating hospitals, were collected during several preprocessing stages.This involved imputation using the KNN method and subsequent filtering of missing values (NaN).Data from patients with a feature coverage exceeding 80% were subjected to imputation, while those with a feature coverage less than 70% were discarded during the NaN filtering stage.For the PU cohort, data were further curated to capture only the interval, spanning 3 days prior to PU onset, up to 60 days before its incidence.Concurrently, laboratory data pertaining to the PU group were processed through date filtering and imputed using mean values.The processed clinical and laboratory datasets were subsequently combined and applied to feature selection.Using these consolidated data, seven distinct machine learning models were developed.The efficacy of the combinations was determined through a 5-fold cross-validation procedure, with performance metrics presented in terms of accuracy, AUC, and F1-score.

Data Characteristics and Dataset Selection for Each Hospital
Table 1 presents the baseline characteristics of patients from Dankook University Hospital (DKUH) and Chonnam National University Hospital (CNUH), categorized by the presence or absence of pressure ulcers.While DKUH focused on the formulation and refinement of machine learning algorithms, CNUH was utilized exclusively for external validation.Notably, patients with pressure ulcers exhibited longer hospital stays across

Data Characteristics and Dataset Selection for Each Hospital
Table 1 presents the baseline characteristics of patients from Dankook University Hospital (DKUH) and Chonnam National University Hospital (CNUH), categorized by the presence or absence of pressure ulcers.While DKUH focused on the formulation and refinement of machine learning algorithms, CNUH was utilized exclusively for external validation.Notably, patients with pressure ulcers exhibited longer hospital stays across both institutions.For our machine learning models, we utilized parameters from the ISNCSCI, K-MBI, FIM, and 20 laboratory indicators.A detailed analysis revealed significant differences in certain metrics.Within the ISNCSCI, total motor scores, especially for the right and left lower extremities, showed marked disparities.Sensory scores, both light touch and pinprick, varied notably between groups.According to the K-MBI metrics, distinctions were evident in toileting, stair climbing, dressing, and ambulation, among others.The FIM highlighted differences in bladder and bowel controls.Finally, laboratory results revealed contrasting hemoglobin levels, hematocrit levels, and platelet counts between the groups.The distributions of patient data across the different datasets from the two hospitals are shown in Table 2. Notably, the "Lab" dataset was the most voluminous of all the datasets.Nevertheless, all the datasets were rigorously evaluated to optimize the machine learning models.The data revealed a marked difference in the composition of the datasets between the two hospitals.Primary analysis was performed using the comprehensive "Lab + ISNCSCI + K-MBI + FIM" dataset from DKUH.To increase the precision of external validation, distinct training and validation datasets were derived from the "Lab + ISNCSCI" collection of both DKUH and CNUH, thus facilitating cross-institutional validation.DKUH evaluated the performance of machine learning models using different dataset combinations, such as "Lab", "Lab + ISNCSCI", "Lab + ISNCSCI + K-MBI", and "Lab + ISNCSCI + K-MBI + FIM".In contrast, the CNUH evaluations focused primarily on the "Lab", "ISNCSCI", and "Lab + ISNCSCI" datasets due to significant data gaps in the K-MBI and FIM metrics.

Predictive Performance of the Machine Learning Models
Table 3 delineates the predictive properties of various machine learning (ML) models across distinct dataset combinations at Dankook University Hospital (DKUH).A comparison across identical algorithms revealed that the "Lab + ISNCS + CI + K-MBI + FIM" amalgam dataset consistently surpassed the other datasets in terms of AUC values.Within the same dataset, algorithmic variations demonstrated different levels of performance; the GNN-GCN algorithm outperformed the "Lab" dataset, and the KNN algorithm outperformed the others in the "Lab + ISNCSCI" dataset, whereas the SVM algorithm consistently stood out in the "Lab + ISNCSCI + K-MBI" and "Lab + ISNCSCI + K-MBI + FIM" datasets.Remarkably, the SVM_linear algorithm in the "Lab + ISNCSCI + K-MBI + FIM" dataset achieved a pinnacle AUC of 0.904, an accuracy of 0.944, and an F1-score of 0.907.Feature selection was used strategically to improve the predictive accuracy of our models.Figure 2 shows the t-SNE plots before and after this important feature selection process.As shown, the demarcation between the non-PU and PU groups became much more pronounced after feature selection, illustrating the critical role of feature selection in achieving clearer data differentiation.
Figure 3 shows the importance scores of the top 39 parameters identified by the SVM_linear model.These parameters are spread across several categories, including Lab, ISNCSCI, K-MBI, and FIM.The representation of each category in this ranking indicates its integral role in influencing the model's predictions.The most prominent parameter in this evaluation was "Ambulation" in the K-MBI, which received the highest importance score of 1.212, followed by lower body dressing, transfer to bed, chair, and wheelchair, eating, and bathing in the FIM.This score demonstrated the importance of the functional status of the upper and lower extremities in the modeling process.In addition, the balanced presence of clinical scales such as the K-MBI and FIM, combined with laboratory and neurological data from Lab and ISNCSCI, highlights the variety of factors considered important in this model.This combination of factors demonstrates the intricate blend of physical, cognitive, and biological considerations that the model accounts for when analyzing outcomes.
Feature selection was used strategically to improve the predictive accuracy of our models.Figure 2 shows the t-SNE plots before and after this important feature selection process.As shown, the demarcation between the non-PU and PU groups became much more pronounced after feature selection, illustrating the critical role of feature selection in achieving clearer data differentiation.In the CNUH dataset, Table 4 describes the predictive effectiveness of various machine learning models across different combinations of datasets and algorithms.Within each algorithm, the "ISNCSCI" dataset predominantly registered the highest AUC for the GNN-GCN, DNN, KNN, and random forest algorithms.Conversely, the "Lab + ISNCSCI" dataset included the SVM_linear, SVM_RBF, and logistic regression algorithms.When comparing different algorithms on the same dataset, the KNN algorithm consistently had the highest AUC in all three datasets.Overall, the KNN algorithm showed the best predictive performance on the "ISNCSCI" dataset, achieving an AUC of 0.737, an accuracy of 0.891 and an F1-score of 0.661.The t-SNE plot before and after feature selection in the CNUH model is shown in Supplementary Figure S1.The non-PU and PU groups were more clearly classified after feature selection.The changes in the distribution patterns of the non-PU and PU groups after feature selection indicate notable changes.The eleven feature parameters were ranked by importance scores based on the outcome of the KNN model (Supplementary Figure S2), and the ASIA impairment scale (AIS) item of the ISNCSCI had the highest importance score of 0.19.
Figure 4 shows the receiver operating characteristic (ROC) curves for the optimal datasets from both DKUH and CNUH.With respect to the DKUH dataset, which comprises the Lab + ISNCSCI + K-MBI + FIM variables, the SVM_linear algorithm distinctly surpassed the other methods, with an AUC of 0.904, an accuracy of 94.4%, a sensitivity of 0.840, a specificity of 0.968, and an F1-score of 0.907.The CNUH dataset, which was based exclusively on ISNCSCI variables, exhibited more uniform results across different algorithms.Notably, the KNN algorithm had an AUC of 0.737, an accuracy of 89.1%, a sensitivity of 0.562, a specificity of 0.913, and an F1-score of 0.661.Overall, the performance at CNUH was more restrained than that at DKUH.The SVM_linear algorithm maintained its superior performance with only the ISNCSCI variables in the DKUH dataset, but its AUC and accuracy were considerably lower than those of the combination of the Lab + ISNCSCI + K-MBI + FIM variables and even the CNUH results (Supplementary Figure S3).status of the upper and lower extremities in the modeling process.In addition, the balanced presence of clinical scales such as the K-MBI and FIM, combined with laboratory and neurological data from Lab and ISNCSCI, highlights the variety of factors considered important in this model.This combination of factors demonstrates the intricate blend of physical, cognitive, and biological considerations that the model accounts for when analyzing outcomes.exclusively on ISNCSCI variables, exhibited more uniform results across different algorithms.Notably, the KNN algorithm had an AUC of 0.737, an accuracy of 89.1%, a sensitivity of 0.562, a specificity of 0.913, and an F1-score of 0.661.Overall, the performance at CNUH was more restrained than that at DKUH.The SVM_linear algorithm maintained its superior performance with only the ISNCSCI variables in the DKUH dataset, but its AUC and accuracy were considerably lower than those of the combination of the Lab + ISNCSCI + K-MBI + FIM variables and even the CNUH results (Supplementary Figure S3). Figure 5 shows the decision tree model employed to distinguish between the Non-PU and PU groups.The primary discriminator is "K-MBI: ambulation", with a threshold value of 2.338.Subjects who scored below this threshold were predominantly categorized using subsequent discriminators, notably "ISNCSCI: motor score of Rt. lower extremity" (≤18.314) and "FIM: eating" (≤5.033).In contrast, for those surpassing the "K-MBI: ambulation" threshold, "FIM: walk/wheelchair" (≤3.445) and "ISNCSCI: motor score of Rt. lower extremity" (≤7.388) emerged as salient discriminators.The tree further expands to encompass laboratory parameters such as platelet count, mean corpuscular hemoglobin, and eosinophil count, as well as multiple neurological and functional metrics.Each branch point denotes a unique criterion that aids in efficiently classifying subjects into the non-PU and PU groups.Figure 5 shows the decision tree model employed to distinguish between the Non-PU and PU groups.The primary discriminator is "K-MBI: ambulation", with a threshold value of 2.338.Subjects who scored below this threshold were predominantly categorized using subsequent discriminators, notably "ISNCSCI: motor score of Rt. lower extremity" (≤18.314) and "FIM: eating" (≤5.033).In contrast, for those surpassing the "K-MBI: ambulation" threshold, "FIM: walk/wheelchair" (≤3.445) and "ISNCSCI: motor score of Rt. lower extremity" (≤7.388) emerged as salient discriminators.The tree further expands to encompass laboratory parameters such as platelet count, mean corpuscular hemoglobin, and eosinophil count, as well as multiple neurological and functional metrics.Each branch point denotes a unique criterion that aids in efficiently classifying subjects into the non-PU and PU groups.

Discussion
In our quest to improve outcomes for patients with SCI, our study's application of ML techniques marks a shift from traditional areas of focus, including neurological and functional outcomes [30][31][32][33][34][35], to the proactive prevention of PUs.These prevalent yet preventable complications have a profound impact on the recovery and quality of life of patients with SCI [36].Our innovative use of ML, ranging from SVM to DNN to GNN, has enabled us to delve into complex datasets and extract critical insights from nonlinear relationships for more accurate PU predictive modeling.This methodology underscores the potential of ML to go beyond the boundaries of conventional statistical analysis.
A key finding of our study was the differential performance of the linear SVM model across different datasets.The DKUH dataset, with its larger sample size and diverse baseline characteristics such as age range, injury severity, and neurological status, provided a different context for ML application than the CNUH dataset.This contrast in performance underscores the influence of specific dataset attributes on the success of ML models and highlights the need for data that encapsulate a broad range of patient scenarios to improve predictive accuracy in diverse clinical settings.
Furthermore, our analysis revealed the paramount importance of functional parameters, including walking (K-MBI), lower body dressing, transfers, eating and bathing (FIM), in predicting PUs (Figure 3).This finding highlights the dominance of functional data over neurological factors in risk assessment.The need for functional assessment is particularly pronounced in conditions such as spinal shock, where the neurological status may be uncertain.In addition to these functional indicators, our study also draws attention to the importance of pre-onset laboratory markers related to inflammation and anemia, such as lymphocyte, neutrophil and eosinophil counts, as well as MCHC and RBC counts, which is consistent with the findings of a previous study [37].Although not primary predictors, their association with increased PU risk is consistent with previous research and underscores their importance in PU risk assessment.
As we move toward clinical application, we have developed a decision tree algorithm based on the results of our study.This algorithm incorporates key parameters identified as significant in predicting PUs, such as functional status indicators (e.g., mobility and self-care ability) and relevant laboratory markers (e.g., inflammatory, and hematological parameters).Designed as a user-friendly decision support tool, it systematically evaluates these factors to estimate PU risk, providing clinicians with a structured framework for early intervention.While the algorithm is promising, extensive validation in diverse clinical settings is essential to determine its utility and efficacy.Our preliminary external validation efforts have revealed variability in performance across datasets from different institutions, underscoring the challenges of creating a universally applicable ML-based prediction tool.These observations not only highlight the intricacies of ML model generalization, but also pave the way for further research to refine and adapt the algorithm for broader clinical use [38,39].
The limitations of our study are openly acknowledged, particularly with respect to the limited sample size and the brevity of the observation period.We recognize the potential of expansive data sources such as the National Spinal Cord Injury Statistical Center (NSCISC; https://www.nscisc.uab.edu/,accessed on 19 October 2021) and the National Institutes of Health (NIH) National Institute of Neurological Disorders and Stroke (NINDS; https://www.commondataelements.ninds.nih.gov/Spinal%20Cord%20Injury,accessed on 19 October 2021), although their use was limited by their mismatch with the acute and subacute phase specificity of our research, particularly the lack of time-sensitive laboratory data relevant to the onset of PUs.In addition, the design of our study needed a rigorous selection process to include only participants with unique records, limiting our dataset.
We envision that future studies include a wider network, integrate data from multiple centers, and account for the temporal progression of PUs.The exploration of hybrid machine learning frameworks that combine the strengths of different algorithms may

Figure 1 .
Figure 1.Flow of the machine learning process.

Figure 1 .
Figure 1.Flow of the machine learning process.

Figure 2 .
Figure 2. t-SNE plot before (left) and after (right) feature selection in the DKUH data.

Figure 3
Figure 3 shows the importance scores of the top 39 parameters identified by the SVM_linear model.These parameters are spread across several categories, including Lab, ISNCSCI, K-MBI, and FIM.The representation of each category in this ranking indicates its integral role in influencing the model's predictions.The most prominent parameter in

Figure 2 .
Figure 2. t-SNE plot before (left) and after (right) feature selection in the DKUH data.

Figure 3 .
Figure 3. Importance scores of the top 39 featured parameters based on the outcome of the SVM_linear model.Abbreviations: K-MBI = Korean version of the Modified Barthel Index; FIM = Functional Independence Measure; ISNCSCI = International Standards for Neurological Classification of Spinal Cord Injury; LEMS = lower extremity, motor subscore; LTL = light touch, left; MCHC = mean corpuscular hemoglobin concentration; RBC = red blood cell; AIS = ASIA impairment scale; PPL = pinprick, left; LER = lower extremity, right; LEL = lower extremity, left; LT, total = light touch, total; MCH = mean corpuscular hemoglobin.

Figure 4 .
Figure 4. Receiver operating characteristic (ROC) curve of each machine learning algorithm in (A) DKUH using the Lab + ISNCSCI + K-MBI + FIM dataset and (B) CNUH using the ISNCSCI dataset.

Figure 4 .
Figure 4. Receiver operating characteristic (ROC) curve of each machine learning algorithm in (A) DKUH using the Lab + ISNCSCI + K-MBI + FIM dataset and (B) CNUH using the ISNCSCI dataset.

Figure 5 .
Figure 5. Decision tree of the SVM_linear model trained with the "Lab + ISNCSCI + K-MBI + FIM" dataset of DKUH to classify the non-PU and PU groups.The "ambulation" subscale of the K-MBI was identified as the first single discriminator for determination of the two groups.The red line indicates "yes", and the blue line indicates "no".Abbreviations: K-MBI = Korean version of the Modified Barthel Index; ISNCSCI = International Standards for Neurological Classification of Spinal Cord Injury; FIM = Functional Independence Measure; LER = lower extremity, right; UER = upper extremity, right; AIS = ASIA Impairment Scale; LTL = light touch, left; LEL = lower extremity, left; PPL = pinprick, left; UEMS = Upper Extremity Motor Subscore; LT, total = light touch, total, MCH = mean corpuscular hemoglobin; MCHC = mean corpuscular hemoglobin concentration.

Figure 5 .
Figure 5. Decision tree of the SVM_linear model trained with the "Lab + ISNCSCI + K-MBI + FIM" dataset of DKUH to classify the non-PU and PU groups.The "ambulation" subscale of the K-MBI was identified as the first single discriminator for determination of the two groups.The red line indicates "yes", and the blue line indicates "no".Abbreviations: K-MBI = Korean version of the Modified Barthel Index; ISNCSCI = International Standards for Neurological Classification of Spinal Cord Injury; FIM = Functional Independence Measure; LER = lower extremity, right; UER = upper extremity, right; AIS = ASIA Impairment Scale; LTL = light touch, left; LEL = lower extremity, left; PPL = pinprick, left; UEMS = Upper Extremity Motor Subscore; LT, total = light touch, total, MCH = mean corpuscular hemoglobin; MCHC = mean corpuscular hemoglobin concentration.

Table 1 .
Baseline characteristics of patients with and without pressure ulcers at DKUH and CNUH.

Table 1 .
Cont.Values are presented as the number of subjects (%) or means ± standard deviations.The p values of the non-PU and PU groups were determined by the chi-squared test and independent t test; * p < 0.05.Abbreviations: PU = pressure ulcer, ISNCSCI = International Standards for Neurological Classification of Spinal Cord Injury; K-MBI = Korean version of the Modified Barthel Index; FIM = Functional Independence Measure.

Table 2 .
Number of patients in the dataset category.
The numbers of patients satisfying each dataset in the two hospitals are presented.The numbers in parentheses represent the counts of PU events.Abbreviations: DKUH = Dankook University Hospital; CNUH = Chungnam National University Hospital; PU = pressure ulcer; Lab = laboratory data; ISNCSCI = International Standards for Neurological Classification of Spinal Cord Injury; K-MBI = Korean version of the Modified Barthel Index; FIM = Functional Independence Measure.

Table 3 .
Performance comparison of machine learning algorithms in each dataset from DKUH.
Abbreviations: Lab = laboratory data; ISNCSCI = International Standards for Neurological Classification of Spinal Cord Injury; K-MBI = Korean version of the Modified Barthel Index; FIM = Functional Independence Measure.

Table 4 .
Performance comparison of machine learning algorithms in each dataset of CNUH data.

Table 4 .
Cont.The KNN model trained with the "ISNCSCI" dataset from CNUH demonstrated the highest performance, with an AUC of 0.737.Abbreviations: Lab = laboratory data; ISNCSCI = International Standards for Neurological Classification of Spinal Cord Injury.