Injury Prediction in Korean Adult Field Hockey Players Using Machine Learning and SHAP-Based Feature Importance Analysis

Choi, Minkyung; Lee, Kumju; Lee, Kihyuk

doi:10.3390/app15168946

Open AccessArticle

Injury Prediction in Korean Adult Field Hockey Players Using Machine Learning and SHAP-Based Feature Importance Analysis

by

Minkyung Choi

¹,

Kumju Lee

² and

Kihyuk Lee

^1,*

¹

Department of Sport Culture, Dongguk University, 30, Pildong-ro 1gil, Jung-gu, Seoul 04620, Republic of Korea

²

Department of Physical Education, Korea National Sport University, Seoul 05541, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(16), 8946; https://doi.org/10.3390/app15168946

Submission received: 18 July 2025 / Revised: 5 August 2025 / Accepted: 11 August 2025 / Published: 13 August 2025

(This article belongs to the Special Issue Sports Injuries: Prevention and Rehabilitation)

Download

Browse Figures

Versions Notes

Abstract

Field hockey involves repetitive high-intensity movements and physical contact, posing a high risk of injury. However, studies developing injury prediction models without relying on expensive tools such as GPS remain limited. This study aimed to develop an explainable AI model that predicts injury occurrence using only simple questionnaire-based data and visually identifies key predictors. Survey data were collected from 239 adult players registered with the Korea Field Hockey Association in 2024, including university and professional team athletes. Ten variables were used: sex, team affiliation, playing experience, player level, warm-up duration, weekly training hours and days, and physical indicators (age, height, weight). Injury was defined as an event within the past year that resulted in being unable to train for more than 24 h. Logistic Regression, Random Forest, and XGBoost models were compared. The final model—Logistic Regression—underwent SHAP-based visualization for interpretability. The Logistic Regression model showed the best performance in recall (0.6810 ± 0.0983), F1-score (0.6260 ± 0.0499), and AUC (0.6515 ± 0.0393). SHAP analysis identified Group, Training Time, Weight, and Player Level as key predictors, and visualized their contributions to individual predictions. This study demonstrates that a lightweight, interpretable injury prediction model using only simple survey data can achieve practical performance. This approach offers valuable insights for real-world applications and the development of injury prevention strategies.

Keywords:

field hockey; injury prediction; machine learning; SHAP; explainable AI

1. Introduction

Field hockey is a sport characterized by short bursts of high-intensity movement, abrupt changes in direction, and frequent physical contact between players, making it a sport with a high risk of injury [1,2]. Such injuries not only lead to a decline in individual athletic performance, but also result in extended rehabilitation periods and interruptions in training, which can have both direct and indirect impacts on team performance. Therefore, the importance of systematic injury risk management by coaches and trainers in the field has been consistently emphasized [3,4]. In recent years, the field of sports science has seen a growing interest in using machine learning to predict athletic injuries, with most studies focusing on improving model performance indicators such as prediction accuracy and AUC [5,6]. While these metrics are useful for evaluating a model’s classification ability, they fall short of offering practical guidance on which specific factors—such as survey responses or physical measurements—should be addressed to effectively reduce injury risk in real-world settings.

Meanwhile, in the case of field hockey, some studies have examined injury-related factors based on self-reported questionnaires and physical indicators. For example, experiences of pain in specific body parts, levels of fatigue, sleep quality, and annual training volume have been reported to be significantly associated with injury risk [7,8]. According to recent findings, around 60% of injury prediction studies still rely on regression-based methods, with most using univariate statistics or regression-centered approaches. However, such models are limited in their ability to capture the complexity of injury mechanisms, especially when multiple risk factors can lead to similar injury outcomes. They also fall short of reflecting interactions between variables and potential non-linear relationships [9]. Therefore, there is a growing need to build injury prediction models that incorporate a variety of characteristics, including self-reported data and physical measures, and to adopt analytical approaches capable of capturing complex interactions and non-linearities. Beyond improving prediction, such research should aim to identify and interpret key risk factors that can guide effective interventions in practice [10,11].

Accordingly, this study has two main objectives. First, it aims to develop a predictive model for injury occurrence among field hockey players using machine learning techniques. Second, by applying the SHAP (Shapley Additive Explanations) interpretability method, it seeks to visually present the importance of each variable within the model, thereby providing practical insights that can be used for injury prevention and management in real-world field settings.

2. Materials and Methods

2.1. Participants and Procedures

This study was conducted with 239 adult field hockey players registered with the Korea Hockey Association as of 2024, all of whom were affiliated with either university or semi-professional teams. All participants were aged 19 years or older and voluntarily completed a survey that included information on sex, team type (university or semi-professional), playing experience, warm-up duration, weekly training hours, and number of training days per week. Participation was limited to individuals who provided informed consent after being fully informed about the purpose and use of personal data. All procedures were carried out in accordance with the ethical principles of the Declaration of Helsinki.

The survey took place in 2024, from 1 September to 30 September. No incomplete or duplicate responses were found, and all 239 responses were included in the final analysis. Injury was defined as an event that occurred during training or competition in the past year and resulted in an inability to participate in normal athletic activities for more than 24 h. The participants included 120 males (50.2%) and 119 females (49.8%), with 109 athletes (45.6%) from university teams and 130 (54.4%) from semi-professional teams. To enhance the clarity of the methodological process, Figure 1 illustrates the overall analytical pipeline from participant recruitment and survey data collection to model development and SHAP-based interpretation.

2.2. Survey Content

The questionnaire used in this study was designed to comprehensively identify risk factors associated with injuries and to support injury prevention among field hockey players. The survey items were developed with reference to previous studies [12,13,14].

It included questions on basic physical characteristics, training habits, team-related attributes, and injury history, all of which were completed via self-report. Injury experience was assessed with the following question: “Have you experienced an injury during training or competition in the past year that prevented you from participating in normal physical activity for more than 24 h?” Participants who answered “yes” were coded as 1 (injured), and those who answered “no” were coded as 0 (not injured).

To assess physical characteristics, Age, Height, and Weight were treated as continuous variables and used in their original form without categorization. All other variables were coded categorically according to predefined criteria. Sex was coded as 1 for male and 2 for female. Team affiliation (Group) was coded as 1 for university teams and 2 for semi-professional teams. Playing experience (Career) was categorized into five levels: less than 1 year (1), 1–2 years (2), 3–4 years (3), 5–6 years (4), and more than 6 years (5). Player level (Level) was defined as general player (1), intermediate-to-advanced player, which included youth or national team candidates (2), and former national team member (3). Warm-up duration per session (Warm-up) was categorized as none (1), less than 5 min (2), 5–10 min (3), 10–20 min (4), 20–30 min (5), or more than 30 min (6). Training duration per session (Training Time) was categorized as 1–2 h (1), 2–3 h (2), 3–4 h (3), 5–6 h (4), or more than 6 h (5). Weekly training days (Training Day) were classified on a 5-point scale from 1 day (1) to 5 or more days (5). All variables were used as independent variables in relation to injury status, and each participant completed the survey only once to avoid duplicate responses.

2.3. Machine Learning Approach

This study employed machine learning-based classification models to predict injury occurrence among adult field hockey players. The dependent variable was injury status, coded as 0 (no injury) or 1 (injury). Independent variables included continuous features—age, height, and weight—as well as categorical variables such as sex, team affiliation, playing experience, player level, warm-up duration, weekly training hours, and training days per week. All categorical variables were converted into a numerical format using LabelEncoder. To evaluate multicollinearity among the predictors, we conducted Pearson correlation and Variance Inflation Factor (VIF) analyses. The results showed no strong correlations (|r| > 0.8) and all VIF values were below 5 (Figure 2), indicating the absence of multicollinearity.

The dataset included responses from 239 players, with the class distribution between injured and non-injured players being relatively balanced (approximately 51:49). Therefore, no over-sampling techniques such as SMOTE (Synthetic Minority Over-sampling Technique) were applied. Applying SMOTE in a balanced dataset may introduce artificial bias, distort the natural distribution, and compromise model interpretability [15]. To maintain the integrity of the data and avoid such potential distortions, we deliberately chose not to apply SMOTE.

The following three classification algorithms were applied: Logistic Regression, Random Forest, and XGBoost [16].

Logistic Regression (LR) estimates the probability of injury occurrence using a logistic (sigmoid) function:

P (y = 1 | X) = \frac{1}{1 + e^{- β^{T} X}}

It models the log-odds of the outcome as a linear function of the predictors.

Random Forest (RF) is an ensemble learning method that combines the predictions of multiple decision trees to enhance generalizability. The final prediction is based on majority voting:

\hat{y} = mode (T_{1} (X), T_{2} (X), \dots, T_{K} (X))

XGBoost (Extreme Gradient Boosting) builds an additive model in a forward stage-wise manner by minimizing a regularized loss function. The prediction at stage t is as follows:

{\hat{y}}^{(t)} = \sum_{k = 1}^{t} f_{k} (X), f_{k} \in F

A fixed random seed was used to ensure consistency and comparability across models. These hyperparameters were selected based on commonly used defaults and internal validation, without extensive parameter search. The final hyperparameter settings for each model are summarized in Table 1.

Model evaluation was conducted using 5-fold cross-validation via StratifiedKFold. For each model, the performance metrics—including accuracy, area under the curve (AUC), precision, recall, and F1-score—were calculated and reported as a mean ± standard deviation.

2.4. SHAP Analysis

To visualize feature contributions, several SHAP plots were generated: a violin plot (summary plot), waterfall plot (individual prediction explanation), and force plot (feature impact relative to the base value). SHAP values were computed across each fold and aggregated to derive the overall contribution of each feature, enabling both quantitative and visual interpretation. Although SHAP analysis can be applied to all models, we selected Logistic Regression for final interpretation based on both its predictive performance and explainability. Among the three models tested, Logistic Regression achieved the highest recall and AUC, which were the most relevant metrics for our goal of identifying at-risk individuals. Furthermore, SHAP values derived from linear models are often more stable and globally interpretable, especially when using small-to-medium-sized datasets like ours [17].

We also cross-validated the consistency of SHAP-based feature importance across models. Training time, playing career, and team affiliation were among the top-ranked predictors in all three models (Logistic Regression, Random Forest, and XGBoost), although their exact order and magnitude varied slightly. This consistency supports the robustness of our findings and justifies the use of Logistic Regression for explainability. All analyses were performed in the Google Colab environment using Python (version 3.11.13).

2.5. Statistical Analysis

To analyze group differences based on injury status (non-injured vs. injured), continuous variables (Age, Height, and Weight) were first tested for normality using the Shapiro–Wilk test and for homogeneity of variance using Levene’s test. After confirming that both assumptions were met, independent t-tests were conducted. For categorical variables (Sex, Group, Career, Level, Warm-up, Training Time, and Training Day), chi-square tests were used to examine differences in distribution between the two groups. All statistical analyses were performed in the Google Colab environment using Python libraries such as scipy, pandas, and statsmodels. The significance level was set at α = 0.05.

Although power analysis is not commonly applied in machine learning studies, we assessed the adequacy of the sample size by examining both model stability (via 5-fold cross-validation) and effect sizes from the group comparisons in Table 1. Based on the significant differences in training experience (p = 0.014) between injured and non-injured groups, a post hoc power analysis using G*Power (version 3.1.9.7; effect size = 0.3, α = 0.05, total n = 239) yielded a power of 0.91, indicating sufficient sample adequacy.

3. Results

3.1. Descriptive Statistics and Group Comparison

In this study, the 239 adult field hockey players were divided into two groups based on injury experience: the non-injured group (n = 117) and the injured group (n = 122). Group differences in key variables were analyzed (Table 2). Among the continuous variables, no statistically significant differences were found between the two groups in terms of age, height, or weight (Age: p = 0.4438; Height: p = 0.4298; Weight: p = 0.1858).

On the other hand, several categorical variables showed significant differences between groups. Playing experience (Career: p = 0.014), player level (Level: p = 0.023), weekly training hours (Training Time: p = 0.046), and weekly training days (Training Day: p = 0.017) differed significantly by injury status. These findings suggest a potential association between injury occurrence and variables related to training volume and athletic background. In contrast, no significant differences were found in terms of sex (Sex: p = 0.476), team affiliation (Group: p = 1.000), or warm-up duration (Warm-up: p = 0.214).

3.2. Performance of Machine Learning Models

To predict injury occurrence among field hockey players, the performance of three models—Logistic Regression, Random Forest, and XGBoost—was compared (Table 3). The analysis showed that the Logistic Regression model achieved the best performance in terms of recall for the injured class (1) (0.6810 ± 0.0983), F1-score (0.6260 ± 0.0499), and AUC (0.6515 ± 0.0393). In contrast, the Random Forest model demonstrated relatively lower overall accuracy (0.5316 ± 0.0798) and AUC (0.6144 ± 0.0627), and its recall for injured players was also limited (0.5650 ± 0.0600), indicating reduced effectiveness in detecting actual injury cases. The XGBoost model showed the lowest overall performance, with both its accuracy (0.5777 ± 0.0934) and its AUC (0.5973 ± 0.0799) being the lowest among the three models.

According to the 5-fold cross-validation results of the Logistic Regression model, which demonstrated the best overall performance (Table 4), the binary classification model achieved an average accuracy of 0.5902 (±0.0366), average precision of 0.5863 (±0.0360), average recall of 0.6810 (±0.0983), average F1-score of 0.6260 (±0.0499), and average AUC of 0.6515 (±0.0393). Notably, the relatively low standard deviations in accuracy and precision (±0.0366 and ±0.0360, respectively) indicate that the model performance was stable across different data splits. In addition, the AUC value of approximately 0.65 suggests moderate to good discriminative power, and the average recall of 0.6810 for the injured class (1) indicates a satisfactory level of sensitivity.

3.3. Interpretation of Feature Importance Using SHAP

According to the SHAP summary plot (Figure 3A), the Group variable had the strongest influence on the model’s output. In particular, belonging to a specific group (e.g., semi-professional team) was associated with higher SHAP values in the positive direction, indicating an increased likelihood of injury (i.e., a higher predicted probability of the positive class) for that group. This was followed by Training Time and Weight, which also showed substantial influence. Overall, the distribution of SHAP values across variables showed relatively low variance, suggesting stable contribution patterns. In contrast, variables such as Warm-up, Height, and Sex had SHAP values mostly near zero with narrow distributions, indicating their limited impact on the model’s predictions.

As shown in the SHAP bar plot of mean absolute contributions (Figure 3B), Group, Training Time, Weight, and Level emerged as the top-ranking features in terms of average SHAP value. These variables contributed approximately ±0.2 to ±0.3 to the model output (log-odds) on average, reflecting a relatively strong predictive influence.

In the SHAP waterfall plot (Figure 3C), the sample example showed that Training Day = 3 had the largest negative contribution (−0.96), lowering the model’s prediction significantly. On the other hand, Level = 2 contributed positively (+0.47), raising the prediction, while Weight = 76, Age = 34, and Group = 1 showed minor positive or negative contributions. The waterfall plot structure allows for a clear quantitative explanation of both the direction and magnitude of each variable’s effect.

The SHAP force plot (Figure 3D) visually illustrates how the final model output is derived from the base value (e.g., average prediction) through the cumulative contributions of individual variables. For instance, in the given example, Training Day and Group contributed negatively, while Level and Training Time had positive effects, ultimately shifting the prediction lower than the baseline. This suggests a higher probability of being classified into the negative class (i.e., no injury) for that particular sample.

4. Discussion

This study focused not on developing a top-performing injury prediction model for field hockey players, but rather on demonstrating that a predictive model can be constructed using only simple, survey-based variables, and on identifying key factors that influence injury occurrence. To achieve this, three machine learning models—Logistic Regression, Random Forest, and XGBoost—were compared and evaluated. Given the importance of detecting injured cases (the positive class) in the context of class imbalance, model selection was primarily based on recall and AUC.

The performance comparison showed that the Logistic Regression model achieved the best results in terms of recall for injured players (0.6810 ± 0.0983), F1-score (0.6260 ± 0.0499), and AUC (0.6515 ± 0.0393). Similarly, Leckey et al. [18] reported that among Logistic Regression, Random Forest, and XGBoost models, Logistic Regression demonstrated the highest recall and AUC under imbalanced data conditions. This finding aligns with the descriptive review by Ruddy et al. [5], which noted that although tree-based models outperformed others in about 60% of cases, Logistic Regression outperformed machine learning techniques in 4 out of 12 studies. Meanwhile, Random Forest showed the highest overall accuracy (0.6138 ± 0.0300), but its low recall (0.4444 ± 0.0675) suggests a high likelihood of missing actual injury cases, limiting its practical applicability in field settings. Similarly, Jauhiainen et al. [10], in a study predicting ACL injury among female elite athletes, reported that although all evaluated machine learning models outperformed chance, their predictive performance was modest, with the highest AUC-ROC of 0.63 achieved by linear support vector machine. These results suggest that model complexity may actually hinder predictive accuracy in the context of imbalanced datasets. This study prioritized recall (sensitivity) over precision when selecting the final model, as the primary goal was to maximize the detection of injury cases for early intervention. This recall-oriented strategy reflects a preventive approach, where capturing as many true positives as possible is considered more valuable than minimizing false positives in field settings.

According to the 5-fold cross-validation results (Section 2.3), the Logistic Regression model achieved an average accuracy of 0.5902 ± 0.0366, precision of 0.5863 ± 0.0360, recall of 0.6810 ± 0.0983, F1-score of 0.6260 ± 0.0499, and AUC of 0.6515 ± 0.0393. Notably, the relatively low standard deviations in accuracy and precision (±0.0366 and ±0.0360, respectively) indicate a consistent predictive performance that is not heavily affected by how the data is split. This suggests that the model could provide a reliable performance under various sampling conditions in real-world applications. In relation to this, Jauhiainen et al. [10] reported that repeated cross-validation can reduce performance variability and contribute to more robust prediction. Rossi et al. [19] also emphasized that reporting both the mean and standard deviation of performance metrics during cross-validation is crucial for enhancing the reliability and practical applicability of a model. Therefore, the low standard deviations observed in this study support the conclusion that the model can maintain a consistent performance across different sampling scenarios, reinforcing its potential utility as a trustworthy injury prediction tool in field settings.

In the SHAP analysis conducted to enhance model interpretability, the summary plot revealed that variables such as Group, Training Time, Weight, and Level were the most influential global features contributing to the model’s predictions. Notably, players affiliated with a specific team type (e.g., semi-professional teams), those with longer training durations, and those with a higher body weight were more likely to be predicted as injured. In contrast, variables such as Height, Sex, and Warm-up had limited influence on the model’s predictions.

Using waterfall and force plots, the cumulative impact of key variables on the final prediction for individual athletes was visualized step by step, allowing for a transparent and detailed interpretation of the model’s decision-making process.

These findings offer several implications. First, the results demonstrate that Logistic Regression, a relatively simple and interpretable model, can satisfy the requirements of predictive performance, consistency, and explainable AI—especially in the context of imbalanced data. Furthermore, the key variables identified through SHAP analysis can serve as practical indicators for injury prevention strategies, such as enhanced monitoring of less experienced players and risk assessments based on team and training plans.

In previous field hockey studies, objective indicators such as GPS-based training load and high-intensity repetitive movements have been reported to be significantly associated with injury risk [20]. In disciplines such as middle- and long-distance running, machine learning models utilizing these indicators have achieved predictive performance with AUC values exceeding 0.70 [21]. Furthermore, other prior studies have demonstrated that multimodal machine learning models combining GPS, wearable sensor data, physiological metrics, and injury history can reach AUC values of between 0.70 and 0.80 or higher [19,20]. For example, Rossi et al. [22] improved model interpretability and practical utility by using GPS-based training data, while Ye et al. [9] achieved an AUC of over 0.89 by combining time series image encoding with deep learning. Huang et al. [23] also found that models integrating multimodal data showed significantly better predictive performance than single-modal models.

In addition, recent advances in technologies such as wearable biosensors based on nanomaterials [24,25], ultrasensitive biochemical sensors [26], and energy-efficient health monitoring systems [27] have opened new possibilities for injury prediction and personalized healthcare monitoring. In fact, a machine learning-based injury prediction model utilizing wearable biosignal data has also been reported [28]. In contrast, this study used only 10 self-reported variables collected through surveys, yet achieved an AUC of 0.65, and demonstrated even better sensitivity (recall) in detecting injured players than some multimodal data-based models. These findings suggest that while integrating diverse data sources can improve performance, sensitive injury prediction may still be achievable in practice using only simple survey-based data, especially when access to high-end sensing systems is limited.

This study has several limitations. First, due to the cross-sectional nature of the data, it is not possible to infer causal relationships between predictors and injury occurrence. For instance, reduced training time may be a consequence of injury, rather than a contributing risk factor. Longitudinal (cohort) studies are needed to clarify such causal directions. Furthermore, survivor bias is unlikely to influence the current findings, as the outcome variable was strictly defined as injury occurring within the previous 12 months, and athletes who retired due to earlier injuries were not included in the sample. However, this limitation should be carefully addressed in future longitudinal studies. Second, the use of self-reported questionnaire data may introduce recall bias, as responses rely on the participants’ memory rather than objective measurement. Variables such as warm-up time, injury occurrence, and weekly training volume may be particularly affected by this limitation. In particular, the limited design of the warm-up variable—without consideration of warm-up type, intensity, or timing—may have contributed to its weak predictive value in both statistical and SHAP analyses, despite its established importance in sports medicine. Third, the number of variables included in the model was limited, which restricts the ability to account for several known risk factors. Important predictors such as training intensity (e.g., RPE), previous injury history, psychological stress (e.g., GAD-7), and sleep quality (e.g., PSQI) were not included in this analysis. Lastly, the sample consisted of 239 Korean university and semi-professional field hockey players recruited nationwide, which may limit the generalizability of the findings to athletes from other sports, regions, or competitive levels. Although 5-fold cross-validation was applied to mitigate the risk of overfitting, further studies with larger and more diverse samples are needed to enhance external validity and model robustness.

Future studies should also consider adopting a longitudinal (cohort) design, integrating diverse data sources such as wearable sensors and EMR, and optimizing UI/UX based on user feedback to enhance the model’s real-world applicability and accuracy. Wearable-based monitoring and multi-modal data integration may also be particularly beneficial for continuous tracking and large-scale deployment. This approach may also help address issues such as cumulative exposure or survivor bias, which are difficult to capture in cross-sectional designs.

5. Conclusions

This study developed an injury prediction model using only 10 survey-based variables and demonstrated that even with simple input features, meaningful predictive performance could be achieved—Logistic Regression analysis yielded a recall of 0.6810 and an AUC of 0.6515. Through 5-fold cross-validation, the standard deviations of overall accuracy and precision were both within ±0.04, indicating the model’s consistent predictive performance across various sampling conditions. SHAP analysis identified Group, Training Time, Weight, and Level as key predictors, suggesting their potential utility in developing personalized injury prevention strategies that reflect individual athlete characteristics and training patterns. In conclusion, this study shows that injury risk can be effectively predicted using only simple survey-based variables, while SHAP-based interpretation ensures both the transparency and the practical applicability of the model. These findings provide a foundation for advancing injury prevention and player management in field hockey and other team sports.

Author Contributions

Conceptualization, K.L. (Kihyuk Lee) and M.C.; methodology, M.C.; investigation, M.C. and K.L. (Kumju Lee); data curation, M.C. and K.L. (Kumju Lee); writing—original draft preparation, M.C.; writing—review and editing, K.L. (Kihyuk Lee) and M.C.; visualization, K.L. (Kihyuk Lee); supervision, M.C.; project administration, K.L. (Kihyuk Lee) and M.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

This study analyzed de-identified, non-sensitive, self-reported survey data. All participants voluntarily consented after being fully informed of the study’s purpose. No sensitive or personally identifiable information was collected, and all responses were anonymized to ensure confidentiality. The level of risk was minimal. The study was conducted in accordance with the ethical principles of the Declaration of Helsinki and, under Articles 2.1 and 15.1 of the South Korean Life Ethics and Safety Act, qualifies for exemption from IRB review.

Informed Consent Statement

Informed consent has been obtained from the participants for the publication of our study.

Data Availability Statement

The data presented in this study is available on request from the corresponding author.

Acknowledgments

We thank all the subjects for participating in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Gabbett, T.J. The training—Injury prevention paradox: Should athletes be training smarter and harder? Br. J. Sports Med. 2016, 50, 273–280. [Google Scholar] [CrossRef] [PubMed]
Murtaugh, K. Injury patterns among female field hockey players. Med. Sci. Sports Exerc. 2001, 33, 201–207. [Google Scholar] [CrossRef] [PubMed]
Ekstrand, J.; Hägglund, M.; Waldén, M. Injury incidence and injury patterns in professional football: The UEFA injury study. Br. J. Sports Med. 2011, 45, 553–558. [Google Scholar] [CrossRef]
Bahr, R.; Krosshaug, T. Understanding injury mechanisms: A key component of preventing injuries in sport. Br. J. Sports Med. 2005, 39, 324–329. [Google Scholar] [CrossRef]
Ruddy, J.D.; Cormack, S.J.; Whiteley, R.; Williams, M.D.; Timmins, R.G.; Opar, D.A. Modeling the risk of team sport injuries: A narrative review of different statistical approaches. Front. Physiol. 2019, 10, 829. [Google Scholar] [CrossRef]
Van Eetvelde, H.; Mendonça, L.D.; Ley, C.; Seil, R.; Tischer, T. Machine learning methods in sport injury prediction and prevention: A systematic review. J. Exp. Orthop. 2021, 8, 27. [Google Scholar] [CrossRef]
Mason, J.; Wellmann, K.; Groll, A.; Braumann, K.-M.; Junge, A.; Hollander, K.; Zech, A. Game exposure, player characteristics, and neuromuscular performance influence injury risk in professional and youth field hockey players. Orthop. J. Sports Med. 2021, 9, 1–9. [Google Scholar] [CrossRef]
Veiga, G.; Torres, G.; Maposa, I. Association of the acute: Chronic workload ratio and wellness scores in premier league male hockey players. S. Afr. J. Sports Med. 2021, 33, 1–7. [Google Scholar]
Ye, X.; Huang, Y.; Bai, Z.; Wang, Y. A novel approach for sports injury risk prediction: Based on time-series image encoding and deep learning. Front. Physiol. 2023, 14, 1174525. [Google Scholar] [CrossRef]
Jauhiainen, S.; Kauppi, J.-P.; Krosshaug, T.; Bahr, R.; Bartsch, J.; Äyrämö, S. Predicting ACL injury using machine learning on data from an extensive screening test battery of 880 female elite athletes. Am. J. Sports Med. 2022, 50, 2917–2924. [Google Scholar] [CrossRef]
Rhon, D.I.; Teyhen, D.S.; Collins, G.S.; Bullock, G.S. Predictive models for musculoskeletal injury risk: Why statistical approach makes all the difference. BMJ Open Sport Exerc. Med. 2022, 8, e001388. [Google Scholar] [CrossRef] [PubMed]
Choi, M.; Lee, K. Training-related sports injury patterns among elite middle and high school field hockey players in Korea. Sports 2025, 13, 117. [Google Scholar] [CrossRef] [PubMed]
Chung, J.-W.; Song, H.-S.; Kim, E.-H.; Cho, J.-H.; Park, J.-Y.; Lee, K.-H. Incidence of sports injury in middle and high school fencers by gender, grade and type during training. Korean Acad. Kinesiol. 2017, 19, 65–72. [Google Scholar][Green Version]
Oh, S.W.; Lee, Y.K.; Lee, K.H. Incidence of sports injury according to the gender and age group of Korean elite handball players. J. Coach. Dev. 2022, 24, 208–217. [Google Scholar] [CrossRef]
Sakho, A.; Malherbe, E.; Scornet, E. Do we need rebalancing strategies? A theoretical and empirical study around SMOTE and its variants. arXiv 2024, arXiv:2402.03819. [Google Scholar] [CrossRef]
Chehreh Chelgani, S.; Fatahi, R.; Pournazari, A.; Nasiri, H. Modeling energy consumption indexes of an industrial cement ball mill for sustainable production. Sci. Rep. 2025, 15, 18514. [Google Scholar] [CrossRef]
Roberts, C.V.; Elahi, E.; Chandrashekar, A. On the bias-variance characteristics of LIME and SHAP in high sparsity movie recommendation explanation tasks. arXiv 2022, arXiv:2206.04784. [Google Scholar] [CrossRef]
Leckey, C.; Van Dyk, N.; Doherty, C.; Lawlor, A.; Delahunt, E. Machine learning approaches to injury risk prediction in sport: A scoping review with evidence synthesis. Br. J. Sports Med. 2025, 59, 491–500. [Google Scholar] [CrossRef]
Rossi, A.; Pappalardo, L.; Cintia, P. A narrative review for a machine learning application in sports: An example based on injury forecasting in soccer. Sports 2021, 10, 5. [Google Scholar] [CrossRef]
Kim, T.; Park, J.-C.; Park, J.-M.; Choi, H. Optimal relative workload for managing low-injury risk in lower extremities of female field hockey players: A retrospective observational study. Medicine 2021, 100, e27643. [Google Scholar] [CrossRef]
Lövdal, S.S.; Den Hartigh, R.J.; Azzopardi, G. Injury prediction in competitive runners with machine learning. Int. J. Sports Physiol. Perform. 2021, 16, 1522–1531. [Google Scholar] [CrossRef]
Rossi, A.; Pappalardo, L.; Cintia, P.; Iaia, F.M.; Fernández, J.; Medina, D. Effective injury forecasting in soccer with GPS training data and machine learning. PLoS ONE 2018, 13, e0201264. [Google Scholar] [CrossRef] [PubMed]
Huang, Y.; Huang, S.; Wang, Y.; Li, Y.; Gui, Y.; Huang, C. A novel lower extremity non-contact injury risk prediction model based on multimodal fusion and interpretable machine learning. Front. Physiol. 2022, 13, 937546. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Gao, Z.; Wu, W.; Xiong, Y.; Luo, J.; Sun, Q.; Mao, Y.; Wang, Z.L. TENG-Boosted Smart Sports with Energy Autonomy and Digital Intelligence. Nano-Micro Lett. 2025, 17, 265. [Google Scholar] [CrossRef] [PubMed]
Sun, F.; Zhu, Y.; Jia, C.; Wen, Y.; Zhang, Y.; Chu, L.; Zhao, T.; Liu, B.; Mao, Y. Deep-Learning-Assisted Neck Motion Monitoring System Self-Powered Through Biodegradable Triboelectric Sensors. Adv. Funct. Mater. 2024, 34, 2310742. [Google Scholar] [CrossRef]
Liu, D.; Wen, Y.; Xie, Z.; Zhang, M.; Wang, Y.; Feng, Q.; Cheng, Z.; Lu, Z.; Mao, Y.; Yang, H. Self-Powered, Flexible, Wireless and Intelligent Human Health Management System Based on Natural Recyclable Materials. ACS Sens. 2024, 9, 6236–6246. [Google Scholar] [CrossRef]
Zhu, Y.; Zhao, T.; Sun, F.; Jia, C.; Ye, H.; Jiang, Y.; Wang, K.; Huang, C.; Xie, Y.; Mao, Y. Multi-Functional Triboelectric Nanogenerators on Printed Circuit Board for Metaverse Sport Interactive System. Nano Energy 2023, 113, 108520. [Google Scholar] [CrossRef]
Wen, Y.; Zhang, M.; Xie, Z.; An, Z.; Liu, B.; Sun, F.; Zhao, T.; Yu, Z.; Wang, F.; Mao, Y. Intelligent Interaction System Based on Multimodal Conformal Triboelectric Nanogenerator Patch for Disabled Sports and Life. Sci. China Technol. Sci. 2025, 68, 1221101. [Google Scholar] [CrossRef]

Figure 1. AI-based injury risk prediction pipeline.

Figure 2. Correlation matrix of predictor variables used in injury prediction model. (A) Heatmap of Pearson correlation coefficients among predictor variables. (B) Mean VIF (Variance Inflation Factor) and Tolerance values for each variable. VIF and Tolerance are unitless statistical indicators for multicollinearity. Categorical variables were numerically encoded prior to analysis.

Figure 3. SHAP-based interpretation of feature contributions in Logistic Regression model. SHAP analysis results visualizing contribution of each variable to prediction of injury risk. (A) Summary plot showing distribution and direction of SHAP values for each feature, highlighting Group, Training Time, and Weight as most influential variables. (B) Mean absolute SHAP value bar plot indicating average magnitude of each feature’s impact on model output. (C) SHAP waterfall plot explaining how each feature contributes to predicted value for sample classified as non-injured. (D) SHAP force plot showing cumulative contribution of each feature toward final prediction relative to model’s base value.

Table 1. Hyperparameter settings for each model.

Model	Hyperparameter	Description/Role	Value
Logistic Regression	penalty	Regularization type	l2
	solver	Optimization algorithm	lbfgs
	max_iter	Maximum number of iterations	1000
	random_state	Random seed for reproducibility	7777
Random Forest	n_estimators	Number of trees	100
	max_depth	Maximum depth of trees	None
	criterion	Splitting criterion	gini
	random_state	Random seed for reproducibility	7777
XGBoost	n_estimators	Number of boosting rounds	100
	max_depth	Maximum depth of each tree	3
	learning_rate	Step size shrinkage	0.1
	use_label_encoder	Disable legacy encoder	False
	eval_metric	Evaluation metric	logloss
	random_state	Random seed for reproducibility	7777

Table 2. Descriptive statistics and group comparisons by injury status (non-injured vs. injured).

Variable	Not Injured (n = 117)	Injured (n = 122)	p-Value
Age	23.44 ± 4.73	23.90 ± 4.66	0.4438
Height	169.93 ± 8.33	169.08 ± 8.24	0.4298
Weight	66.50 ± 13.16	64.52 ± 9.52	0.1858
Sex—1	62 (51.67%)	58 (48.33%)	0.476
Sex—2	55 (46.22%)	64 (43.59%)	0.476
Group—1	53 (48.62%)	56 (51.38%)	1.000
Group—2	64 (49.23%)	66 (50.77%)	1.000
Career—1	-	-	0.014
Career—2	1 (100.00%)	0 (0.00%)
Career—3	7 (100.00%)	0 (0.00%)
Career—4	8 (66.67%)	4 (33.33%)
Career—5	101 (46.12%)	118 (53.88%)
Level—1	37 (60.66%)	24 (39.34%)	0.023
Level—2	59 (49.58%)	60 (50.42%)
Level—3	21 (35.59%)	38 (64.41%)
Warm-up—1	-	-	0.214
Warm-up—2	2 (100.00%)	0 (0.00%)
Warm-up—3	20 (58.82%)	14 (41.18%)
Warm-up—4	48 (44.04%)	61 (55.96%)
Warm-up—5	36 (53.73%)	31 (46.27%)
Warm-up—6	11 (40.74%)	16 (59.26%)
Training Time—1	1 (100.00%)	0 (0.00%)	0.046
Training Time—2	32 (65.31%)	17 (34.69%)
Training Time—3	23 (50.00%)	23 (50.00%)
Training Time—4	58 (43.94%)	74 (56.06%)
Training Time—5	3 (27.27%)	8 (72.73%)
Training Day—1	1 (100.00%)	0 (0.00%)	0.017
Training Day—2	1 (100.00%)	0 (0.00%)
Training Day—3	8 (100.00%)	0 (0.00%)
Training Day—4	1 (100.00%)	0 (0.00%)
Training Day—5	106 (46.49%)	122 (53.51%)

Continuous variables are presented as mean ± SD; categorical variables as n (%). Group differences were tested using t-tests and chi-square tests. Statistical significance was considered at p < 0.05. Group = 1: university, 2: semi-professional; Career = 1–5: <1 yr to >6 yrs; Level = 1: general, 2: candidate, 3: former national; Warm-up, Training Time, Training Day are coded in ascending order.

Table 3. Classification performance of machine learning models for injury prediction.

Class	Precision	Recall	F1-Score
Logistic Regression
Non-injured (0)	0.6048 ± 0.0528	0.4967 ± 0.0891	0.5395 ± 0.0498
Injured (1)	0.5863 ± 0.0360	0.6810 ± 0.0983	0.6260 ± 0.0499
Macro average	0.5956 ± 0.0391	0.5889 ± 0.0346	0.5828 ± 0.0358
Weighted average	0.5955 ± 0.0387	0.5902 ± 0.0366	0.5834 ± 0.0366
Accuracy	0.5902 ± 0.0366
AUC	0.6515 ± 0.0393
Random Forest
Non-injured (0)	0.5175 ± 0.0858	0.4964 ± 0.1133	0.5058 ± 0.0992
Injured (1)	0.5432 ± 0.0767	0.5650 ± 0.0600	0.5530 ± 0.0664
Macro average	0.5304 ± 0.0811	0.5307 ± 0.0795	0.5294 ± 0.0810
Weighted average	0.5307 ± 0.0812	0.5316 ± 0.0798	0.5300 ± 0.0812
Accuracy	0.5316 ± 0.0798
AUC	0.6144 ± 0.0627
XGBoost
Non-injured (0)	0.5749 ± 0.1022	0.5141 ± 0.1062	0.5425 ± 0.1042
Injured (1)	0.5797 ± 0.0880	0.6390 ± 0.0812	0.6077 ± 0.0846
Macro average	0.5773 ± 0.0946	0.5766 ± 0.0931	0.5751 ± 0.0944
Weighted average	0.5775 ± 0.0946	0.5777 ± 0.0934	0.5757 ± 0.0945
Accuracy	0.5777 ± 0.0934
AUC	0.5973 ± 0.0799

Values are reported as mean ± standard deviation across 5-fold cross-validation. Standard deviations reflect performance variability and serve as a practical alternative to formal confidence intervals in small-sample machine learning studies. Macro and weighted averages are calculated across classes. Accuracy and AUC are listed as overall model performance.

Table 4. Logistic Regression results from 5-fold cross-validation.

	Accuracy	Precision	Recall	F1	AUC
Fold 1	0.5625	0.5556	0.6250	0.5882	0.5990
Fold 2	0.5417	0.5312	0.7083	0.6071	0.6528
Fold 3	0.6250	0.6129	0.7600	0.6786	0.7130
Fold 4	0.5833	0.6190	0.5200	0.5652	0.6226
Fold 5	0.6383	0.6129	0.7917	0.6909	0.6703
Mean	0.5902	0.5863	0.6810	0.6260	0.6515
Std	0.0366	0.0360	0.0983	0.0499	0.0393

AUC: area under ROC curve. Mean: average value calculated across 5 validation folds. Std: standard deviation across 5 folds, representing variability in performance. Values are reported to enhance interpretability and reliability of evaluation metrics.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Choi, M.; Lee, K.; Lee, K. Injury Prediction in Korean Adult Field Hockey Players Using Machine Learning and SHAP-Based Feature Importance Analysis. Appl. Sci. 2025, 15, 8946. https://doi.org/10.3390/app15168946

AMA Style

Choi M, Lee K, Lee K. Injury Prediction in Korean Adult Field Hockey Players Using Machine Learning and SHAP-Based Feature Importance Analysis. Applied Sciences. 2025; 15(16):8946. https://doi.org/10.3390/app15168946

Chicago/Turabian Style

Choi, Minkyung, Kumju Lee, and Kihyuk Lee. 2025. "Injury Prediction in Korean Adult Field Hockey Players Using Machine Learning and SHAP-Based Feature Importance Analysis" Applied Sciences 15, no. 16: 8946. https://doi.org/10.3390/app15168946

APA Style

Choi, M., Lee, K., & Lee, K. (2025). Injury Prediction in Korean Adult Field Hockey Players Using Machine Learning and SHAP-Based Feature Importance Analysis. Applied Sciences, 15(16), 8946. https://doi.org/10.3390/app15168946

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Injury Prediction in Korean Adult Field Hockey Players Using Machine Learning and SHAP-Based Feature Importance Analysis

Abstract

1. Introduction

2. Materials and Methods

2.1. Participants and Procedures

2.2. Survey Content

2.3. Machine Learning Approach

2.4. SHAP Analysis

2.5. Statistical Analysis

3. Results

3.1. Descriptive Statistics and Group Comparison

3.2. Performance of Machine Learning Models

3.3. Interpretation of Feature Importance Using SHAP

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI