Are Countermovement Jump Variables Indicators of Injury Risk in Professional Soccer Players? A Machine Learning Approach

Pérez-Contreras, Jorge; Villaseca-Vicuña, Rodrigo; Loro-Ferrer, Juan Francisco; Inostroza-Ríos, Felipe; Brito, Ciro José; Cerda-Kohler, Hugo; Bustamante-Garrido, Alejandro; Muñoz-Hinrichsen, Fernando; Hermosilla-Palma, Felipe; Ulloa-Díaz, David; Merino-Muñoz, Pablo; Aedo-Muñoz, Esteban

doi:10.3390/app152312721

Open AccessArticle

Are Countermovement Jump Variables Indicators of Injury Risk in Professional Soccer Players? A Machine Learning Approach

by

Jorge Pérez-Contreras

^1,2

,

Rodrigo Villaseca-Vicuña

³

,

Juan Francisco Loro-Ferrer

⁴

,

Felipe Inostroza-Ríos

⁵

,

Ciro José Brito

⁵

,

Hugo Cerda-Kohler

⁶

,

Alejandro Bustamante-Garrido

²,

Fernando Muñoz-Hinrichsen

^7,8

,

Felipe Hermosilla-Palma

⁹

,

David Ulloa-Díaz

¹⁰

,

Pablo Merino-Muñoz

^11,12,*

and

Esteban Aedo-Muñoz

^13,*

¹

Departamento de Ciencias Clínicas, Facultad de Ciencias de la Salud, Escuela de Doctorado de La Universidad de Las Palmas de Gran Canaria (EDULPGC), 35016 Las Palmas, Spain

²

Escuela de Ciencias del Deporte y Actividad Física, Facultad de Salud, Universidad Santo Tomás, Santiago 8370003, Chile

³

School of Educational Sciences and Technology, Physical Education Pedagogy, Faculty of Education, Universidad Católica Silva Henríquez, Santiago 8280354, Chile

⁴

Departamento de Ciencias Clínicas, Universidad de Las Palmas de Gran Canaria, 35016 Las Palmas, Spain

⁵

Departamento de Educação Física, Instituto de Ciências da Vida, Universidade Federal de Juiz de Fora, Governador Valadares 35010-180, Brazil

⁶

Departamento de Educación Física, Deportes y Recreación, Facultad de Artes y Educación Física, Universidad Metropolitana de Ciencias de la Educación, Santiago 7760197, Chile

⁷

Laboratorio de Actividad Física, Salud y Rendimiento Humano, Departamento de Kinesiología, Universidad Metropolitana de Ciencias de la Educación, Santiago 7501173, Chile

⁸

Centro de Desarrollo de la Investigación (CEDI), Universidad Metropolitana de Ciencias de la Educación, Santiago 7501173, Chile

⁹

Pedagogía en Educación Física, Facultad de Educación, Universidad Autónoma de Chile, Talca 3460000, Chile

¹⁰

Department of Sports Sciences and Physical Conditioning, Universidad Católica de la Santísima Concepción, Concepción 4030000, Chile

¹¹

Núcleo de Investigación en Ciencias de la Motricidad Humana, Universidad Adventista de Chile, Chillán 3780000, Chile

¹²

Programa de Engenharia Biomédica, Instituto Alberto Luiz Coimbra de Pós-Graduação e Pesquisa de Engenharia (COPPE), Universidade Federal do Rio de Janeiro, Rio de Janeiro 21941-853, Brazil

¹³

Escuela de Ciencias de la Actividad Física, El Deporte y la Salud, Facultad de Ciencias Médicas, Universidad de Santiago de Chile, Santiago 8370003, Chile

Show full affiliation list

Hide full affiliation list

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2025, 15(23), 12721; https://doi.org/10.3390/app152312721

Submission received: 12 October 2025 / Revised: 21 November 2025 / Accepted: 26 November 2025 / Published: 1 December 2025

Download

Browse Figures

Versions Notes

Abstract

Background: Muscle injuries are among the main problems in professional soccer, affecting player availability and team performance. Countermovement jump (CMJ) variables have been proposed as indicators of injury risk and for detecting strength imbalances, although their use is less explored than isokinetic assessments. Unlike previous studies based solely on linear statistics, this research integrates biomechanical data with machine learning approaches, providing a novel perspective for injury prediction in elite soccer. Objective: To examine the association between CMJ variables and muscle injury risk during a competitive season, considering injury incidence and effective playing minutes. It was hypothesized that specific CMJ asymmetries would be associated with a higher injury risk, and that machine learning algorithms could accurately classify players according to their injury status. Methods: Forty-one professional soccer players (18 women, 23 men) from national league teams (Chile) were assessed during preseason using force platforms. Non-contact muscle injuries and playing minutes were recorded over 10 months after the CMJ evaluations. Analyses included two-way ANOVA (sex × injury status) and machine learning algorithms (Logistic Regression, Decision Tree, K-Nearest Neighbors [KNN], Random Forest, Gradient Boosting [GB]). Results: Significant sex differences were observed in most variables (p < 0.05 and η_p² > 0.11), except peak force and peak power asymmetry. For injury status, only peak force asymmetry differed, while sex × injury interactions were found in peak power and left peak power. KNN (Accuracy = 87% and CI 95% = 71% to 96%) and GB (Accuracy = 84% and CI 95% = 68% to 94%) achieved the best classification performance between injured and non-injured players. Conclusions: CMJ did not show consistent statistical differences between injured and non-injured groups. However, machine learning models, particularly KNN and GB, demonstrated high predictive accuracy, suggesting that injuries are a complex phenomenon characterized by non-linear patterns. These findings highlight the potential of combining CMJ with machine learning approaches for functional monitoring and early detection of injury risk, though validation in larger cohorts is required before establishing clinical thresholds and preventive applications.

Keywords:

machine learning; athletic injuries; vertical jump; muscle strength; football; force platform

1. Introduction

Currently, sport has positioned itself as a phenomenon of high scientific and social interest, backed by funding from public institutions and support from private entities, particularly sports clubs. In this scenario, professional soccer has established itself as one of the disciplines with the highest investment, development, and participation [1]. This physically demanding team sport is characterized by its intermittent nature and by the combination of medium- and high-intensity actions that require explosive muscular efforts [1,2]. However, these demands bring with them a high risk of injury, due to external factors such as the number of matches, weather conditions, and the type/level of competition and opponents, as well as internal factors such as physical condition, compatible health, and physical load [3,4,5]. When these athletes are injured, they generate high financial costs for sports clubs due to treatment and rehabilitation costs, and the loss of value in the transfer market. Therefore, identifying factors that influence the incidence and type of injury is crucial for optimizing training loads [6].

Despite specialized physical training, professional soccer has a high incidence of injuries, with muscle injuries being the most common [7,8]. Several studies have reported that approximately one-third of injuries in high-performance sports are muscular, of which 92% are non-contact injuries, mainly in the quadriceps, hamstrings, adductors, and calf muscle groups [9]. These types of injuries not only affect player availability but also have an impact on team performance, sporting results, and team planning [10,11]. The incidence of injuries is influenced by multiple factors such as playing position, competitive category, stage of the match, accumulated load, congestion of matches, and effective participation time [3,12]. In this regard, the literature has shown an increase in the frequency of muscle injuries during periods of high competitive density [3]. Related to this, several studies have reported differences based on gender, both in terms of injury severity, incidence, and affected area [13,14,15]. This may be due to differences in athletic performance between men and women, differences in athletic ability among women, or the presence of asymmetries in the lower extremities [16].

Recent research has delved into kinetic and biomechanical variables as indicators of fatigue, intensity, and risk of injury [17,18,19]. Particular attention has been given to the presence of asymmetries between limbs in capacities such as strength and power, which have been associated with both decreased performance and an increased likelihood of injury [20,21]. Therefore, monitoring these parameters has become essential for optimizing training, reducing the risk of injury, and prolonging the soccer player’s athletic career. One of the tests that has been used to determine kinetic variables is the countermovement jump (CMJ) due to its reliability and validity [22]. This test provides relevant variables such as peak force, rate of force development, duration of muscle contraction phases (concentric and eccentric), and maximum power, among others [23]. Evidence has shown that these variables can be used as potential predictors of injury risk in sports, as stated by Benavides Ormaza & Cuadrado Peñafiel [24], who describe the force-velocity profile using variables obtained from the CMJ and SJ (squat jump) tests, which were performed with external resistance. This establishes that the profile is a viable factor for determining the risk of injury to the lower limbs. Similarly, another study in soccer players [25] described that the peak force of the eccentric phase recorded in the CMJ test was lower in athletes in players who suffered an injury that left them unable to train for more than three months. However, they also did not analyze whether asymmetries in kinetic variables represented an increased risk of injury among the participating athletes. Therefore, there is still controversy regarding the role of asymmetries in a CMJ as a risk factor for injury.

A recent systematic review by Pérez-Contreras et al. [26] indicates that kinetic variables are reliable indicators for estimating the risk of injury in professional soccer players, mainly due to their ability to reveal strength imbalances between limbs. Monitoring these variables not only allows for the identification of players with a higher susceptibility to injury but also guides more effective preventive strategies. This approach aligns with the current trend toward prevention based on objective data and load control. Despite this, the literature on injuries in professional soccer has focused predominantly on isokinetic assessments and muscle ratios [17,19,27,28,29], while dynamic analyses through vertical jumps—such as the CMJ—remain less explored, especially in comparisons between men and women. This gap limits the possibility of establishing practical parameters for monitoring in real-world settings, where isokinetic tests are costly and not easily accessible. Recent research has advanced the integration of biomechanical data with machine learning algorithms to predict injury risk more effectively in elite soccer. In particular, one study demonstrated that explainable machine learning models based on kinetic and kinematic variables can accurately predict muscle injuries in professional players [30]. Similarly, another study highlighted in a systematic review found that countermovement jump (CMJ) parameters, together with neuromuscular and functional assessments, are promising indicators for injury-risk monitoring in sport [31]. Addressing this gap has a significant practical impact: clubs and coaching staff require simple, valid, and accessible tools to identify risk factors and reduce the incidence of muscle injuries, which are one of the main causes of inactivity in professional soccer [6,18]. This study aims to examine the association between countermovement jump (CMJ) variables and muscle injury risk throughout a competitive season, considering injury incidence and effective playing minutes. Beyond traditional linear approaches, this research integrates biomechanical data with machine learning models to improve the understanding and prediction of injury risk in professional soccer. These findings may contribute to the development of individualized monitoring strategies and data-driven preventive programs in elite soccer environments.

2. Materials and Methods

2.1. Design

This study is a non-experimental exploratory quantitative research project with a prospective cohort design, whose objective was to determine the association between CMJ variables and effective playing minutes, assessed in the preseason, and the occurrence of non-contact muscle injuries during the competitive season in male and female professional soccer players [32].

2.2. Sample

The sample consisted of 41 (20 suffered injury and 21 not suffered injury) professional soccer players categorized in Tier 3–4 according to the classification framework [33], selected by convenience sampling, divided into two groups: women (n = 18; Age = 23.0 ± 5.3 years; Weight = 58.9 ± 6.5 kg; Height = 164 ± 6.2 cm) and men (n = 23; Age = 21.7 ± 1.4 years; Weight = 70.3 ± 6.6 kg; Height = 176.1 ± 6.5 cm). The inclusion and exclusion criteria considered players belonging to the professional teams of each club and registered in the national competition (Chile). Only those players who did not present musculoskeletal injuries that limited their participation at the time of the evaluation and who had not suffered injuries that prevented them from training regularly during the two weeks prior to the tests were included.

2.3. Ethical Considerations

The study was conducted according to the guidelines of the Declaration of Helsinki [34] and approved by the Institutional Ethics Committee of University Hospitals Virgen Macarena and Virgen del Rocío from Seville, Spain (C.P. RENFEFUTCHILE C.I. 2355-N-20, 28 June 2021). All participants held valid federation licenses and had undergone medical evaluations at the start of the season, meeting the requirements set by the Chilean Soccer Federation. The evaluations were conducted in the absence of injuries or discomfort, without altering the usual practice or introducing additional risks beyond those inherent to the activity.

2.4. Procedures

The assessments were carried out during the preseason, on the second day of the weekly microcycle, in the morning, after a week of familiarization with the procedures. A standardized three-phase warm-up protocol was applied. The general phase included a joint mobility exercise, a ballistic stretching exercise, and two core activation exercises (front plank, glute bridge). The exercises were performed in circuit format, with 20 s of work, 20 s of rest, and 3 rounds of the circuit. The specific phase included a squat and lunge exercise, performing 3 sets of 10 repetitions of each exercise. Finally, the third phase was test familiarization, in which 2 sets of 3 CMJs were performed. All tests were performed using appropriate sports shoes.

2.4.1. Instruments

Two PASCO^® portable force platforms were used, specifically the PS-2141 model (PASCO Scientific, Roseville, CA, USA), which have been validated for the evaluation of vertical jumps [35,36]. Measurements were taken at a sampling frequency of 1000 Hz using Pasco Capstone software (version 2.3.1.1), and the data obtained were exported to a spreadsheet. The force platforms were tared (button) before each assessment (Force in 0 newtons). The records were then processed and analyzed using scripts written by the authors in MATLAB (version R2024a, MathWorks Inc., Natick, MA, USA).

2.4.2. Countermovement Jump (CMJ)

To assess the CMJ, participants started in an upright position, with both feet flat on the force platforms and their hands on their hips. After a countdown by the evaluator (“3, 2, 1, go!”), they were instructed to perform a quick knee bend followed by a vertical jump, trying to lift themselves off the ground as high and as fast as possible. The importance of landing with both feet completely inside the force plates was emphasized in order to validate the attempt. Jumps involving arm movements, asymmetrical take-off or landing, or any deviation from the technical protocol were excluded from the statistical analysis. Three attempts were performed per subject with 20 s of rest between attempts. The test was administered by a professional with experience in physical assessments and the use of force plates, ensuring the correct execution of the procedure and the validity of the data collected. The following variables were calculated: Jump height through flight time, peak force bilateral (summing force of two force platforms), peak force for leg, peak power bilateral (power was obtained through double integration of force by derivate time and multiplying by force [velocity x force], summing power of two force platforms), peak power for leg and peak of force rate of development (first derivate of force of time instantaneously [1 ms windows]) of braking phase. Asymmetry index was calculated for peak force and power as follows: (left − right)/((left + right)/2) × 100.

2.4.3. Injury Monitoring

Throughout the competitive season, both contact and non-contact muscle injuries were recorded for each participant. Only non-contact injuries were included in the analysis. The classification proposed by Mueller-Wohlfahrt et al. [37] was used solely as a reference to define and categorize muscle injuries during data collection; however, the specific injury types were not analyzed, as the final data set was dichotomized as “injured” or “non-injured”). The follow-up was conducted over the course of one competitive season (10 months).

2.5. Statistical Analysis

The distribution of the variables was analyzed using the Shapiro–Wilk test and histogram visualization, and a normal distribution was assumed (p > 0.05). The Levene test was applied, and homogeneity of variances was assumed (p > 0.05). The descriptive statistics were described with mean and standard deviation. Two-way ANOVA was applied (sex and injury state) to compare the groups, and post hoc test with Tukey correction was applied if the interaction effect was significant. The effect size partial eta-squared (η_p²) was calculated and was categorized following standard thresholds: 0 to 0.01 trivial; 0.01 to 0.06 small; 0.061 to 0.13 moderate; and >0.14 large [38]. The alpha level was set at 0.05. The analysis was performed in JASP (version 0.19.3; JASP Team, University of Amsterdam, Amsterdam, The Netherlands).

2.6. Machine Learning Analysis

For the binary classification tasks (yes or no injury), the following machine learning algorithms were used: Logistic Regression (LR), Decision Tree (DT), K-nearest neighbors (KNN), Random Forest (RF), Gradient Boosting (GB) and Neural Network feedforward (NNF). LR was employed to set the baseline performance obtained by a linear model. DT was evaluated, providing a more interpretable decision-making mechanism. KNN was selected due to its ability to address the overfitting problems that arise in high-dimensional spaces. Random Forest (RF), which is an ensemble learning algorithm, is used due to its fast execution speed and increased model performance. GB was included because it iteratively builds weak learners to minimize prediction errors, thereby improving accuracy and robustness, especially in non-linear relationships, and NNF was also tested because it can handle complex data. Table 1 shows the hyperparameters tested in the training process for each model. Firstly, the data set was divided into training (70%) and test (30%), using a stratified sampling defined by injury and sex, and later the training data set was scaled (mean 0 and standard deviation 1), and using the mean and standard deviation from this data set, the test data set was scaled to avoid data leakage. The Boruta algorithm was used for feature selection using a training data set [39] (Figure 1). Boruta is a wrapper feature selection method built upon the Random Forest classifier, designed to detect all variables carrying information useful for predicting the target outcome. The algorithm operates by creating shadow features, which are shuffled copies of the original variables that serve as a baseline for comparison. A Random Forest model is iteratively trained using both the original and shadow features, and the importance of each real variable is statistically compared with the highest importance achieved among its shadow counterparts. Variables that consistently outperform the shadow features are classified as Confirmed, those that perform worse are Rejected, and those with ambiguous importance are labeled Tentative. Boruta was executed for 99 iterations, ensuring convergence and stable importance rankings across the evaluated predictors. After completion, the function TentativeRoughFix() was applied to re-evaluate the tentative variables and assign them a final status. This procedure ensured that the subsequent models were trained exclusively on the most informative subset of features, minimizing redundancy and potential noise. Repeated five times 10-fold cross-validation was performed. Models were evaluated in the test data set for accuracy (with confidence interval of 95%), sensitivity, specificity, and area under the curve (AUC). For all classification models developed in this study, predictions were obtained in probabilistic form on the independent test set. To convert these probabilities into binary class labels, an optimized decision threshold was applied instead of the default value of 0.5. The optimal threshold was determined empirically by analyzing the Receiver Operating Characteristic (ROC) curve and selecting the cutoff point that maximized both sensitivity and specificity. Model performance was subsequently evaluated through the confusion matrix, providing accuracy, sensitivity, specificity, and predictive values, while the Area Under the ROC Curve (AUC) was calculated to quantify overall discriminative performance. The following packages were used for this task in RStudio (version 2024.12.1; Posit Software, PBC, Boston, MA, USA): caret, RNSSS, and NeuralNetTools [40,41,42]. Additionally, for the algorithm with higher accuracy, we used SHapley Additive exPlanations (SHAP), which are based on Shapley values of game theory [43]. SHAP offers the ability to interpret the machine learning algorithms, which are often treated as black boxes [44], and was applied in the training data set. The algorithm computes Shapley values that represent the average marginal contribution of each variable across all possible combinations of features, thus providing a consistent and theoretically grounded measure of variable importance. Specifically, 50 random permutations were performed to estimate the marginal contribution of each variable (only in the higher accuracy model). This procedure allowed for a detailed quantification of the direction and magnitude of each variable’s influence on the model’s predictions, providing an interpretable framework for understanding the relative contribution of the selected features to injury classification.

3. Results

Table 2 shows the descriptive statistics and comparisons between factors (sex and injury). All variables have differences (p < 0.05) between sexes, except asymmetry of peak force and peak power (p > 0.05). Later, for the injury factor, only the asymmetry peak force has differences (p < 0.05). For the interaction effect (gender and injury status), differences were found in peak power and left peak power (p < 0.05).

Boruta algorithm (Figure 1) selects the following variables: right peak force, peak force, age, asymmetry peak force, asymmetry peak power, minutes and left peak force, as the most relevant variables for predicting injury risk. Figure 1 shows the variables selected for the Boruta algorithm. Table 3 shows the performance of the machine learning algorithms trained. KNN (Accuracy = 87% and CI 95% = 71% to 96%) and GB (Accuracy = 84% and CI 95% = 68% to 94%) have the best performance among the trained algorithms.

Figure 2a shows the importance of variables of gradient boosting with SHAP; this displays the mean absolute contribution of each predictor to the model’s output. This representation quantifies the overall influence of each feature on the predictions, irrespective of whether the effect increases or decreases the injury probability. Variables with higher mean absolute SHAP values exert greater global impact on the model’s decision process. Figure 2b shows an example of one subject analyzed; the variables are displayed in order of importance, and each one has a bar graph, which can be positive or negative. Bars to the right indicate that the feature increases the prediction. Bars to the left indicate that the feature decreases prediction (pulling the prediction toward the positive class). Final prediction (f(x)): After summing all feature contributions to the base value, the graph shows the model’s final prediction for that observation. In this case, the variables that most influence the classification as injured are peak force of the right leg (PF_R) and minutes played (Min), with negative values; to change this subject’s classification to non-injured, we would have to increase the values of PF_R and Min.

4. Discussion

Sports injuries are a complex, multifactorial problem in which linear, single-metric screens often fail to capture the underlying risk. In this context, brief neuromuscular actions, such as the CMJ, likely embed distributed, non-linear information related to injury status; yet, much of this signal is lost when reduced to simple group contrasts. Rather than implying a universal “injured versus non-injured” signature, our results align with a systems perspective where interactions among neuromuscular performance, sex, and exposure history shape the phenotype. Accordingly, CMJ should serve within an integrated monitoring framework, with interpretable machine learning complementing statistics to model complexity and guide cautious, evidence-based translation and practice.

4.1. Asymmetry as a Sex-Invariant Marker and Its Link to Injury Status

Related to differences between sexes, the analysis of variance identified large differences (η²p > 0.14) between men and women in CMJ performance variables. This was to be expected given the anthropometric and physical differences between the sexes [45,46]. However, a crucial finding was that strength and power asymmetries showed no differences between sexes. This result suggests that limb imbalance is a risk factor independent of sex, and positioning asymmetry is a universally relevant variable to monitor in both groups.

Peak force asymmetry was the only kinetic variable that differed statistically between the injured and uninjured groups, although with a small effect size. Contrary to previous reports, which have shown substantial asymmetries in concentric and eccentric strength between soccer players with and without previous injuries, suggesting that a previous injury predisposes to such asymmetries during a CMJ [47]. Injured players had lower mean asymmetry values (9.0% in women; 6.6% in men) than uninjured players (11.7% and 10.2%, respectively), both remaining below the 15% risk threshold [48]. This seemingly contradictory result could be explained by the fact that athletes with pre-existing impairments may develop compensatory strategies that mask the true neuromuscular deficit during a CMJ, resulting in artificially measured asymmetry [25]. Thus, low asymmetry could be an indicator of an altered and inefficient neuromuscular strategy that predisposes to injury. The low bilateral CMJ sensitivity for detecting these deficits, compared to its unilateral version [25], may influence this finding. However, a kinematic analysis would be required to better understand this phenomenon. It is likely that movement strategies during the jump influence the reaction force against the ground of each limb.

4.2. Sex Effects and Sex × Injury Interactions in Power Metrics

A small, significant interaction effect was found between sex and injury status for the variables peak power and peak left power. Injured men showed higher power values than uninjured men, while the opposite trend was observed in women. This counterintuitive finding warrants further investigation. A possible hypothesis is that in men, higher power levels may be associated with exposure to greater loads or a more explosive playing style, whereas in women, lower power values could reflect suboptimal neuromuscular capacity that increases vulnerability [49,50]. A systematic review concluded that there is moderate evidence linking the risk of musculoskeletal injury to performance in horizontal and vertical non-countermovement jumps, but not to countermovement jumps (CMJ); however, these results are based on height and distance achieved, not on the power produced [48]. On the other hand, the relative power (W/kg) produced in a unilateral CMJ has been associated with a higher risk of non-contact ankle injury (OR = 9.2 [95% CI = 1.13–75.09]) in male amateur soccer players [50]. The results of this study reinforce the evidence and suggest that it could be extrapolated to the female population, although the risk mechanisms may differ between the sexes [51].

4.3. Machine Learning for Injury Risk: Feature Relevance, Predictive Performance, and Interpretability

Machine learning validated the relevance of kinetic variables, particularly asymmetries, for predicting the risk of non-contact injuries. The Boruta algorithm selected right peak force, peak force, asymmetry peak force, asymmetry peak power, and left peak force as the most relevant kinetic variables. This highlights the multifactorial nature of injury, integrating complex interactions that traditional statistical models overlook [51]. The identification of strength and power variables as key predictors is consistent with the findings of Bird et al. [52], who found that high-performance neuromuscular movement strategies (relative power) were associated with a lower risk of injury. This convergence suggests that the ability to generate force quickly and symmetrically is a critical and modifiable factor in preventing injury.

The superior predictive power of our KNN and GB models (AUC of 0.87–0.92) contrasts sharply with the results reported by Oliver et al. [53] and Merrigan et al. [54]. They found that, although machine learning radically improved sensitivity compared to logistic regression (55.6% vs. 15.2%), overall accuracy (AUC ≈ 0.66) remained moderate. For their part, they reported that a logistic regression model based on CMJ strength-time metrics and injury history completely failed to predict injuries in the test set (sensitivity = 0%). The disparity in predictive performance can be attributed to several key methodological differences. Our study used a more diverse set of machine learning algorithms and applied rigorous variable selection with the Boruta algorithm, which identified a concise set of predictors. The findings reinforce that strength asymmetry between the lower limbs can predispose athletes to non-contact injuries. Likewise, the importance of asymmetry peak power aligns with Henry et al. [50], who found that low relative power increased the risk of injury by 9.2 times. This synergy highlights the importance of incorporating power and symmetry training into prevention programs.

The SHAP analysis applied to the Gradient Boosting model provided interpretability, confirming that strength asymmetry is the variable with the most significant predictive power.

4.4. Limitations and Future Directions

These findings should be considered in the context of the study’s limitations. The generalizability of the findings and the reliability of the machine learning models may have been impacted by sample limitations, specifically their small size and unequal gender distribution, which increases the likelihood of overfitting. Future studies should aim for larger, prospectively recruited cohorts with balanced gender representation to enhance model robustness and external validity.

The study’s methodology did not control for the participants’ history of earlier acute injuries, which is a key limitation given the well-documented impact of prior injury on neuromuscular asymmetry and subsequent injury risk [47]. Future studies should incorporate prospective injury history registration and comprehensive screening protocols to control for this critical confounding variable. Furthermore, the binary definition of injury status oversimplifies the underlying complexity of injury mechanisms and could benefit from a more granular analysis that considers the type, severity, and timing of the injury [54].

Finally, it would be pertinent for future studies to explore the accuracy, sensitivity, and specificity of other variables of the force-time curve, such as components of the eccentric phase, as well as to explore the potential of unilateral CMJ asymmetry variables [25].

4.5. Practical Applications

From an applied perspective, the results support the use of the CMJ for monitoring injury risk in the context of professional soccer. The routine assessment of maximum force asymmetries using CMJ with force platforms positions asymmetry peak force as a key safety indicator to be monitored, enabling timely decision-making and adjustments to training load. It should be emphasized that asymmetries below 15% should not be automatically interpreted as safe, as they could mask compensatory strategies for underlying neuromuscular deficits. Finally, although the CMJ in isolation showed discriminative capacity, its integration within a comprehensive assessment battery is recommended for a holistic evaluation. It is essential to remember that predicting injuries is inherently complex [53]. Therefore, these models should be viewed as tools to support decision-making by teams of professionals.

5. Conclusions

This study did not identify consistent differences in CMJ kinetic variables between injured and non-injured soccer players, although significant sex-related differences were observed in most parameters. Regarding injury status, only peak force asymmetry showed differences between groups, while the sex × injury interaction revealed differences in peak power and left peak power. Moreover, machine learning algorithms, particularly KNN and Gradient Boosting, achieved high predictive accuracy, and Boruta and SHAP analyses highlighted the relevance of variables such as peak force and power asymmetry, peak RFD, and jump height. These findings suggest that injuries are a complex and non-linear phenomenon in which asymmetries play a significant role, and that combining CMJ with machine learning models may represent a valuable tool for functional monitoring and early detection of injury risk factors in professional soccer. However, these results should be interpreted with caution, as they are preliminary and require validation in larger and more diverse cohorts. From a practical standpoint, integrating CMJ-derived metrics into regular monitoring protocols could enhance practitioners’ ability to identify and manage injury risk more effectively. Ultimately, these insights may contribute to the development of evidence-based, data-driven strategies for injury prevention and performance optimization in elite soccer.

Author Contributions

Conceptualization, J.P.-C., F.I.-R., J.F.L.-F. and E.A.-M.; methodology, J.P.-C., P.M.-M., C.J.B., J.F.L.-F., E.A.-M. and F.I.-R.; validation, F.H.-P. and D.U.-D.; formal analysis, P.M.-M., E.A.-M., C.J.B. and F.M.-H.; investigation, J.P.-C., R.V.-V., A.B.-G. and F.I.-R.; data curation, R.V.-V., A.B.-G. and F.M.-H.; writing—original draft preparation, J.P.-C., F.I.-R., P.M.-M. and E.A.-M.; writing—review and editing, H.C.-K., J.F.L.-F., C.J.B., D.U.-D., F.M.-H., A.B.-G. and R.V.-V.; visualization, F.H.-P.; supervision, J.P.-C. and E.A.-M.; project administration, J.P.-C. and F.M.-H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Ethics Committee of University Hospitals Virgen Macarena and Virgen del Rocío from Seville, Spain (protocol code C.P. RENFEFUTCHILE—C.I. 2355-N-20, 28 June 2021). All participants held valid federation licenses and had undergone medical evaluations at the start of the season, meeting the requirements set by the Chilean Soccer Federation. The evaluations were conducted in the absence of injuries or discomfort, without altering the usual practice or introducing additional risks beyond those inherent to the activity.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study. Written informed consent was also obtained from all participants for the use of their anonymized data in scientific publications.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The authors would like to express their sincere gratitude to the participating soccer clubs and their technical staff for their valuable collaboration and the facilities provided to conduct this study. Thanks to the Minas Gerais State Research Support Foundation. Postgraduate Scholarship Program PAPG Grant: # 13464. Agreement: 5.12/2022.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CMJ	Countermovement Jump
RFD	Rate of Force Development
KNN	K-Nearest Neighbors
RF	Random Forest
DT	Decision Tree
LR	Logistic Regression
GB	Gradient Boosting
NNF	Neural Network Feedforward
SHAP	SHapley Additive exPlanations
AUC	Area Under the Curve
SD	Standard Deviation
ANOVA	Analysis of Variance
η²p	Partial Eta-Squared
SP	Specificity
SE	Sensitivity
PASCO	Portable Force Platform Brand (PASCO^® PS-2141)
MATLAB	Matrix Laboratory (MathWorks, Natick, MA, USA)
JASP	Jeffreys’s Amazing Statistics Program (v. 0.17.3, Amsterdam, The Netherlands)

References

Dolci, F.; Hart, N.; Kilding, A.; Chivers, P.; Piggott, B.; Spiteri, T. Physical and Energetic Demand of Soccer: A Brief Review. Strength Cond. J. 2020, 42, 1. [Google Scholar] [CrossRef]
Bangsbo, J.; Mohr, M.; Krustrup, P. Physical and Metabolic Demands of Training and Match-Play in the Elite Football Player. J. Sports Sci. 2006, 24, 665–674. [Google Scholar] [CrossRef] [PubMed]
Bengtsson, H.; Ekstrand, J.; Waldén, M.; Hägglund, M. Muscle Injury Rate in Professional Football Is Higher in Matches Played within 5 Days since the Previous Match: A 14-Year Prospective Study with More than 130 000 Match Observations. Br. J. Sports Med. 2018, 52, 1116–1122. [Google Scholar] [CrossRef] [PubMed]
de Albuquerque Freire, L.; Brito, M.A.; Muñoz, P.M.; Pérez, D.I.V.; Kohler, H.C.; Aedo-Muñoz, E.A.; Slimani, M.; Brito, C.J.; Bragazzi, N.L.; Znazen, H.; et al. Match Running Performance of Brazilian Professional Soccer Players According to Tournament Types. Montenegrin J. Sports Sci. Med. 2022, 11, 53–58. [Google Scholar] [CrossRef]
Jiang, Z.; Hao, Y.; Jin, N.; Li, Y. A Systematic Review of the Relationship between Workload and Injury Risk of Professional Male Soccer Players. Int. J. Environ. Res. Public Health 2022, 19, 13237. [Google Scholar] [CrossRef]
Pérez-Contreras, J.; Loro-Ferrer, J.F.; Merino-Muñoz, P.; Hermosilla-Palma, F.; Miranda-Lorca, B.; Bustamante-Garrido, A.; Inostroza-Ríos, F.; Brito, C.J.; Aedo-Muñoz, E. Intra and Inter-Test Reliability of Isometric Hip Adduction Strength Test with Force Plates in Professional Soccer Players. J. Funct. Morphol. Kinesiol. 2024, 9, 270. [Google Scholar] [CrossRef]
López-Valenciano, A.; Ruiz-Pérez, I.; Garcia-Gómez, A.; Vera-Garcia, F.J.; De Ste Croix, M.; Myer, G.D.; Ayala, F. Epidemiology of Injuries in Professional Football: A Systematic Review and Meta-Analysis. Br. J. Sports Med. 2020, 54, 711–718. [Google Scholar] [CrossRef]
López-Valenciano, A.; Raya-González, J.; Garcia-Gómez, J.A.; Aparicio-Sarmiento, A.; Sainz de Baranda, P.; De Ste Croix, M.; Ayala, F. Injury Profile in Women’s Football: A Systematic Review and Meta-Analysis. Sports Med. 2021, 51, 423–442. [Google Scholar] [CrossRef]
Ekstrand, J.; Hägglund, M.; Waldén, M. Epidemiology of Muscle Injuries in Professional Football (Soccer). Am. J. Sports Med. 2011, 39, 1226–1232. [Google Scholar] [CrossRef]
Hägglund, M.; Waldén, M.; Magnusson, H.; Kristenson, K.; Bengtsson, H.; Ekstrand, J. Injuries Affect Team Performance Negatively in Professional Football: An 11-Year Follow-up of the UEFA Champions League Injury Study. Br. J. Sports Med. 2013, 47, 738–742. [Google Scholar] [CrossRef]
LaPlaca, D.; Elliott, J. The Value of Having an Expert Sports Performance and Medicine Staff in the National Football League. Int. J. Strength Cond. 2021, 1, 1–15. [Google Scholar] [CrossRef]
Aiello, F.; Impellizzeri, F.M.; Brown, S.J.; Serner, A.; McCall, A. Injury-Inciting Activities in Male and Female Football Players: A Systematic Review. Sports Med. 2023, 53, 151–176. [Google Scholar] [CrossRef] [PubMed]
Larruskain, J.; Lekue, J.A.; Diaz, N.; Odriozola, A.; Gil, S.M. A Comparison of Injuries in Elite Male and Female Football Players: A Five-season Prospective Study. Scand. J. Med. Sci. Sports 2018, 28, 237–245. [Google Scholar] [CrossRef] [PubMed]
Mandorino, M.; Figueiredo, A.J.; Gjaka, M.; Tessitore, A. Injury Incidence and Risk Factors in Youth Soccer Players: A Systematic Literature Review. Part I Epidemiol. Anal. Biol. Sport 2023, 40, 3–25. [Google Scholar] [CrossRef]
Robles-Palazón, F.J.; López-Valenciano, A.; De Ste Croix, M.; Oliver, J.L.; García-Gómez, A.; Sainz de Baranda, P.; Ayala, F. Epidemiology of Injuries in Male and Female Youth Football Players: A Systematic Review and Meta-Analysis. J. Sport Health Sci. 2022, 11, 681–695. [Google Scholar] [CrossRef]
Barnes, C.; Archer, D.; Hogg, B.; Bush, M.; Bradley, P. The Evolution of Physical and Technical Performance Parameters in the English Premier League. Int. J. Sports Med. 2014, 35, 1095–1100. [Google Scholar] [CrossRef]
Lee, J.W.Y.; Mok, K.-M.; Chan, H.C.K.; Yung, P.S.H.; Chan, K.-M. Eccentric Hamstring Strength Deficit and Poor Hamstring-to-Quadriceps Ratio Are Risk Factors for Hamstring Strain Injury in Football: A Prospective Study of 146 Professional Players. J. Sci. Med. Sport. 2018, 21, 789–793. [Google Scholar] [CrossRef]
Shalaj, I.; Gjaka, M.; Bachl, N.; Wessner, B.; Tschan, H.; Tishukaj, F. Potential Prognostic Factors for Hamstring Muscle Injury in Elite Male Soccer Players: A Prospective Study. PLoS ONE 2020, 15, e0241127. [Google Scholar] [CrossRef]
van Dyk, N.; Bahr, R.; Burnett, A.F.; Whiteley, R.; Bakken, A.; Mosler, A.; Farooq, A.; Witvrouw, E. A Comprehensive Strength. Testing Protocol Offers No Clinical Value in Predicting Risk of Hamstring Injury: A Prospective Cohort Study of 413 Professional Football Players. Br. J. Sports Med. 2017, 51, 1695–1702. [Google Scholar] [CrossRef]
Bishop, C.; Turner, A.; Read, P. Effects of Inter-Limb Asymmetries on Physical and Sports Performance: A Systematic Review. J. Sports Sci. 2018, 36, 1135–1144. [Google Scholar] [CrossRef]
Read, P.J.; Oliver, J.L.; Myer, G.D.; De Ste Croix, M.B.A.; Lloyd, R.S. The Effects of Maturation on Measures of Asymmetry During Neuromuscular Control Tests in Elite Male Youth Soccer Players. Pediatr. Exerc. Sci. 2018, 30, 168–175. [Google Scholar] [CrossRef] [PubMed]
Collings, T.J.; Lima, Y.L.; Dutaillis, B.; Bourne, M.N. Concurrent Validity and Test–Retest Reliability of VALD ForceDecks’ Strength, Balance, and Movement Assessment Tests. J. Sci. Med. Sport 2024, 27, 572–580. [Google Scholar] [CrossRef] [PubMed]
Harper, D.J.; Cohen, D.D.; Carling, C.; Kiely, J. Can Countermovement Jump Neuromuscular Performance Qualities Differentiate Maximal Horizontal Deceleration Ability in Team Sport Athletes? Sports 2020, 8, 76. [Google Scholar] [CrossRef] [PubMed]
Ormaza, B.; Estefania, P.; Cuadrado Peñafiel, V. Determinación de Factores Predictivos Lesionales de Miembros Inferiores a Través de la Mecánica en Jugadores de Fútbol. Master’s Thesis, Universidad Politécnica Salesiana, Cuenca, Ecuador, 2024. [Google Scholar]
Mitchell, A.; Holding, C.; Greig, M. The Influence of Injury History on Countermovement Jump Performance and Movement Strategy in Professional Soccer Players: Implications for Profiling and Rehabilitation Foci. J. Sport. Rehabil. 2021, 30, 768–773. [Google Scholar] [CrossRef]
Pérez-Contreras, J.; Loro-Ferrer, J.F.; Inostroza-Ríos, F.; Merino-Muñoz, P.; Bustamante Garrido, A.; Hermosilla-Palma, F.; Brito, C.J.; Cortés-Roco, G.; Arriagada Tarifeño, D.; Muñoz-Hinrichsen, F.; et al. Kinetic Variables as Indicators of Lower Limb Indirect Injury Risk in Professional Soccer: A Systematic Review. J. Funct. Morphol. Kinesiol. 2025, 10, 228. [Google Scholar] [CrossRef]
Fousekis, K.; Tsepis, E.; Poulmedis, P.; Athanasopoulos, S.; Vagenas, G. Intrinsic Risk Factors of Non-Contact Quadriceps and Hamstring Strains in Soccer: A Prospective Study of 100 Professional Players. Br. J. Sports Med. 2011, 45, 709–714. [Google Scholar] [CrossRef]
Timmins, R.G.; Bourne, M.N.; Shield, A.J.; Williams, M.D.; Lorenzen, C.; Opar, D.A. Short Biceps Femoris Fascicles and Eccentric Knee Flexor Weakness Increase the Risk of Hamstring Injury in Elite Football (Soccer): A Prospective Cohort Study. Br. J. Sports Med. 2016, 50, 1524–1535. [Google Scholar] [CrossRef]
Dauty, M.; Menu, P.; Fouasson-Chailloux, A. Cutoffs of Isokinetic Strength Ratio and Hamstring Strain Prediction in Professional Soccer Players. Scand. J. Med. Sci. Sports 2018, 28, 276–281. [Google Scholar] [CrossRef]
Calderón-Díaz, M.; Silvestre Aguirre, R.; Vásconez, J.P.; Yáñez, R.; Roby, M.; Querales, M.; Salas, R. Explainable Machine Learning Techniques to Predict Muscle Injuries in Professional Soccer Players through Biomechanical Analysis. Sensors 2024, 24, 119. [Google Scholar] [CrossRef]
Velarde-Sotres, Á.; Bores-Cerezal, A.; Alemany-Iturriaga, J.; Calleja-González, J. Tensiomyography, Functional Movement Screen and Counter Movement Jump for the Assessment of Injury Risk in Sport: A Systematic Review of Original Studies of Diagnostic Tests. Front. Sports Act. Living 2025, 7, 1565900. [Google Scholar] [CrossRef]
Hernández Sampieri, R.; Fernández Collado, C.; Baptista Lucio, M.d.P. Metodología de La Investigación; McGraw-Hill Interamericana: Madrid, Spain, 2014; Volume 6, ISBN 9781456232306. [Google Scholar]
McKay, A.K.A.; Stellingwerff, T.; Smith, E.S.; Martin, D.T.; Mujika, I.; Goosey-Tolfrey, V.L.; Sheppard, J.; Burke, L.M. Defining Training and Performance Caliber: A Participant Classification Framework. Int. J. Sports Physiol. Perform. 2022, 17, 317–331. [Google Scholar] [CrossRef]
World Medical Association. World Medical Association Declaration of Helsinki. Ethical Principles for Medical Research Involving Human Subjects. JAMA 2013, 310, 2191. [Google Scholar] [CrossRef]
Sands, W.A.; Bogdanis, G.C.; Penitente, G.; Donti, O.; McNeal, J.R.; Butterfield, C.C.; Poehling, R.A.; Barker, L.A. Reliability and Validity of a Low-Cost Portable Force Platform. Isokinet. Exerc. Sci. 2020, 28, 247–253. [Google Scholar] [CrossRef]
Lake, J.; Mundy, P.; Comfort, P.; McMahon, J.J.; Suchomel, T.J.; Carden, P. Concurrent Validity of a Portable Force Plate Using Vertical Jump Force–Time Characteristics. J. Appl. Biomech. 2018, 34, 410–413. [Google Scholar] [CrossRef] [PubMed]
Mueller-Wohlfahrt, H.-W.; Haensel, L.; Mithoefer, K.; Ekstrand, J.; English, B.; McNally, S.; Orchard, J.; van Dijk, C.N.; Kerkhoffs, G.M.; Schamasch, P.; et al. Terminology and Classification of Muscle Injuries in Sport: The Munich Consensus Statement. Br. J. Sports Med. 2013, 47, 342–350. [Google Scholar] [CrossRef] [PubMed]
Lakens, D. Calculating and Reporting Effect Sizes to Facilitate Cumulative Science: A Practical Primer for t-Tests and ANOVAs. Front. Psychol. 2013, 4, 863. [Google Scholar] [CrossRef]
Kursa, M.B.; Rudnicki, W.R. Feature Selection with the Boruta Package. J. Stat. Softw. 2010, 36, 1–13. [Google Scholar] [CrossRef]
Kuhn, M. Building Predictive Models in R Using the Caret Package. J. Stat. Softw. 2008, 28, 1–26. [Google Scholar] [CrossRef]
Bergmeir, C.; Benítez, J.M. Neural Networks in R Using the Stuttgart Neural Network Simulator: RSNNS. J. Stat. Softw. 2012, 46, 1–26. [Google Scholar] [CrossRef]
Beck, M.W. NeuralNetTools: Visualization and Analysis Tools for Neural Networks. J. Stat. Softw. 2018, 85, 1–20. [Google Scholar] [CrossRef]
Lundberg, S.; Lundberg, S.M.; Allen, P.G.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. arXiv 2017. [Google Scholar] [CrossRef]
Ponce-Bobadilla, A.V.; Schmitt, V.; Maier, C.S.; Mensing, S.; Stodtmann, S. Practical Guide to SHAP Analysis: Explaining Supervised Machine Learning Model Predictions in Drug Development. Clin. Transl. Sci. 2024, 17, e70056. [Google Scholar] [CrossRef]
Cardoso de Araújo, M.; Baumgart, C.; Jansen, C.T.; Freiwald, J.; Hoppe, M.W. Sex Differences in Physical Capacities of German Bundesliga Soccer Players. J. Strength. Cond. Res. 2020, 34, 2329–2337. [Google Scholar] [CrossRef]
Petri, C.; Campa, F.; Holway, F.; Pengue, L.; Arrones, L.S. ISAK-Based Anthropometric Standards for Elite Male and Female Soccer Players. Sports 2024, 12, 69. [Google Scholar] [CrossRef] [PubMed]
Hart, L.M.; Cohen, D.D.; Patterson, S.D.; Springham, M.; Reynolds, J.; Read, P. Previous Injury Is Associated with Heightened Countermovement Jump Force-time Asymmetries in Professional Soccer Players. Transl. Sports Med. 2019, 2, 256–262. [Google Scholar] [CrossRef]
Impellizzeri, F.M.; Rampinini, E.; Maffiuletti, N.; Marcora, S.M. A Vertical Jump Force Test for Assessing Bilateral Strength Asymmetry in Athletes. Med. Sci. Sports Exerc. 2007, 39, 2044–2050. [Google Scholar] [CrossRef] [PubMed]
Henry, T.; Evans, K.; Snodgrass, S.J.; Miller, A.; Callister, R. Risk Factors for Noncontact Ankle Injuries in Amateur Male Soccer Players. Clin. J. Sport Med. 2016, 26, 251–258. [Google Scholar] [CrossRef] [PubMed]
de la Motte, S.J.; Lisman, P.; Gribbin, T.C.; Murphy, K.; Deuster, P.A. Systematic Review of the Association Between Physical Fitness and Musculoskeletal Injury Risk: Part 3—Flexibility, Power, Speed, Balance, and Agility. J. Strength Cond. Res. 2019, 33, 1723–1735. [Google Scholar] [CrossRef]
Bird, M.B.; Mi, Q.; Koltun, K.J.; Lovalekar, M.; Martin, B.J.; Fain, A.; Bannister, A.; Vera Cruz, A.; Doyle, T.L.A.; Nindl, B.C. Unsupervised Clustering Techniques Identify Movement Strategies in the Countermovement Jump Associated with Musculoskeletal Injury Risk During US Marine Corps Officer Candidates School. Front. Physiol. 2022, 13, 868002. [Google Scholar] [CrossRef]
Oliver, J.L.; Ayala, F.; De Ste Croix, M.B.A.; Lloyd, R.S.; Myer, G.D.; Read, P.J. Using Machine Learning to Improve Our Understanding of Injury Risk and Prediction in Elite Male Youth Football Players. J. Sci. Med. Sport. 2020, 23, 1044–1048. [Google Scholar] [CrossRef]
Merrigan, J.J.; Stone, J.D.; Kraemer, W.J.; Vatne, E.A.; Onate, J.; Hagen, J.A. Female National Collegiate Athletic Association Division-I Athlete Injury Prediction by Vertical Countermovement Jump Force-Time Metrics. J. Strength Cond. Res. 2024, 38, 783–786. [Google Scholar] [CrossRef]
Zech, A.; Hollander, K.; Junge, A.; Steib, S.; Groll, A.; Heiner, J.; Nowak, F.; Pfeiffer, D.; Rahlf, A.L. Sex Differences in Injury Rates in Team-Sport Athletes: A Systematic Review and Meta-Regression Analysis. J. Sport Health Sci. 2022, 11, 104–114. [Google Scholar] [CrossRef]

Figure 1. Boruta feature selection, expressed using box plots. Red box variables are unimportant and deleted; green boxes are variables selected for the training data set. PP_R peak power right leg; JH jump height; PRFD peak rate of force development; PP_l peak power left leg; PP peak power bilateral; PF_l peak force left leg; Min minutes played in matches; PP_A peak power asymmetry; PF_A peak force asymmetry; PF peak force bilateral; PF_R peak force right leg.

Figure 2. (a) Importance of variables with SHapley Additive exPlanations (SHAP) analysis for gradient boosting. Higher values mean a more important variable. PP_R peak power right leg; Min minutes played in matches PP_l peak power left leg; PF_l peak force left leg; PF peak force bilateral; PF_A peak force asymmetry; PP_A peak power asymmetry. (b) Individual variable contribution according to SHAP and predict the class of one subject (injured).

Table 1. Summary of the machine learning models trained and their hyperparameter configurations.

Model	Main Tuned Hyperparameters	Grid Values Tested	Notes
K-Nearest Neighbors	kmax (number of neighbors), distance (metric), kernel (weighting function)	kmax = 3–15 (odd numbers); distance = {1, 2}; kernel = {“rectangular”, “gaussian”}	Euclidean (2) and Manhattan (1) distances compared.
Decision Tree	cp (complexity parameter, pruning depth)	10 automatically generated cp values (tuneLength = 10)	Standard CART minimizing impurity.
Random Forest	mtry (variables per split), ntree (number of trees)	mtry = 1–11; ntree = {50, 100, 200}	Custom RF wrapper used
Artificial Neural Network	size (hidden units), decay (L2 regularization), maxit (iterations)	size = 2–14; decay = {0, 0.1, 0.5}; maxit = 200	Single hidden-layer feedforward neural network.
Logistic Regression	family (binomial), link function, maxit (iterations)	family = binomial(link = “logit”); maxit = 50	Standard logistic regression model.
Gradient Boosting	interaction.depth, n.trees, shrinkage, n.minobsinnode	interaction.depth = 2–8; n.trees = {50, 100, 200}; shrinkage = {0.1, 0.01, 0.001}; n.minobsinnode = {2, 4, 6}	Uses stochastic gradient boosting with bag fraction = 0.7.

Table 2. Descriptive and comparative results.

	No Injury				Injury				ANOVA
	Female		Male		Female		Male		Sex		Injury		Interaction
Variables	M	±SD	M	±SD	M	±SD	M	±SD	p	η_p²	p	η_p²	p	η_p²
Age (years)	22.3	4.9	21.9	1.4	23.8	5.4	21.3	1.3	0.038	0.039	0.526	0.004	0.105	0.024
Body mass (kg)	59.3	6.6	72.3	5.5	58.6	6.2	68.6	7.1	<0.001	0.450	0.065	0.031	0.227	0.013
Minutes (min)	1.391	505	902	706	1.15	466	1.10	801	0.032	0.041	0.898	<0.001	0.081	0.027
Jump height (cm)	28.5	3.4	39.2	5.2	27.2	3.1	39.6	4.0	<0.001	0.665	0.566	0.003	0.285	0.010
Peak Force (N/kg)	23.4	1.5	25.8	2.3	23.4	1.6	26.1	1.7	<0.001	0.320	0.749	<0.001	0.662	0.002
Right peak force (N/kg)	23.8	2.8	26.0	2.5	23.9	2.0	26.2	2.2	<0.001	0.187	0.720	0.001	0.992	<0.001
Left peak force (N/kg)	23.1	1.7	25.5	3.1	22.9	1.9	26.2	1.9	<0.001	0.280	0.583	0.003	0.306	0.010
Asymmetry peak force (%)	11.7	8.1	10.2	7.8	9.0	6.8	6.6	4.6	0.142	0.020	0.017	0.051	0.707	0.001
Peak power (W/kg)	41.8	3.6	56.0	6.7	39.5	4.7	58.4	5.2	<0.001	0.706	0.956	<0.001	0.023	0.046
Right peak power (W/kg)	43.2	9.9	58.1	8.8	42.6	8.7	59.1	7.5	<0.001	0.451	0.896	<0.001	0.647	0.002
Left peak power (W/kg)	41.6	9.3	54.6	13.3	37.3	7.6	59.2	11.3	<0.001	0.392	0.963	<0.001	0.033	0.041
Asymmetry Peak power (%)	32.7	21.8	25.7	20.9	27.3	22.7	20.9	18	0.089	0.026	0.197	0.015	0.927	<0.001
Peak RFD (N/s)	9355	3595	13,258	5382	10,195	3444	13,586	6424	<0.001	0.116	0.544	0.003	0.790	<0.001

Abbreviations: SD: standard deviation; p: p-value; RFD: rate of force development, η_p²: partial eta-squared. Significant results are highlighted in bold.

Table 3. Performance metrics of machine learning algorithms.

Algorithms	Accuracy	LL 95%	UL 95%	SE	SP	AUC
Logistic Regression	60%	42%	77%	87%	35%	0.5
K-Nearest Neighbors	87%	71%	96%	81%	96%	0.87
Decision Tree	48%	30%	66%	43%	52%	0.48
Random Forest	75%	57%	88%	62%	88%	0.81
Gradient Boosting	84%	68%	94%	75%	94%	0.90
Neural Network	78%	61%	91%	93%	64%	0.84

Abbreviations: LL: lower limit; UL: upper limit; SE: sensitivity; SP: specificity; AUC: area under the curve. The algorithm with the best performance is highlighted in bold.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pérez-Contreras, J.; Villaseca-Vicuña, R.; Loro-Ferrer, J.F.; Inostroza-Ríos, F.; Brito, C.J.; Cerda-Kohler, H.; Bustamante-Garrido, A.; Muñoz-Hinrichsen, F.; Hermosilla-Palma, F.; Ulloa-Díaz, D.; et al. Are Countermovement Jump Variables Indicators of Injury Risk in Professional Soccer Players? A Machine Learning Approach. Appl. Sci. 2025, 15, 12721. https://doi.org/10.3390/app152312721

AMA Style

Pérez-Contreras J, Villaseca-Vicuña R, Loro-Ferrer JF, Inostroza-Ríos F, Brito CJ, Cerda-Kohler H, Bustamante-Garrido A, Muñoz-Hinrichsen F, Hermosilla-Palma F, Ulloa-Díaz D, et al. Are Countermovement Jump Variables Indicators of Injury Risk in Professional Soccer Players? A Machine Learning Approach. Applied Sciences. 2025; 15(23):12721. https://doi.org/10.3390/app152312721

Chicago/Turabian Style

Pérez-Contreras, Jorge, Rodrigo Villaseca-Vicuña, Juan Francisco Loro-Ferrer, Felipe Inostroza-Ríos, Ciro José Brito, Hugo Cerda-Kohler, Alejandro Bustamante-Garrido, Fernando Muñoz-Hinrichsen, Felipe Hermosilla-Palma, David Ulloa-Díaz, and et al. 2025. "Are Countermovement Jump Variables Indicators of Injury Risk in Professional Soccer Players? A Machine Learning Approach" Applied Sciences 15, no. 23: 12721. https://doi.org/10.3390/app152312721

APA Style

Pérez-Contreras, J., Villaseca-Vicuña, R., Loro-Ferrer, J. F., Inostroza-Ríos, F., Brito, C. J., Cerda-Kohler, H., Bustamante-Garrido, A., Muñoz-Hinrichsen, F., Hermosilla-Palma, F., Ulloa-Díaz, D., Merino-Muñoz, P., & Aedo-Muñoz, E. (2025). Are Countermovement Jump Variables Indicators of Injury Risk in Professional Soccer Players? A Machine Learning Approach. Applied Sciences, 15(23), 12721. https://doi.org/10.3390/app152312721

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Are Countermovement Jump Variables Indicators of Injury Risk in Professional Soccer Players? A Machine Learning Approach

Abstract

1. Introduction

2. Materials and Methods

2.1. Design

2.2. Sample

2.3. Ethical Considerations

2.4. Procedures

2.4.1. Instruments

2.4.2. Countermovement Jump (CMJ)

2.4.3. Injury Monitoring

2.5. Statistical Analysis

2.6. Machine Learning Analysis

3. Results

4. Discussion

4.1. Asymmetry as a Sex-Invariant Marker and Its Link to Injury Status

4.2. Sex Effects and Sex × Injury Interactions in Power Metrics

4.3. Machine Learning for Injury Risk: Feature Relevance, Predictive Performance, and Interpretability

4.4. Limitations and Future Directions

4.5. Practical Applications

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI