Next Article in Journal
Utility of Initial Arterial Blood Gas in Neuromuscular versus Non-Neuromuscular Acute Respiratory Failure in Intensive Care Unit Patients
Next Article in Special Issue
Associations of Blood and Performance Parameters with Signs of Periodontal Inflammation in Young Elite Athletes—An Explorative Study
Previous Article in Journal
High PEEP Levels during CPR Improve Ventilation without Deleterious Haemodynamic Effects in Pigs
Previous Article in Special Issue
Exercise Hypertension in Athletes
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Predictive Modeling of Injury Risk Based on Body Composition and Selected Physical Fitness Tests for Elite Football Players

Francisco Martins
Krzysztof Przednowek
Cíntia França
Helder Lopes
Marcelo de Maio Nascimento
Hugo Sarmento
Adilson Marques
Andreas Ihle
Ricardo Henriques
11 and
Élvio Rúbio Gouveia
Department of Physical Education and Sport, University of Madeira, 9020-105 Funchal, Portugal
Laboratory of Robotics and Engineering Systems, Interactive Technologies Institute, 9020-105 Funchal, Portugal
Institute of Physical Culture Sciences, Medical College, University of Rzeszów, 35-959 Rzeszów, Poland
Department of Physical Education, Federal University of Vale do São Francisco, Petrolina 56304-917, Brazil
University of Coimbra, Research Unit for Sport and Physical Activity (CIDAF), Faculty of Sport Sciences and Physical Education, 3004-504 Coimbra, Portugal
CIPER, Faculty of Human Kinetics, University of Lisbon, 1495-751 Lisbon, Portugal
ISAMB, Faculty of Medicine, University of Lisbon, 1649-020 Lisbon, Portugal
Department of Psychology, University of Geneva, 1205 Geneva, Switzerland
Center for the Interdisciplinary Study of Gerontology and Vulnerability, University of Geneva, 1205 Geneva, Switzerland
Swiss National Centre of Competence in Research LIVES—Overcoming Vulnerability: Life Course Perspectives, 1015 Lausanne, Switzerland
Marítimo da Madeira—Futebol, SAD, 9020-208 Funchal, Portugal
Author to whom correspondence should be addressed.
J. Clin. Med. 2022, 11(16), 4923;
Submission received: 21 July 2022 / Revised: 17 August 2022 / Accepted: 18 August 2022 / Published: 22 August 2022
(This article belongs to the Special Issue Advancements in Sports Medicine)


Injuries are one of the most significant issues for elite football players. Consequently, elite football clubs have been consistently interested in having practical, interpretable, and usable models as decision-making support for technical staff. This study aimed to analyze predictive modeling of injury risk based on body composition variables and selected physical fitness tests for elite football players through a sports season. The sample comprised 36 male elite football players who competed in the First Portuguese Soccer League in the 2020/2021 season. The models were calculated based on 22 independent variables that included players’ information, body composition, physical fitness, and one dependent variable, the number of injuries per season. In the net elastic analysis, the variables that best predicted injury risk were sectorial positions (defensive and forward), body height, sit-and-reach performance, 1 min number of push-ups, handgrip strength, and 35 m linear speed. This study considered multiple-input single-output regression-type models. The analysis showed that the most accurate model presented in this work generates an error of RMSE = 0.591. Our approach opens a novel perspective for injury prevention and training monitorization. Nevertheless, more studies are needed to identify risk factors associated with injury prediction in elite soccer players, as this is a rising topic that requires several analyses performed in different contexts.

1. Introduction

Injuries are one of the most significant hampering issues for elite football players [1]. Football is known for its fast-paced and powerful actions [2,3], which might contribute to players’ increased risk of injuries [4]. Due to their effects on individuals’ mental states and overall teams’ performances, elite players’ injuries significantly impact the sports business [5,6]. Consequently, elite football clubs have been consistently interested in having practical, interpretable, and usable models as decision-making support for coaches and their technical staff members [7].
From the clinical standpoint, the literature describes the lower limbs as the most affected body zone by sports injuries [4,8,9,10,11,12,13,14], particularly for muscle injuries in the thigh area, the quadriceps, and the groin [4,10,15,16]. Since injuries in professional soccer are an increasingly problem, it is crucial that the work done in training sessions reflects the demands of competition, aiming at the development of athletes’ performance, which includes injury prevention [17,18,19].
Machine learning or statistical learning methods are currently tools that can significantly support decision-making in various aspects of the training process. For instance, it has been reported in the literature that some models can optimize training loads [20], which reinforces the applicability of machine learning in improving injury prediction [21,22].
Researchers, managers, and coaches are becoming increasingly involved in injury forecasting, using regular data collection that will allow them to act consciously and intervene on time on this global issue [23]. An investigation conducted over 18 years showed that the total injury rate in practice and competition has dropped during the past years [24]. Although the cause leading to this decrease is still unknown, one potential explanation for this decrease may be related to the effectiveness of injury prevention. If so, it is likely that the motivation of the medical staff at elite football teams is increasing, in terms of implementing and overseeing preventive injury programs [24].
Machine learning offers a modern statistical method that uses algorithms mainly created to deal with unbalanced data sets and enable the modeling of interactions between a large number of variables [25]. In the football context, machine learning has been used in injury prediction, physical performance prediction, training load and monitoring, players’ career trajectories, clubs’ performance, and match attendance [26].
There has been some research done on elite-football-injury prediction up to this point [23,25,27,28,29,30,31]. In 2019, 96 male elite football players participated in a study throughout a season, with hamstring-strain injuries being the primary anticipated consequence. In that study, the prediction model showed moderate to high accuracy for identifying players at risk of hamstring-strain injuries during pre-season testing [31]. Another example involved 26 elite football players participating in year-long research to forecast non-contact injuries. The authors reported that machine learning was far more accurate than baselines and modern injury-risk-estimating approaches, detecting roughly 80% of injuries with about 50% accuracy [23]. In another study conducted with 132 male elite football and handball players, the prediction model accurately identified elite players at risk of developing muscular injuries [25].
Two types of variables are highlighted in the previous research on predictive modeling of injury risk [30]. The first block of predictor variables is modifiable variables, i.e., training loads or physiological and physical fitness tests. The second type is non-modifiable variables, including demographic variables, anthropometric parameters, and injury histories. Indeed, body composition and physical fitness tests are the most commonly assessed by sports staff given their close relationship with game performance and players’ health. Moreover, evaluating and monitoring players’ characteristics during the season provides valuable information to understand better players’ behavioral changes and support coaches’ decision-making in the training and match process. In the sports injury literature, most of the investigation conducted aimed to assess one specific variable at a time to predict injury risk. However, this approach limits the correlation of injury risk and a global interpretation of players’ performance in professional football [23]. Therefore, this study aimed to analyze predictive modeling of injury risk based on body composition variables and selected physical fitness tests for elite football players across a sports season.

2. Materials and Methods

2.1. Participants

Thirty-six players from a professional football team participated in this study. This team competed in the First Portuguese League during the 2020/2021 season.
A description of the variables together with the basic statistics (M—mean value, SD—standard deviation) is given in Table 1. The models were calculated based on 22 independent variables (x1–x22) and one dependent variable (y). Independent variables include players’ information (sectorial position, age, experience, and number of previous injuries), anthropometric parameters with body composition, and components of physical fitness (flexibility, general strength, explosive strength, speed, agility, and aerobic endurance). The dependent variable is the number of injuries per season. The predictive analysis did not use the data of all athletes. Twenty-four players’ data were used. This was due to the fact that some of the athletes were noted to have missing data related to not taking certain physical fitness tests.
All procedures applied were approved by the Ethics Committee of the Faculty of Human Kinetics, CEIFMH No. 34/2021. The investigation was conducted following the Declaration of Helsinki, and informed consent was obtained from all participants.

2.2. Body-Composition Assessment

Body-composition variables were assessed using hand-to-foot bioelectrical impedance analysis (InBody 770, Cerritos, CA, USA). Height was measured to the nearest 0.1 cm using a stadiometer (SECA 213, Hamburg, Germany). The measurements occurred in the early morning, with participants fasting and wearing only their underwear. During the assessment, participants were barefoot, standing with both arms 45° apart from the trunk, with both feet bare on the spots of the platform. A total of 26 evaluations of body composition were considered during the season. Body mass, total body water (TBW), body fat mass (BFM), and fat-free mass (FFM) were retained for analysis.

2.3. Physical Fitness Assessment

The sit-and-reach bilateral test was used to evaluate flexibility measurement. A box (32.4 cm high and 53.3 cm long) with a 23 cm heel line mark was used. The participants sat barefoot in front of the box, with both knees fully extended and heels against the box. The research team held one hand lightly against each participant’s knees to ensure complete leg extension. Then, participants placed their hands on top of each other, palms down, and slowly bent forward along the measuring scale. The forward-hold position was repeated twice. The third and final forward stretch was held for three seconds, and the score was recorded to the nearest 0.1 cm.
The push-ups test protocol consisted in performing the highest number of push-ups in one minute, respecting the success criteria judged by the evaluator. The participants started the test in the down position to get correct hand placement and then assumed the up position, from which they did the maximum number of push-ups possible. No cadence was used, although participants were encouraged to execute push-ups with good form but fast enough to obtain the best possible score in a minute. The evaluator independently counted the number of push-ups correctly executed.
The handgrip protocol consisted of three alternated data collection trials for each arm, performed using a hand dynamometer (Jamar Plus+, Chicago, IL, USA). Participants were instructed to hold a dynamometer in one hand, laterally to the trunk with the elbow at a 90° position [32]. From this position, participants were instructed to squeeze as hard as possible, progressively and continuously squeezing the hand dynamometer for about two seconds. The dynamometer could not contact the participant’s body; otherwise, the trial was repeated. The best score of the three trials was retained for analysis.
The countermovement jump (CMJ) and the squat jump (SJ) were used to assess lower-body explosive strength [33]. Both protocols included four data collection trials and were performed using the Optojump Next (Microgate, Bolzano, Italy) system of analysis and measurement. In both tests, participants were encouraged to jump to their maximum height. Before data collection, three experimental trials were performed by each participant to ensure correct execution. For the CMJ, participants began in a tall standing position, with feet placed hip-width to shoulder-width apart. Then, participants dropped into the countermovement position to a self-selected depth, followed by a maximal-effort vertical jump. Hands remained on the hips for the entire movement to eliminate any influence of arm swing. If the hands were removed from the hips at any point, or excessive knee flexion was exhibited during the countermovement, the trial was repeated. The participants reset to the starting position after each jump. The SJ protocol testing began with the participant in a squat position at a self-selected depth of approximately 90° of knee flexion, holding this position for the researchers’ count of three before jumping. If a dipping movement of the hips was evident, then the trial was repeated. The participants reset to the starting position after each jump.
Linear speed was assessed with maximal sprints at 5, 10, and 35 m, starting from a stationary position. Sprint time was recorded using Witty-Gate photocells (Microgate, Bolzano, Italy). Participants were allowed two trials for each sprinting distance, and the best time was used for analysis.
A yoyo intermittent recovery test was applied to evaluate the athlete’s maximum oxygen uptake under repeated high-intensity aerobic exercise [34,35]. The test consists of a 2 × 20 m shuttle run at increasing speeds, interspersed with 10 s of active recovery, controlled by audio signals. The test terminated when the subject was no longer able to maintain the required speed. The total distance and VO2 maximum record were used as results [36]. The results used were based on the athletes’ performance in the yoyo test, which is an indirect method of measuring such variables.
All tests were performed on the same day within a 4 h period in the morning (8 a.m.–12 p.m.). They were conducted by trained staff from the research team, who were familiar with each protocol. All protocols were followed with the utmost rigor, and the organization of the sequence of physical tests was designed to reduce the fatigue factor throughout all tests.

2.4. Injury Report

This study followed the Union of European Football Association (UEFA)’s recommendations for epidemiological investigations. An injury was defined as an event during a scheduled training session or match, resulting in an absence from the next training session or match [37]. Regarding the variables under analysis, the type, zone, and specific location of the injury are complementary variables that identify the part of the body that suffered structural and/or functional changes. The mechanism of injury is intended to understand if the injury was traumatic or if it was contracted by overload. The severity of the injury considers the period, in days, from the athlete’s stoppage until resuming field work with the consent of the clinical department. Finally, an injury was marked as recurrent when a player was injured in the same place and type where they were previously affected by an injury. Injury records during the season, including in training and competitive moments, were made daily by the clinical department.

2.5. Predictive Modeling

In this analysis, multiple-input single-output models for prediction were used. The output of the model is a continuous variable and represents the number of occurrences of potential injuries. Therefore, we consider regression-type models, not classifiers. Classic regression models (OLS), shrinkage regression, and stepwise regression were used in the models’ calculations. All predictive models were calculated using R Software version 4.2.0 (R Foundation for Statistical Computing, Vienna, Austria, 2022). The implemented methods included:
  • The ordinary least squares regression (OLS) used a popular least-squares method, in which weights are calculated by minimizing the sum of the squared errors.
  • The Ridge model was calculated using the criterion of performance, which includes a penalty for increased weights. Parameter λ decides the size of the penalty: the greater the value of λ is, the bigger the penalty. The value of lambda can vary from 0 to infinity [38].
  • Lasso regression is the model where the mechanism facilitates assigning a penalty to variables, and, in this way, they are eliminated from equations. In Lasso regression [39], the parameter s (penalty) is used to optimize the model.
  • Elastic net (ENET) [40] combines the features of ridge and LASSO regressions. The performance criterion is the so-called naive elastic net. To minimize the criterion, the LARS-EN algorithm was suggested [40], which is based on the LARS algorithm for LASSO regression. In elastic net regression, we have two parameters, penalty s and λ.
  • Stepwise Forward Regression has a forward selection procedure (FS), which begins with an equation that contains only a free expression. The first variable in the equation is the one that has the highest correlation with the output variable. If the coefficient of regression of the variable differs significantly from zero, the variable remains in the equation and another variable is added. The second variable introduced into the equation is the one that has the highest correlation with output, which has been adjusted for the effect of the first variable. If the regression coefficient is statistically significant (using F-test), adding the next variable is implemented in the same way [41,42].
The presented methods were used to calculate models from all variables (Table 1). Additionally, OLS, Ridge, LASSO, and elastic net models have been reimplemented for the best subset of input variables computed from stepwise regression. All models calculated in the study were tested by leave-one-out cross validation (LOOCV). In this method, the data set is divided into two subsets: learning and testing (validation). In LOOCV, the test set is composed of a selected pair of data ( x i , y i ), and the number of tests is equal to the number of data n . During the cross-validation, R M S E C V error was calculated:
R M S E C V = 1 n i = 1 n ( y i y ^ i ) 2  
where n —number of patterns, y i —the output value of the model built in the i-th step of cross-validation based on a data set containing no testing pair ( x i , y i ), y ^ i —the output value of the model built in the i-th step based on the full data set, and R M S E C V —root mean square error of prediction.

3. Results

Table 2 summarizes the data regarding the participants and injuries characterization of Club Sport Marítimo in the 2020/2021 season. Of the 36 players participating in the study, 23 contracted at least one injury over the 2020/2021 season. Injured players missed an average of 14.3 days per injury. There were 0.9 injuries contracted by the number of participants (34 injuries/36 players) over the study period. Most injuries were classified as traumatic (52.9%). About 50% of the injuries were, according to their severity, moderate, since the athletes missed between 8 and 28 days of training and/or competition. Finally, four of the injuries counted were classified as recurrent.
Figure 1, Figure 2 and Figure 3 summarize the type, area, and specific location of injuries. The lower limbs were the body area most affected by injuries (85.2%). Sprains (35.2%) and muscle injuries (35.2%) were the most recurrent type of injuries throughout the study period, particularly in the ankles (29.4%), quadriceps (11.7%), and hamstrings (11.7%).
Table 3 presents the errors for each model and the sets of predictors calculated by the variable selection methods. The classical OLS regression model has the worst predictive ability, for which the error of RMSE = 18.57. Such a large error shows that the injury-prediction problem is complex and needs to be regularized by, among other things, using shrinkage regression. The use of shrinkage models (Ridge, LASSO, and elastic net) resulted in a sharp decrease in error and, thus, an improvement in the predictive ability of the model. The best model performing injury-prediction tasks for all predictors is the Ridge model, in which the RMSE error was 0.698. The optimal Ridge model was calculated for λ = 82.2. Optimizations of all shrinkage models are presented in Figure 4. The LASSO model for all predictors was not calculated because the algorithm does not work properly for such a configuration of the number of variables and patterns. Therefore, the following model used was the elastic net regression model. For elastic net regression, a very small prediction error was obtained (RMSE = 0.633), and the number of predictors was reduced due to the properties of this method. The result of the elastic net analysis was that the best set of input variables is the set of seven variables: x1—sectorial position 1, x3—sectorial position 3, x7—body height, x12—sit and reach, x13n push-ups, x15—handgrip (l), and x20—V35 m.
The forward regression showed that the significant predictors are x1 – sectorial position 1, x12—sit and reach, x13n push-ups, and x15—handgrip l). All the predictors determined by forward regression are contained in the set determined by elastic net regression. The model determined by forward regression generates an error of RMSE = 0.618. The predictors obtained using elastic net (E) and forward regression (F) were used in further predictive analysis. Both sets were used to recalculate the Ridge and LASSO models. The Ridge model with the set calculated by elastic net generates an error of RMSE = 0.592, and a very similar error was obtained for the Ridge model, with the set calculated by forward regression, with RMSE = 0.591. Both Ridge models with new sets of predictors show the best ability. LASSO models for enumerated sets of predictors showed worse predictive abilities than Ridge models. In the case of the best model, the model predicts the number of injury occurrences with an error of 0.59. This means that if a player has three injuries, the model would predict a value from the range of 2.41 to 3.59. The equations for the best models are presented in Table 4.

4. Discussion

This study aimed to analyze predictive modeling of injury risk based on players’ sectorial position, body composition variables (i.e., weight, height, TBW, FAT, and FFM), and selected physical fitness tests, which include sit-and-reach, push-ups, handgrip, CMJ, SJ, 5 m, 10 m, 35 m, and yoyo tests.
This study considered multiple-input single-output regression-type models. It allowed us to select the best model to perform injury prediction tasks, considering all predictors. Previous work on predictive injury risk models is mostly based on classification learning models [31,43,44]. These models’ predictive accuracy ranged from 75% to 82.9% [30]. The present study did not use a categorical variable but rather a continuous variable. A similar solution was presented in another work, where a continuous variable was also placed in the output [45]. A direct comparison of the models’ predictive ability with those presented by other authors is complex because different quality criteria were used.
The value of cross-validation error is important, but a more critical element of the analysis presented was the identification of significant predictors of injury risk. An important part of the analysis was the variable-selection methods, resulting in a very clear and simplified model structure. The simple structure of the model and the linear nature of the methods made it possible to interpret the impact of individual variables on injury risk. Data-selection mechanisms were also used by other authors who have also used LASSO [44].
According to the data collected for this study, a professional football team can experience 0.9 injuries for every player on the field. This number is noticeably lower than that reported in a study following the analysis of three sports seasons, averaging 1.5 injuries per player [4]. In reality, training load and competitive load—both internal and external—are variables that are related to muscle injuries and that change depending on the situation and level of competition. In this study, sprains and muscular injuries were the most common types of injuries in the lower limbs. The quadriceps and hamstrings were the next most afflicted muscles, followed by the ankles. These results are consistent with the previous findings in the literature [10,12,13,14,16]. In reality, the lower limbs are under more pressure in this activity because of the tactical–technical maneuvers needed, which justifies their increased risk of damage. Overload injuries were more common than traumatic injuries. A recent investigation also established the existence of such prevalence [4]. In contrast, a different article discovered that overload was the cause of two out of every three injuries in their study [12]. Since there is a strong link between training load and the likelihood of injury, it is imperative to emphasize the significance of appropriately structuring the training cycles according to the players’ attributes and physical condition. When individual training loads are measured using the right tools, this process happens more reliably and consistently. Coaches, players, and their technical-support personnel increasingly monitor and evaluate the sports load using a scientific method [46]. In reality, keeping an eye on the training process is essential for assessing the level of athlete weariness, which may help to lower the risk of injury. Soccer involves physical contact and high intensity. Therefore, injury-prevention procedures should take both overload and traumatic injuries into account. Each athlete missed 14.3 days of practice or competition after suffering an injury, on average. This finding differs from that seen in the literature, with players missing an average of seven to eight days owing to injury [4,8,12]. On the other hand, we draw the conclusion that more serious injuries result in a longer period of player absence. This demonstrates the necessity of strengthening all preventative and rehabilitation efforts, while taking into consideration the predictive variables of injury as well as more frequent medical checkups and physical testing. Some authors claim that muscle injuries in soccer are the most common [9,10], converging with our findings. The injury-recurrence rate in our study is consistent with the rates reported in the literature, which range from 8% to 22% [9,47,48]. According to earlier research, these percentage discrepancies may result from the resources available in the individual clinical departments as well as a particular club’s infrastructure and material-resource capabilities to respond quickly, in order to maximize the injury prevention and healing process.
Regarding the impact of selected predictors included in the models, first of all, for sectorial position, the defensive and forward sectors were the ones that presented a higher risk of injury. A previous study conducted across three consecutive seasons with 123 Chilean elite male football players also reported that the defensive and forward sectors were the ones that contracted more injuries over the study period [4]. Among 71 Spanish elite male players, forwards were the ones who presented the highest rates in both incidence and severity of injury [14]. Indeed, the literature has described that certain positions, such as fullbacks and forwards, have more demanding tasks both in-game and during training sessions, such as covering greater distances and running with higher intensity than their peers. Overall, fullbacks and forwards perform a total of 29–35 sprints, which is higher than other positions (approx. 17–23 sprints) [49], which may justify their higher injury rates (i.e., hamstring injuries) [50,51]. Therefore, managing training loads appropriately following the physical demands of different sectors and playing positions might be a helpful method to lower the risk of injury in football [52]. Sports agents and coaches should consider load exposure according to players’ position, particularly when designing training sessions [52]. Moreover, our results consolidate the need to consider the players’ position as a variable to be included in the definition of injury-risk programs.
Another important predictor identified in our study was lower-limb flexibility. The sit-and-reach test is one of the physical fitness tests mostly used to predict the injury risk of elite football players across a sports season. In the literature, several studies have concluded that reduced flexibility in the lower limbs is related to the increased risk of injuries in elite football players [53,54,55,56]. Some studies report that it is essential to develop and introduce a standard battery assessment of flexibility in preseason tests, contributing to the awareness of the players’ profile [56,57]. The newest Guidelines for Exercise Testing and Prescription from the American College of Sports Medicine reported that maintaining good flexibility in all joints depends on many specific variables, including distensibility of the joint capsule and muscle viscosity, which facilitates movement and may prevent injuries [58]. However, we must acknowledge some limitations on the topic. First, it is not entirely understood if pre-activity stretching unequivocally reduces injuries associated with training load. Secondly, the most recent guidelines recommend direct measures of range of motion (i.e., goniometer and inclinometer) rather than indirect methods, such as sit-and-reach tests assessing flexibility. This means that most of the indirect measures that we most often use in various sports context are coming into disuse. It is recommended that direct measures of range of motion should be used more regularly. In general, the important focus will be that future studies continue to investigate this topic, so we can draw more reliable and valid conclusions regarding the relationship between flexibility and sports injuries.
According to our analyses, the push-up, handgrip, and 35 m linear sprint tests may be reliable predictors of injury risk among elite football players. Besides, height was also one of the variables significantly integrated into injury-prediction models in elite football players. Those variables can be related to each other, since they all end up influencing the players’ sports performance. In fact, the main value of this study is directed towards sports monitoring and injury prevention, as we analyzed the relationship between overall strength and height in elite soccer players as predictors of injury, and this is a topic on the rise. In the literature, we identified two studies conducted with youth footballers that have determined that injured players were significantly stronger, bigger, and more experienced than non-injured players [59,60]. This aspect becomes even more relevant when we talk about elite football players, since their demands are higher. The slightest physical differences can make all the difference in the outcome of individual action, dictating the outcome of crucial moments of games and seasons. We believe that these achievements can support future research on the topic to disentangle this complex net of variables that may affect the injury profile.
There are some limitations to this study that need to be acknowledged. The sample size and the fact that we only evaluated the elite players for 26 weeks across 42 weeks of the season are the main limitations of this study. The sample size is related to the number of patterns teaching predictive models. The greater the amount of recorded-injury information is, the better the material for calculating predictive models. Continued collection of learning patterns will improve the predictive ability of the models. Moreover, this is a cross-sectional study, which does not allow a cause–effect of the presented results. However, these results bring important and specific practical implications for those involved in the elite football context, mainly for the topics of injury prevention and training monitorization, since these are issues that are gaining significant attention in the sports business.

5. Conclusions

Addressing the need for further studies to identify risk factors for predicting injuries in elite football players, our approach opens a novel perspective on injury prevention and training monitorization, providing a methodology for evaluating and interpreting the complex relations between injury risk and players’ performance in elite football. Players’ sectorial position, body-composition variables, and physical fitness tests (sit-and-reach, push-up, handgrip, countermovement jump, squat jump, linear speed, and yoyo tests), were all important predictors that may be considered in the injury-risk prevention in elite football players. It would be an added value if future studies analyzed the influence of body-composition factors and physical fitness tests in elite football teams across different seasons.

Author Contributions

Conceptualization, F.M., É.R.G., K.P. and C.F.; methodology, F.M., É.R.G., K.P., C.F. and R.H.; validation, K.P., H.L. and R.H.; formal analysis, F.M., É.R.G., K.P. and C.F.; investigation, F.M., C.F. and R.H.; resources, É.R.G. and H.L.; writing—original draft preparation, F.M., É.R.G., K.P. and C.F.; writing—review and editing, M.d.M.N., H.S., A.M. and A.I.; visualization, M.d.M.N., H.S., A.M. and A.I.; project administration, É.R.G. and H.L.; funding acquisition, É.R.G. and H.L. All authors have read and agreed to the published version of the manuscript.


C.F., F.M. and E.R.G. acknowledge support from LARSyS—the Portuguese national funding agency for science, research, and technology (FCT)—pluriannual funding 2020–2023 (Reference: UIDB/50009/2020). This study is framed in the Marítimo Training Lab Project. The project received funding under application no. M1420-01-0247-FEDER-000033 in the System of Incentives for the Production of Scientific and Technological Knowledge in the Autonomous Region of Madeira—PROCiência 2020.

Institutional Review Board Statement

This study was conducted according to the guidelines of the Declaration of Helsinki, was approved by the Ethics Committee of the Faculty of Human Kinetics (CEIFMH Nº34/2021), and followed the ethical standards of the Declaration of Helsinki for Medical Research in Humans (2013) and the Oviedo Convention (1997).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study. Written informed consent has been obtained from all players to publish this paper.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.


The authors would like to thank all players and their respective legal guardians for participating in this study.

Conflicts of Interest

The authors declare no conflict of interest.


  1. Smpokos, E.; Mourikis, C.; Theos, C.; Manolarakis, G.; Linardakis, M. Injuries and risk factors in professional football players during four consecutive seasons. Sport Sci. Health 2021, 1–8. [Google Scholar] [CrossRef]
  2. Konefał, M.; Chmura, P.; Kowalczuk, E.; Figueiredo, A.J.; Sarmento, H.; Rokita, A.; Chmura, J.; Andrzejewski, M. Modeling of relationships between physical and technical activities and match outcome in elite German soccer players. J. Sports Med. Phys. Fit. 2019, 59, 752–759. [Google Scholar] [CrossRef] [PubMed]
  3. Rice, S.M.; Purcell, R.; De Silva, S.; Mawren, D.; McGorry, P.D.; Parker, A.G. The mental health of elite athletes: A narrative systematic review. Sports Med. 2016, 46, 1333–1353. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Yáñez, S.; Yáñez, C.; Martínez, M.; Núñez, M.; De la Fuente, C. Lesiones deportivas del plantel profesional de fútbol Santiago Wanderers durante las temporadas 2017, 2018 y 2019. Arch. Soc. Chil. Med. Deporte 2021, 66, 92–103. [Google Scholar]
  5. Hägglund, M.; Waldén, M.; Magnusson, H.; Kristenson, K.; Bengtsson, H.; Ekstrand, J. Injuries affect team performance negatively in professional football: An 11-year follow-up of the UEFA Champions League injury study. Br. J. Sports Med. 2013, 47, 738–742. [Google Scholar] [CrossRef] [Green Version]
  6. Hurley, O.A. Impact of player injuries on teams’ mental states, and subsequent performances, at the Rugby World Cup 2015. Front. Psychol. 2016, 7, 807. [Google Scholar]
  7. Kirkendall, D.T.; Dvorak, J. Effective injury prevention in soccer. Physician Sportsmed. 2010, 38, 147–157. [Google Scholar] [CrossRef]
  8. Cohen, M.; Abdalla, R.J.; Ejnisman, B.; Amaro, J.T. Lesões ortopédicas no futebol. Rev. Bras. Ortop. 1997, 32, 940–944. [Google Scholar]
  9. Ekstrand, J.; Hägglund, M.; Waldén, M. Epidemiology of muscle injuries in professional football (soccer). Am. J. Sports Med. 2011, 39, 1226–1232. [Google Scholar] [CrossRef] [Green Version]
  10. Hoffman, D.T.; Dwyer, D.B.; Tran, J.; Clifton, P.; Gastin, P.B. Australian Football League injury characteristics differ between matches and training: A longitudinal analysis of changes in the setting, site, and time span from 1997 to 2016. Orthop. J. Sports Med. 2019, 7, 2325967119837641. [Google Scholar] [CrossRef] [Green Version]
  11. Lee, I.; Jeong, H.S.; Lee, S.Y. Injury profiles in Korean youth soccer. Int. J. Environ. Res. Public Health 2020, 17, 5125. [Google Scholar] [CrossRef]
  12. Noya Salces, J.; Gómez-Carmona, P.M.; Gracia-Marco, L.; Moliner-Urdiales, D.; Sillero-Quintana, M. Epidemiology of injuries in First Division Spanish football. J. Sports Sci. 2014, 32, 1263–1270. [Google Scholar] [CrossRef] [Green Version]
  13. Raya-González, J.; de Ste Croix, M.; Read, P.; Castillo, D. A Longitudinal Investigation of muscle injuries in an elite spanish male academy soccer club: A hamstring injuries approach. Appl. Sci. 2020, 10, 1610. [Google Scholar] [CrossRef] [Green Version]
  14. Torrontegui-Duarte, M.; Gijon-Nogueron, G.; Perez-Frias, J.C.; Morales-Asencio, J.M.; Luque-Suarez, A. Incidence of injuries among professional football players in Spain during three consecutive seasons: A longitudinal, retrospective study. Phys. Ther. Sport 2020, 41, 87–93. [Google Scholar] [CrossRef]
  15. Jones, A.; Jones, G.; Greig, N.; Bower, P.; Brown, J.; Hind, K.; Francis, P. Epidemiology of injury in English Professional Football players: A cohort study. Phys. Ther. Sport 2019, 35, 18–22. [Google Scholar] [CrossRef] [Green Version]
  16. Krutsch, W.; Memmel, C.; Alt, V.; Krutsch, V.; Tröß, T.; Meyer, T. Timing return-to-competition: A prospective registration of 45 different types of severe injuries in Germany’s highest football league. Arch. Orthop. Trauma Surg. 2022, 142, 455–463. [Google Scholar] [CrossRef]
  17. Bradley, P.S.; Noakes, T.D. Match running performance fluctuations in elite soccer: Indicative of fatigue, pacing or situational influences? J. Sports Sci. 2013, 31, 1627–1638. [Google Scholar] [CrossRef]
  18. Harper, D.J.; Carling, C.; Kiely, J. High-intensity acceleration and deceleration demands in elite team sports competitive match play: A systematic review and meta-analysis of observational studies. Sports Med. 2019, 49, 1923–1947. [Google Scholar] [CrossRef] [Green Version]
  19. Oliva-Lozano, J.M.; Gómez-Carmona, C.D.; Pino-Ortega, J.; Moreno-Pérez, V.; Rodríguez-Pérez, M.A. Match and training high intensity activity-demands profile during a competitive mesocycle in youth elite soccer players. J. Hum. Kinet. 2020, 75, 195–205. [Google Scholar] [CrossRef]
  20. Przednowek, K.; Wiktorowicz, K.; Krzeszowski, T.; Iskra, J. A web-oriented expert system for planning hurdles race training programmes. Neural Comput. Appl. 2019, 31, 7227–7243. [Google Scholar] [CrossRef] [Green Version]
  21. Stern, B.D.; Hegedus, E.J.; Lai, Y.-C. Injury prediction as a non-linear system. Phys. Ther. Sport 2020, 41, 43–48. [Google Scholar] [CrossRef] [PubMed]
  22. Huang, C.; Jiang, L. Data monitoring and sports injury prediction model based on embedded system and machine learning algorithm. Microprocess. Microsyst. 2021, 81, 103654. [Google Scholar] [CrossRef]
  23. Rossi, A.; Pappalardo, L.; Cintia, P.; Iaia, F.M.; Fernández, J.; Medina, D. Effective injury forecasting in soccer with GPS training data and machine learning. PLoS ONE 2018, 13, e0201264. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Ekstrand, J.; Spreco, A.; Bengtsson, H.; Bahr, R. Injury rates decreased in men’s professional football: An 18-year prospective cohort study of almost 12,000 injuries sustained during 1.8 million hours of play. Br. J. Sports Med. 2021, 55, 1084–1091. [Google Scholar] [CrossRef]
  25. López-Valenciano, A.; Ayala, F.; Puerta, J.M.; Croix, M.D.S.; Vera-García, F.; Hernández-Sánchez, S.; Ruiz-Pérez, I.; Myer, G. A preventive model for muscle injuries: A novel approach based on learning algorithms. Med. Sci. Sports Exerc. 2018, 50, 915. [Google Scholar] [CrossRef]
  26. Nassis, G.; Stylianides, G.; Verhagen, E.; Brito, J.; Figueiredo, P.; Krustrup, P. A review of machine learning applications in soccer with an emphasis on injury risk. Biol. Sport 2022, 40, 233–239. [Google Scholar] [CrossRef]
  27. Brink, M.S.; Visscher, C.; Arends, S.; Zwerver, J.; Post, W.J.; Lemmink, K.A. Monitoring stress and recovery: New insights for the prevention of injuries and illnesses in elite youth soccer players. Br. J. Sports Med. 2010, 44, 809–815. [Google Scholar] [CrossRef]
  28. Ehrmann, F.E.; Duncan, C.S.; Sindhusake, D.; Franzsen, W.N.; Greene, D.A. GPS and injury prevention in professional soccer. J. Strength Cond. Res. 2016, 30, 360–367. [Google Scholar] [CrossRef]
  29. Venturelli, M.; Schena, F.; Zanolla, L.; Bishop, D. Injury risk factors in young soccer players detected by a multivariate survival model. J. Sci. Med. Sport 2011, 14, 293–298. [Google Scholar] [CrossRef]
  30. Van Eetvelde, H.; Mendonça, L.D.; Ley, C.; Seil, R.; Tischer, T. Machine learning methods in sport injury prediction and prevention: A systematic review. J. Exp. Orthop. 2021, 8, 27. [Google Scholar] [CrossRef]
  31. Ayala, F.; López-Valenciano, A.; Martín, J.A.G.; Croix, M.D.S.; Vera-Garcia, F.J.; García-Vaquero, M.D.P.; Ruiz-Pérez, I.; Myer, G.D. A preventive model for hamstring injuries in professional soccer: Learning algorithms. Int. J. Sports Med. 2019, 40, 344–353. [Google Scholar] [CrossRef]
  32. Gerodimos, V. Reliability of handgrip strength test in basketball players. J. Hum. Kinet. 2012, 31, 25. [Google Scholar] [CrossRef]
  33. Bosco, C.; Luhtanen, P.; Komi, P.V. A simple method for measurement of mechanical power in jumping. Eur. J. Appl. Physiol. Occup. Physiol. 1983, 50, 273–282. [Google Scholar] [CrossRef]
  34. Schmitz, B.; Pfeifer, C.; Kreitz, K.; Borowski, M.; Faldum, A.; Brand, S.-M. The Yo-Yo intermittent tests: A systematic review and structured compendium of test results. Front. Physiol. 2018, 9, 870. [Google Scholar] [CrossRef] [Green Version]
  35. Turner, A.; Walker, S.; Stembridge, M.; Coneyworth, P.; Reed, G.; Birdsey, L.; Barter, P.; Moody, J. A testing battery for the assessment of fitness in soccer players. Strength Cond. J. 2011, 33, 29–39. [Google Scholar] [CrossRef] [Green Version]
  36. Bangsbo, J.; Iaia, F.M.; Krustrup, P. The Yo-Yo intermittent recovery test. Sports Med. 2008, 38, 37–51. [Google Scholar] [CrossRef]
  37. Hägglund, M.; Waldén, M.; Bahr, R.; Ekstrand, J. Methods for epidemiological study of injuries to professional football players: Developing the UEFA model. Br. J. Sports Med. 2005, 39, 340–346. [Google Scholar] [CrossRef] [Green Version]
  38. Hoerl, A.E.; Kennard, R.W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 1970, 12, 55–67. [Google Scholar] [CrossRef]
  39. Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 1996, 58, 267–288. [Google Scholar] [CrossRef]
  40. Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 2005, 67, 301–320. [Google Scholar] [CrossRef] [Green Version]
  41. Hastie, T.; Tibshirani, R.; Friedman, J. Unsupervised learning. In The Elements of Statistical Learning; Springer: Berlin/Heidelberg, Germany, 2009; pp. 485–585. [Google Scholar]
  42. Chatterjee, S.; Hadi, A.S. Regression Analysis by Example; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
  43. Carey, D.L.; Ong, K.-L.; Whiteley, R.; Crossley, K.M.; Crow, J.; Morris, M.E. Predictive modelling of training loads and injury in Australian football. arXiv 2017, arXiv:170604336. [Google Scholar] [CrossRef] [Green Version]
  44. Rodas, G.; Osaba, L.; Arteta, D.; Pruna, R.; Fernández, D.; Lucia, A. Genomic prediction of tendinopathy risk in elite team sports. Int. J. Sports Physiol. Perform. 2019, 15, 489–495. [Google Scholar] [CrossRef]
  45. Ruddy, J.D.; Cormack, S.J.; Whiteley, R.; Williams, M.D.; Timmins, R.G.; Opar, D.A. Modeling the risk of team sport injuries: A narrative review of different statistical approaches. Front. Physiol. 2019, 10, 829. [Google Scholar] [CrossRef]
  46. Halson, S. Monitorización de la carga de formación para conocer fatiga en los atletas. Sports Med. 2014, 44, 139–147. [Google Scholar] [CrossRef] [Green Version]
  47. Stubbe, J.H.; van Beijsterveldt, A.-M.M.C.; van der Knaap, S.; Stege, J.; Verhagen, E.A.; van Mechelen, W.; Backx, F.J.G. Injuries in professional male soccer players in the Netherlands: A prospective cohort study. J. Athl. Train. 2015, 50, 211–216. [Google Scholar] [CrossRef] [Green Version]
  48. Waldén, M.; Hägglund, M.; Ekstrand, J. Injuries in Swedish elite football—A prospective study on injury definitions, risk for injury and injury pattern during 2001. Scand. J. Med. Sci. Sports 2005, 15, 118–125. [Google Scholar] [CrossRef]
  49. Di Salvo, V.; Baron, R.; González-Haro, C.; Gormasz, C.; Pigozzi, F.; Bachl, N. Sprinting analysis of elite soccer players during European Champions League and UEFA Cup matches. J. Sports Sci. 2010, 28, 1489–1494. [Google Scholar] [CrossRef]
  50. Suarez-Arrones, L.; Torreño, N.; Requena, B.; De Villarreal, E.S.; Casamichana, D.; Barbero-Alvarez, J.C.; Munguía-Izquierdo, D. Match-play activity profile in professional soccer players during official games and the relationship between external and internal load. J. Sports Med. Phys. Fit. 2014, 55, 1417–1422. [Google Scholar]
  51. Martín-García, A.; Díaz, A.G.; Bradley, P.S.; Morera, F.; Casamichana, D. Quantification of a professional football team’s external load using a microcycle structure. J. Strength Cond. Res. 2018, 32, 3511–3518. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  52. Gabbett, T.J. The training—Injury prevention paradox: Should athletes be training smarter and harder? Br. J. Sports Med. 2016, 50, 273–280. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  53. Hrysomallis, C. Hip adductors’ strength, flexibility, and injury risk. J. Strength Cond. Res. 2009, 23, 1514–1517. [Google Scholar] [CrossRef]
  54. Arnason, A.; Sigurdsson, S.B.; Gudmundsson, A.; Holme, I.; Engebretsen, L.; Bahr, R. Risk factors for injuries in football. Am. J. Sports Med. 2004, 32 (Suppl. S1), 5–16. [Google Scholar] [CrossRef]
  55. Ekstrand, J.; Gillquist, J. Soccer injuries and their mechanisms: A prospective study. Med. Sci. Sports Exerc. 1983, 15, 267–270. [Google Scholar] [CrossRef]
  56. Ibrahim, A.; Murrell, G.; Knapman, P. Adductor strain and hip range of movement in male professional soccer players. J. Orthop. Surg. 2007, 15, 46–49. [Google Scholar] [CrossRef]
  57. Witvrouw, E.; Danneels, L.; Asselman, P.; D’Have, T.; Cambier, D. Muscle flexibility as a risk factor for developing muscle injuries in male professional soccer players: A prospective study. Am. J. Sports Med. 2003, 31, 41–46. [Google Scholar] [CrossRef]
  58. Liguori, G.; American College of Sports Medicine (ACSM). ACSM’s Guidelines for Exercise Testing and Prescription; Lippincott Williams & Wilkins: Philadelphia, PA, USA, 2016. [Google Scholar]
  59. Turbeville, S.D.; Cowan, L.D.; Asal, N.R.; Owen, W.L.; Anderson, M.A. Risk factors for injury in middle school football players. Am. J. Sports Med. 2003, 31, 276–281. [Google Scholar] [CrossRef]
  60. Turbeville, S.D.; Cowan, L.D.; Owen, W.L.; Asal, N.R.; Anderson, M.A. Risk factors for injury in high school football players. Am. J. Sports Med. 2003, 31, 974–980. [Google Scholar] [CrossRef]
Figure 1. Injury frequency by zone (n).
Figure 1. Injury frequency by zone (n).
Jcm 11 04923 g001
Figure 2. Injury frequency by type (n).
Figure 2. Injury frequency by type (n).
Jcm 11 04923 g002
Figure 3. Injury frequency by specific location (n).
Figure 3. Injury frequency by specific location (n).
Jcm 11 04923 g003
Figure 4. Optimization of predictive models (the red line indicates the optimal model).
Figure 4. Optimization of predictive models (the red line indicates the optimal model).
Jcm 11 04923 g004aJcm 11 04923 g004b
Table 1. Description of the variables used to construct the predictive model (N = 24).
Table 1. Description of the variables used to construct the predictive model (N = 24).
x1–x3Sectorial Position *--
x4Age (y)25.453.34
x5Experience (y)7.293.38
x6Body mass (kg)80.097.07
x7Height (cm)182.526.01
x8TBW (L)51.934.66
x9BFM (kg)8.22.41
x10FFM (kg)71.26.50
x11Previous injury (n)1.291.63
x12Sit and reach (cm)34.526.79
x13Push-ups (n)43.638.68
x14Handgrip right (kg)50.879.62
x15Handgrip left (kg)48.928.67
x16CMJ height (cm)40.144.58
x17SJ height (cm)39.644.26
x18LS 5 m (s)1.160.13
x19LS 10 m (s)1.880.16
x20LS 35 m (s)4.850.27
x21Estimated VO2 max (L/kg/min)50.823.98
x22Yoyo (m)1720476
yInjury frequency (n)0.790.72
*—qualitative variable, M (mean value), sd (standard deviation), Me (median), TBW (total body water), BFM (body fat mass), FFM (fat free mass), CMJ (countermovement jump), SJ (squat jump), LS (linear speed), y (years), kg (kilograms), cm (centimeters), L (liters), n (number), s (speed), min (minutes), m (meters).
Table 2. Characterization of participants and injuries of CS Marítimo in the 2020/2021 season.
Table 2. Characterization of participants and injuries of CS Marítimo in the 2020/2021 season.
No. of Players36
No. of Injured Players23
Total Injuries34
Average Days Missed Due to Injury14.3
Injury per Player0.9
Injury Mechanism
Traumatic18 (52.9%)
Overload16 (47.1%)
Injury Severity *
Minimal (1–3 days)4 (11.7%)
Mild (4–7 days)7 (20.5%)
Moderate17 (50%)
Severe (+28 days)6 (17.6%)
Injury Recurrence
Yes4 (11.8%)
No30 (88.2%)
* Number of days missed by a player due to a sports injury contracted in training or match.
Table 3. Predictive errors for calculated models.
Table 3. Predictive errors for calculated models.
OLSx1, x2, x3, …, x2318.57-
Ridgex1, x2, x3, …, x230.698λ = 82.2
LASSOx1, x2, x3, …, x230.737s = 0
Elastic net (EN)x1, x3, x7, x12, x13, x15, x200.633λ = 0.1, s = 0.22
Forward (F)x1, x12, x13, x150.618-
Ridge (EN)x1, x3, x7, x12, x13, x15, x200.592λ = 17.5
Ridge (F)x1, x12, x13, x150.591λ = 7
LASSO (EN)x1, x3, x7, x12, x13, x15, x200.635s = 0.55
LASSO (F)x1, x12, x13, x150.613s = 0.87
Table 4. Predictive errors for calculated models.
Table 4. Predictive errors for calculated models.
y = 0.01 + 0.10⊕x1 − 0.27⊕x3 + 0.01⊕x7 − 0.01⊕x12 − 0.01⊕x13 − 0.03⊕x15 − 0.45⊕x20
y = −0.28 + 0.35⊕x1 − 0.02⊕x12⊕−0.01x13 + 0.04⊕x15
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Martins, F.; Przednowek, K.; França, C.; Lopes, H.; de Maio Nascimento, M.; Sarmento, H.; Marques, A.; Ihle, A.; Henriques, R.; Gouveia, É.R. Predictive Modeling of Injury Risk Based on Body Composition and Selected Physical Fitness Tests for Elite Football Players. J. Clin. Med. 2022, 11, 4923.

AMA Style

Martins F, Przednowek K, França C, Lopes H, de Maio Nascimento M, Sarmento H, Marques A, Ihle A, Henriques R, Gouveia ÉR. Predictive Modeling of Injury Risk Based on Body Composition and Selected Physical Fitness Tests for Elite Football Players. Journal of Clinical Medicine. 2022; 11(16):4923.

Chicago/Turabian Style

Martins, Francisco, Krzysztof Przednowek, Cíntia França, Helder Lopes, Marcelo de Maio Nascimento, Hugo Sarmento, Adilson Marques, Andreas Ihle, Ricardo Henriques, and Élvio Rúbio Gouveia. 2022. "Predictive Modeling of Injury Risk Based on Body Composition and Selected Physical Fitness Tests for Elite Football Players" Journal of Clinical Medicine 11, no. 16: 4923.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop