Machine Learning-Based Prediction of Muscle Injury Risk in Professional Football: A Four-Year Longitudinal Study
Abstract
1. Introduction
2. Materials and Methods
2.1. Participants
2.2. Sports Data
2.3. Predictive Modeling
2.4. Injury Data
2.5. Global Positioning System
2.6. Rate of Perceived Exertion
3. Results
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Rice, S.M.; Purcell, R.; De Silva, S.; Mawren, D.; McGorry, P.D.; Parker, A.G. The mental health of elite athletes: A narrative systematic review. Sports Med. 2016, 46, 1333–1353. [Google Scholar] [CrossRef]
- MEDICA, E.M. Modeling of relationships between physical and technical activities and match outcome in elite German soccer players. J. Sports Med. Phys. Fit. 2019, 59, 752–759. [Google Scholar] [CrossRef] [PubMed]
- Ade, J.; Fitzpatrick, J.; Bradley, P.S. High-intensity efforts in elite soccer matches and associated movement patterns, technical skills and tactical actions. Information for position-specific training drills. J. Sports Sci. 2016, 34, 2205–2214. [Google Scholar] [CrossRef] [PubMed]
- Alexe, D.I.; Čaušević, D.; Čović, N.; Rani, B.; Tohănean, D.I.; Abazović, E.; Setiawan, E.; Alexe, C.I. The relationship between functional movement quality and speed, agility, and jump performance in elite female youth football players. Sports 2024, 12, 214. [Google Scholar] [CrossRef] [PubMed]
- Cross, M.R.; Lahti, J.; Brown, S.R.; Chedati, M.; Jimenez-Reyes, P.; Samozino, P.; Eriksrud, O.; Morin, J.-B. Training at maximal power in resisted sprinting: Optimal load determination methodology and pilot results in team sport athletes. PLoS ONE 2018, 13, e0195477. [Google Scholar] [CrossRef]
- Hammami, M.; Negra, Y.; Billaut, F.; Hermassi, S.; Shephard, R.J.; Chelly, M.S. Effects of lower-limb strength training on agility, repeated sprinting with changes of direction, leg peak power, and neuromuscular adaptations of soccer players. J. Strength Cond. Res. 2018, 32, 37–47. [Google Scholar] [CrossRef]
- Mor, H.; Mor, A.; Abdioğlu, M.; Tohănean, D.I.; Savu, C.V.; Acar, G.C.; Moraru, C.E.; Alexe, D.I. The Acute Effects of Caffeine Supplementation on Anaerobic Performance and Functional Strength in Female Soccer Players. Nutrients 2025, 17, 2156. [Google Scholar] [CrossRef]
- Hägglund, M.; Waldén, M.; Bahr, R.; Ekstrand, J. Methods for epidemiological study of injuries to professional football players: Developing the UEFA model. Br. J. Sports Med. 2005, 39, 340–346. [Google Scholar] [CrossRef]
- Raya-González, J.; de Ste Croix, M.; Read, P.; Castillo, D. A Longitudinal Investigation of muscle injuries in an elite spanish male academy soccer club: A hamstring injuries approach. Appl. Sci. 2020, 10, 1610. [Google Scholar] [CrossRef]
- Yáñez, S.; Yáñez, C.; Martínez, M.; Núñez, M.; De la Fuente, C. Lesiones deportivas del plantel profesional de fútbol Santiago Wanderers durante las temporadas 2017, 2018 y 2019. Arch. Soc. Chil. Med. Del Deporte 2021, 66, 92–103. [Google Scholar] [CrossRef]
- Hägglund, M.; Waldén, M.; Magnusson, H.; Kristenson, K.; Bengtsson, H.; Ekstrand, J. Injuries affect team performance negatively in professional football: An 11-year follow-up of the UEFA Champions League injury study. Br. J. Sports Med. 2013, 47, 738–742. [Google Scholar] [CrossRef] [PubMed]
- Martins, F.; França, C.; Henriques, R.; Ihle, A.; Przednowek, K.; Marques, A.; Lopes, H.; Sarmento, H.; Gouveia, É.R. Body composition variations between injured and non-injured professional soccer players. Sci. Rep. 2022, 12, 20779. [Google Scholar] [CrossRef] [PubMed]
- Ekstrand, J.; Hägglund, M.; Waldén, M. Epidemiology of muscle injuries in professional football (soccer). Am. J. Sports Med. 2011, 39, 1226–1232. [Google Scholar] [CrossRef] [PubMed]
- Martins, F.; Marques, A.; França, C.; Sarmento, H.; Henriques, R.; Ihle, A.; de Maio Nascimento, M.; Saldanha, C.; Przednowek, K.; Gouveia, É.R. Weekly external load performance effects on sports injuries of male professional football players. Int. J. Environ. Res. Public Health 2023, 20, 1121. [Google Scholar] [CrossRef]
- Seshadri, D.R.; Li, R.T.; Voos, J.E.; Rowbottom, J.R.; Alfes, C.M.; Zorman, C.A.; Drummond, C.K. Wearable sensors for monitoring the internal and external workload of the athlete. NPJ Digit. Med. 2019, 2, 71. [Google Scholar] [CrossRef]
- Miguel, M.; Oliveira, R.; Loureiro, N.; García-Rubio, J.; Ibáñez, S.J. Load measures in training/match monitoring in soccer: A systematic review. Int. J. Environ. Res. Public Health 2021, 18, 2721. [Google Scholar] [CrossRef]
- Vallance, E.; Sutton-Charani, N.; Imoussaten, A.; Montmain, J.; Perrey, S. Combining internal-and external-training-loads to predict non-contact injuries in soccer. Appl. Sci. 2020, 10, 5261. [Google Scholar] [CrossRef]
- Rago, V.; Brito, J.; Figueiredo, P.; Costa, J.; Barreira, D.; Krustrup, P.; Rebelo, A. Methods to collect and interpret external training load using microtechnology incorporating GPS in professional football: A systematic review. Res. Sports Med. 2020, 28, 437–458. [Google Scholar] [CrossRef]
- Sobolewski, E.J. The relationships between internal and external load measures for division I college football practice. Sports 2020, 8, 165. [Google Scholar] [CrossRef]
- Enes, A.; Oneda, G.; Alves, D.L.; Palumbo, D.d.P.; Cruz, R.; Moiano Junior, J.V.; Novack, L.F.; Osiecki, R. Determinant factors of the match-based internal load in elite soccer players. Res. Q. Exerc. Sport 2021, 92, 63–70. [Google Scholar] [CrossRef]
- Askow, A.T.; Lobato, A.L.; Arndts, D.J.; Jennings, W.; Kreutzer, A.; Erickson, J.L.; Esposito, P.E.; Oliver, J.M.; Foster, C.; Jagim, A.R. Session rating of perceived exertion (sRPE) load and training impulse are strongly correlated to GPS-derived measures of external load in NCAA division I women’s soccer athletes. J. Funct. Morphol. Kinesiol. 2021, 6, 90. [Google Scholar] [CrossRef] [PubMed]
- Naidu, S.A.; Fanchini, M.; Cox, A.; Smeaton, J.; Hopkins, W.G.; Serpiello, F.R. Validity of session rating of perceived exertion assessed via the CR100 scale to track internal load in elite youth football players. Int. J. Sports Physiol. Perform. 2019, 14, 403–406. [Google Scholar] [CrossRef]
- de Dios-Álvarez, V.; Suárez-Iglesias, D.; Bouzas-Rico, S.; Alkain, P.; González-Conde, A.; Ayán-Pérez, C. Relationships between RPE-derived internal training load parameters and GPS-based external training load variables in elite young soccer players. Res. Sports Med. 2023, 31, 58–73. [Google Scholar] [CrossRef] [PubMed]
- Piłka, T.; Grzelak, B.; Sadurska, A.; Górecki, T.; Dyczkowski, K. Predicting injuries in football based on data collected from GPS-based wearable sensors. Sensors 2023, 23, 1227. [Google Scholar] [CrossRef] [PubMed]
- Saberisani, R.; Barati, A.H.; Zarei, M.; Santos, P.; Gorouhi, A.; Ardigò, L.P.; Nobari, H. Prediction of football injuries using GPS-based data in Iranian professional football players: A machine learning approach. Front. Sports Act. Living 2025, 7, 1425180. [Google Scholar] [CrossRef]
- Martins, F.; Przednowek, K.; Santos, F.; França, C.; Martinho, D.; Gouveia, É.R.; Marques, A.; Sarmento, H. Predictive models of injury risk in male professional football players: A systematic review. Inj. Prev. 2025, 31, 177–190. [Google Scholar] [CrossRef]
- Leckey, C.; Van Dyk, N.; Doherty, C.; Lawlor, A.; Delahunt, E. Machine learning approaches to injury risk prediction in sport: A scoping review with evidence synthesis. Br. J. Sports Med. 2025, 59, 491–500. [Google Scholar] [CrossRef]
- Majumdar, A.; Bakirov, R.; Hodges, D.; McCullagh, S.; Rees, T. A multi-season machine learning approach to examine the training load and injury relationship in professional soccer. J. Sports Anal. 2024, 10, 47–65. [Google Scholar] [CrossRef]
- Colby, M.J.; Dawson, B.; Peeling, P.; Heasman, J.; Rogalski, B.; Drew, M.K.; Stares, J.; Zouhal, H.; Lester, L. Multivariate modelling of subjective and objective monitoring data improve the detection of non-contact injury risk in elite Australian footballers. J. Sci. Med. Sport 2017, 20, 1068–1074. [Google Scholar] [CrossRef]
- Čaušević, D.; Rani, B.; Gasibat, Q.; Čović, N.; Alexe, C.I.; Pavel, S.I.; Burchel, L.O.; Alexe, D.I. Maturity-Related variations in morphology, body composition, and somatotype features among young male football players. Children 2023, 10, 721. [Google Scholar] [CrossRef]
- Van Eetvelde, H.; Mendonça, L.D.; Ley, C.; Seil, R.; Tischer, T. Machine learning methods in sport injury prediction and prevention: A systematic review. J. Exp. Orthop. 2021, 8, 27. [Google Scholar] [CrossRef] [PubMed]
- Majumdar, A.; Bakirov, R.; Hodges, D.; Scott, S.; Rees, T. Machine learning for understanding and predicting injuries in football. Sports Med.–Open 2022, 8, 73. [Google Scholar] [CrossRef]
- Jauhiainen, S.; Kauppi, J.-P.; Krosshaug, T.; Bahr, R.; Bartsch, J.; Äyrämö, S. Predicting ACL injury using machine learning on data from an extensive screening test battery of 880 female elite athletes. Am. J. Sports Med. 2022, 50, 2917–2924. [Google Scholar] [CrossRef] [PubMed]
- Oliver, J.L.; Ayala, F.; Croix, M.B.D.S.; Lloyd, R.S.; Myer, G.D.; Read, P.J. Using machine learning to improve our understanding of injury risk and prediction in elite male youth football players. J. Sci. Med. Sport 2020, 23, 1044–1048. [Google Scholar] [CrossRef] [PubMed]
- Cohen, J. Statistical Power Analysis for the Behavioral Sciences; Routledge: Oxfordshire, UK, 2013. [Google Scholar]
- Frank, E.; Hall, M.A.; Witten, I.H. The WEKA Workbench. Online Appendix for “Data Mining: Practical Machine Learning Tools and Techniques”; The University of Waikato: Hamilton, New Zealand, 2016. [Google Scholar]
- Aha, D.W.; Kibler, D.; Albert, M.K. Instance-based learning algorithms. Mach. Learn. 1991, 6, 37–66. [Google Scholar] [CrossRef]
- Cleary, J.G.; Trigg, L.E. K*: An instance-based learner using an entropic distance measure. In Machine Learning Proceedings 1995; Elsevier: Amsterdam, The Netherlands, 1995; pp. 108–114. [Google Scholar]
- Landwehr, N.; Hall, M.; Frank, E. Logistic model trees. Mach. Learn. 2005, 59, 161–205. [Google Scholar] [CrossRef]
- Cessie, S.L.; Houwelingen, J.V. Ridge estimators in logistic regression. J. R. Stat. Soc. Ser. C Appl. Stat. 1992, 41, 191–201. [Google Scholar] [CrossRef]
- Frank, E. Fully Supervised Training of Gaussian Radial Basis Function Networks in WEKA; The University of Waikato: Hamilton, New Zealand, 2014. [Google Scholar]
- Platt, J. Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines; Microsoft: Redmond, WA, USA, 1998. [Google Scholar]
- Foster, C.; Florhaug, J.A.; Franklin, J.; Gottschall, L.; Hrovatin, L.A.; Parker, S.; Doleshal, P.; Dodge, C. A new approach to monitoring exercise training. J. Strength Cond. Res. 2001, 15, 109–115. [Google Scholar]
- Impellizzeri, F.M.; Rampinini, E.; Coutts, A.J.; Sassi, A.; Marcora, S.M. Use of RPE-based training load in soccer. Med. Sci. Sports Exerc. 2004, 36, 1042–1047. [Google Scholar] [CrossRef]
- Bowen, L.; Gross, A.S.; Gimpel, M.; Bruce-Low, S.; Li, F.X. Spikes in acute:chronic workload ratio (ACWR) associated with a 5–7 times greater injury rate in English Premier League football players: A comprehensive 3-year study. Br. J. Sports Med. 2020, 54, 731–738. [Google Scholar] [CrossRef]
- Drew, M.K.; Finch, C.F. The relationship between training load and injury, illness and soreness: A systematic and literature review. Sports Med. 2016, 46, 861–883. [Google Scholar] [CrossRef]
- Ekstrand, J. Epidemiology of football injuries. Sci. Sports 2008, 23, 73–77. [Google Scholar] [CrossRef]
- Ekstrand, J.; Hägglund, M.; Waldén, M. Injury incidence and injury patterns in professional football: The UEFA injury study. Br. J. Sports Med. 2011, 45, 553–558. [Google Scholar] [CrossRef] [PubMed]
- Pfirrmann, D.; Herbst, M.; Ingelfinger, P.; Simon, P.; Tug, S. Analysis of injury incidences in male professional adult and elite youth soccer players: A systematic review. J. Athl. Train. 2016, 51, 410–424. [Google Scholar] [CrossRef] [PubMed]
- Enright, K.; Green, M.; Hay, G.; Malone, J.J. Workload and injury in professional soccer players: Role of injury tissue type and injury severity. Int. J. Sports Med. 2020, 41, 89–97. [Google Scholar] [CrossRef]
- Nobari, H.; Mainer-Pardos, E.; Denche Zamorano, A.; Bowman, T.G.; Clemente, F.M.; Pérez-Gómez, J. Sprint variables are associated with the odds ratios of non-contact injuries in professional soccer players. Int. J. Environ. Res. Public Health 2021, 18, 10417. [Google Scholar] [CrossRef]
- Suarez-Arrones, L.; De Alba, B.; Röll, M.; Torreno, I.; Strütt, S.; Freyler, K.; Ritzmann, R. Player monitoring in professional soccer: Spikes in acute: Chronic workload are dissociated from injury occurrence. Front. Sports Act. Living 2020, 2, 75. [Google Scholar] [CrossRef]
- Malone, J.J.; Barrett, S.; Barnes, C.; Twist, C.; Drust, B. To infinity and beyond: The use of GPS devices within the football codes. Sci. Med. Footb. 2020, 4, 82–84. [Google Scholar] [CrossRef]
- Malone, J.J.; Lovell, R.; Varley, M.C.; Coutts, A.J. Unpacking the black box: Applications and considerations for using GPS devices in sport. Int. J. Sports Physiol. Perform. 2017, 12, S2-18–S12-26. [Google Scholar] [CrossRef]
- Georgiadis, G.; van Gaal Appelhof, R.; Stoop, R.; Peters, J.; Essers, J. Association between external load and injury incidence in professional and elite-youth football players. Phys. Educ. Sports Stud. Res. 2024, 3, 10–25. [Google Scholar] [CrossRef]
- Oliva-Lozano, J.M.; Gómez-Carmona, C.D.; Pino-Ortega, J.; Moreno-Pérez, V.; Rodríguez-Pérez, M.A. Match and training high intensity activity-demands profile during a competitive mesocycle in youth elite soccer players. J. Hum. Kinet. 2020, 75, 195. [Google Scholar] [CrossRef]
- Harper, D.J.; Carling, C.; Kiely, J. High-intensity acceleration and deceleration demands in elite team sports competitive match play: A systematic review and meta-analysis of observational studies. Sports Med. 2019, 49, 1923–1947. [Google Scholar] [CrossRef]
- Romero-Moraleda, B.; González-García, J.; Morencos, E.; Giráldez-Costas, V.; Moya, J.; Ramirez-Campillo, R. Internal workload in elite female football players during the whole in-season: Starters vs non-starters. Biol. Sport 2023, 40, 1107–1115. [Google Scholar] [CrossRef]
- Teixeira, J.E.; Forte, P.; Ferraz, R.; Leal, M.; Ribeiro, J.; Silva, A.J.; Barbosa, T.M.; Monteiro, A.M. Monitoring accumulated training and match load in football: A systematic review. Int. J. Environ. Res. Public Health 2021, 18, 3906. [Google Scholar] [CrossRef]
- Gaudino, P.; Iaia, F.M.; Strudwick, A.J.; Hawkins, R.D.; Alberti, G.; Atkinson, G.; Gregson, W. Factors influencing perception of effort (session rating of perceived exertion) during elite soccer training. Int. J. Sports Physiol. Perform. 2015, 10, 860–864. [Google Scholar] [CrossRef] [PubMed]
- Bartlett, J.D.; O’Connor, F.; Pitchford, N.; Torres-Ronda, L.; Robertson, S.J. Relationships between internal and external training load in team-sport athletes: Evidence for an individualized approach. Int. J. Sports Physiol. Perform. 2017, 12, 230–234. [Google Scholar] [CrossRef] [PubMed]
- Rossi, A.; Pappalardo, L.; Cintia, P.; Iaia, F.M.; Fernández, J.; Medina, D. Effective injury forecasting in soccer with GPS training data and machine learning. PLoS ONE 2018, 13, e0201264. [Google Scholar] [CrossRef]
- Ehrmann, F.E.; Duncan, C.S.; Sindhusake, D.; Franzsen, W.N.; Greene, D.A. GPS and injury prevention in professional soccer. J. Strength Cond. Res. 2016, 30, 360–367. [Google Scholar] [CrossRef]
- Jaspers, A.; Kuyvenhoven, J.P.; Staes, F.; Frencken, W.G.; Helsen, W.F.; Brink, M.S. Examination of the external and internal load indicators’ association with overuse injuries in professional soccer players. J. Sci. Med. Sport 2018, 21, 579–585. [Google Scholar] [CrossRef]
- Tao, H.; Deng, Y.; Xiang, Y.; Liu, L. Performance of long short-term memory networks in predicting athlete injury risk. J. Comput. Methods Sci. Eng. 2024, 24, 3155–3171. [Google Scholar] [CrossRef]
- Guan, L. Intelligent rehabilitation assistant: Application of deep learning methods in sports injury recovery. Mol. Cell. Biomech. 2024, 21, 384. [Google Scholar] [CrossRef]
- Ye, X.; Huang, Y.; Bai, Z.; Wang, Y. A novel approach for sports injury risk prediction: Based on time-series image encoding and deep learning. Front. Physiol. 2023, 14, 1174525. [Google Scholar] [CrossRef]
- Ali, S.; Akhlaq, F.; Imran, A.S.; Kastrati, Z.; Daudpota, S.M.; Moosa, M. The enlightening role of explainable artificial intelligence in medical & healthcare domains: A systematic literature review. Comput. Biol. Med. 2023, 166, 107555. [Google Scholar] [CrossRef] [PubMed]
- Hu, C.; Tan, Q.; Zhang, Q.; Li, Y.; Wang, F.; Zou, X.; Peng, Z. Application of interpretable machine learning for early prediction of prognosis in acute kidney injury. Comput. Struct. Biotechnol. J. 2022, 20, 2861–2870. [Google Scholar] [CrossRef] [PubMed]
- Pietsch, S.; Pizzari, T. Risk factors for quadriceps muscle strain injuries in sport: A systematic review. J. Orthop. Sports Phys. Ther. 2022, 52, 389–400. [Google Scholar] [CrossRef]
- Hägglund, M.; Waldén, M.; Ekstrand, J. Previous injury as a risk factor for injury in elite football: A prospective study over two consecutive seasons. Br. J. Sports Med. 2006, 40, 767–772. [Google Scholar] [CrossRef]
| Variable | Description | Average Season | 2 Weeks Before Injury | 4 Weeks Before Injury | |||
|---|---|---|---|---|---|---|---|
| M ± SD | M ± SD | M ± SD | |||||
| x1 | Age (y) | 26.22 ± 4.19 | 26.22 ± 4.19 | 26.22 ± 4.19 | |||
| x2 | Experience (y) | 8.16 ± 4.18 | 8.16 ± 4.18 | 8.16 ± 4.18 | |||
| x3 | WT (min) | 73.17 ± 6.14 | 74.55 ± 9.18 | 74.71 ± 8.32 | |||
| x4 | TD (m) | 4933.09 ± 559.41 | 5222.95 ± 1083.50 | 5223.06 ± 1030.41 | |||
| x5 | Z4 (m) | 528.60 ± 94.70 | 569.69 ± 170.32 | 565.50 ± 149.36 | |||
| x6 | Z5 (m) | 311.68 ± 68.69 | 324.11 ± 93.35 | 321.31 ± 85.85 | |||
| x7 | Z6 (m) | 57.54 ± 21.52 | 57.10 ± 26.25 | 57.22 ± 24.63 | |||
| x8 | HSR (m) | 367.89 ± 84.78 | 380.39 ± 110.47 | 374.34 ± 99.16 | |||
| x9 | ACC (m) | 42.03 ± 12.92 | 43.57 ± 14.60 | 42.96 ± 13.43 | |||
| x10 | DEC (m) | 42.46 ± 11.72 | 43.72 ± 14.76 | 42.79 ± 13.06 | |||
| x11 | PL (m) | 221.41 ± 173.41 | 222.48 ± 179.26 | 224.00 ± 181.42 | |||
| x12 | M/Min (m) | 75.32 ± 9.01 | 76.06 ± 12.33 | 75.86 ± 11.83 | |||
| x13 | MaxVel (m) | 27.07 ± 1.35 | 27.20 ± 1.72 | 27.35 ± 2.07 | |||
| x14 | RPE | 4.91 ± 0.86 | 4.99 ± 0.96 | 4.97 ± 0.90 | |||
| y | Injury Occurrence * | NI (69) | I (27) | NI (69) | I (59) | NI (69) | I (59) |
| Variable | Season Average | 4 Weeks Before Injury Occurrence | 2 Weeks Before Injury Occurrence | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Group | M | SD | Mdif | padj | r | Rmag | M | SD | Mdif | padj | r | rmag | M | SD | Mdif | padj | r | rmag | |
| Age (y) | NI | 25.67 | 3.79 | 1.96 | >0.05 | 0.20 | Small | 25.67 | 3.79 | 1.96 | >0.05 | 0.20 | Small | 25.67 | 3.79 | 1.96 | >0.05 | 0.20 | Small |
| I | 27.63 | 4.87 | 27.63 | 4.87 | 27.63 | 4.87 | |||||||||||||
| Experience (y) | NI | 7.57 | 3.76 | 2.10 | >0.05 | 0.22 | Small | 7.57 | 3.76 | 2.10 | >0.05 | 0.22 | Small | 7.57 | 3.76 | 2.10 | >0.05 | 0.22 | Small |
| I | 9.67 | 4.85 | 9.67 | 4.85 | 9.67 | 4.85 | |||||||||||||
| WT (min) | NI | 73.50 | 5.82 | 1.19 | >0.05 | 0.16 | Small | 73.50 | 5.82 | 2.62 | >0.05 | 0.10 | Small | 73.50 | 5.82 | 2.27 | >0.05 | 0.10 | Small |
| I | 72.31 | 6.91 | 76.16 | 10.40 | 75.77 | 11.91 | |||||||||||||
| TD (m) | NI | 4981.17 | 549.08 | 170.98 | >0.05 | 0.15 | Small | 4981.17 | 549.08 | 524.77 | >0.05 | 0.23 | Small | 4981.17 | 549.08 | 524.53 | >0.05 | 0.18 | Small |
| I | 4810.19 | 577.16 | 5505.94 | 1349.22 | 5505.70 | 1437.45 | |||||||||||||
| Z4 (m) | NI | 530.93 | 89.19 | 8.28 | >0.05 | 0.04 | Negligible | 530.93 | 89.19 | 75.00 | >0.05 | 0.24 | Small | 530.93 | 89.19 | 84.08 | >0.05 | 0.23 | Small |
| I | 522.64 | 109.14 | 605.93 | 190.89 | 615.01 | 224.31 | |||||||||||||
| Z5 (m) | NI | 315.64 | 71.45 | 14.07 | >0.05 | 0.05 | Negligible | 315.64 | 71.45 | 12.30 | >0.05 | 0.09 | Negligible | 315.64 | 71.45 | 18.37 | >0.05 | 0.10 | Small |
| I | 301.57 | 61.15 | 327.94 | 100.35 | 334.01 | 113.63 | |||||||||||||
| Z6 (m) | NI | 58.79 | 22.27 | 4.43 | >0.05 | 0.09 | Negligible | 58.79 | 22.27 | 3.41 | >0.05 | 0.09 | Negligible | 58.79 | 22.27 | 3.66 | >0.05 | 0.11 | Small |
| I | 54.36 | 19.50 | 55.38 | 27.21 | 55.13 | 30.33 | |||||||||||||
| HSR (m) | NI | 373.90 | 88.17 | 21.38 | >0.05 | 0.09 | Negligible | 373.90 | 88.17 | 0.95 | >0.05 | 0.01 | Negligible | 373.90 | 88.17 | 14.08 | >0.05 | 0.07 | Negligible |
| I | 352.52 | 74.92 | 374.85 | 111.48 | 387.98 | 132.33 | |||||||||||||
| ACC (m/s2) | NI | 42.94 | 13.14 | 3.21 | >0.05 | 0.08 | Negligible | 42.94 | 13.14 | 0.05 | >0.05 | 0.01 | Negligible | 42.94 | 13.14 | 1.37 | >0.05 | 0.05 | Negligible |
| I | 39.73 | 12.25 | 42.99 | 13.87 | 44.31 | 16.23 | |||||||||||||
| DEC (m/s2) | NI | 43.93 | 11.45 | 5.23 | >0.05 | 0.20 | Small | 43.93 | 11.45 | 2.47 | >0.05 | 0.11 | Small | 43.93 | 11.45 | 0.46 | >0.05 | 0.06 | Negligible |
| I | 38.70 | 11.78 | 41.46 | 14.71 | 43.47 | 17.98 | |||||||||||||
| PL (AU) | NI | 224.33 | 172.26 | 10.38 | >0.05 | 0.07 | Negligible | 224.33 | 172.26 | 0.72 | >0.05 | 0.05 | Negligible | 224.33 | 172.26 | 4.02 | >0.05 | 0.05 | Negligible |
| I | 213.95 | 179.41 | 223.61 | 193.07 | 220.31 | 188.58 | |||||||||||||
| M/min (m/min) | NI | 75.24 | 8.75 | 0.28 | >0.05 | 0.03 | Negligible | 75.24 | 8.75 | 1.35 | >0.05 | 0.10 | Small | 75.24 | 8.75 | 1.78 | >0.05 | 0.08 | Negligible |
| I | 75.52 | 9.82 | 76.59 | 14.69 | 77.02 | 15.53 | |||||||||||||
| MaxVel (m/s) | NI | 27.17 | 1.36 | 0.35 | >0.05 | 0.12 | Small | 27.17 | 1.36 | 0.39 | >0.05 | 0.06 | Negligible | 27.17 | 1.36 | 0.05 | >0.05 | 0.08 | Negligible |
| I | 26.82 | 1.31 | 27.56 | 2.68 | 27.22 | 2.08 | |||||||||||||
| RPE | NI | 5.01 | 0.87 | 0.35 | >0.05 | 0.19 | Small | 5.01 | 0.87 | 0.09 | >0.05 | 0.05 | Negligible | 5.01 | 0.87 | 0.03 | >0.05 | 0.02 | Negligible |
| I | 4.66 | 0.79 | 4.92 | 0.94 | 4.98 | 1.06 | |||||||||||||
| IBk | IIBk is an implementation of the KNN classifier that uses a distance function. By default, k = 1, meaning it considers only one nearest neighbor. The number of neighbors can be set manually (-K) or automatically through leave-one-out cross-validation (-X). If more neighbors are selected, their predictions can be weighted based on their distance to the test instance. There are two different formulas for calculating the weight from the distance (-D and -F). The time required to classify a test instance with the nearest neighbor classifier increases linearly with the number of training instances [37]. |
| K-Star | IK-Star is a type of instance-based classifier where the class of a test instance is determined by the classes of similar training instances, identified through a similarity function. Unlike other instance-based classifiers, K-Star employs an entropy-based distance function [38]. |
| Simple Logistic | Simple Logistic is a classification algorithm used for predictive modeling when the target variable is categorical. It is particularly useful for binary classification problems, where the output is either one class or another (0 or 1). A threshold is set to determine the class assignment based on the predicted value [39]. |
| Logistic Classifier | Logistic Classifier is a modified logistic regression method. Unlike the classical method, in this model I realize the handling of instance weights. The class implementing this model uses a ridge estimator [40]. |
| MLP Classifier | MLP Classifier is an artificial neural network with one hidden layer. The process of calculating and optimization involves minimizing a loss function with a quadratic penalty using the BFGS method, with all attributes standardized. Key parameters include the ridge parameter for weight penalty and the number of hidden units, which affects training time. The default activation function for the hidden layer is an approximate logistic function, but the output layer uses the sigmoid function for classification. Nominal attributes are converted to binary, and missing values are globally replaced [36]. |
| Random Tree | Random Tree is an advanced technique for building tree-based classifiers. Used in supervised learning for both classification and regression tasks. By employing multiple classification or regression trees and incorporating randomness, this method generates predictions that are highly resilient to new data [41]. |
| RBF Classifier | Radial basis function networks are a type of feedforward network. An RBF (Radial Basis Function) network is characterized by the hidden neuron implementing a function, called the basis function, which varies radially around a selected center [41]. |
| SMO | Sequential Minimal Optimization is a simple algorithm that can quickly solve the Support Vector Machines quadratic programming (QP) problem without any extra matrix storage. SMO decomposes the overall QP problem into QP sub-problems, using Osuna’s theorem to ensure convergence [42]. |
| Metric | Unit | Zone |
|---|---|---|
| Work time | Minutes | |
| Total distance | Meters | |
| Accelerations | Above 10.8 km/h | |
| Decelerations | Above 10.8 km/h | |
| High-speed running | Above 18 km/h | |
| High-speed zone 4 | 14–18 km/h | |
| High-speed zone 5 | 18–24 km/h | |
| High-speed zone 6 | 24–39 km/h |
| Sportive Season | 2020/2021 | 2021/2022 | 2022/2023 | 2023/2024 | Total |
|---|---|---|---|---|---|
| Club (No.) | 21 | 23 | 26 | 26 | 96 |
| Injured Players | 12 | 13 | 14 | 10 | 49 |
| Injury Frequency | 21 | 22 | 39 | 15 | 97 |
| Injuries per player (Av.) | 1 | 0.9 | 1.5 | 0.6 | 1 |
| Injury Occurrence (No.) | |||||
| Training | 13 | 15 | 25 | 11 | 64 |
| Match | 9 | 6 | 14 | 4 | 33 |
| Injury Exposure (h) | |||||
| Training | 4011 | 5245 | 5204 | 6880 | 21340 |
| Match | 440 | 619 | 970 | 650 | 2679 |
| Injury Incidence (per 1000 h) | |||||
| Training | 4.0 (1.72–5.54) | 2.9 (1.60–4.72) | 4.8 (3.11–7.18) | 1.6 (0.80–2.86) | 3.0 (2.33–3.88) |
| Match | 20.5 (9.35–38.83) | 9.7 (3.55–21.71) | 14.4 (7.89–24.22) | 6.2 (1.67–15.75) | 12.3 (8.57–17.30) |
| Method | Dataset | Accuracy | Balanced Accuracy | Sensitivity | Specificity | Precision | F1-Score | AUC-ROC | AUC-PR |
| Logistic | 2 Weeks Before Injury | 55% | 57% | 76% | 38% | 51% | 61% | 67% | 73% |
| 4 Weeks Before Injury | 61% | 62% | 77% | 46% | 55% | 65% | 71% | 75% | |
| Average Season | 62% | 62% | 61% | 62% | 39% | 47% | 63% | 56% | |
| MLP | 2 Weeks Before Injury | 72% | 72% | 75% | 69% | 67% | 71% | 79% | 80% |
| 4 Weeks Before Injury | 68% | 68% | 73% | 63% | 63% | 68% | 75% | 78% | |
| Average Season | 60% | 57% | 51% | 63% | 35% | 42% | 59% | 53% | |
| RBF | 2 Weeks Before Injury | 64% | 66% | 88% | 43% | 57% | 69% | 77% | 78% |
| 4 Weeks Before Injury | 56% | 58% | 84% | 33% | 51% | 64% | 68% | 72% | |
| Average Season | 58% | 56% | 51% | 61% | 34% | 41% | 59% | 53% | |
| SMO | 2 Weeks Before Injury | 46% | 49% | 98% | 1% | 46% | 62% | 49% | 46% |
| 4 Weeks Before Injury | 46% | 49% | 96% | 2% | 46% | 62% | 49% | 46% | |
| Average Season | 62% | 64% | 67% | 60% | 40% | 50% | 63% | 40% | |
| Simple logistic | 2 Weeks Before Injury | 53% | 55% | 81% | 29% | 49% | 61% | 65% | 70% |
| 4 Weeks Before Injury | 59% | 60% | 80% | 40% | 53% | 64% | 71% | 76% | |
| Average Season | 61% | 62% | 63% | 60% | 38% | 48% | 64% | 57% | |
| IBk | 2 Weeks Before Injury | 70% | 70% | 69% | 71% | 67% | 68% | 70% | 64% |
| 4 Weeks Before Injury | 67% | 67% | 66% | 68% | 64% | 65% | 67% | 62% | |
| Average Season | 46% | 42% | 33% | 51% | 21% | 25% | 42% | 29% | |
| KStar | 2 Weeks Before Injury | 70% | 70% | 67% | 73% | 68% | 67% | 79% | 81% |
| 4 Weeks Before Injury | 73% | 72% | 69% | 76% | 71% | 70% | 81% | 83% | |
| Average Season | 52% | 50% | 46% | 54% | 28% | 35% | 48% | 39% | |
| Random Tree | 2 Weeks Before Injury | 67% | 68% | 74% | 61% | 62% | 68% | 67% | 60% |
| 4 Weeks Before Injury | 68% | 69% | 77% | 61% | 63% | 69% | 69% | 61% | |
| Average Season | 53% | 46% | 33% | 60% | 24% | 28% | 46% | 31% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Martins, F.; Sarmento, H.; Gouveia, É.R.; Saveca, P.; Przednowek, K. Machine Learning-Based Prediction of Muscle Injury Risk in Professional Football: A Four-Year Longitudinal Study. J. Clin. Med. 2025, 14, 8039. https://doi.org/10.3390/jcm14228039
Martins F, Sarmento H, Gouveia ÉR, Saveca P, Przednowek K. Machine Learning-Based Prediction of Muscle Injury Risk in Professional Football: A Four-Year Longitudinal Study. Journal of Clinical Medicine. 2025; 14(22):8039. https://doi.org/10.3390/jcm14228039
Chicago/Turabian StyleMartins, Francisco, Hugo Sarmento, Élvio Rúbio Gouveia, Paulo Saveca, and Krzysztof Przednowek. 2025. "Machine Learning-Based Prediction of Muscle Injury Risk in Professional Football: A Four-Year Longitudinal Study" Journal of Clinical Medicine 14, no. 22: 8039. https://doi.org/10.3390/jcm14228039
APA StyleMartins, F., Sarmento, H., Gouveia, É. R., Saveca, P., & Przednowek, K. (2025). Machine Learning-Based Prediction of Muscle Injury Risk in Professional Football: A Four-Year Longitudinal Study. Journal of Clinical Medicine, 14(22), 8039. https://doi.org/10.3390/jcm14228039

