Football Match Line-Up Prediction Based on Physiological Variables: A Machine Learning Approach †
Abstract
:1. Introduction
2. Related Work
3. Methodology
3.1. Business Understanding
3.2. Data Understanding
3.3. Data Preparation
3.4. Modelling
3.4.1. Selection of the Best Set of Variables
3.4.2. Predicting the Starting Line-Up and Chose the Better Prepare Players
3.5. Evaluation
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Yang, Y. Evaluation Model of Soccer Training Technology Based on Artificial Intelligence. J. Phys. Conf. Ser. 2020, 1648. [Google Scholar] [CrossRef]
- Maanijou, R.; Mirroshandel, S.A. Introducing an Expert System for Prediction of Soccer Player Ranking Using Ensemble Learning. Neural Comput. Appl. 2019, 31, 9157–9174. [Google Scholar] [CrossRef]
- Kusmakar, S.; Shelyag, S.; Zhu, Y.; Dwyer, D.; Gastin, P.; Angelova, M. Machine Learning Enabled Team Performance Analysis in the Dynamical Environment of Soccer. IEEE Access 2020, 8, 90266–90279. [Google Scholar] [CrossRef]
- Baboota, R.; Kaur, H. Predictive Analysis and Modelling Football Results Using Machine Learning Approach for English Premier League. Int. J. Forecast. 2019, 35, 741–755. [Google Scholar] [CrossRef]
- Oliver, J.L.; Ayala, F.; de Ste Croix, M.B.A.; Lloyd, R.S.; Myer, G.D.; Read, P.J. Using Machine Learning to Improve Our Understanding of Injury Risk and Prediction in Elite Male Youth Football Players. J. Sci. Med. Sport 2020, 23, 1044–1048. [Google Scholar] [CrossRef]
- Knauf, K.; Memmert, D.; Brefeld, U. Spatio-Temporal Convolution Kernels. Mach. Learn. 2016, 102, 247–273. [Google Scholar] [CrossRef] [Green Version]
- Catapult Innovations Playertek. Available online: https://www.playertek.com/gb (accessed on 9 January 2022).
- Cortez, A.; Trigo, A.; Loureiro, N. Predicting Physiological Variables of Players That Make a Winning Football Team: A Machine Learning Approach. In Proceedings of the Computational Science and Its Applications—ICCSA 2021: 21st International Conference, Cagliari, Italy, 13–16 September 2021; pp. 3–15. [Google Scholar]
- López-Valenciano, A.; Ayala, F.; Puerta, J.M.; de Ste Croix, M.B.A.; Vera-Garcia, F.J.; Hernández-Sánchez, S.; Ruiz-Pérez, I.; Myer, G.D. A Preventive Model for Muscle Injuries: A Novel Approach Based on Learning Algorithms. Med. Sci. Sports Exerc. 2018, 50, 915–927. [Google Scholar] [CrossRef] [PubMed]
- Vallance, E.; Sutton-Charani, N.; Imoussaten, A.; Montmain, J.; Perrey, S. Combining Internal- and External-Training-Loads to Predict Non-Contact Injuries in Soccer. Appl. Sci. 2020, 10, 5261. [Google Scholar] [CrossRef]
- García-Aliaga, A.; Marquina, M.; Coterón, J.; Rodríguez-González, A.; Luengo-Sánchez, S. In-Game Behaviour Analysis of Football Players Using Machine Learning Techniques Based on Player Statistics. Int. J. Sports Sci. Coach. 2020, 16, 148–157. [Google Scholar] [CrossRef]
- Behravan, I.; Razavi, S.M. A Novel Machine Learning Method for Estimating Football Players’ Value in the Transfer Market. Soft Comput. 2021, 25, 2499–2511. [Google Scholar] [CrossRef]
- Ćwiklinski, B.; Giełczyk, A.; Choraś, M. Who Will Score? A Machine Learning Approach to Supporting Football Team Building and Transfers. Entropy 2021, 23, 90. [Google Scholar] [CrossRef] [PubMed]
- Constantinou, A.C. Dolores: A Model That Predicts Football Match Outcomes from All over the World. Mach. Learn. 2019, 108, 49–75. [Google Scholar] [CrossRef] [Green Version]
- Stübinger, J.; Mangold, B.; Knoll, J. Machine Learning in Football Betting: Prediction of Match Results Based on Player Characteristics. Appl. Sci. 2019, 10, 46. [Google Scholar] [CrossRef] [Green Version]
- Herold, M.; Goes, F.; Nopp, S.; Bauer, P.; Thompson, C.; Meyer, T. Machine Learning in Men’s Professional Football: Current Applications and Future Directions for Improving Attacking Play. Int. J. Sports Sci. Coach. 2019, 14, 798–817. [Google Scholar] [CrossRef]
- Jaggia, S.; Kelly, A.; Lertwachara, K.; Chen, L. Applying the CRISP-DM Framework for Teaching Business Analytics. Decis. Sci. J. Innov. Educ. 2020, 18, 612–634. [Google Scholar] [CrossRef]
- Bunker, R.P.; Thabtah, F. A Machine Learning Framework for Sport Result Prediction. Appl. Comput. Inf. 2019, 15, 27–33. [Google Scholar] [CrossRef]
- Elyakim, E.; Morgulev, E.; Lidor, R.; Meckel, Y.; Arnon, M.; Ben-Sira, D. Comparative Analysis of Game Parameters between Italian League and Israeli League Football Matches. Int. J. Perform. Anal. Sport 2020, 20, 165–179. [Google Scholar] [CrossRef]
- Tanizaka Filho, M.O.; Cordeiro Marujo, E.; Calasans Dos Santos, T. Identification of Features for Profit Forecasting of Soccer Matches. In Proceedings of the 2019 8th Brazilian Conference on Intelligent Systems (BRACIS), Salvador, Brazil, 15–18 October 2019; pp. 18–23. [Google Scholar] [CrossRef]
- Miguel, M.; Oliveira, R.; Loureiro, N.; García-Rubio, J.; Ibáñez, S.J. Load Measures in Training/Match Monitoring in Soccer: A Systematic Review. Int. J. Environ. Res. Public Health 2021, 18, 2721. [Google Scholar] [CrossRef]
- Altavilla, G.; Riela, L.; di Tore, A.P.; Raiola, G. The Physical Effort Required from Professional Football Players in Different Playing Positions. J. Phys. Educ. Sport 2017, 17, 2007–2012. [Google Scholar] [CrossRef]
- Almulla, J.; Alam, T. Machine Learning Models Reveal Key Performance Metrics of Football Players to Win Matches in Qatar Stars League. IEEE Access 2020, 8, 213695–213705. [Google Scholar] [CrossRef]
- Baptista, I.; Johansen, D.; Seabra, A.; Pettersen, S.A. Position Specific Player Load during Matchplay in a Professional Football Club. PLoS ONE 2018, 13, e0198115. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Borghi, S.; Colombo, D.; la Torre, A.; Banfi, G.; Bonato, M.; Vitale, J.A. Differences in GPS Variables According to Playing Formations and Playing Positions in U19 Male Soccer Players. Res. Sports Med. 2020, 29, 225–239. [Google Scholar] [CrossRef] [PubMed]
- Stein, M.; Seebacher, D.; Marcelino, R.; Schreck, T.; Grossniklaus, M.; Keim, D.A.; Janetzko, H. Where to Go: Computational and Visual What-If Analyses in Soccer. J. Sports Sci. 2019, 37, 2774–2782. [Google Scholar] [CrossRef] [PubMed]
- Keshav, R. Applications of Artificial Intelligence in the Game of Football: The Global Applications of Artificial Intelligence in the Game of Football: The Global Perspective. Int. Refereed Soc. Sci. J. 2020, 11, 18–29. [Google Scholar] [CrossRef]
Games Dataset | Training Sessions Dataset |
---|---|
‘Athlete’, ‘Game’, ‘Position’, ‘Home or Away’, ‘Pitch’, ‘Final Score’, ‘Minutes’, ‘Game Condition’, ‘RPE_J’, ‘sRPE_J’, ‘HR’, ‘%HR’, ‘<60%HR’, ‘60–74.9%HR’, ‘75–89.9%HR’, ‘>90%HR’, ‘Player Load’, ‘Player Load.UA/min’, ‘Distance_m’, ‘Distance.m/min’, ‘Distance.0–3’, ‘Distance.3.4’, ‘Distance.4–5.5’, ‘Distance.5.5–7’, ‘Distance.>7’, ‘WRRatio’, ‘Accel.0–2’, ‘Accel.2–4’, ‘Accel.>4’, ‘Deacc.0–2’, ‘Deacc.2–4’, ‘Deacc.>4’ | ‘Athlete’, ‘Game’, ‘Position’, ‘Home or Away’, ‘Pitch’, ‘Final Score’, ‘Minutes’, ‘RPE_J’, ‘Player Load’, ‘Player Load.UA/min’, ‘Distance Total’, ‘Distance.m/min’, ‘Distance.0–3’, ‘Distance.3.4’, ‘Distance.4–5.5’, ‘Distance.5.5–7’, ‘Distance.>7’, ‘WRRatio’, ‘Accel.0–2’, ‘Accel.2–4’, ‘Accel.>4’, ‘Deacc.0–2’, ‘Deacc.2–4’, ‘Deacc.>4’ |
Variables | Description |
---|---|
Game | Is the number of the game in the Championship games sequence. |
Final Score | Represents the points that the team won in the game, which relates to a victory, draw or loss. 0 it’s for a lost game, 1 for a draw and 3 for a win. |
Minutes | It is the number of minutes that the players are actively in the game. |
RPE | It is the rate of perceived exertion, made by a numeric estimate of someone’s exercise intensity. It is a way to measure how hard a person is exercising, which ranges from 1 (no exertion) to 10 (extremely hard). |
Heart Rate (HR) | The maximum heart rate calculated as HRmax = 220—age. It is calculated in absolute and %. |
Player Load | Calculated based on the acceleration data that are registered by the triaxial accelerometers. This variable, considered as a magnitude vector, represents the sum of the accelerations recorded in the anteroposterior, medio-lateral and vertical planes. Represented in Total and in Arbitrary Units (U.A.) per minutes. |
Distance (Total and m/s) | The Total distance provides a good global representation of volume of exercise (walking, running) and is also a simple way to assess individual’s contribution relative to a team effort. It is divided in five different speed zones: “walking/jogging distance, 0.0 to 3.0 m/s; running speed distance, 3.0 to 4.0 m/s; high-speed running distance, 4.0 to 5.5 m/s; very high-speed running distance, 5.5 to 7.0 m/s; and sprint distance, a speed greater than 7.0 m/s [21]. |
Work Ratio (WRRatio) | It is used to describe footballer’s activity profiles, which is divided in two categories: (1) pause if the distance is travelled at a speed <3.0 m/s; and as (2) work if the distance is travelled at a speed >3.0 m/s. |
Acceleration (Accel.) | Categorized based upon the acceleration of the movement, which is thought to represent the “intensity” of the action. It is divided in “low intensity”, 0.0 to 2.0 m/s2; “moderate intensity”, 2.0 to 4.0 m/s2; and “high intensity”, greater than 4.0 m/s2 [21]. |
Deacceleration (Deacc.) | Categorized based upon the deacceleration of the movement, which is thought to represent the “intensity” of the action. It is divided in “low intensity”, 0.0 to −2.0 m/s2; “moderate intensity”, −2.0 to −4.0 m/s2; and “high intensity”, greater than −4.0 m/s2 [21]. |
Training Sessions Dataset with Win and Lineup Identification |
---|
‘Athlete’, ‘Week’, ‘Positionx, ‘Home or Away’, ‘Pitch’, ‘Final-Score’, ‘Minutes’, ‘Player_Load’, ‘Player Load_UA/min’, ‘Distance_m’, ‘Distance.m/min’, ‘Distance_0_3’, ‘Distance_3_4’, ‘Distance_4_5.5’, ‘Distance_5.5_7’, ‘Distance_>7’, ‘WRRatio’, ‘Aceler_0_2’, ‘Aceler_2_4’, ‘Aceler_>4’, ‘Desac_0_2’, ‘Desac_2_4’, ‘Desac_>4’, ‘Win’, ‘Line-up’ |
Athlete | Code | Position | Home or Away | Pitch | Final-Score | Minutes | Player_Load | Player Load _UA/min | Distance_m | Distance.m/min | Distance_0_3 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 9 | 13 | 7 | 1 | 0 | 3 | 96 | 363.8 | 3.789583 | 5728.3 | 59.669792 | 3959.3 |
2 | 9 | 13 | 2 | 1 | 0 | 3 | 76 | 302.0 | 3.973684 | 5025.7 | 66.127632 | 3362.4 |
3 | 9 | 13 | 7 | 1 | 0 | 3 | 86 | 336.9 | 3.917442 | 4554.2 | 52.955814 | 3981.6 |
5 | 2 | 13 | 3 | 1 | 0 | 3 | 96 | 230.9 | 2.397917 | 4800.7 | 50.007292 | 3917.0 |
6 | 2 | 13 | 3 | 1 | 0 | 3 | 76 | 261.9 | 3.446053 | 5300.5 | 69.743421 | 4163.1 |
Distance _3_4 | Distance_4_5.5 | Distance_5.5_7 | Distance_>7 | WRRatio | Aceler_0_2 | Aceler_2_4 | Aceler_>4 | Desac_0_2 | Desac_2_4 | Desac_>4 | Win | Line-up |
684.1 | 502.8 | 447.7 | 134.4 | 9.1 | 101 | 95 | 39 | 89 | 110 | 30 | 1 | 1.0 |
665.1 | 607.7 | 320.8 | 69.7 | 11.1 | 69 | 109 | 26 | 108 | 93 | 16 | 1 | 1.0 |
335.9 | 221.6 | 15.1 | 0.0 | 7.3 | 188 | 113 | 24 | 143 | 142 | 21 | 1 | 1.0 |
502.5 | 287.6 | 72.5 | 21.0 | 8.5 | 78 | 81 | 10 | 104 | 63 | 9 | 1 | 1.0 |
667.7 | 417.7 | 52.0 | 0.0 | 12.8 | 122 | 83 | 6 | 121 | 83 | 13 | 1 | 1.0 |
Position | Features/Variables |
---|---|
Central Defender (CD) | ‘Distance.m’, ‘Distance.0–3′, ‘Distance.3–4′, ‘Distance.>7’ |
Full Back (FB) | ‘<60.0%HR’, ‘60–74.9%HR’, ‘Player Load.UA/min’, ‘WRRatio’ |
Central Midfielder (CM) | ‘Player Load.UA/min’, ‘Distance.m/min’, ‘WRRatio’, ‘Aceler.>4’ |
Offensive Midfielder (OM) | <‘60.0%HR’, ‘60–74.9%HR’, ‘Player Load.UA/min’, ‘WRRatio’ |
Winger (W) | ‘<60.0%HR’, ‘60–74.9%HR’, ‘75–89.9%HR’, ‘Player Load.UA/min’ |
Forward (F) | ‘Distance.m’, ‘Distance.4–5.5’, ‘Distance.>7’ |
Positions | Features/Variables |
---|---|
CD | ‘Distance_m’, ‘Distance_0_3’, ‘Distance_3_4’, ‘Distance_>7’ |
FB | ‘Player Load_UA/min’, ‘WRRatio’ |
CM | ‘Player Load_UA/min’, ‘Distance_m/min’, ‘WRRatio’,’Aceler_>4’ |
W | ‘Player Load_UA/min’, ‘Distance_m/min’ |
F | ‘Distance_m’, ‘Distance_4_5.5’, ‘Distance_>7’ |
ML Algorithms and Models/Position | CD | FB | CM | W | F | |
---|---|---|---|---|---|---|
DT Accuracy with CV | Model 1 (all variables) | 83 | 69 | 83 | 63 | 70 |
Model 2 (using RFE) | 83 | 71 | 70 | 73 | 70 | |
NB Accuracy with CV | Model 1 (all variables) | 59 | 77 | 59 | 68 | 46 |
Model 2 (using RFE) | 73 | 72 | 65 | 68 | 57 |
coef | Std err | z | P > |z| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
const | 2.0983 | 1.577 | 1.331 | 0.183 | −0.992 | 5.189 |
Player Load_UA/min | −1.0209 | 0.582 | −1.753 | 0.080 | −2.162 | 0.121 |
WRRatio | 0.1600 | 0.094 | 1.702 | 0.089 | −0.024 | 0.344 |
coef | Std err | z | P > |z| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
const | −3.0404 | 1.799 | −1.690 | 0.091 | −6.567 | 0.486 |
Player Load_UA/min | 2.3050 | 0.784 | 2.939 | 0.003 | 0.768 | 3.842 |
Distance.m/min | −0.0994 | 0.064 | −1.550 | 0.121 | −0.225 | 0.026 |
WRRatio | 0.1218 | 0.121 | 1.006 | 0.314 | −0.116 | 0.359 |
Aceler_>4 | 0.0198 | 0.024 | 0.813 | 0.416 | −0.028 | 0.068 |
coef | Std err | z | P > |z| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
const | −3.2661 | 1.978 | −1.651 | 0.099 | −7.143 | 0.611 |
Player Load_UA/min | 3.1864 | 1.074 | 2.967 | 0.003 | 1.082 | 5.291 |
Distance.m/min | −0.0938 | 0.046 | −2.040 | 0.041 | −0.184 | −0.004 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cortez, A.; Trigo, A.; Loureiro, N. Football Match Line-Up Prediction Based on Physiological Variables: A Machine Learning Approach. Computers 2022, 11, 40. https://doi.org/10.3390/computers11030040
Cortez A, Trigo A, Loureiro N. Football Match Line-Up Prediction Based on Physiological Variables: A Machine Learning Approach. Computers. 2022; 11(3):40. https://doi.org/10.3390/computers11030040
Chicago/Turabian StyleCortez, Alberto, António Trigo, and Nuno Loureiro. 2022. "Football Match Line-Up Prediction Based on Physiological Variables: A Machine Learning Approach" Computers 11, no. 3: 40. https://doi.org/10.3390/computers11030040
APA StyleCortez, A., Trigo, A., & Loureiro, N. (2022). Football Match Line-Up Prediction Based on Physiological Variables: A Machine Learning Approach. Computers, 11(3), 40. https://doi.org/10.3390/computers11030040