1. Introduction
Elite soccer teams can currently play more than 70 games per season. This number of matches is considered to rate the season as congested [
1]. At the moment, in the first division of soccer teams, a crowded season is characterized by matches every 4 days (average per player), and a player rotation system is even implemented to mitigate and ensure more recovery time between games [
2]. The literature reports that residual fatigue accompanied by muscle morphological changes (e.g., muscle fiber misalignment and muscle swelling) can last more than 72 h after a match [
3,
4]. In this sense, the biggest challenge for coaches and physical trainers is knowing how to implement a training load in a congested season to avoid a drop in performance and not induce poor adaptation (and, consequently, injuries).
The pitch training load (PTL) is vital in induced adaptations in athletes’ performance [
5]. Little information [
1,
2] exists about the manipulation of the PTL in congested seasons. Previous studies have focused on analyzing the external load profile in busy seasons [
1]; however, the evaluation of pitch training load and its relationship with performance still remains open. Also, the most recent studies are still investigating the impact of the external load on the internal load of young players, even in non-congested seasons [
6,
7]. Therefore, determining the optimal PTL and the number of training sessions between matches over the congested season is an open question in soccer. It would still be of great value to provide information about the level of different components of the optimal PTL between matches (e.g., intensity, volume, and frequency of training between games). For example, in a congested season, the frequency of weekly physical conditioning training between games has been limited to ~2 sessions (ranging from 1 to 5 sessions) depending on training opportunities [
2]. However, this training block appears to be insufficient, as it has been identified that the number of training sessions and the pitch training load between matches positively correlate with several running performance variables [
2]. Therefore, practical information regarding the pitch training load on adult soccer players is missing in the literature. This information is of great practical importance for coaches and still needs to be established.
In Brazilian soccer, teams with a busy calendar train their technical and tactical skills on the soccer field as part of the athletes’ physical conditioning (known as pitch training). This PTL coaching tactic characterizes the team’s playing style and running behavior. Thus, we hypothesize that PTL is reflected in match physical performance (MPP). However, to our knowledge, studies have yet to investigate the impact of the PTL practiced by coaches and staff on MPP. In this sense, examining to what extent the specificity of the PTL applied by different training philosophies affects MPP can provide important information about ideal overloads to induce adaptations in matches [
8].
The literature on MPP analysis in congested seasons should be more extensive and consistent. At the same time, a review reported no changes [
1], and a more recent review [
9] identified a worsening or no change in running performance in elite players in periods considered congested. However, most studies on this topic have been limited to evaluating a few games (three to eight matches) [
1,
10]. A long-term study evaluated 52 matches over two seasons, but the team trained 5 to 6 times a week, and only home games were analyzed [
11]. In another study, Dupont et al. [
11] compared periods of congestion (two matches per week) with periods of one match per week and did not identify changes in running performance but identified an increase in injury rate. On the other hand, Penedo-Jamardo et al. [
12] accompanied 4496 professional players for a season and identified a performance decrease in fixture periods (<4 days between matches) during the mid-season, mainly in the fullback and midfield player positions. However, these studies do not report an analysis of contextual factors, such as the training load (a vital component to induce performance adaptation). In this sense, studies assessing congestion seasons (where training opportunities between matches are scarce) and considering both home and away games are still necessary.
Considering a congested scenario, our study is guided by three fundamental hypotheses. The first is that variations in MPP over time are directly linked to PTL. The second hypothesis posits that PTL, as determined by coaches, significantly influences athletes’ MPP. Therefore, if PTL is associated with MPP and coach philosophy, then, by using a machine learning approach, it will be possible to identify the optimal PTL to induce an increase in MPP.
With these hypotheses in mind, our study aims to achieve three main objectives:
Identify whether the changes in MPP and PTL throughout a congested season in elite soccer players are similar;
Identify whether MPP adaptation is specific to the coach’s training load philosophy;
Identify an optimal weekly PTL on MPP during a congested season.
2. Materials and Methods
2.1. Participants and Sample
Match physical performance (N = 3068 cases) and pitch training load session data (N = 11,658 cases) were collected from 54 male professional soccer players (age, 24.3 ± 4.9 years; body weight, 75.2 ± 5.7 kg; height, 178.8 ± 4.5 cm) belonging to the Brazilian First Division team during the 2022 and 2023 season. Data corresponding to 148 official matches from the two seasons were analyzed (season 2022 = 77 matches; season 2023 = 71 matches). The season of 2022 (starting with the first official match) started in January and ended in November, and there were no breaks during this entire period (that is, they had one or two matches every week). The season of 2023 (starting with the first official match) started in January and ended in December. Only data from players who complete full matches were included in the analysis. In this sense, the sample was limited to a 721 match physical performance (MPP)-case data set and their prior pitch training load (PLT) data set (see
Figure 1). Thus, as presented in
Figure 1, the match physical performance and pitch training load resulted in 721 paired performance cases (i.e., these data were separate counts reported jointly).
Players were classified into four positions: striker (197 cases of match performance and respective pitch training load), fullback (153 cases), winger (152 cases), and midfield (148 cases). Goalkeepers were excluded from this analysis due to the different nature of their movement patterns. This study was approved by the São Judas Tadeu University Ethics Committee (number 6.507.950; date 11 November 2023). This is a retrospective study that assessed the data from two seasons of an elite team in the first division of Brazilian football, and it complied with the ethical standards of the university and followed the guidelines of the Declaration of Helsinki.
2.2. Data Collection
A catapult system (Vector 7, Catapult Sports, Melbourne, Australia) with global and local positioning system devices (GPS, GLONASS & SBAS 18 Hz; LPS, Catapult ClearSky 10 Hz) combined with inertial sensors such as an accelerometer (3D +/− 16 G; sampled at 1 kHz, provided at 100 Hz), gyroscope (3D 2000 degrees/second @ 100 Hz), and magnetometer (3D ± 4900 µT @100 Hz) was used to collect data from all matches and training sessions. All three inertial sensors collected data on acceleration, force, rotation, and body orientation. The Inertial Movement Analysis (IMA) method was used to assess explosive efforts such as jumps (>40 cm), acceleration (−45 to 0, 0 to 45 degrees), deceleration (135 to 180, −180 to −135 degrees), and change of direction (COD) to the left (−135 to −45 degrees) or to the right (45 to 135 degrees). In addition, the total explosive effort (the sum of the jump, acceleration, deceleration, and COD) was recorded as the IMA explosive effort. The intensity threshold of each IMA event was set when the action occurred at >2 m/s2, thus characterizing an explosive action. The IMA acceleration and deceleration were set at >3 m/s2. The running distance producing metabolic power (W·kg−1) was also collected at different intensities (>20 and >55 W·kg−1). The player load was collected as the sum of the accelerations (>m/s2) of the tri-axial accelerometer. GPS methods were used to collect the total distance (m), relative distance (m/min), and running distance > 20 km/h (m), >25 km/h (m), and >30 km/h. In addition, the number of sprints (running > 25 km/h) and the maximum speed (km/h) achieved during the match or training session were recorded.
The catapult system shows suitable validity and reliability for measuring speed, acceleration, deceleration [
13], jump [
14], and change of direction [
15]. As a previous study [
13] identified a small coefficient of variation (0.9 to 1.1%) in intra- and inter-unit reliability, the players used the same device over the season.
2.3. Pitch Training Load Data Collection
Data on the conditioning sessions between games were collected. Pre-season training was excluded from the analysis, and a total of 11,658 training sessions were observed. Only the training loads corresponding to complete matches were included. The average number of training sessions per week before the match was calculated (reported here as a weekly training block). Players who did not complete a conditioning session but had played just one match were excluded from the analysis (see detailed description in
Figure 1). Strength training and recovery sessions were not included in the analysis.
2.4. Contextual Factors
The team had two coaches throughout the two seasons (coach 1 = 452 cases; coach 2 = 269 cases) and used the following player positions: striker (coach 1 = 100 cases; coach 2 = 97 cases), fullback (coach 1 = 123 cases; coach 2 = 30 cases), winger (coach 1 = 80 cases; coach 2 = 72 cases), and midfield (coach 1 = 149 cases; coach 2 = 70 cases).
Coach 1 used a daily training sequence that consisted of the following activities: a first part of the training (10%) for warm-up (coordination, agility, speed, or strength), followed by exercises using small-sized games without the use of goalkeepers (15%), a third part comprising the main exercise (50%) through medium-sized games with goalkeepers, and in the final part (15%), activities for general and specific technical improvement (passes/crosses/shots). The coach’s guiding principles were the individual and collective development of the football players in technical and tactical aspects. Coach 2 followed a routine with the following characteristics: a first part (15%) for warm-up (coordination, agility, or speed), followed by exercises using small- and medium-sized games without the use of goalkeepers (25%) and the main activity (60%) with medium-/large-sized games with goalkeepers. The coach’s guiding philosophy was based on collective tactical and strategic aspects.
The home–away matches for coach 1 were home = 187, away = 229; for coach 2, they were home = 71, away = 83.
The Mann–Whitney test was used to verify if training or match frequency was similar between coaches. The match frequency was different between coaches (coach 1 = 4.59 ± 2.2 days; coach 2 = 5.3 ± 2.5 days; p < 0.001), but that of training sessions between matches was not (coach 1 = 2.9 ± 3.2 sessions; coach 2 = 2.9 ± 3.8 sessions; p = 0.67), and nor was the number of weekly training blocks (coach 1 = 2.3 ± 1.1 sessions; trainer 2 = 2.2 ± 1.0 sessions; p = 0.62). We quantified the number of training sessions between matches because in some cases, the players were prevented from playing as a way to preserve themselves for more important matches, and thus were only accumulating training sessions to recover. Also, we reported the amount of weekly training blocks as the weekly training sessions before each match.
To compare performance across the seasons, we divided the two seasons into quartiles, with three months for each quartile. Each quartile encompassed data from two seasons. For instance, the first quartile contained data from January, February, and March of both seasons. The data set was from 148 official matches that produced 721 player performance observations, which were divided as follows: 1st (180 of both match physical performance and session training case observations), 2nd (221 cases), 3rd (203 cases), and 4th (117 cases). Note that the numbers of cases per quartile are different; this is because there is naturally a greater number of matches in quartiles 2 and 3 (mid-season): 1st (31 matches), 2nd (46 matches), 3rd (47 matches), and 4th (24 matches). This nature in relation to the distribution of the data (imbalance over time) was taken into account in the statistical analysis using the mixed linear model (see details in the statistical analysis) [
16].
2.5. Statistical Analysis
The data are presented as mean and standard deviation (±) or, when indicated, with a 95% confidence interval (CI). The mixed linear model was used for dependent variables to determine the difference in training load and match physical performance between the coaches and across the season. Active coach and season quartile were used as fixed effects to compare match physical performance and training load. Four contextual factors were used as covariables to compare the match physical performance between coaches: player age, player position, match avenue, and match frequency. Five contextual factors were used as covariables to compare performance across the seasons: season (2022 and 2023), player position, match avenue, player age, and coach. Because data from the same player were used multiple times, players were used as random effects (intercept model) in the mixed linear model. To examine the association between player running performance and training session load, all match data and training load data were standardized to the z-score within each player’s position. First, we performed a hierarchical cluster to generate a dendrogram. Then, based on the visual inspection of the dendrogram created from the hierarchical cluster approach, we identified that three clusters represented the best solution to solve the identified problem (is training load associated with its respective MPP?). Also, the best silhouette scores were 0.4, 0.4, and 0.3 for 2, 3, and 4 clusters, respectively, confirming objectively that 3-cluster data set separation is the best solution. Finally, we clustered the training load based on player position using a k-means approach. The k-means algorithm used Euclidean distance to compute distance and defined 100 interactions to compute the cluster centroids. The three clusters generated showed three distinct external loads from the pitch training load. The reproducibility of the clusters created in this study was tested in the database itself. To complete this, clustering was repeated several times from different classification orders of the database (i.e., changing the initial k centers). As a result, the k-means converged to the same cluster profile regardless of the initial k center, thus demonstrating that our database is sufficiently large and representative of elite soccer players. To validate the clusters of training load (i.e., to verify if clusters were different between them), we use one-way ANOVA (using clusters as factors and the z-score of training load variables as dependent variables) followed by the Duncan post hoc test. The difference in match physical performance in the function of training load was verified with the mixed linear model using coach and training load cluster as fixed factors; player position, coach, season, match avenues, and match frequency as covariables; and player as a random effect. The η2 effect size was reported as ≥0.01 as small, ≥0.06 as medium, and ≥0.14 as a large effect size. All analyses were performed using the statistical package IBM SPSS Statistics v.26.0.
4. Discussion
The main finding of this study was the identification that match physical performance reflects the pitch training load in a specific way. Specifically, due to our massive number of weekly training blocks (721 blocks) collected over two seasons, we could identify three distinct training loads (i.e., low, moderate, and high training load blocks). Interestingly, the high training load identified in this study was positively associated with the athletes’ performance. Furthermore, the training load and match physical performance data allowed us to identify a specific pattern for the two coaches, which allowed us to recognize that the high-load pitch training throughout the seasons of the two coaches was associated with better running performance on the field in a specific way. To our knowledge, in addition to the data presented in a previous study by our group [
2], there are no other data in the literature that provide a guide to the pitch training load in teams that are involved in congested seasons (over 70 games per season). Therefore, based on our data, coaches and physical trainers can use the methodology described in this study to identify and prescribe the optimal training load throughout the season for soccer players.
Another important finding in our study was the identification of changes in performance throughout the season. Specifically, we identified an improvement in match physical performance during the 2nd and 3rd quartiles, followed by a performance return to baseline (i.e., like to the 1st quartile) during the 4th quartile. We hypothesize that this decrease in training load is a consequence of a lack of player readiness, mainly due to progressive fatigue over the season [
6,
7,
8]. Thus, a strategy to monitor fatigue on a daily basis is necessary to monitor negative changes in performance over the season [
17]. For this, the machine learning analysis is a promissory approach due to it being non-invasive and non-time-demanding (for athletes and coaches) [
8].
We also identified a decrease in training load throughout the season in several variables. This decrease occurred mainly in variables related to explosive effort and peak velocity, suggesting that very high-intensity variables are affected at the end of the season. However, such a decrease in training load did not negatively impact match physical performance. The average reduction in training load over the season, in general, was from the high-load to mid-load blocks. Thus, none of the variables decreased to a low load. As mentioned for match physical performance, a hypothesis for this decrease could also be related to a lack of player readiness [
6,
7,
8]. Another hypothesis is that a lack of time imposes this decrease in training load (see
Table 2). For instance, the lower training load occurs concomitantly over the season with reduced days between matches and training blocks. Thus, it is suggested that a decrease in training load is necessary to avoid overtraining.
This study also showed that high-load pitch training was significantly associated with match physical performance. Low- and mid-load pitch training was associated with low and medium running performance in matches, respectively (see
Figure 5). Thus, a change in match physical performance throughout the season may be related to a change in pitch training load over the season. However, caution is needed to interpret these data; as presented in
Table 2, the training load was also associated with the length of training block sessions and with match frequency. For example, low-load pitch training was associated with a high match frequency (every three days) and ~1.6 session training blocks. In comparison, high-load pitch training was associated with a lower match frequency (every 4.5 days) and ~2.6 session training blocks. Thus, we cannot discard that low match performance could be a consequence of fatigue between matches, and that low load could be a consequence of the downward modulation of training load to avoid overtraining the soccer players during over-congested matches (i.e., every three days).
The differences in match physical performance corresponds to the same differences in pitch training load across the three clusters (low, medium, and high load). In addition to physical conditioning, pitch training is used for technical and tactical purposes, which can also impact the field movement profile because tactical/technical training aims to create movement behavior (i.e., playing style). However, it is undeniable that pitch training load impacts performance negatively or positively [
5]. For instance, in a study by Guerrero-Calderón et al. [
5], a negative load was identified for the total distance in training sessions (negatively affecting total distance and high-intensity runs in the matches); on the other hand, a positive training load was identified for high-intensity running activity (>25 km/h). Our data did not identify a negative effect but a positive one for both variables. In our previous study [
2], where we used data from one season (77 matches and from just a single coach), we identified that a high-impact activity training load (such as explosive effort, COD, jump, and deceleration) or running in a straight line had cross-interference on match physical performance. For example, when blocks of impact training (such as COD, accelerations, decelerations, and jumps) are performed, they cause a negative impact on straight-line running variables (such as running at speeds > 20, >25, or >30 km/h and total running distance) or vice versa. Our results suggest that this cross-interference may occur because differences in field performance reflect the training load in the three clusters (low, medium, and high load). For instance, COD, explosive efforts, and acceleration/deceleration activities are emphasized in small-sized games during pitch training load (generates a greater volume and intensity for these activities), while straight-line running activities are emphasized in all-field pitch training. This different approach will be reflected in match physical running performance during the official games. In this sense, trainers should periodize these loads to avoid cross-interferences and to induce desirable adaptation [
18].
Our description of training loads provides information on optimal pitch training load prescription. For instance, a previous study by our group [
2] and our current data also demonstrate that training volume (training between matches) has a positive impact on straight-line running performance (described in
Figure 2). Also, accumulating pitch training between matches (i.e., volume) positively affects players’ performance in a congested season. In addition, our data suggest that a 4.5-day match frequency may be sufficient to induce better performance than a 3-day match frequency. A previous study assessing only home matches indicated that a short period of match congestion (2–4-day match frequency) did not negatively impact running performance [
1]. Our data assessed both home and away matches, thus adding greater ecological validity to the analysis. Therefore, the sum of our data suggests that one or two matches per week (i.e., every 4.5 days) does not negatively impact match physical performance [
2,
11]. However, some players have a match frequency of >7 days, and this remains an open question. Thus, the ideal match frequency remains an open question in match physical performance. Albeit, a high match frequency (i.e., two matches per week) did not negatively affect match physical performance but increased injury risk [
1]; thus, coaches should be cautious during congested periods, and adopting rotation between players is advised. This caution is crucial for maintaining player health and performance. A previous study [
19] identified that chronic high-load training could reduce the risk of injury incidence when compared to chronic low-load training. In this sense, future studies should investigate the relationship between training load and injury incidence in soccer during the congestion season.
It is important to note that here we conducted a retrospective observational study, and thus the significant association between pitch training load and match physical performance may not be cause–effect. In this sense, future studies with experimental designs (randomized controlled approach, observing athletes with different pitch training loads) aiming to identify a dose–response relationship are still necessary.