Key Performance Indicators Predictive of Success in Soccer: A Comprehensive Analysis of the Greek Soccer League

Previous research emphasizes the significance of key performance metrics in determining match outcomes. The purpose of this study is to enhance the understanding of success in professional soccer by analyzing the relationship between match outcomes (win, lose, draw) and various Performance Indicators extracted from the Greek soccer league, as well as to develop a regression model of success in soccer. The sample consisted of all 91 matches from the first round of the 2020–2021 season of the Greek Football League. Utilizing Kruskal–Wallis tests, significant differences were found in goals scored, shots, and shots on target, ball possession, passing metrics, touches in the penalty area, and average shot distance (p < 0.05), with winning teams having demonstrated superior performance metrics. Moreover, winning teams engaged more in positional attacks and counterattacks with shots (p < 0.05). The binary logistic regression model applied to predict match outcomes identified shots on target, counterattacks, passes metrics, offensive duels and set pieces (penalties, free kicks) as key factors influencing the likelihood of winning (p < 0.05). These findings collectively highlight the importance of effective offensive play, including goal scoring, shooting accuracy, and ball possession, in determining the outcomes of soccer matches, with the regression model offering a nuanced understanding of these relationships.


Introduction
Over the years, several studies have sought to decipher the relationship between playing styles and statistical indicators in soccer in various leagues and competitions [1][2][3][4][5][6][7][8][9][10][11][12][13][14].Specifically, success in soccer has been strongly related to the ability of the coaching staff to observe, interpret, and improve key performance indicators and the team's tactical behavior through interventions during the game [15,16].In general, it is emphasized that performance analysis methods are valuable in enhancing the understanding of athletic performance by providing detailed and objective insights into players' strengths and areas for development, enabling targeted interventions for the coaching staff [17].Moreover, recent research highlights the significance of artificial intelligence and factor analysis in understanding playing styles in soccer [18].
The ability to score goals and prevent goal-scoring opportunities from the opponent has long been the focus of tactical discussions.While goals per se might be the ultimate aim, the road to victory is paved with a more nuanced statistic: scoring efficiency [19].Broich et al. [19] elucidated that the quality of shots, as indicated by goal efficiency (ratio of goals scored/shots taken), holds greater significance and takes precedence over shot quantity when determining victory in a soccer match.Regarding shots on target and goals during the 2012 European Championship, it was indicated that 89% of the goals were scored from within the penalty area, while 65% of the shots from outside were saved by the goalkeepers [20].Similar studies of the Premier League and Bundesliga during the 2012-2013 season reported that both distance and angle of the shots taken have a significant impact on the calculation of xG (Expected Goals) [21].
The application of regression analyses in soccer research has been pivotal in enhancing the understanding of the game's strategic and tactical aspects, its effectiveness, and factors influencing match outcomes.Previous research [47] employed linear regression and factor analysis to identify key playing styles in soccer, such as possession play, set pieces, and counterattacks.Similarly, a multilevel logistic regression was applied [29] to analyze team possessions in the Premier League, highlighting the effectiveness of counterattacks and home advantage.Logistic regression analyses focused on the effectiveness of counterattacks and passing strategies [48][49][50] revealed that counterattacks were more effective against imbalanced defenses and that successful teams performed fewer passes and dribbles but completed more successful passes and shots.This suggests that a direct style of play is more successful.Other studies [51,52] explored the prediction of home team win probabilities in European soccer leagues using a regression model, emphasizing the significance of defensive performance in match outcomes.A previous study [1] investigated factors influencing ball recovery in elite soccer, identifying key variables such as match location, status, and quality of opposition, which underscored the proactive defensive strategies of higher-ranked teams.When the performance of a professional soccer team over three seasons was compared [53], it was found that successful performances were associated with fewer attempted and completed passes and more effective shooting.Kite and Nevill [53] also concluded that a more direct style of play, with fewer passes and more shots on target, benefited the team.
The importance of regression analysis in soccer research is evident.Although numerous studies have explored the differences between match outcomes (win, draw, lose), the proportion of research conducted to construct regression models that indicate the relationship between performance indicators and the prediction of match outcomes is disproportionately small.In this context, this study aimed to (a) investigate the relationship between various factors and match outcomes (win, lose, draw) and offensive and defensive play, and set pieces and (b) apply a binary regression model (win-no win) to determine if there are statistically significant key factors for teams' success in the Greek soccer league.

Sample
The sample consisted of all matches of the first round (N = 91) of the first division of the Greek Football League (Super League Interwetten) during the 2020-2021 season, in which 14 teams participated.

Data Collection and Analysis Procedures-Analysis of Matches
All matches and statistical performance indicators presented and analyzed in this study were collected from the Wyscout platform website (https://wyscout.com,accessed on 1 June 2020) through Hudl, a platform for collecting match data by expert video analysts.Two UEFA A licensed coaches and one UEFA Pro coach were involved in the data collection and assessment process from August 2020 until August 2021.All variables' definitions used for this study were defined in the platform's glossaries (Wyscout Glossary, https: //dataglossary.wyscout.com,accessed on 1 June 2020).For example, Pressing intensity (PPDA) quantifies high press intensity by calculating the ratio of opponent passes to defensive actions within the final 60% of the field; smart passes are creative and penetrative passes that break the opposition's defensive lines; deep completed crosses are targeted to the zone within 20 m of the opponent's goal; passes into the final third originate outside the final third and the next ball touch occurs within it; progressive passes are forward passes that advance the team significantly closer to the opponent's goal, Average pass length measures the average length of passes made by a team or player during a match and average shot distance is the mean distance (m) from a team's shots to the opponent's goal, calculated from all shots taken by the team during a match (Wyscout Glossary, https://dataglossary.wyscout.com,accessed on 1 June 2020).
According to researchers [54], the reliability index of the Wyscout platform was 0.70, an index considered satisfactory for the analysis and evaluation of the performance of soccer players through the machine learning used by the platform.Researchers have used this platform in the past in similar research procedures [11,12,25,27,[54][55][56][57][58][59][60].

Statistical Analysis
The data of the present study were analyzed using IBM SPSS Statistics for Windows, Version 25.0, Armonk, NY, USA, IBM Corp. [61].Effect size (ES) was calculated according to Cohen's criteria [62,63], and the calculation of statistical power and ES was performed with the software G*Power: Statistical Power Analyses for Windows, Version 3.1.9.7 [64,65].Regarding the ES, the magnitude of coefficient η 2 was evaluated in the following ranges: η 2 = 0.01-0.06(small effect), η 2 = 0.06-0.14(moderate effect), and η 2 > 0.14 (large effect).At the level of descriptive statistics, the following were calculated: mean (M), standard deviation (SD), and frequencies for all performance indicators.Non-parametric statistical tests, specifically the Kruskal-Wallis and Mann-Whitney U tests, were applied for the comparisons between performance indicators and matches' outcome (win/lose/draw), and in the event of a significant difference, Mann-Whitney U-tests using the Bonferroni correction were employed.
The second objective of the statistical analysis was to determine the factors that significantly predict the outcome of the match.To achieve this, a generalized linear model used for predicting binary outcomes, was utilized.The logistic regression analysis used the binary match outcome (Win versus Draw/Lose) as the dependent variable, as in previous studies [50,53].In order to evaluate the model's predictive capacity, we used two pseudo-Rsquared values: Cox and Snell R 2 and Nagelkerke R 2 .The −2 Log likelihood was employed as a goodness-of-fit test to assess how well the model fits the data.These measures allowed us to examine both the explained variance and the overall fit of the binary logistic regression model.The Omnibus Test of Model Coefficients was used to assess the model fit.Several predictor variables were tested for their effect on the match outcome.These predictors were evaluated based on the Wald chi-square test.The stepwise logistic regression was conducted to achieve a model that adequately predicts the match outcome (Win versus Draw/Lose), adding variables based on their ability to improve the model's fit.The level of statistical significance was set up at p ≤ 0.05.

Discussion
This study presents a comprehensive analysis of the relationship between various performance indicators and match outcomes in the Greek soccer league, offering a deeper understanding of the factors influencing soccer success.Specifically, the analysis revealed that winning teams typically scored an average of 2.09 goals per game, significantly higher than drawing or losing ones.This aligns with previous studies [19], as well as with other researchers [11,12], who emphasized scoring efficiency over shot quantity.As indicated by goal efficiency, the quality of shots is paramount in determining victory, supporting our observation that effective shooting is crucial for success in soccer matches.
Regarding shots on target, winning teams also had a higher average (5.27 per game).This finding is consistent with similar research [20], where it was found that most goals during the 2012 European Championship were scored from within the penalty area, suggesting that strategic positioning and accuracy are more critical than the sheer volume of shots attempted.
Furthermore, this study partially highlights the effectiveness of short passing sequences, with winning teams averaging 4.01 passes per possession, aligned with other researchers [22], who mentioned that shorter passing sequences are common and effective in creating dynamic possession games.However, our findings suggest that longer sequences could also be effective in certain contexts, echoing similar findings [23] on the strategic use of teams' possessions.A preference for combinative attacks among highranked teams is also indicated in the present study, which is in line with similar analyses of other European championships [26,29].This preference underscores the adaptability of teams based on their competitive standing [11,12,24,34].These findings collectively underscore the importance of effective offensive play, including goal scoring, shooting accuracy, and ball possession, as explored by other researchers [25,27,31,32,42,60,66].
Our findings highlight that set plays, particularly those leading to shots, penalties and corners, could play a significant role in soccer success since winning teams generally performed better in these aspects.The observed relationship between the frequency of set pieces with shots and winning outcomes resonates with the findings of several other researchers [3,9,15,40,67], who have highlighted the critical role of performance indicators, including set pieces, in influencing match results.Furthermore, the notable differences in penalties and their conversion rates among winning teams in this study align with the arguments presented above [19].The higher percentage of penalties converted by winning teams underscores the critical role of efficient scoring opportunities in soccer success, suggesting that the ability to capitalize on these chances can be a decisive factor in match outcomes.
Regarding defensive metrics, the significant difference in goals conceded among winning, drawing, and losing teams is a finding of considerable interest.This observation is in line with the tactical discussions emphasized by the researchers [15,23,28,68], who have explored the impact of defensive strategies on game outcomes.In this study, other defensive metrics did not show significant differences across match outcomes, echoing the general importance of a well-structured defensive strategy [30] for all teams, regardless of their ranking.While the effectiveness in preventing goals is a distinguishing factor in winning matches, other aspects of defensive play are consistently executed across different match outcomes.
The application of logistic regression analysis in this study to predict soccer match outcomes as either a win or no-win (draw/lose) provides a compelling insight into the multifaceted nature of soccer tactics and strategies.The model's ability to correctly classify 83.5% of the cases underscores its efficacy in capturing the complexities inherent in soccer match outcomes.A key finding from this study is the significant positive effect of 'Shots on target' on the likelihood of winning.This aligns with the broader narrative in soccer analytics regarding the importance of creating and capitalizing on scoring opportunities, i.e., the critical role of effective offensive strategies in determining match outcomes [26,29,47].This study also reveals the tactical significance of 'Counterattacks' in terms of the likelihood of winning.This finding echoes previous insights [48,49,69,70] that emphasized the effectiveness of counterattacks, particularly against imbalanced defenses.It suggests that teams exploiting quick transition opportunities are more likely to succeed.Furthermore, the substantial positive impact of 'Penalties converted' on winning outcomes underscores the importance of efficiency in scoring, particularly in high-value opportunities like penalties.This aspect of the game is often highlighted in studies that focus on set pieces.Conversely, the negative association of 'Free kicks' and 'Back passes' with winning outcomes suggests a potential strategic drawback in over-relying on this style of play.This could indicate that certain defensive or conservative strategies might not be as conducive to winning, adding a layer of complexity to the ongoing discourse on the balance between offensive and defensive play in soccer.Additionally, the positive association of 'Smart passes' with winning outcomes highlights the importance of intelligent ball distribution and strategic playmaking.Identifying these variables as key factors influencing the likelihood of winning aligns with previous findings [29,47].
While this study provides valuable insights into the analyzed matches, it is important to acknowledge its limitations.Firstly, the study's scope is limited to the first round and not the whole season of a championship, and secondly, factors such as home advantage and in-game tactical behavior were not investigated.

Conclusions
This analysis revealed that winning teams had a higher number in metrics such as goals scored, shots on target, and passing accuracy, emphasizing the importance of effective shooting and ball possession in achieving successful outcomes.This analysis also highlighted the significance of counterattacks and successful offensive duels, suggesting that strategies focusing on quick transitions and offensive one-on-one situations can be beneficial.Additionally, winning teams tend to excel in set plays (e.g., corners, penalties, free kicks), suggesting that a focused approach to enhancing set-piece strategies could also be crucial.The logistic regression model further enriches these insights by identifying key factors influencing the likelihood of winning.An increase in shots on target is related to a higher probability of winning, emphasizing the need for strategies that create more shooting opportunities and improve shooting accuracy.Effective counterattacks also increased the chances of winning, suggesting that teams should train in quick transition play and exploit opportunities during counterattacks.Conversely, the logistic regression model indicated that an increase in free kicks and back passes is associated with a lower likelihood of winning, possibly pointing towards a need for more direct play and reducing unnecessary free kicks.
These findings provide valuable guidance for coaches, players, and performance analysts in their strategic planning and decision-making, contributing to the development of effective soccer strategies, specifically indicating that training should focus on enhancing shooting accuracy, pass completion, and developing tactics that leverage counterattacks to increase the likelihood of winning matches.

Table 1 .
Offensive play metrics and their impact on match outcomes (win, draw, lose).

Table 2 .
Set play metrics and their impact on match outcomes (win, draw, lose).

Table 3 .
Defensive play metrics and their impact on match outcomes (win, draw, lose).

Table 4 .
Binary Logistic Regression analysis results for the key predictors of match outcome (win/lose).