1. Introduction
Tactical performance in football is significantly influenced by player positioning [
1]. Understanding how players occupy and move within space allows teams to control the flow of the game, limit opponents’ options, and create opportunities. However, historically, reliable data on player and ball positioning was scarce, complicating the assessment of action threats. Recent advancements in video data processing and analysis technologies have improved this situation, enabling the collection of granular spatiotemporal and event data [
2]. In response, a variety of models have been developed to quantify the value and threat associated with football actions.
One model designed to evaluate action threat is the Expected Threat (xT) model, which quantifies the potential danger of different pitch locations by assigning a numerical xT value to each grid cell on the pitch. The concept of Expected Threat (xT) originated from Rudd’s research [
3], which, while not explicitly naming “Expected Threat”, established a foundational framework by segmenting the game into distinct states and using transition probabilities. This framework enabled a more equitable distribution of credit among players beyond traditional metrics. Karun Singh [
4] later popularized the term “Expected Threat” (xT), introducing an equation that computes xT components. Central to Singh’s approach was the location of the ball, which allowed for a detailed grid of the pitch, assigning xT values to each grid cell. Despite its usefulness, the xT framework presents notable limitations. First, it is restricted to actions characterized by a well-defined start and end location, such as passes and carries. Second, it does not account for the threat conceded by a team, limiting its ability to evaluate how actions expose a team to danger. Finally, it overlooks the spatial configuration of off-ball players, whose positioning plays a crucial role in both creating and mitigating threats during a game.
A notable advancement in football action evaluation is the VAEP model (Valuing Actions by Estimating Probabilities), introduced by Decroos et al. [
5]. Unlike xT, VAEP overcomes key limitations by evaluating all types of actions, regardless of whether they have clearly defined start and end locations. Furthermore, it assesses not only the potential to create scoring opportunities but also the extent to which an action increases a team’s vulnerability to conceding a goal. For instance, a pass may positively influence the probability of scoring in subsequent actions, yet at the same time, it may also expose the team to counterattacks if unsuccessful or poorly positioned. VAEP captures this dual nature by estimating the change in scoring and conceding probabilities before and after each action. The model incorporates contextual features such as action location, distance and angle to the goal, game tempo, time remaining, and current score differential, providing a comprehensive view of an action’s impact within the match context.
In addition to VAEP, other machine learning approaches, such as Deep Reinforcement Learning (DRL), have been explored for football action evaluation. Liu et al. [
6] employed Deep Reinforcement Learning (DRL) by training an action-value Q-function to evaluate actions, developing a neural network framework with a stacked two-tower LSTM architecture, one for each competing team. In a similar vein, Routley and Schulte [
7] employed a Markov model in hockey to compute Q-values for actions, which estimate the probability of scoring the next goal based on the current game context. Pulis and Bajada [
8] also leveraged DRL to analyze decision-making, introducing a Decision Value (DV) metric that optimizes decision policies based on teammates’ and opponents’ positions.
While these models have significantly improved action evaluation by integrating machine learning techniques and probabilistic frameworks, they often fail to account for the dynamic spatial context of player positioning despite its crucial influence on decision-making and game dynamics [
9,
10,
11,
12,
13,
14].
Among the existing action evaluation models, we have chosen to build upon the xT framework due to its high interpretability [
15]. To address a limitation that, to our knowledge, has not yet been explored—namely, the lack of consideration for the spatial configuration of off-ball players—we propose a novel methodological approach that refines xT by dynamically adjusting threat values based on real-time game context. This enhancement allows for the generation of action-specific threat maps, offering a more granular and context-sensitive evaluation of football actions. By doing so, our model bridges the gap between theoretical metrics and practical usability, aligning with the needs of analysts who often struggle with the limited credibility of traditional xT models in tactical analysis.
In this study, we begin with a theoretical analysis of existing xT models, highlighting their mathematical limitations in modeling key parameters such as move and shot probabilities. Building on these insights, we develop a new theoretical framework that forms the basis of the Dynamic Expected Threat (DxT) model. We then implement the model and analyze its iterative convergence properties. To assess its practical relevance, we conduct two case studies based on real match sequences: one involving Albania’s opening goal against Croatia in the Euro 2024 and the other focusing on Portugal’s final offensive sequence in the Euro 2024 quarter-final against France. Finally, we evaluate the performance of both models to validate their ability to provide a realistic assessment of football actions.
2. Existing Expected Threat (xT) Models Theory
Football performance evaluation relies on statistical analysis to assess both player and team effectiveness. Traditional statistics, such as goals scored, completed passes, and shots taken, offer quantitative insights into performance but do not fully capture the qualitative aspects of play. Expected Statistics (xStats), including Expected Threat (xT), address this limitation by quantifying various aspects of in-game actions, such as the effectiveness of passes, carries, and shot opportunities.
2.1. Theoretical Foundations of Expected Threat (xT) Models
This section outlines the theoretical foundations of the xT model as proposed by Karun Singh [
4].
The xT model involves segmenting the football pitch into longitudinal and lateral sections, creating a grid where each action is located within a grid cell Actions can either be a shot from this location or a move to any of the possible grid cells .
The expected outcome of shooting the ball from grid cell is equal to the probability of scoring from that situation. This probability is denoted as , representing the Expected Goals from that shot. The expected outcome from moving the ball from grid cell to grid cell is given by: where is the probability of transitioning from grid cell to grid cell . is the expected outcome associated with that transition.
Consequently, the overall expected payoff of a football action occurring in
grid cell, denoted as
, is as follows:
where
represents the probability that the player in possession of the ball opts to take a shot on goal from
cell and
represents the probability that the player opts to move the ball.
The terms
,
, and
in (1) refer to the actions taken by players in possession of the ball to transition the game to a state where they have an increased likelihood of scoring. These game states align directly with the states of the Markov model [
16]: players shift the game from one state to another through passes or carries until reaching an absorbing state, such as a goal or a possession turnover.
2.2. Empirical Calculation of Expected Threat (xT)
In practice, the existing xT models calculate
via historical data, relying on the grid location
of actions (shots and ball moves):
where
represents the number of historical shots performed inside the grid cell
and
represents the number of historical ball moves performed inside the grid cell
.
And:
where
represents the number of historical moves performed from the grid cell
to the grid cell
and
represents the number of historical moves performed from the grid cell
.
2.3. Lack of Realism in Expected Threat (xT) Models
Existing xT models present numerous limitations related to their realistic aspects. Among these limitations:
2.3.1. Oversimplified Decision Modeling
In traditional xT models, the probability of shooting is calculated by dividing the number of shots taken from the grid cell by the total number of shots and moves executed within that grid cell. This frequentist approach, based solely on the ball’s location, offers a simplified representation of the decision-making process, reducing the decision to shoot to the ball’s position alone. However, real-world scenarios deviate from this simplification.
A player’s decision to shoot from a given position is heavily influenced by their perceived likelihood of scoring. If they estimate a high probability of success, they are more inclined to take the shot; otherwise, they may opt to pass to a better-positioned teammate. For example, a player might attempt a shot at an unguarded goal, even if such attempts have been historically rare from that distance.
2.3.2. Static xG Estimates
One major limitation of traditional xT models is their reliance on static xG estimations, which fail to account for dynamic aspects of goal probability evaluation. In these models, goal probabilities are often estimated based on the historical frequency of goals scored from each grid cell on the pitch. However, this frequency-based approach fails to capture the true nature of scoring opportunities, leading to an oversimplified and often misleading representation of goal likelihoods. A player does not have a 0% chance of scoring from midfield merely because no goals have been historically recorded from that location in the datasets used to estimate goal probabilities. Exceptional circumstances, such as a goalkeeper being out of position during a counterattack, can lead to successful long-range attempts, which static xT models fail to account for.
More advanced xG models are required to build realistic xT models, particularly those integrating dynamic player positioning and game context. Several xG models have been proposed [
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28,
29], yet most fail to fully incorporate off-the-ball player positioning. Some models [
30,
31] attempt to consider defensive positioning by quantifying the number of defenders ahead of the ball, but this approach remains insufficient. A scenario where a shooter faces four defenders directly blocking the shooting angle differs significantly from one where those defenders are present but do not obstruct the shooting angle.
A promising advancement in xG modeling is the work of Hassani and Lotfi [
14], who introduced the KOS angle to refine shooting angle evaluations. This metric subtracts the angles formed by the extremities of players positioned ahead of the ball, effectively incorporating defensive width, orientation, and positioning into the xG calculation. Subsequently, Cefis et al. [
32] developed an xG model that builds upon the KOS angle formulation. Despite these advancements, xT models still lack the integration of such dynamic xG methodologies, which limits their ability to provide a realistic assessment of football actions.
2.3.3. Simplistic Ball Transition Probabilities
The calculation of (representing the probability of moving the ball from the grid cell to the grid cell ) in the existing xT models is grounded on the premise that the probability of a player relocating the ball from one zone to another is exclusively determined by the historical frequency of actions transitioning from one zone to another. Yet, in practical scenarios, numerous other variables come into play, prominently including the spatial locations and orientation of players on the pitch, with the objective of optimizing the threat magnitude. To elucidate further, based on a player’s specific grid location, their intent often gravitates towards directing the ball to zones exhibiting elevated values or zones expected to yield augmented values in subsequent actions.
3. Dynamic Expected Threat (DxT) Model Theory
3.1. General DxT Equation
This section outlines the theoretical framework for constructing the Dynamic Expected Threat model (DxT). This model is designed to generate an xT map tailored to each specific game scenario. Unlike static models, DxT considers not only the ball’s position but also a range of other features that influence the action being evaluated. These features, denoted as “
”, encompass a plethora of elements that contribute to the potential threat of the action. Rather than defining a fixed set of features at this stage, we present a generalized equation, which will be further refined based on the features available in our dataset. Equation (1), reformulated to incorporate these considerations, is presented as follows:
where:
represent all features influencing the potential threat posed by the action located in grid cell.
represent all features influencing the potential threat posed by the action located in grid cell.
represents the probability of a player opting to take a shot on goal from grid cell, while considering features labeled as “”.
represents the probability of a player opting to move the ball from grid cell to another grid while considering features labeled as “”.
represents the probability of scoring a goal from grid cell, while considering features labeled as “”.
represents the probability of moving the ball from grid cell to grid cell, given the alteration in features from the state “” to the state “”.
represents the dynamic threat of the action that takes place in grid cell, while considering features labeled as “”.
represents the dynamic threat of the action that takes place in grid cell, while considering features labeled as “”.
3.2. Methodology for Calculating Probability of Shooting/Moving Decision
Within the DxT model, the computation of
is based on the assumption that a player’s decision to shoot, rather than pass or carry, is driven by their likelihood of scoring, commonly quantified as Expected Goals (xG). Therefore,
is a function of
, where
represents the Expected Goal value for an action occurring in the
grid cell, taking into account the features denoted as “
”. This relationship is represented as follows:
where f is the probability function to shoot, which depends on
, and represents the probability that a player chooses to shoot instead of moving the ball.
The crux of this section lies in defining a methodology for calculating , and deriving an expression for the function .
3.2.1. Proposed Methodology for Calculating Expected Goals
To compute , representing the probability of a player deciding to shoot rather than pass from a grid cell , it is essential to measure .
Building upon the limitations highlighted in
Section 2.3.2, where traditional xT models fail to incorporate dynamic xG estimations, we adopt an xG model that incorporates the KOS angle feature. Throughout this study, we refer to the KOS xG model as any xG model that incorporates the KOS angle feature, regardless of the machine learning algorithms used or the datasets employed for model training.
Beyond the KOS angle, additional features may be considered, though we refrain from predefining them at this stage to maintain a general formulation for our DxT model. To simplify the notation in the subsequent equations, we denote this set of features as , emphasizing that multiple features can be used as long as the KOS angle is included in the calculation of , which can thus be rewritten as .
3.2.2. Proposed Process for Building the Probability Function to Shoot
The construction of the probability function f, which represents the likelihood that a player opts to shoot rather than pass the ball, is fundamentally based on the Expected Goals (xG) value for the given situation, denoted as . This probability reflects the player’s decision-making process, which is contingent upon the likelihood of scoring from that specific context. Classical xT models calculate the frequency of shots and moves performed within each grid cell of the pitch to determine these probabilities. In contrast, our method evaluates shooting or passing probabilities by segmenting the range [0,1] of xG values and analyzing the frequency of shots and passes within each xG sub-segment. This methodology follows a five-step process, as outlined below:
Step 1: Partitioning the [0,1] Interval into Multiple Sub-Intervals
Given that values fall within the range of 0 to 1, we partition the interval [0,1] with a step size Δ. Each sub-interval is denoted as for .
For this study, we use and , resulting in the following intervals: .
Step 2: Determining the Sub-Interval to which xGValues of Historical Actions Belong
We analyze
historical actions (both shots and moves) and compute their xG values using the KOS xG model. Each xG value is then assigned to the corresponding sub-interval
as shown in
Figure 1.
Step 3: Measuring the Frequency of Shots in Each Sub-Interval
For
, we calculate the frequency of shots within each sub-interval
. This is achieved by determining the ratio of the number of historical shots with
values in
to the total number of historical actions (shots and moves) within
. We denote this ratio as
, computed as
where
represents the number of shots with
values in
and
represents the number of moves with xG values in
.
Step 4: Calculating and Determining Its Sub-Interval
Using the KOS xG model, we calculate the value of (for the action we aim to compute the DxT for). We then identify the sub-interval to which belongs and denote this interval as
Step 5: Deducing a Formulation for the Probability Function to Shoot f
For similar scoring probabilities, as modeled by the sub-intervals
in this study, the likelihood that players decide to take a shot is relatively consistent within these sub-intervals. This holds true, especially when
Δ is very small. Furthermore, as the size of the training dataset used in Step 2 increases, the probability of players deciding to shoot can be approximated by the frequency of shots falling within the sub-interval
. Therefore:
where
represents the number of historical shots having xG values included in the interval
and
the number of historical moves having xG values included in the interval
.
3.3. Proposed Methodology for Calculation Ball Transition Probability
Given that players’ decisions to pass the ball to a specific zone are based on directing the ball to zones exhibiting elevated xG values or zones expected to yield augmented xG values in subsequent actions, we will model
by a function
that depends on
and
:
To determine an expression for the function , we posited that the likelihood of transitioning from one scenario to another hinges on the differential between these situations: the larger the differential, the higher the probability that a player’s decision to move the ball from the initial to the subsequent situation is intensified.
To gauge the likelihood of moving the ball from one zone to another, we computed the
differential between each zone on the pitch. Subsequently, we standardized these values to ensure that the cumulative probabilities sum up to 1 for every starting zone on the pitch. Consequently, we derived the following formula for the function
:
where
represents the sum of the
possible xG values in the different grid cells of the pitch.
The final expression of the DxT equation is:
4. Experimentation
In this section, we implement the DxT model using real match data from the StatsBomb 360 Open Data dataset. For our experiment, we selected 210 matches, while an additional 85 matches will be used in
Section 6 for model validation.
The dataset used for training consists of 335,002 events, including passes, carries, and shots. It contains various features such as ball positioning, action type, and the positions of off-ball players during each action. The pitch dimensions in this dataset are 120 m in length and 80 m in width.
To discretize the field, we selected a grid resolution of L = 12 and W = 8, resulting in 1-square-meter grid cells on a 120 × 80 m pitch. This resolution offers a balance between spatial granularity and computational efficiency. Finer grids (e.g., 20 × 16) may provide more detail but often lead to data sparsity issues, especially in less active zones, while coarser grids (e.g., 6 × 4) may fail to capture important spatial dynamics. Additionally, the dataset provides player locations with decimal precision; however, we observed minor inaccuracies in the decimal values—for example, a player recorded at x = 110.7 may, in fact, be closer to 110.6 upon visual inspection. As such, we opted for a grid division that matches the integer meter scale (i.e., dividing the pitch length by 12 and width by 8) since the most reliable aspect of the positional data lies in the integer part of the coordinates. This ensures that each cell corresponds to a 1 m square, aligned with the practical resolution of the data.
We will refer to this dataset as the historical dataset, which serves as the foundation for implementing the protocol described in
Section 3.2.1.
In addition to this historical dataset, a separate dataset is used for building the xG model, which is a crucial component in the DxT model implementation.
4.1. xG Model Building
To compute xG values, we employed a variant of the KOS xG model developed by Hassani and Lotfi [
14]. Their study demonstrated that, regardless of the machine learning models used, the most influential feature (according to Shapley values) is consistently the KOS Angle, followed by shot distance and shot angle. Based on these results, we chose to build a model using only these three features.
To enhance the predictive capabilities of our model, we expanded the training dataset from its initial 68,382 shots to 91,362 shots by incorporating additional data from the 2022 and 2026 World Cup qualifiers, as well as the Asian Cup 2023. Additionally, we replaced its original Gradient Boosting structure with an XGBoost framework, leading to a notable improvement in performance:
This performance enhancement allows our xG model to achieve competitive results compared to existing models in the literature, as shown in
Table 1.
4.2. DxT Probability Matrices Calculation
For each action in our historical dataset, we first compute its xG value using the previously trained xG model. Additionally, we estimate hypothetical xG values for all grid cells on the pitch at the moment the action occurred. In other words, we compute:
The actual xG of the action, assuming it was a shot, by calculating the KOS Angle, shot distance, and shot angle.
The xG of each grid cell’s center, considering the distribution of players on the pitch at that moment.
The xG values of our historical actions allow us to compute the shot and move probability matrices
and
using the methodology outlined in
Section 3.2.2.
Additionally, the actual xG of an action, combined with the hypothetical xG values across all pitch grids, allows us to compute the transition probability matrices that we defined in
Section 3.3:
.
4.3. Iterative Computation of DxT
The model requires a stepwise update of the values for each grid cell , starting from an initial value of zero.
The computation begins by initializing DxT values across all pitch grids to zero. In the first iteration, DxT values are computed solely based on the immediate probability of a shot. The computed DxT values in this first step represent the Expected Threat associated with direct shots without accounting for intermediate ball movements.
As the process iterates, ball movement is progressively incorporated into the computation. Each subsequent iteration refines the DxT values by considering not only direct shots but also sequences of ball movements that can lead to a shot. At each step, the DxT value of a given grid is updated as a combination of its immediate shot probability and the expected DxT values of surrounding grids, weighted by their respective transition probabilities.
The iteration process continues until the DxT values converge, meaning that further updates no longer produce significant changes.
4.4. DxT Convergence Analysis
To determine the optimal iteration number
at which the DxT values converge, we employed a convergence-based solution approach. Specifically, we introduced a convergence criterion to measure the relative evolution of the normalized DxT values between two consecutive iterations (
k and
k + 1):
where:
is the chosen convergence criteria from the iteration to .
is the Dynamic Expected Thread in the grid cell after k iteration.
is the maximum Dynamic Expected Thread on the pitch after k iteration.
In addition to evaluating the convergence criterion as a function of the number of iterations, we also analyzed its standard deviation for different actions across iterations. Given the dynamic nature of DxT, we aimed to quantify the stability of threat values for each action.
Specifically, we computed the standard deviation of DxT Convergence Criteria changes at each iteration to assess the degree of fluctuation before stabilization.
The high standard deviation in early iterations observed in
Figure 2 indicates significant updates in DxT values, whereas the gradual reduction in standard deviation suggests that the model is approaching convergence. This analysis led us to determine that 10 iterations were sufficient to achieve a stable and consistent DxT representation across all actions. Consequently, for the remainder of this study, we will adopt a fixed number of 10 iterations for the generation of DxT grids.
4.5. Case Study: DxT to Evaluate the Sequence of Actions Leading to Albania’s First Goal Against Croatia, Euro 2024
To illustrate the inner workings of the DxT model—its dynamic nature and its ability to evaluate tactical situations more effectively than the xT framework—we applied it to Albania’s first goal against Croatia during Euro 2024.
Albania scored after a sequence of passes and carries, culminating in a shot that resulted in a goal. This sequence is depicted in
Figure 3.
The first action in this sequence was a backward pass from Kristjan Asllani to Ylber Ramadani, moving from grid (3,9) to grid (4,8), as shown in
Figure 4. In this figure, opposing players are represented as red dots, while teammates appear as blue dots. The ball transitioned from a grid cell with a DxT starting value of 0.0906 to a cell with an end value of 0.0942.
The DxT value of this first action is computed as the difference between the DxT values of these two grid cells:
Ylbert Ramadani then carried the ball laterally (
Figure 5), generating a DxT value of:
It is worth noting that while the overall appearance of the DxT maps remains similar from one action to the next, a closer examination of the values reveals significant variations in threat estimation. This is further illustrated in
Table 2, where we observe that the threat value at the end of an action is not equal to the threat value at the start of the following action. This discrepancy arises from the constant adjustments in player positioning between successive actions, affecting the threat landscape even before the next movement occurs.
For comparative purposes, we implemented an xT model using the same historical dataset used to build the DxT model.
Figure 6 illustrates the overlay of all actions in the analyzed sequence with the generated static xT map, and
Table 3 details the
values for each action in the analyzed sequence.
The first action in the sequence—a backward pass—received a negative xT value, which may seem counterintuitive given its tactical function. This outcome stems from the underlying logic of the xT model, which primarily rewards actions that reduce the distance to the goal: the closer the ball moves toward the goal, the higher the estimated threat. However, this approach disregards the contextual dynamics of the game, such as opponent positioning or space manipulation. As a result, backward passes—even when strategically valuable—tend to be undervalued.
In contrast, the DxT model explicitly integrates the spatial and tactical context in which each action occurs. It evaluates threats dynamically based on the real-time configuration of players. The same backward pass, when assessed with DxT, received a positive value of 0.0035, reflecting its role in destabilizing the opponent’s shape and contributing meaningfully to the team’s buildup play.
A clear example of this difference emerges from Albania’s offensive strategy during the analyzed sequence. Rather than forcing a direct progression through the center, the team opted to shift play to the right flank, effectively bypassing Croatia’s compact defensive block. This tactical decision is visible in
Figure 3,
Figure 4 and
Figure 5, as well as in
Figure A1,
Figure A2,
Figure A3 and
Figure A4, where we observe that ball circulation occurred outside zones densely populated by defenders (shown as red dots). Although these actions may not appear immediately threatening in terms of proximity to the goal, they were essential in creating space and preparing for the decisive final movement.
These tactical nuances are captured more effectively by DxT than by xT. While the total xT value over the entire sequence amounts to 0.8656, the corresponding DxT total reaches 0.9078. This discrepancy underscores DxT’s greater sensitivity to context-aware strategies, such as backward progressive passes or flank switches, which aim to manipulate space rather than simply move the ball forward.
As such, DxT proves particularly valuable for analyzing indirect but tactically meaningful actions, providing a more nuanced and realistic measure of threat that aligns closely with actual in-game decision-making.
4.6. Case Study: DxT to Evaluate Portugal’s Last Offensive Sequence in Euro 2024 Quarter-Final
In the previous case study, we demonstrated how DxT generates a threat map for each individual action, enabling the analysis of subtle tactical nuances such as flank switches. In this section, we shift our focus to another important aspect of DxT: its ability to support player decision-making analysis.
To illustrate this, we examine the actions taken by Portugal during the final offensive sequence of their Euro 2024 quarter-final match against France—a match that ended in a 0–0 draw and was ultimately decided by a penalty shootout, which Portugal lost.
We analyze the final attacking sequence available to Portugal (see
Figure 7) to assess whether alternative decisions by the players could have led to a more threatening outcome.
The details of this sequence are provided in
Table 4. In this section, we focus specifically on the final pass from Bernardo Mota to Nuno Mendes, as well as the resulting shot by Nuno Mendes. This critical moment is illustrated in
Figure 8 and
Figure 9, allowing us to assess whether a different decision at this stage could have resulted in a more favorable outcome for Portugal.
Figure A5,
Figure A6,
Figure A7,
Figure A8 and
Figure A9 provide, for reference, the full breakdown of the preceding actions in the sequence.
In
Figure 8, Bernardo Mota was positioned in a grid cell with a DxT value of 0.0702 and was, therefore, expected to transition the ball toward a teammate located in a grid with a higher threat value. While several passing options were available, any chosen target had to lie in a zone that was not obstructed or at risk of interception by opposing players. In this context, the grid cell to which he ultimately passed the ball maximized the threat while minimizing risk, thus illustrating a sound decision-making process by Bernardo Mota.
Upon receiving the ball, Nuno Mendes decided to shoot toward the left side of the goal (
Figure 9). However, this choice yielded a DxT gain of only 0.0518 (calculated as 0.1147–0.0629). An alternative option—aiming for the right side of the goal—would have resulted in a DxT gain of 0.1446 (0.2075–0.0629), representing almost three times the scoring potential.
This considerable difference in threat values, as suggested by the DxT model, indicates that Nuno Mendes’s decision to shoot toward the left side of the goal may have significantly reduced Portugal’s likelihood of scoring in this critical moment. Based on the DxT evaluation, the suboptimal threat gain associated with his chosen shot direction could have contributed to the missed opportunity and, ultimately, negatively impacted Portugal’s chances of advancing in the competition.
5. Model Validation and Results
While the previous case studies demonstrate the ability of DxT to provide detailed insights into both tactical dynamics and individual decision-making, they remain illustrative examples. As such, they do not allow for broader conclusions regarding the model’s overall performance.
To rigorously validate DxT, we used a distinct subset of the StatsBomb 360 Open Data that was not included in the 210 matches used for the empirical derivation of xT and DxT. This new dataset consists of 85 matches, containing 151,827 actions of type pass, carry, and shot. This ensures that our validation is conducted on a separate set of matches, allowing us to assess how well DxT generalizes to unseen game situations.
The main challenge in validating a complex model like DxT lies in the computational resources required to process a sufficiently large number of cases for validation. Given the structure of DxT, the validation process involves handling large-scale probability matrices, making the calculations computationally expensive.
We had to compute and manipulate through 10 iterations three key probability matrices:
Shot probability matrix of size
Move probability matrix of size
Transition probability matrix of size
The transition probability matrix computation was even more demanding. For each of the 151,827 actions, it was necessary to compute:
The hypothetical xG for the actions themself.
The hypothetical xG for every grid cell on the pitch—considering the positioning of players and requiring an additional 8 × 12 × 151,827 xG computations.
Despite vectorizing our code to optimize performance, we encountered significant computational bottlenecks when attempting to run these calculations.
To overcome these limitations, we performed our experiments on MARWAN’s high-performance computing (HPC) infrastructure, which provided the necessary computational power to process large-scale probability matrices. The hardware configuration was as follows:
CPU: 2× Intel Xeon Gold 6148 (2.4 GHz, 20 cores each)
RAM: 1 TB
GPU: 2× NVIDIA Tesla P100 (12 GB) with CUDA v10.1
After computing the DxT grids for each of the 151,827 evaluated actions (implemented in Python 3 [
33]), we assessed the model’s predictive performance using the Brier Score and the Area Under the Receiver Operating Characteristic Curve (AUC-ROC). These metrics were selected for their specific advantages: the AUC-ROC quantifies the model’s ability to discriminate between positive and negative outcomes, which is particularly relevant given the class imbalance in our dataset, where only 1.03% of sequences lead to goals; meanwhile, the Brier Score evaluates both the calibration and accuracy of probabilistic predictions by measuring the mean squared error between predicted probabilities and actual outcomes.
Our model achieved a Brier Score of 0.0811 and an AUC-ROC of 0.7283, significantly outperforming xT models, which yielded a Brier Score of 0.4260 and an AUC-ROC of 0.6038 (see
Table 5).
6. Discussion
The DxT model demonstrates competitive performance based on established validation metrics. A Brier Score close to 0 indicates high calibration quality, while a score approaching 1 signals poor predictive performance. A value below 0.1 is generally considered reliable. Our DxT model achieves a Brier Score of 0.0811, suggesting that its predictions are well-calibrated with actual outcomes. In addition, the model obtains an AUC-ROC of 0.7283, exceeding the typical acceptability threshold of 0.7. This indicates a reasonable capacity to discriminate between high- and low-threat actions.
In contrast, the original xT model yields a significantly higher Brier Score of 0.4260, indicating poor calibration and limited reliability. Its AUC-ROC value of 0.6038 is only marginally above the baseline for random prediction, reflecting its limited ability to accurately distinguish between impactful and non-impactful actions. These shortcomings are particularly evident in situations where tactically valuable actions—such as backward or lateral passes—are systematically underestimated.
To further contextualize DxT’s performance, we conducted a direct comparison with VAEP, one of the few existing models that employs the same evaluation metrics. VAEP outperforms both xT and DxT in numerical terms, achieving a Brier Score of 0.01376 and an AUC-ROC of 0.7693. This result is not unexpected, given VAEP’s design for comprehensive action valuation using large-scale datasets. However, the performance gap between VAEP and DxT can be narrowed by addressing some current limitations in the DxT implementation.
Several avenues for improvement are identified. First, the transition probability matrices could be refined through more expressive mathematical transformations—for instance, using logarithmic scaling to better capture subtle but meaningful spatial transitions. Second, the shot and move probability matrices could benefit from calibration on a larger and more diverse dataset to improve generalization. Third, integrating synchronized tracking data would enable the model to consider the positions of all 22 players, significantly enriching its tactical awareness. At present, DxT relies on freeze-frame data, which limits the scope of spatial interpretation.
Perhaps the most critical enhancement would involve coupling DxT with a more robust xG model. Since DxT’s threat computation heavily depends on Expected Goal estimates, using an xG model trained on industrial-scale data would improve its reliability. Academic xG models often require class-balancing techniques to improve performance; however, these adjustments tend to distort the distribution of xG values by pushing them toward 0.5, thereby creating an artificial representation of scoring likelihood.
Beyond performance metrics, DxT offers significant advantages in terms of interpretability and tactical insight. Unlike VAEP, which produces abstract probability estimates based on predefined action windows, DxT generates spatially explicit threat maps that reflect the real-time configuration of players on the field. These maps allow coaches and analysts to intuitively assess which zones are more or less dangerous based on opponent positioning and available space. For example, a zone may appear threatening not because of proximity to the goal but due to limited defensive coverage—something that DxT captures and visually conveys. By contrast, models like VAEP or those based on deep learning architectures tend to function as black boxes, offering little transparency regarding how action values are derived in specific tactical contexts. Their reliance on latent feature representations makes them difficult to interpret, especially when concrete spatial decisions are required in the field.
One important conceptual feature of VAEP is its use of a fixed action horizon to compute the value of an event. Specifically, the value of a given action is defined as the change in the team’s probability of scoring (or conceding) within the next N actions, where N is commonly set to 10. However, this number of actions (N) is typically selected empirically, without a formal sensitivity analysis or mathematical justification. As a result, VAEP imposes a form of temporal rigidity that may not accurately reflect the natural variability in attacking sequences, which can vary widely in length depending on the tactical context. In contrast, the DxT model uses an iterative computation process grounded in convergence analysis. Rather than fixing a predetermined number of steps, DxT updates threat values over multiple iterations until the values stabilize. This mathematically driven approach ensures that threat propagation is captured organically, without relying on arbitrary parameters, and contributes to the model’s internal coherence.
Taken together, these elements position DxT as a practical, transparent, and tactically informed framework, particularly well-suited for applications that require fine-grained spatial and contextual analysis—such as space occupation, buildup evaluation, or action-by-action interpretation. In contrast, VAEP—along with other models relying on deep learning architectures—is more appropriate for global performance assessment, such as player ranking, scouting, or recruitment, where the objective is to quantify overall impact rather than interpret tactical behavior. Ultimately, DxT offers a complementary perspective that aligns more closely with the real-time spatial and tactical logic of football, making it especially valuable in decision-support environments focused on coaching and game analysis.
7. Conclusions
This study introduced the Dynamic Expected Threat (DxT) model, a novel framework designed to overcome the limitations of traditional Expected Threat (xT) models. Unlike conventional xT approaches, which assign fixed threat values to pitch zones, DxT dynamically adjusts threat estimations by integrating real-time spatial configurations of players. This innovation allows for a more context-aware assessment of football actions.
Our model outperforms the traditional xT model, as demonstrated by its superior Brier Score (0.0811 vs. 0.4260) and AUC-ROC (0.7283 vs. 0.6038). Furthermore, two case studies based on real match sequences—Albania’s opening goal against Croatia in the Euro 2024 group stage and Portugal’s final offensive sequence in the quarter-final against France—illustrate DxT’s ability to capture both tactical nuances, such as flank switches and spatial manipulation, and to support the analysis of individual decision-making in high-stakes scenarios. Despite its advancements, the DxT model still has several limitations that future research should address. One key area for improvement is the expansion of training datasets to enhance the performance of the xG model underlying the DxT framework. Additionally, exploring alternative formulations for transition probability matrices—such as logarithmic scaling—could better capture subtle variations in xG values. Finally, integrating continuous tracking data alongside event data, rather than relying solely on freeze frames, would be a crucial step toward increasing the model’s realism and robustness.