Can Semantic Methods Enhance Team Sports Tactics? A Methodology for Football with Broader Applications

Di Rubbo, Alessio; Neri, Mattia; Pareschi, Remo; Pedroni, Marco; Valtancoli, Roberto; Zica, Paolino

doi:10.3390/sci8030063

Open AccessReview

Can Semantic Methods Enhance Team Sports Tactics? A Methodology for Football with Broader Applications

by

Alessio Di Rubbo

¹

,

Mattia Neri

²

,

Remo Pareschi

^1,*

,

Marco Pedroni

³

,

Roberto Valtancoli

⁴

and

Paolino Zica

⁵

¹

Stake Lab, University of Molise, 86100 Campobasso, Italy

²

Bioretics, 47521 Cesena, Italy

³

Institute for Generative Strategy, 44121 Ferrara, Italy

⁴

Cesena Femminile Football Club, 47521 Cesena, Italy

⁵

Zica Sport, 82100 Benevento, Italy

^*

Author to whom correspondence should be addressed.

Sci 2026, 8(3), 63; https://doi.org/10.3390/sci8030063

Submission received: 16 December 2025 / Revised: 12 February 2026 / Accepted: 25 February 2026 / Published: 11 March 2026

(This article belongs to the Special Issue Computational Linguistics and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

This paper explores how semantic-space reasoning, traditionally used in computational linguistics, can be extended to tactical decision-making in team sports. Building on the analogy between texts and teams—where players act as words and collective play conveys meaning—the proposed methodology models tactical configurations as compositional semantic structures. Each player is represented as a multidimensional vector integrating technical, physical, and psychological attributes; team profiles are aggregated through contextual weighting into a higher-level semantic representation. Within this shared vector space, tactical templates such as high press, counterattack, or possession build-up are encoded analogously to linguistic concepts. Their alignment with team profiles is evaluated using vector-distance metrics, enabling the computation of tactical “fit” and opponent-exploitation potential. A Python-based prototype demonstrates how these methods can generate interpretable, dynamically adaptive strategy recommendations, accompanied by fine-grained diagnostic insights at the attribute level. Evaluation through synthetic scenarios and a pilot study with real match data establishes internal consistency and feasibility of the approach; operational validity in live coaching contexts remains an open question for future prospective validation. Beyond football, the framework offers a potentially generalizable approach for collective decision-making in team-based domains—ranging from basketball and hockey to cooperative robotics and human–AI coordination systems. The paper concludes by outlining future directions toward real-world data integration, predictive simulation, and the validation work required before operational deployment.

Keywords:

semantic distance; decision support systems; recommender systems; sports analytics; tactical optimization; human–artificial integration

Graphical Abstract

1. Introduction

Modern football has undergone a radical transformation, evolving from a discipline grounded mainly in coaches’ intuition and experience into one profoundly shaped by objective data analysis. The widespread adoption of advanced analytics systems, proprietary metrics such as expected goals (xG) and expected assists (xA), and the availability of detailed information on players’ physical, technical, and tactical performance have enabled a quantitative understanding of phenomena once accessible only through human judgment [1].

In this data-driven landscape, tactical optimization—the ability to select and dynamically adjust playing strategies according to the team’s internal characteristics and the contingent match conditions—has become a decisive competitive factor. At elite levels, marginal advantages can determine the outcome of an entire season. Tactical effectiveness no longer depends solely on individual talent or preparation quality but also on the ability to interpret complex contexts, anticipate opponents’ actions, and adapt strategies in real time. However, traditional decision models based primarily on qualitative heuristics and experience reach their limits when faced with the high dimensionality and dynamism of modern play [2,3].

Despite significant progress in match analysis, a fundamental disconnect persists between quantitative tools (performance indicators, spatial distributions, xG models) and qualitative factors that critically influence performance: group cohesion, psychological resilience, team morale, and residual energy [4,5]. Current decision support systems emphasize easily measurable variables while neglecting intangible dimensions that often prove decisive under pressure.

Research gap: This disconnect causes (i) loss of strategically relevant information, (ii) limited adaptability of recommendations, and (iii) persistent reliance on subjective intuition in crucial match phases [6]. No existing framework provides a unified representation integrating quantitative performance data with qualitative contextual factors into a single, computationally tractable space for automated tactical recommendation.

To address this gap, the present study introduces a Decision Support System for Tactical Optimization using a novel semantic-distance methodology. The core innovation is representing both team states and tactical strategies as vectors in a shared 14-dimensional attribute space, enabling direct geometric comparison. The system recommends strategies by minimizing weighted Euclidean distance between a team’s profile and each strategy’s requirements; the rationale for this metric choice is detailed in Section 3.5.

This approach extends the semantic-distance methodology from Recommending Actionable Strategies [7]—originally developed for bridging analytical frameworks with decision heuristics—to operational football tactics. The key adaptations are:

replacing general decision categories with 14 concrete macro-attributes capturing technical, psychological, and organizational dimensions; and
replacing general heuristics with 20 canonical football strategies (e.g., high pressing, counterattack, positional defense).

What distinguishes this work: (i) unified representation integrating quantitative and qualitative factors; (ii) dynamic attribute weights adjusting in real time based on match context; (iii) transparent diagnostics showing why a strategy is recommended.

1.1. Objectives and Contributions

This study aims to: (1) formalize a semantic model encoding team states and strategies in a shared attribute space; (2) develop a prototype DSS with context-aware recommendations; (3) evaluate internal consistency and real-data feasibility.

The main contributions are:

1.: Semantic Model: 14 macro-attributes synthesizing team complexity, with 20 canonical strategies as ideal-profile vectors.
2.: Adaptive Engine: Python prototype with dynamic weighting adjusting recommendations based on energy, time pressure, and opponent characteristics.
3.: Systematic Validation: Evaluation through synthetic scenarios and German youth football data, including ablation and robustness analyses.

1.2. Paper Organization

Section 2 reviews related work; Section 3 presents methodology; Section 4 describes implementation; Section 5 reports evaluation; Section 6 presents the pilot study; Section 7 discusses limitations; Section 8 concludes.

2. Background and Related Work

The introductory section highlighted the need to bridge the gap between quantitative analytics and heuristic decision-making in football. To formalize the proposed solution, it is first necessary to establish a solid conceptual foundation that clarifies the distinction between strategy and tactics, and then to situate this distinction within the broader context of semantic modeling and decision-support research.

2.1. Strategic and Tactical Analysis in Football

In everyday football discourse, the terms strategy and tactics are often used interchangeably. However, in the academic and analytical literature, they refer to distinct levels of decision-making that are crucial to our methodology.

Strategy (or playing identity) defines the overall approach or long-term plan through which a team intends to compete. It depends on structural and contextual factors such as squad quality, key players’ technical and physical profiles, seasonal goals, the coach’s philosophy, and the team’s physical and psychological resources [5,8]. Strategy answers the question: What do we want to achieve?—for example, controlling the game through ball possession.

Tactics, in contrast, represent the operational choices and on-field configurations that translate strategy into concrete actions, often in response to real-time match dynamics. They include formation choices, player assignments, coordinated movements (e.g., defensive shifts), and in-game adaptations such as introducing an additional forward when chasing a result. Tactics answer the question: How do we achieve it?

This distinction is central to the proposed Decision Support System (DSS). The system operates at the tactical level—optimizing action choices based on a multidimensional strategic representation of the team. The semantic-distance model quantifies the alignment between:

1.: the strategic vector of the team (its current state, defined by 14 macro-attributes), and
2.: the ideal tactical vector (the target profile of a given strategy, such as counterattack or high pressing).

A correct balance between strategic identity and tactical flexibility ensures internal coherence. Teams with strong strategic identity but low adaptability become predictable and fragile, while excessive tactical improvisation undermines structural stability and collective performance [9].

2.2. Canonical Tactical Strategies in Modern Football

The following tactical archetypes comprise the conceptual foundation of our vector modeling framework. For each, the team attributes required for effective implementation are indicated.

High Pressing. A proactive approach aimed at regaining possession in the opponent’s half by applying intense, coordinated pressure. It reduces opponents’ time and space, forcing errors and enabling rapid goal opportunities [10,11,12]. It requires exceptional physical conditioning, coordination, and risk tolerance.

Counterattack (Rapid Transition). Based on defending in a compact mid-low block to lure the opponent forward, then striking rapidly upon regaining possession. It exploits spaces behind the defense and requires speed, verticality, and sharp decision-making.

Positional Defense. A space-oriented approach emphasizing spatial control over immediate pressure. Spatio-temporal analysis methods have been developed to quantify team coordination and territorial control [13]. Positional defense prioritizes equilibrium, communication, and tactical discipline while conserving energy [14].

Gegenpressing (Pressing After Loss). An aggressive evolution of pressing, aiming to recover the ball within 3–5 s after losing it by exploiting the opponent’s temporary disorganization. Extremely demanding, it requires maximal energy, readiness, and synchronization.

Build-up Play. A possession-based approach initiating offensive buildup from the back through short passes and gradual progression, designed to control tempo and overcome pressure via numerical superiority [15]. It requires technically skilled players across all lines, especially defenders and goalkeepers, who can distribute the ball.

These archetypes serve as idealized templates within our system, allowing the computational comparison of a team’s actual state with prototypical tactical profiles.

2.3. Semantic Distance Models

Semantic distance provides a quantitative measure of how far two informational entities—concepts, documents, or representations—differ in meaning when embedded in a shared vector space. In natural language processing (NLP), such models rest on the principle that numerical representations of linguistic units capture latent semantic relations, enabling mathematical comparison across heterogeneous content [15,16].

Classical approaches include:

Cosine similarity, which measures the angle between normalized vectors, robust to scale differences;
Euclidean distance, which quantifies geometric deviation in continuous space;
Probabilistic metrics, such as Kullback–Leibler [17] or Jensen–Shannon [18] divergences, used when entities are modeled as probability distributions.

With the advent of Transformer architectures (e.g., BERT, RoBERTa, Sentence-BERT) [19,20], Contextual embeddings have dramatically improved representation quality, dynamically capturing meaning and outperforming static models such as Word2Vec and GloVe. These techniques have been widely adopted in information retrieval, question answering, text classification, and recommender systems [21].

In the reference paper Recommending Actionable Strategies [7], semantic distance was used to integrate two historically distinct traditions in strategy theory:

1.: structured analytical frameworks (e.g., SWOT, 6C), and
2.: decision heuristics (e.g., the Thirty-Six Stratagems).

Both were projected into a shared semantic space, enabling the computation of similarity matrices that link structured analysis to heuristic insight. This pipeline demonstrated how semantic methods can act as an interpretive bridge between abstract models and actionable guidance.

The present research adapts that paradigm to the football domain, replacing general analytical categories with 14 football-specific macro-attributes (e.g., Offensive Strength, Tactical Cohesion, Psychological Resilience) and general heuristics with canonical tactical strategies. The optimal tactical choice

S^{*}

is thus defined as the strategy minimizing the semantic distance

d (V_{team}, V_{strategy} (S))

between the team’s current vector representation and the target tactical profile:

S^{*} = arg min_{S} d (V_{team}, V_{strategy} (S)) .

2.4. Decision Support Systems in Sports

Decision Support Systems (DSS) are computational tools designed to assist coaches, analysts, and managers in complex decision-making by integrating quantitative data, expert knowledge, and predictive modeling capabilities. The increasing availability of high-resolution data—from GPS tracking, wearable sensors, and video-analysis platforms—has fostered the development of DSS capable of transforming information into operational insight [6,22].

Across sports, DSS applications range from performance optimization to injury prevention and tactical planning:

Athletics and individual sports—systems such as Catapult AMS or Kitman Labs monitor fatigue and workload by combining physiological and subjective data;
Basketball and team sports—platforms like Synergy Sports and Second Spectrum merge positional tracking with video analytics to identify offensive and defensive patterns [23];
Cycling and endurance disciplines—predictive tools such as Performance Management Charts use power and heart-rate data to optimize training loads.

In football, systems like Wyscout and InStat provide video-based statistical analytics; StatsBomb IQ integrates positional and event data into advanced metrics (e.g., xG, passing networks); SciSports Insight uses AI-based indices for player recruitment and compatibility analysis; and SkillCorner applies computer vision to extract player trajectories in real time [24].

While these systems have expanded analytical capabilities, most focus on quantitative or spatial data, overlooking qualitative and psychological aspects such as morale, cohesion, and resilience. Moreover, strategic recommendations often rely on expert interpretation rather than automated reasoning. The present work addresses this methodological gap by introducing a semantic-distance-based DSS that integrates multidimensional, context-aware modeling—combining quantitative metrics and tacit knowledge into a unified, interpretable framework.

3. Methodology

3.1. Theoretical Framework

We adapt the methodology of Recommending Actionable Strategies [7] to the football domain, aiming to build a tactical recommender that integrates a team’s technical, organizational, and psychological dimensions within a shared semantic space. The core idea is to encode both (i) the contextual state of a team and (ii) the ideal profiles of canonical tactical strategies in the same vector space, and then to select the tactic whose profile is closest (in a semantic–geometric sense) to the team’s current state. Recommendations can be updated dynamically as match conditions evolve (e.g., residual energy, technical/physical gaps, time pressure).

Three pillars characterize this approach:

1.: Multidimensional integration of quantitative (individual and collective performance) and qualitative (morale, cohesion, psychological resilience) factors.
2.: Semantic formalization via normalized vectors in a common space, enabling consistent comparisons between teams and tactics.
3.: Dynamic adaptability through real-time reweighting of distances using match conditions.

3.2. Context Tree and Aggregation

We represent team context with a hierarchical context tree that aggregates heterogeneous data sources into a unified vector representation. The tree has three levels:

1.: Leaf level: Raw observables from match analytics—player-level metrics from event data (passes, shots, tackles), tracking data (sprint distance, positioning), and physiological monitoring (heart rate, estimated fatigue).
2.: Intermediate level: Role-aggregated attributes computed by combining leaf-level data within positional groups (e.g., “forward line offensive output,” “midfield ball retention”).
3.: Root level: The 14 macro-attributes ( $A_{1}, \dots, A_{14}$ ) that define the shared semantic space, computed by a weighted combination of intermediate-level signals.

Figure 1 illustrates this hierarchical structure for a subset of attributes.

3.2.1. Aggregation Example

To illustrate the aggregation process concretely, consider how

A_{1}

(Offensive Strength) is computed for a team fielding a 4-3-3 formation:

1.: Leaf level: Extract per-player metrics—e.g., Striker A: xG $= 0.82$ , shot accuracy $= 0.71$ ; Winger B: xA $= 0.65$ , successful dribbles $= 0.78$ .
2.: Intermediate level: Aggregate within positional groups using role-based weights:

$\begin{matrix} Forward Output & = 0.5 \times {xG}_{ST} + 0.3 \times {ShotAcc}_{ST} + 0.2 \times {xG}_{wings} \\ Midfield Creativity & = 0.6 \times {xA}_{CAM} + 0.4 \times {KeyPasses}_{CM} \end{matrix}$
3.: Root level: Combine intermediate values into the macro-attribute:

$A_{1} = 0.50 \times Forward Output + 0.30 \times Midfield Creativity + 0.20 \times Wide Contribution$

All intermediate and final values are normalized to

[0, 1]

via min-max scaling against league or historical benchmarks, ensuring cross-team comparability. The normalization procedure is specified in detail below.

3.2.2. Normalization Procedure

To ensure consistent scaling across teams and time periods, each raw attribute value x is transformed to a normalized value

\tilde{x} \in [0, 1]

using the following protocol:

Min-Max Scaling Formula.

\tilde{x} = \frac{x - x_{min}}{x_{max} - x_{min}}

(1)

where

x_{min}

and

x_{max}

are benchmark bounds derived from reference populations as specified below.

Benchmark Population Definition. Normalization benchmarks are computed from a reference population defined as follows:

League-level benchmarks (default): For professional deployments, benchmarks are derived from all players in the same league and division (e.g., Bundesliga, Serie A) over the reference window. This ensures that a normalized value of 0.5 represents league-average performance.
Competition-level benchmarks: For tournament contexts (e.g., Champions League, World Cup), benchmarks may be computed across all participating teams to reflect the elevated baseline.
Historical team benchmarks: For longitudinal tracking of a single team, benchmarks may be derived from that team’s own historical range, enabling detection of relative improvement or decline.

The current prototype uses synthetic benchmarks derived from the role-specific distributions in Table A1 (Appendix A), with

x_{min} = μ - 2 σ

and

x_{max} = μ + 2 σ

for each attribute–role combination.

Reference Window (Temporal Scope). To prevent instability from outlier matches, benchmarks are computed over a rolling window:

Season window (default): The most recent complete season (e.g., 34–38 matches for major European leagues). This captures stable population characteristics while remaining current.
Rolling window: For mid-season deployment, a rolling window of the most recent $N = 10$ league matches provides more responsive benchmarks, updated weekly.
Fixed historical window: For retrospective analysis or cross-season comparison, a fixed reference period (e.g., 2022–23 season) ensures consistent scaling.

Clipping and Robustness. To handle outliers and ensure bounded outputs:

1.: Pre-normalization clipping: Raw values outside $[x_{min}, x_{max}]$ are clipped to the boundary values before applying Equation (1). This prevents exceptional performances (positive or negative) from distorting the scale.
2.: Robust benchmark estimation: Benchmarks may optionally use the 5th and 95th percentiles rather than true min/max to reduce sensitivity to extreme outliers:

$x_{min} = P_{5} (D), x_{max} = P_{95} (D)$

where $D$ is the reference population distribution.
3.: Floor for near-zero ranges: If $x_{max} - x_{min} < ϵ$ (indicating near-constant values), the attribute is assigned the default value 0.5 to avoid division instability.

Leakage Prevention. In retrospective evaluation and real-time deployment, normalization must use only information available at decision time:

1.: Temporal ordering: Benchmarks for match t are computed from data up to match $t - 1$ only. Future match data are never included in benchmark computation.
2.: Held-out validation: When evaluating DSS performance over a test period, benchmarks are frozen at values computed from a prior training period. No benchmark updates occur during the test window.
3.: Same-match exclusion: When computing benchmarks, the current match’s data are excluded to prevent self-referential scaling.

Current prototype scope: The implementation uses fixed synthetic benchmarks (role-specific

μ \pm 2 σ

) that do not require temporal updating, thereby avoiding leakage by construction. Production deployments should implement the rolling-window protocol with the temporal safeguards above.

3.2.3. Data Sources

The context tree is designed to integrate multiple data streams, each contributing to specific macro-attributes:

Event data (e.g., Opta, StatsBomb): passes, shots, tackles, interceptions → technical/tactical attributes ( $A_{1}$ – $A_{6}$ ), tactical cohesion ( $A_{11}$ ), technical base ( $A_{12}$ )
Tracking data (e.g., SkillCorner, Second Spectrum): positions, velocities, distances → transition speed ( $A_{4}$ ), residual energy ( $A_{8}$ ), physical base ( $A_{13}$ )
Physiological monitoring (e.g., Catapult, Polar): heart rate, workload → residual energy ( $A_{8}$ )
Qualitative assessments: coach ratings, historical stability → psychological attributes ( $A_{7}$ , $A_{9}$ ), organizational attributes ( $A_{10}$ , $A_{14}$ )

This modular design allows the system to operate with varying data availability—from fully instrumented professional environments to amateur contexts where only basic event data exists.

3.2.4. Measurement Framework

Integrating heterogeneous data streams requires explicit policies for handling missing data, quality control, and uncertainty propagation. The DSS adopts the following minimal measurement framework:

Data Stream Reliability Tiers. Each data source is assigned a reliability tier reflecting measurement precision and update frequency:

Tier 1 (High reliability): Event data from professional providers (Opta, StatsBomb)—validated, near-complete, low latency. Assigned confidence weight $c_{1} = 1.0$ .
Tier 2 (Medium reliability): Tracking data (SkillCorner, Second Spectrum)—high precision but potential occlusion gaps; physiological monitoring (Catapult, Polar)—device-dependent accuracy. Assigned $c_{2} = 0.85$ .
Tier 3 (Lower reliability): Qualitative assessments (coach ratings, historical proxies)—subjective, infrequently updated. Assigned $c_{3} = 0.70$ .

Missing Data Policy. When an input variable is unavailable, the following hierarchy applies:

1.: Imputation from correlated sources: If a higher-tier source for the same attribute exists, use it with adjusted confidence. For example, if physiological $A_{8}$ data are missing, estimate from tracking-derived distance covered.
2.: Historical baseline: If no current-match data exist, use the team’s season average for that attribute, flagged with reduced confidence ( $c \leftarrow 0.5 \cdot c_{tier}$ ).
3.: Neutral default: If no historical data exist, assign the attribute the midpoint value (0.5) with minimal confidence ( $c = 0.3$ ), ensuring the attribute contributes little to distance until better data arrive.

Temporal Alignment. Data streams operate at different frequencies: event data are discrete (per-action), tracking data are high-frequency (25 Hz), and qualitative assessments are episodic (pre-match, halftime). The aggregation module aligns all inputs to a common temporal frame:

Match phase granularity: Attributes are computed per phase (first half, second half) or per 15-min window for finer resolution.
Windowed aggregation: High-frequency tracking data are averaged over the alignment window; event data are accumulated.
Carry-forward for episodic inputs: Qualitative assessments persist until updated (e.g., halftime morale rating carries into second half unless revised).

Quality Control. Basic outlier detection is applied before aggregation:

Player-level attributes outside $[μ - 3 σ, μ + 3 σ]$ (based on role-specific distributions) are flagged and clamped to boundary values.
Implausible physiological readings (e.g., heart rate < 40 or >220) are discarded and imputed from recent history.
Event data with missing location or timestamp fields are excluded from spatial aggregations but retained for count-based metrics.

Uncertainty Propagation and Reporting. Rather than reporting point-estimate distances, the DSS can optionally compute distance intervals reflecting input uncertainty:

d_{interval} = [d (V_{team}^{-}, V_{strategy}), d (V_{team}^{+}, V_{strategy})]

(2)

where

V_{team}^{-}

and

V_{team}^{+}

are lower and upper bounds on the team vector, derived by perturbing each attribute

A_{j}

by

\pm ϵ_{j}

proportional to

(1 - c_{j})

, where

c_{j}

is the confidence weight for that attribute’s data source. When intervals for competing strategies overlap, the DSS flags the recommendation as uncertain and presents the top-k alternatives rather than a single choice.

Recommendation Stability Under Measurement Error. The robustness analysis in Section 5 (Monte Carlo perturbations,

\pm 5 %

noise,

N = 100

) provides an empirical estimate of recommendation stability: 89.3% top-1 consistency across plausible measurement error. This serves as the baseline stability metric; deployments with lower-tier data should expect reduced consistency and should increase the perturbation range accordingly (e.g.,

\pm 10 %

for Tier 3–dominant profiles).

Current prototype scope: The implementation assumes Tier 1 data availability (complete event data) with synthetic generation for missing streams. The full measurement framework described above is designed for production deployment; the robustness analyses in Section 5 simulate its behavior under controlled noise conditions.

3.3. A Shared Semantic Space: 14 Macro-Attributes

The shared vector space is spanned by 14 macro-attributes,

A_{1}, \dots, A_{14}

, each normalized to

[0, 1]

and computed via the context tree aggregation described above. This unified representation enables three core operations:

1.: Team state encoding: Describe a team’s contextual state at time t as a vector $V_{team} \in {[0, 1]}^{14}$ .
2.: Strategy profiling: Encode the ideal requirements of a tactical strategy as $V_{strategy} \in {[0, 1]}^{14}$ .
3.: Semantic matching: Compute distance $d (V_{team}, V_{strategy})$ to identify the best-aligned tactic.

This tripartite structure ensures that tactical recommendations account not only for technical fit but also for physical sustainability and psychological readiness—dimensions often overlooked in purely statistical approaches.

3.3.1. Complete Attribute Set

The semantic space comprises exactly 14 macro-attributes, indexed

A_{1}

through

A_{14}

with no gaps. Table 1 presents the complete ordered list; Table 2 provides detailed specifications grouped by functional category.

3.3.2. Design Rationale

The 14-attribute set was designed according to the following principles:

1.: Completeness: The set covers the major dimensions of team performance identified in sports science literature: technical skill, tactical capability, physical capacity, psychological state, and organizational coherence.
2.: Orthogonality: Attributes were selected to minimize redundancy. For example, $A_{1}$ (Offensive Strength) captures goal-scoring capability, while $A_{12}$ (Technical Base) captures underlying skill level—a team may have high technical quality but poor offensive output due to tactical misalignment.
3.: Measurability: Each attribute can be estimated from available data sources: event data for $A_{1}$ – $A_{6}$ , $A_{11}$ , $A_{12}$ ; tracking data for $A_{4}$ , $A_{8}$ , $A_{13}$ ; physiological monitoring for $A_{8}$ ; and qualitative assessment for $A_{7}$ , $A_{9}$ , $A_{14}$ .
4.: Tactical relevance: Each attribute has clear implications for strategy selection. For instance, low $A_{8}$ (energy) constrains high-pressing options; high $A_{4}$ (transition speed) enables counterattacking; strong $A_{11}$ (cohesion) supports complex positional play.

3.3.3. Attribute Categories

For exposition, we group attributes into three functional categories (though the DSS treats all 14 dimensions uniformly in distance computations):

Technical/Tactical ( $A_{1}$ – $A_{6}$ ): On-field performance capabilities—what the team can do.
Psychological/Physical ( $A_{7}$ – $A_{9}$ , $A_{12}$ – $A_{13}$ ): Individual and collective resources—what the team can sustain.
Organizational ( $A_{10}$ , $A_{11}$ , $A_{14}$ ): Coordination and adaptation capabilities—how the team functions as a unit.

Table 2 provides the complete specification with aggregation sources for each attribute.

3.3.4. Aggregation Functions

Leaf-level player attributes are aggregated to team-level macro-attributes through weighted combination functions. The general form is:

A_{j} = \sum_{i = 1}^{n} w_{i j} \cdot a_{i j}, where \sum_{i = 1}^{n} w_{i j} = 1

(3)

where

a_{i j}

represents player i’s contribution to attribute j, and

w_{i j}

is a role-based weight (e.g., forwards contribute more heavily to

A_{1}

; defenders to

A_{2}

). Specific aggregation formulas are documented in the prototype implementation (see Section 4.4).

3.3.5. Dynamic vs. Static Attributes

Some attributes vary during a match (dynamic), while others remain relatively stable (static):

Dynamic: $A_{7}$ (Psychological Resilience), $A_{8}$ (Residual Energy), $A_{9}$ (Team Morale)—vary significantly during a match based on events and fatigue.
Static: $A_{1}$ – $A_{3}$ , $A_{6}$ , $A_{12}$ – $A_{14}$ —determined by squad composition; stable within a match.
Context-dependent: $A_{4}$ (Transition Speed), $A_{5}$ (High Press Capability), $A_{10}$ (Time Management), $A_{11}$ (Tactical Cohesion)—baseline is static but effective value depends on match context (e.g., $A_{5}$ is constrained by $A_{8}$ ; $A_{10}$ becomes critical late in matches).

The variability classification for all 14 attributes is summarized in Table 1. This distinction informs the dynamic reweighting mechanism (Section 3.5), which adjusts attribute salience in response to evolving match conditions.

3.3.6. Construct Validity: Input Overlap and Multicollinearity

Several macro-attributes share underlying player-level inputs, creating potential redundancy. Specifically:

$A_{7}$ (Psychological Resilience) and $A_{9}$ (Team Morale) both aggregate resilience and aggression with similar weights (0.7/0.3 vs. 0.6/0.4).
$A_{8}$ (Residual Energy) uses stamina and resilience, sharing the latter with $A_{7}$ and $A_{9}$ .
$A_{3}$ (Midfield Control) and $A_{6}$ (Width Utilization) both aggregate xA from fullbacks and central midfielders.

To quantify the resulting correlations, we generated 500 synthetic teams using the player attribute distributions in Table A1 (Appendix A) and computed the pairwise correlation matrix. Table 3 reports correlations for the most affected attributes.

Two attribute pairs exhibit high correlations:

A_{7}

–

A_{9}

(

r = 0.98

) and

A_{3}

–

A_{6}

(

r = 0.90

). The corresponding Variance Inflation Factors (VIF) are 35.1 (

A_{7}

), 37.0 (

A_{9}

), 21.6 (

A_{3}

), and 18.4 (

A_{6}

)—well above the conventional threshold of 10, indicating severe multicollinearity. The remaining 10 attributes have VIF

< 5

, suggesting acceptable independence.

3.3.7. Implications for Distance Computation

Under Euclidean distance, correlated attributes contribute partially redundant information, effectively double-counting shared variance. For the

A_{7}

–

A_{9}

pair, a team’s psychological profile influences distance along two nearly parallel axes, inflating the contribution of resilience/aggression relative to other dimensions.

3.3.8. Design Justification

Despite the statistical correlation, we retain both

A_{7}

and

A_{9}

(and both

A_{3}

and

A_{6}

) for the following reasons:

1.: Conceptual distinctness: In sports psychology, resilience (ability to recover from setbacks) and morale (current motivational state) are treated as related but distinct constructs [25]. A team may have high baseline resilience yet low in-match morale due to recent conceded goals. Similarly, midfield control (tempo dictation) and width utilization (flank exploitation) represent tactically distinct capabilities that happen to draw on overlapping personnel.
2.: Strategy vector differentiation: Tactical templates assign different weights to these correlated attributes. For example, “High Press” requires high $A_{7}$ but is neutral on $A_{9}$ , while “Cautious Horizontal Play” prioritizes $A_{9}$ (maintaining composure) over $A_{7}$ (bouncing back from pressure). The correlation at the team level does not imply identical strategic relevance.
3.: Dynamic divergence: Under match conditions, $A_{7}$ and $A_{9}$ can diverge: morale ( $A_{9}$ ) is modulated by score state and momentum, while resilience ( $A_{7}$ ) reflects a more stable trait. The current prototype does not fully exploit this divergence, but the architectural separation enables future refinement.

3.3.9. Mitigation Strategies

We acknowledge that the current Euclidean metric does not account for attribute covariance. Several mitigation approaches are available:

Mahalanobis distance: Replacing Euclidean with Mahalanobis distance, $d_{M} (x, y) = \sqrt{{(x - y)}^{⊤} Σ^{- 1} (x - y)}$ , would down-weight correlated dimensions automatically. However, this requires estimating the covariance matrix $Σ$ from representative team data, which is unavailable for the current prototype.
Principal component projection: Projecting the 14-dimensional space onto principal components would decorrelate the axes. This sacrifices interpretability (components are linear combinations rather than named attributes) but may be appropriate for purely predictive applications.
Attribute consolidation: Merging $A_{7}$ / $A_{9}$ into a single “Psychological State” dimension and $A_{3}$ / $A_{6}$ into “Midfield Effectiveness” would reduce redundancy but lose the conceptual granularity valued by coaching staff.
Regularized weighting: Applying lower dynamic weights to correlated pairs (e.g., halving $w_{7}$ and $w_{9}$ ) would reduce their combined influence, approximating the effect of Mahalanobis correction.

For the current prototype, we retain the 14-attribute Euclidean formulation with the following justification: (i) the DSS is designed for interpretability, and named attributes are more actionable than principal components; (ii) the high-correlation pairs (

A_{7}

/

A_{9}

,

A_{3}

/

A_{6}

) represent 4 of 14 dimensions, limiting the overall inflation; and (iii) the dynamic weighting mechanism (Section 3.6.2) already modulates attribute salience, partially mitigating fixed-correlation effects. Future work should implement Mahalanobis distance once sufficient real-world team data are available to estimate

Σ

reliably.

3.3.10. Empirical Impact Assessment

To quantify the practical effect of multicollinearity on strategy recommendations, we conducted a sensitivity analysis comparing Euclidean rankings with a correlation-adjusted baseline. Specifically:

1.: We computed strategy rankings for 100 synthetic team profiles using standard Euclidean distance.
2.: We repeated the analysis using a “consolidated” 12-attribute space where $A_{7}$ / $A_{9}$ and $A_{3}$ / $A_{6}$ were each merged into single dimensions (averaging their values).
3.: We measured rank correlation (Kendall’s $τ$ ) between the two ranking schemes.

The mean rank correlation was

τ = 0.91

(95% CI: 0.87–0.94), indicating that the top-ranked strategies are largely stable despite the redundancy. The primary effect of consolidation was minor reordering among middle-ranked strategies with similar distances. Critically, the top-1 recommendation matched in 94% of cases, and the top-3 set matched in 89% of cases.

These results suggest that while the multicollinearity is statistically significant, its practical impact on DSS recommendations is limited. Nevertheless, the full 14-dimensional correlation matrix is provided in Appendix A.7 for transparency, and we recommend that future deployments with real team data implement Mahalanobis distance or apply the regularized weighting correction described above.

3.4. Encoding Tactical Strategies as Vectors

A key methodological contribution of this work is the formalization of tactical strategies as vectors in the same semantic space defined by the 14 macro-attributes. This representation enables direct, quantitative comparison between a team’s current state and the requirements of candidate strategies, transforming qualitative tactical concepts into computationally tractable objects.

3.4.1. Strategy Vector Definition

Each canonical strategy

S_{i}

is represented as an ideal profile vector:

V_{strategy}^{(i)} = [s_{i}^{A_{1}}, s_{i}^{A_{2}}, \dots, s_{i}^{A_{14}}], s_{i}^{A_{j}} \in [0, 1]

(4)

where

s_{i}^{A_{j}}

represents the importance or requirement level of attribute

A_{j}

for strategy

S_{i}

. Values approaching 1 indicate critical importance; lower values indicate diminishing relevance. As detailed in Stage 3 of the construction methodology below, we adopt a non-zero floor for all attribute values, reflecting the observation that no attribute is ever entirely irrelevant to any viable football strategy.

This formulation treats strategies not as binary labels but as continuous profiles that specify the ideal team characteristics for effective implementation. The semantic distance between a team vector

V_{team}

and a strategy vector

V_{strategy}^{(i)}

thus quantifies the “fit” between the team’s current capabilities and the strategy’s demands.

3.4.2. Construction Methodology

Strategy vectors were constructed through a four-stage process combining expert knowledge, tactical literature, and empirical validation:

Stage 1: Strategy Selection

Twenty canonical strategies were selected based on three criteria:

(a): Prevalence: Strategies commonly employed in modern professional football, as documented in tactical analysis literature and match reports.
(b): Diversity: Coverage of the tactical spectrum from ultra-defensive (e.g., deep block) to ultra-offensive (e.g., high pressing), and from possession-based to direct approaches.
(c): Distinctiveness: Strategies with clearly differentiated attribute profiles, ensuring meaningful separation in the semantic space.

The selected strategies span five functional categories:

Offensive systems: Build-up play, direct vertical attack, systematic crossing, overlapping flanks, delayed midfielder runs
Pressing variants: High pressing, gegenpressing, midfield pressing, inducing build-up errors
Defensive structures: Positional defense, deep block, compact zonal defense, strict man-marking, offside trap
Transition-based: Fast counterattack, long ball to target man
Possession/control: Extended possession play, cautious horizontal circulation, central block with quick breaks

Stage 2: Qualitative Mapping via Expert Elicitation

For each strategy, tactical requirements were mapped onto the 14 macro-attributes through a structured elicitation protocol involving independent expert ratings followed by reconciliation.

Expert Panel Composition. The elicitation panel comprised three domain experts:

Rater A: Academic researcher with expertise in performance analysis and tactical periodization.
Rater B: Experienced football coach with background in youth academy and semi-professional coaching; familiar with tactical analysis workflows.
Rater C: Practitioner with experience in match analysis and video-based tactical coding.

Rating Protocol. Each rater independently completed a structured rating task:

1.: Materials: Raters received (i) definitions of all 14 macro-attributes with examples, (ii) descriptions of each strategy including typical formations, player movements, and match situations, and (iii) a rating matrix (20 strategies × 14 attributes).
2.: Rating scale: For each strategy–attribute pair, raters assigned one of five qualitative levels: Irrelevant, Low, Moderate, High, or Critical.
3.: Anchoring: Raters were provided with three anchor examples per attribute to calibrate interpretations (e.g., “For $A_{5}$ (High Press Capability): gegenpressing = Critical; build-up play = Low; deep block = Irrelevant”).
4.: Independence: Ratings were collected via separate online forms without communication between raters.
5.: Duration: Each rater completed the task in 2–3 h over multiple sessions.

Inter-Rater Agreement. Agreement was quantified using two metrics:

Percentage exact agreement: 58.2% of ratings were identical across all three raters; 89.6% were within one level (e.g., High vs. Critical).
Krippendorff’s alpha: $α = 0.71$ (ordinal scale), indicating substantial agreement. Values above 0.67 are conventionally acceptable for exploratory research [26].

The highest agreement was observed for extreme ratings (Irrelevant, Critical) and for physically grounded attributes (

A_{8}

,

A_{13}

); lower agreement occurred for psychological attributes (

A_{7}

,

A_{9}

) where interpretation varied.

Conflict Resolution. Discrepancies were resolved through a structured reconciliation process:

1.: Threshold for discussion: Pairs with rating spread $\geq 2$ levels (e.g., Low vs. Critical) were flagged for deliberation.
2.: Reconciliation session: The three raters participated in a 90-min video conference to discuss flagged items (42 of 280 pairs, 15%). Each rater presented their reasoning; discussion continued until consensus or majority agreement was reached.
3.: Averaging for minor discrepancies: For pairs with spread $\leq 1$ level, the median rating was adopted without discussion.
4.: Documentation: All reconciliation decisions were logged with brief justifications (available from the corresponding author upon request).

This stage produced qualitative assessments of the form: “High pressing requires Critical stamina (

A_{8}

), Critical pressing capability (

A_{5}

), and Moderate technical base (

A_{12}

)—with all three raters in agreement.”

Stage 3: Numerical Encoding

Qualitative assessments were converted to numerical values using a standardized mapping, as presented in Table 4:

Values were assigned within ranges to allow fine-grained differentiation between strategies with similar but not identical requirements.

Non-Zero Floor Justification

We adopt a floor of

0.2

rather than 0 for three complementary reasons:

1.: Tactical realism: No attribute is entirely irrelevant to any football strategy. Even a purely defensive system benefits marginally from offensive capability (e.g., to relieve pressure via effective clearances); even a counterattacking approach benefits marginally from possession skills (e.g., to consolidate after a transition). The floor reflects this universal baseline relevance.
2.: Geometric regularization: In the semantic space, true zeros would create degenerate subspaces where certain dimensions contribute nothing to distance computations for particular strategies. This could cause discontinuous behavior: small changes in team attributes along “irrelevant” dimensions would produce no change in distance to one strategy but non-zero changes to another. The non-zero floor ensures that all 14 dimensions contribute meaningfully to every strategy comparison, yielding smoother and more interpretable distance gradients.
3.: Robustness to measurement error: Team attribute estimates are inherently uncertain. If strategy vectors contained true zeros, measurement noise in team attributes along those dimensions would be entirely ignored—potentially masking capability deficits that become relevant under match pressure. The floor provides a buffer that allows the system to detect large deviations even on “low-importance” attributes.

The specific choice of

0.2

as the floor is motivated by the desire to preserve discriminability: it is low enough to clearly distinguish “irrelevant” from “low importance” (

0.4

–

0.5

), yet high enough to provide meaningful geometric contribution. Sensitivity to this choice is examined in Section 3.4.5 below.

Stage 4: Validation and Refinement

Initial vectors were validated through three mechanisms to ensure both internal consistency and external credibility:

Internal Consistency Checks. Semantic coherence was verified by computing pairwise cosine similarities between all strategy vectors:

Similar strategies should cluster: High Press and Gegenpressing achieved cosine similarity of 0.97; Build-up Play and Extended Possession achieved 0.94. All pairs within the same tactical category exceeded 0.85.
Dissimilar strategies should separate: High Press vs. Positional Defense achieved cosine similarity of 0.62; Fast Counterattack vs. Cautious Horizontal achieved 0.58. Cross-category pairs averaged 0.71.
No anomalous outliers: No strategy vector had mean similarity <0.60 to all others, confirming that all strategies occupy coherent positions in the semantic space.

Face Validity Assessment. Face validity was operationalized through structured review by independent coaching practitioners who had not participated in the initial elicitation:

1.

Reviewers: Two additional practitioners with coaching experience were recruited for validation.

2.

Task: Reviewers examined radar-chart visualizations of each strategy vector and rated: (i) whether the profile “looks correct” for the named strategy (Yes/Partially/No), and (ii) which attributes, if any, seemed mis-weighted.

3.

Results: 17 of 20 strategies (85%) received “Yes” ratings from both reviewers. Three strategies received “Partially” from at least one reviewer:

Offside Trap: One reviewer suggested $A_{11}$ (Tactical Cohesion) should be higher; adjusted from 0.7 to 0.8.
Late Midfield Runners: One reviewer suggested $A_{4}$ (Transition Speed) was too high; retained after discussion as the strategy requires rapid positional shifts.
Strict Man-Marking: Both reviewers suggested $A_{13}$ (Physical Base) should be higher; adjusted from 0.6 to 0.7.

4.

Iteration: Adjusted vectors were re-reviewed and approved.

Calibration Against Match Data. As a supplementary check, strategy vectors were compared against attribute profiles computed from professional match data (5 matches per strategy, sourced from publicly available Bundesliga event data):

Teams explicitly employing each strategy (identified via tactical reports) had their match-level attribute profiles computed.
Correlation between expert-assigned strategy vectors and empirical team profiles averaged $r = 0.68$ across strategies, indicating moderate alignment.
Discrepancies were largest for psychological attributes ( $A_{7}$ , $A_{9}$ ), which are not directly observable in event data, and smallest for technical attributes ( $A_{1}$ – $A_{6}$ ).

This calibration provides preliminary evidence that expert-assigned vectors capture real tactical demands, though the limited sample size (100 matches total) precludes strong claims. Future work should expand this validation with larger datasets.

3.4.3. Illustrative Strategy Profiles

Table 5 presents the complete vector profiles for five representative strategies, illustrating the differentiation achieved through this methodology.

Profile Interpretation

The vectors reveal intuitive tactical signatures:

High Pressing and Gegenpressing share elevated demands on $A_{5}$ (pressing capability), $A_{11}$ (tactical cohesion), and $A_{13}$ (physical base), reflecting their high-intensity, coordinated nature. Gegenpressing additionally requires strong $A_{4}$ (transition speed) for immediate recovery.
Fast Counterattack peaks on $A_{1}$ (offensive strength) and $A_{4}$ (transition speed), with lower requirements for possession-related attributes ( $A_{3}$ , $A_{11}$ ), consistent with its reliance on rapid vertical play rather than sustained control.
Positional Defense inverts the pressing profile: maximal $A_{2}$ (defensive strength) and $A_{10}$ (time management), minimal $A_{4}$ and $A_{5}$ , reflecting a compact, energy-conserving approach.
Build-up Play emphasizes $A_{1}$ , $A_{12}$ (technical base), and $A_{11}$ (tactical cohesion), with moderate physical demands—a technically demanding but physically sustainable approach.

Notice that strategy vectors are intentionally not normalized to a constant sum. Different tactics impose varying total demands across macro-attributes: high-intensity approaches such as gegenpressing require elevated levels across multiple dimensions simultaneously, whereas selective tactics like catenaccio concentrate demands on fewer attributes. This design reflects the inherent asymmetry in tactical resource requirements observed in professional football.

3.4.4. Sensitivity to Vector Specification

A legitimate concern is whether recommendations are overly sensitive to the specific numerical values assigned during vector construction. To address this, we conducted a perturbation analysis:

1.: Each strategy vector was perturbed by adding Gaussian noise $ϵ \sim N (0, σ^{2})$ with $σ = 0.05$ (representing $\pm 5 %$ uncertainty in attribute weights).
2.: The DSS was run $N = 100$ times per scenario with perturbed strategy vectors.
3.: The proportion of runs yielding the same top-ranked strategy as the unperturbed case was recorded.

Results showed that recommendations remained stable in >85% of runs across all test scenarios, indicating that modest variations in strategy vector specification do not substantially alter the DSS output. Larger perturbations (

σ > 0.10

) did produce instability, suggesting that while exact values are not critical, the relative ordering of attribute importance within each strategy should be preserved.

3.4.5. Sensitivity to Floor Choice

To verify that recommendations are not artifacts of the specific floor value, we conducted a systematic floor-sensitivity analysis. The “Irrelevant/Not required” encoding was varied across the range

[0.05, 0.35]

in increments of

0.05

, while preserving the relative spacing between qualitative levels (i.e., shifting the entire encoding scale proportionally).

For each floor value

f \in {0.05, 0.10, 0.15, 0.20, 0.25, 0.30, 0.35}

:

1.: All 20 strategy vectors were re-encoded using the adjusted mapping.
2.: The DSS was executed on each of the four primary test scenarios.
3.: The top-ranked strategy and the top-3 ranking were recorded.

Results

The top-ranked strategy remained unchanged across all floor values for 3 of 4 scenarios. In the Fatigued and Inferior scenario, the ranking between Positional Defense and Compact Zonal Defense alternated for

f < 0.15

—two tactically similar strategies whose near-identical profiles make them effectively interchangeable recommendations. Critically, no high-intensity strategy (e.g., gegenpressing) was ever erroneously promoted to top rank due to floor choice.

The mean pairwise rank correlation (Kendall’s

τ

) between the baseline (

f = 0.20

) and alternative floor encodings was

τ = 0.94

, indicating that the overall strategy ordering is highly robust to floor specification.

Geometric Interpretation

Mathematically, shifting the floor from

f_{1}

to

f_{2}

(with

f_{2} > f_{1}

) uniformly increases all strategy vector components. Under Euclidean distance, this shift affects absolute distances but preserves the relative ranking of strategies with respect to any fixed team vector, provided the team vector is also bounded away from zero (which it is, by construction). The observed stability confirms this theoretical expectation.

3.4.6. Extensibility

The vector-based formalization offers several practical advantages:

Modularity: New strategies can be added by specifying a 14-dimensional vector, without modifying the distance computation logic.
Customization: Coaching staff can define club-specific tactical variants (e.g., “our high press”) by adjusting attribute weights to reflect their preferred implementation.
Automation potential: Future extensions could generate strategy vectors automatically from natural language descriptions (e.g., tactical reports) using NLP-based embedding techniques, further reducing manual specification effort.

This formalization transforms tactical strategies from qualitative concepts into quantitative objects amenable to systematic comparison, enabling the semantic distance computations described in the following section.

3.5. Semantic Distance and Matching

Section 2.3 introduced several distance metrics commonly used in semantic spaces. For tactical matching, we adopt Euclidean distance as the baseline metric, with the following rationale.

3.5.1. Why Euclidean over Cosine?

Cosine similarity measures angular alignment between vectors and is scale-invariant—a property desirable when comparing profiles or styles. However, in tactical selection, both the direction and magnitude of team capabilities matter. A team with uniformly weak attributes (

V_{team} \approx 0.3

) should not match a demanding high-pressing template (

V_{strategy} \approx 0.8

) simply because their profiles are proportionally similar. Euclidean distance captures this absolute capability gap, penalizing large deviations quadratically—an appropriate behavior when single-attribute shortfalls (e.g., insufficient stamina for gegenpressing) can be tactically decisive.

3.5.2. Why Not Probabilistic Metrics?

Kullback–Leibler and Jensen–Shannon divergences are well-suited for comparing probability distributions but require vectors to sum to unity. Our macro-attributes are independent capability dimensions, not components of a probability simplex, making geometric metrics more natural.

3.5.3. Baseline Formulation

Given team and strategy vectors

x, y \in {[0, 1]}^{14}

:

d_{eucl} (x, y) = \sqrt{\sum_{j = 1}^{14} {(x_{j} - y_{j})}^{2}} .

3.5.4. Context-Adapted Distance

To account for evolving match conditions, we introduce a dynamic weight vector

w \in R_{\geq 0}^{14}

:

d_{adapt} (x, y; w) = \sqrt{\sum_{j = 1}^{14} w_{j} \cdot {(x_{j} - y_{j})}^{2}} .

Note that the weight

w_{j}

modifies the squared difference

{(x_{j} - y_{j})}^{2}

, not the individual vectors. This is the standard formulation for weighted Euclidean distance:

w_{j}

controls the importance of attribute

A_{j}

in the overall distance computation, not the attribute values themselves. Intuitively, a high

w_{j}

means that mismatches on attribute

A_{j}

are penalized more heavily under current match conditions, while a low

w_{j}

means that the attribute contributes less to strategy selection. The team and strategy vectors retain their original values; only their contribution to the distance metric is modulated.

Weights $w_{j}$ are adjusted based on real-time contextual factors:

Residual energy ( $A_{8}$ ): low energy ⇒ increase $w_{10}$ (time management), decrease $w_{5}$ (pressing).
Technical/physical gaps ( $A_{12}, A_{13}$ ): if inferior, upweight $w_{11}$ (tactical cohesion) and $w_{2}$ (defensive strength); downweight $w_{1}, w_{6}$ (offensive, width).
Time pressure ( $A_{10}$ ): limited time ⇒ upweight $w_{4}$ (transition speed) and $w_{1}$ (offensive strength).

3.5.5. Opponent-Aware Adjustment

An optional extension incorporates opponent modeling via a parameter

α \in [0, 1]

:

d_{comb} (S) = d_{adapt} (V_{team}, V_{strategy} (S)) - α \cdot d_{adapt} (V_{opp}, V_{strategy} (S)) .

When

α > 0

, the system favors strategies that fit our team well and poorly fit the opponent.

Theoretical Justification. The opponent-aware formulation rests on the following game-theoretic intuition: in competitive settings, strategy effectiveness depends not only on own-team capability but also on the opponent’s ability to counter. A strategy that the opponent can execute well creates symmetry—both teams can play similarly, reducing differential advantage. Conversely, a strategy that exploits capability gaps creates asymmetry favoring our team.

Formally, the subtraction

d_{adapt} (V_{team}, V_{S}) - α \cdot d_{adapt} (V_{opp}, V_{S})

can be interpreted as a relative advantage score:

First term (minimized): Measures how well our team can execute strategy S—lower is better.
Second term (maximized via subtraction): Measures how poorly the opponent can execute S—higher opponent distance means greater difficulty for them, which benefits us.
Net effect: Strategies are preferred when we can execute them well AND the opponent cannot.

This formulation assumes:

1.: Monotonicity of advantage: Greater opponent difficulty with our chosen strategy translates to competitive advantage. This holds when strategies impose demands the opponent struggles to meet (e.g., high pressing against a team with poor stamina forces errors).
2.: Comparability of distances: The same distance magnitude represents equivalent “fit” for both teams. This is ensured by the common normalization protocol (Section 3.2.2), which maps all attributes to $[0, 1]$ using consistent benchmarks.
3.: Independence of execution: Our ability to execute a strategy is not directly affected by the opponent’s capability (though match dynamics may create indirect effects not captured here).

When is opponent-awareness beneficial? The formulation is most valuable when:

Opponent capabilities are known with reasonable confidence (scouting data available).
Attribute profiles differ substantially between teams (asymmetric matchups).
Match stakes justify opponent-focused adaptation (knockout games, rivalry matches).

For league matches against unfamiliar opponents or when scouting data are sparse, setting

α = 0

(identity-focused selection) may be more robust.

Estimation of $V_{opp}$ Under Uncertainty Opponent profiles are inherently less certain than own-team profiles due to limited observation and potential strategic concealment. The DSS addresses this uncertainty through the following mechanisms:

(a) Confidence-weighted estimation. When constructing

V_{opp}

, each attribute is assigned a confidence weight

c_{j}^{opp} \in [0, 1]

reflecting data quality:

$c_{j}^{opp} = 1.0$ : Attributes derived from recent match data (last 5 games) with complete coverage.
$c_{j}^{opp} = 0.7$ : Attributes estimated from partial data or older matches (6–15 games ago).
$c_{j}^{opp} = 0.5$ : Attributes inferred from league averages or indirect proxies.

(b) Uncertainty propagation via interval estimation. Rather than a point estimate,

V_{opp}

can be represented as an interval:

V_{opp, j} \in [V_{opp, j}^{est} - δ_{j}, V_{opp, j}^{est} + δ_{j}]

where

δ_{j} = σ_{j} \cdot (1 - c_{j}^{opp})

and

σ_{j}

is the attribute’s population standard deviation. This yields a range of possible

d_{comb}

values, allowing the DSS to flag recommendations as “uncertain” when the range spans multiple top strategies.

(c) Conservative $α$ adjustment. When opponent confidence is low,

α

should be reduced proportionally:

α_{eff} = α \cdot {\bar{c}}^{opp}, where {\bar{c}}^{opp} = \frac{1}{14} \sum_{j = 1}^{14} c_{j}^{opp}

This ensures that uncertain opponent data contribute less to strategy selection, preventing overconfident exploitation of potentially inaccurate profiles.

Sensitivity Analysis Over $α$ . To characterize the influence of

α

on recommendations, we conducted systematic sensitivity analysis across the four test scenarios.

Methodology. For each scenario, we varied

α \in {0.0, 0.1, 0.2, 0.3, 0.4, 0.5}

and recorded: (i) the top-ranked strategy, (ii) the full ranking, and (iii) the distance differential between the top-2 strategies (as a stability indicator).

Results are summarized in Table 6.

Key findings:

Stability range: The top-1 recommendation remained unchanged for $α \in [0.0, 0.3]$ in all scenarios, indicating robustness to moderate parameter variation.
Transition points: At $α = 0.4$ – $0.5$ , one scenario (Fatigued & Inferior) shifted from “Positional Defense” to “Compact Zonal Defense”—both defensive strategies, so the qualitative recommendation (defend conservatively) was preserved.
Rank correlation: Kendall’s $τ$ between rankings at $α = 0$ and $α = 0.5$ exceeded 0.89 in all scenarios, confirming that opponent-awareness modulates rather than disrupts the ranking structure.
Confidence intervals: The 95% CI for the distance differential (computed via bootstrap, $N = 1000$ ) indicates that top-1 vs. top-2 separation remains positive (i.e., clear winner) across the tested range.

Recommendation for $α$ selection:

Default: $α = 0.2$ provides meaningful opponent-awareness without excessive sensitivity.
High-stakes matches: $α = 0.3$ – $0.4$ when opponent data are reliable and exploitation is prioritized.
Uncertain opponents: $α = 0.0$ – $0.1$ when scouting data are limited or opponent behavior is unpredictable.
Identity-focused teams: $α = 0$ for coaches who prioritize consistent style over opponent adaptation.

The parameter can be tuned based on match stakes (higher

α

for must-win games), scouting confidence (lower

α

when opponent data is uncertain), or coaching philosophy (identity-focused coaches use

α \approx 0

; opponent-focused coaches use

α \approx 0.3

–

0.5

).

3.5.6. Optimal Tactic Selection

The recommended strategy minimizes adapted (or combined) distance:

S^{*} = arg min_{S} d_{adapt} (V_{team}, V_{strategy} (S); w (match conditions)) .

3.5.7. Alternative Metrics for Future Work

While Euclidean distance serves well for capability-based matching, cosine similarity could be offered as a user-selectable option for style classification tasks (e.g., “which historical team does this squad most resemble?”). Hybrid approaches—combining Euclidean distance for capability assessment with cosine similarity for stylistic profiling—represent a promising direction for richer tactical analytics.

3.5.8. Controlled Comparison: Euclidean vs. Cosine

To quantify when and why the two metrics diverge, we conducted a controlled comparison across 100 synthetic team profiles and all 20 strategies.

Methodology. For each team profile, we computed:

1.: Euclidean ranking: Strategies ranked by ascending $d_{eucl} (V_{team}, V_{strategy})$ .
2.: Cosine ranking: Strategies ranked by descending cosine similarity $cos (V_{team}, V_{strategy}) = \frac{V_{team} \cdot V_{strategy}}{∥ V_{team} ∥ ∥ V_{strategy} ∥}$ .

We measured rank correlation (Kendall’s

τ

) between the two rankings and identified cases where the top-1 recommendation differed.

Results.

Overall correlation: Mean $τ = 0.82$ (range: 0.71–0.93), indicating substantial but imperfect agreement.
Top-1 agreement: The same strategy was ranked first by both metrics in 73% of cases.
Top-3 overlap: The top-3 sets shared at least 2 strategies in 91% of cases.

When do rankings diverge? Divergence was systematic and predictable:

Magnitude-driven divergence: Teams with uniformly low capabilities ( ${\bar{V}}_{team} < 0.45$ ) showed the largest discrepancies. Cosine similarity favored demanding strategies (e.g., High Press, Gegenpressing) when the team’s profile shape matched, even if absolute capability levels were insufficient. Euclidean distance correctly penalized these mismatches.
Example: A fatigued team ( $A_{8} = 0.3$ ) with otherwise balanced attributes achieved high cosine similarity (0.91) with “High Press” due to proportional alignment, but Euclidean distance correctly ranked it 14th due to the large absolute gap on $A_{8}$ and $A_{5}$ .
Convergence at high capability: For teams with ${\bar{V}}_{team} > 0.65$ , the two metrics agreed on top-1 in 89% of cases, as magnitude differences became less decisive.

Metric Selection Summary. Based on this analysis, the DSS uses metrics as follows:

Tactical selection (primary task): Weighted Euclidean distance ( $d_{adapt}$ ), because capability shortfalls must be penalized regardless of profile similarity.
Strategy vector validation: Cosine similarity, to verify that semantically similar strategies cluster together (Section 3.4).
Style classification (optional): Cosine similarity could be offered for “which team does this squad resemble?” queries, where magnitude is less relevant.

All experimental results reported in this paper use weighted Euclidean distance unless explicitly noted otherwise.

3.6. Selection Algorithm

Inputs: context trees for our team and the opponent; tactical templates

{V_{strategy}^{(i)}}

; match conditions (time remaining, current score).

Outputs: recommended tactic

S^{*}

, ranked list of tactics, attribute-level diagnostics.

3.6.1. Algorithm Steps

1.: Context aggregation: Compute $V_{team}$ and $V_{opp}$ from the respective context trees (14-dimensional vectors).
2.: Gap estimation: Derive technical and physical gaps:

$Δ_{tech} = V_{team} [A_{12}] - V_{opp} [A_{12}], Δ_{phys} = V_{team} [A_{13}] - V_{opp} [A_{13}]$
3.: Weight construction: Build the dynamic weight vector w using the procedure in Section 3.6.2.
4.: Distance computation: For each strategy i, compute:

$d_{adapt} (V_{team}, V_{strategy}^{(i)}; w) = \sqrt{\sum_{j = 1}^{14} w_{j} \cdot {(V_{team}^{(j)} - V_{strategy}^{(i, j)})}^{2}}$
5.: Opponent adjustment (optional): If $α > 0$ , compute combined score:

$d_{comb}^{(i)} = d_{adapt} (V_{team}, V_{strategy}^{(i)}) - α \cdot d_{adapt} (V_{opp}, V_{strategy}^{(i)})$
6.: Ranking & selection: Sort strategies by $d_{adapt}$ (or $d_{comb}$ ) ascending; select $S^{*} = arg {min}_{i} d^{(i)}$ .
7.: Diagnostics: Report per-attribute deltas $Δ_{j} = V_{strategy}^{(S^{*}, j)} - V_{team}^{(j)}$ to explain the recommendation.

3.6.2. Dynamic Weight Computation

The weight vector

w \in R_{\geq 0}^{14}

modulates attribute salience based on match conditions. We define

w_{j} = w_{j}^{base} \cdot m_{j}

, where

w_{j}^{base} = 1

for all j (equal baseline), and

m_{j}

is a context-dependent multiplier.

Energy-Based Adjustments

Let

e = V_{team} [A_{8}]

denote current residual energy (normalized to

[0, 1]

). We define an energy deficit indicator:

δ_{e} = max (0, τ_{e} - e)

where

τ_{e} = 0.5

is the energy threshold below which fatigue effects become salient. The multipliers are:

\begin{matrix} m_{5} & = 1 - γ_{e} \cdot δ_{e} & (reduce weight on High Press Capability) \end{matrix}

(5)

\begin{matrix} m_{10} & = 1 + γ_{e} \cdot δ_{e} & (increase weight on Time Management) \end{matrix}

(6)

\begin{matrix} m_{13} & = 1 - 0.5 \cdot γ_{e} \cdot δ_{e} & (reduce weight on Physical Base) \end{matrix}

(7)

where

γ_{e} = 1.5

is the energy sensitivity parameter. For example, if

e = 0.3

(low energy), then

δ_{e} = 0.2

, yielding

m_{5} = 0.70

,

m_{10} = 1.30

, and

m_{13} = 0.85

.

Gap-Based Adjustments

When the team is outmatched technically or physically, defensive and cohesion attributes become more critical:

\begin{matrix} m_{2} & = 1 + γ_{g} \cdot max (0, - Δ_{tech}) & (increase Defensive Strength if technically inferior) \end{matrix}

(8)

\begin{matrix} m_{11} & = 1 + γ_{g} \cdot max (0, - Δ_{phys}) & (increase Tactical Cohesion if physically inferior) \end{matrix}

(9)

\begin{matrix} m_{1} & = 1 - 0.5 \cdot γ_{g} \cdot max (0, - Δ_{tech}) & (reduce Offensive Strength if outmatched) \end{matrix}

(10)

\begin{matrix} m_{6} & = 1 - 0.5 \cdot γ_{g} \cdot max (0, - Δ_{phys}) & (reduce Width Utilization if outmatched) \end{matrix}

(11)

where

γ_{g} = 1.0

is the gap sensitivity parameter.

Time Pressure Adjustments

Let

t \in [0, 1]

denote the fraction of match time remaining (1 = kickoff, 0 = final whistle), and let

s \in {- 1, 0, + 1}

encode score state (losing, drawing, winning). When time is limited and the team needs a result:

δ_{t} = max (0, τ_{t} - t) \cdot 1 [s \leq 0]

where

τ_{t} = 0.25

(final quarter of the match) and

1 [s \leq 0]

equals 1 if not winning. The multipliers are:

\begin{matrix} m_{4} & = 1 + γ_{t} \cdot δ_{t} & (increase Transition Speed) \end{matrix}

(12)

\begin{matrix} m_{1} & = m_{1} + γ_{t} \cdot δ_{t} & (further increase Offensive Strength) \end{matrix}

(13)

where

γ_{t} = 2.0

is the urgency sensitivity parameter.

Final Weight Computation

All multipliers are combined multiplicatively, then clamped to stability bounds and normalized:

1.: Clamping: Each multiplier is bounded to prevent extreme values:

$m_{j} \leftarrow clamp (m_{j}, m_{min}, m_{max}) = max (m_{min}, min (m_{j}, m_{max}))$

with $m_{min} = 0.3$ and $m_{max} = 2.5$ . This ensures no attribute is entirely suppressed ( $w_{j} > 0$ ) or dominates excessively.
2.: Normalization: Weights are scaled to sum to 14 (preserving the baseline where all $w_{j} = 1$ ):

$w_{j} = \frac{14 \cdot m_{j}}{\sum_{k = 1}^{14} m_{k}}$

The clamping bounds were chosen empirically:

m_{min} = 0.3

prevents any attribute from contributing less than 30% of its baseline importance, while

m_{max} = 2.5

caps amplification at 2.5× baseline. These bounds ensure numerical stability and prevent pathological weight distributions where a single attribute dominates the distance computation.

Table 7 summarizes the default parameter values.

Input Variables for Weight Estimation

The dynamic weight computation requires six input values, all derived from the match state at evaluation time:

1.: $V_{team} [A_{8}]$ : Team’s current residual energy (from context tree or manual input).
2.: $V_{team} [A_{12}]$ : Team’s technical base (static, from roster data).
3.: $V_{team} [A_{13}]$ : Team’s physical base (static, from roster data).
4.: $V_{opp} [A_{12}], V_{opp} [A_{13}]$ : Opponent’s technical and physical bases (for gap computation).
5.: $t \in [0, 1]$ : Fraction of match time remaining.
6.: $s \in {- 1, 0, + 1}$ : Current score state (losing, drawing, winning).

These six values fully determine the weight vector w via the formulas above. In deployment,

V_{team} [A_{8}]

may be updated dynamically from physiological monitoring; other values typically remain fixed within a match.

Parameter Tuning

The default values in Table 7 were set based on tactical reasoning and preliminary experimentation. In deployment, these parameters can be:

Calibrated to historical match data via grid search or Bayesian optimization;
Personalized to reflect coaching philosophy (e.g., risk-averse coaches may increase $γ_{g}$ );
Learned from expert feedback through interactive refinement.

3.6.3. Pseudocode

Algorithm 1 provides a compact pseudocode summary.

Complexity

The algorithm runs in

O (m \cdot n)

time for m strategies and

n = 14

attributes. With

m = 20

strategies, inference completes in under 5 ms on standard hardware, suitable for real-time tactical dashboards.

Strengths

The procedure is interpretable (explicit weights and per-attribute deltas), adaptive in real time (weights update with context), and scalable (new strategies or attributes can be added without changing the core logic).

Algorithm 1 Tactical Strategy Selection

Require: Context trees

T_{team}

,

T_{opp}

; strategy templates

{V_{strategy}^{(i)}}_{i = 1}^{m}

; match state

(t, s)

Ensure: Recommended strategy

S^{*}

, diagnostics

Δ

1:

V_{team} \leftarrow A GGREGATE (T_{team})

2:

V_{opp} \leftarrow A GGREGATE (T_{opp})

3:

Δ_{tech} \leftarrow V_{team} [A_{12}] - V_{opp} [A_{12}]

4:

Δ_{phys} \leftarrow V_{team} [A_{13}] - V_{opp} [A_{13}]

5:

w \leftarrow C OMPUTE W EIGHTS (V_{team} [A_{8}], Δ_{tech}, Δ_{phys}, t, s)

6: for each strategy

i = 1, \dots, m

do
7:

d^{(i)} \leftarrow \sqrt{\sum_{j = 1}^{14} w_{j} {(V_{team}^{(j)} - V_{strategy}^{(i, j)})}^{2}}

8: if

α > 0

then
9:

d_{opp}^{(i)} \leftarrow \sqrt{\sum_{j = 1}^{14} w_{j} {(V_{opp}^{(j)} - V_{strategy}^{(i, j)})}^{2}}

10:

d^{(i)} \leftarrow d^{(i)} - α \cdot d_{opp}^{(i)}

11: end if
12: end for
13:

S^{*} \leftarrow arg {min}_{i} d^{(i)}

14:

Δ \leftarrow V_{strategy}^{(S^{*})} - V_{team}

15: return

S^{*}

,

Δ

3.7. Evaluation Protocol

To assess the reliability, interpretability, and robustness of the prototype, we designed an evaluation protocol combining both qualitative coherence tests and quantitative stability checks. Since the model aims to support tactical reasoning rather than predict match outcomes, evaluation focuses on the logical and behavioral consistency of recommendations.

Consistency Across Scenarios

Each simulated scenario (Section 5.1) is tested for:

Contextual coherence—the recommended strategy must align with intuitive tactical reasoning under the given conditions (e.g., low energy → positional defense).
Ranking monotonicity—when adjusting a single attribute (e.g., increasing $A_{8}$ ), the ranking of high-intensity strategies should improve predictably.

2.: Robustness to Perturbations

To verify numerical stability, random Gaussian noise

ϵ \sim N (0, σ^{2})

is injected into team attributes (

σ \leq 0.05

). The system is expected to preserve the same top-ranked strategy in at least 90% of runs. Formally, let

{\hat{S}}_{k}

denote the recommended strategy in run k; the robustness index is:

R = \frac{1}{K} \sum_{k = 1}^{K} 1 {{\hat{S}}_{k} = S^{*}}, R \in [0, 1] .

A value

R > 0.9

indicates satisfactory resilience to measurement uncertainty.

3.: Sensitivity and Explainability

The diagnostic module computes attribute-level deltas

Δ_{j} = {(V_{strategy}^{(S^{*})} - V_{team})}_{j},

highlighting the most influential gaps driving the recommendation. Manual inspection across scenarios ensures that these explanations remain coherent with domain knowledge (e.g., “low

A_{8}

and

A_{13}

reduce feasibility of gegenpressing”).

4.: Computational Efficiency

All experiments run on a standard laptop (Intel i7, 16GB RAM). Given the small dimensionality (

n = 14

) and the linear complexity

O (m n)

for m strategies, inference latency remains below 5 ms per evaluation—suitable for real-time tactical dashboards.

Summary

The combination of interpretability, robustness, and low computational cost validates the architecture as a viable foundation for more advanced AI-assisted tactical systems.

3.8. System Architecture Diagram

Figure 2 summarizes the end-to-end processing pipeline described in the preceding sections: context tree inputs are aggregated and normalized into a 14-dimensional team vector, which is then matched against strategy templates via the adapted semantic distance module to produce ranked recommendations with diagnostic output.

4. Prototype Implementation

The prototype of the tactical Decision Support System (DSS) was implemented in Python 3.10 using standard scientific libraries (NumPy, pandas, and matplotlib). The code follows a modular structure that mirrors the conceptual architecture described in Figure 2, ensuring both interpretability and extensibility. The complete source code is publicly available at https://github.com/Aribertus/football-dss-semantic-distance (accessed on 24 February 2026).

4.1. Module Organization

The implementation comprises three main modules:

Attribute aggregation module: Computes the 14 macro-attributes from player-level data using the weighted aggregation functions specified in Section 3.2. Each macro-attribute has a dedicated function (e.g., compute_offensive_strength(), compute_residual_energy()) that applies role-based weights to relevant player metrics.
Distance computation module: Implements the semantic distance calculations described in Section 3.5, including base Euclidean distance and the context-adapted variant with dynamic weight adjustments.
Analysis and visualization module: Provides sensitivity analysis, robustness testing, and ablation studies as specified in the evaluation protocol (Section 3.7), with automatic generation of diagnostic plots via matplotlib.

4.2. Dynamic Adjustment Mechanism

The core selection function implements the adapted distance framework from Section 3.5, including attribute-wise dynamic weighting exactly as specified in Section 3.6.2. The opponent-aware objective uses linear subtraction:

d_{comb} (S) = d_{adapt} (V_{team}, V_{strategy} (S)) - α \cdot d_{adapt} (V_{opp}, V_{strategy} (S))

where

d_{adapt}

computes weighted Euclidean distance with the 14-dimensional weight vector w:

d_{adapt} (x, y; w) = \sqrt{\sum_{j = 1}^{14} w_{j} \cdot {(x_{j} - y_{j})}^{2}}

Implementation of Attribute-Wise Weighting

The weight vector w is computed fresh for each strategy evaluation based on current match conditions. The implementation directly follows Algorithm 1 and the multiplier formulas in Section 3.6.2:

1.

Initialize all multipliers

m_{j} = 1

for

j = 1, \dots, 14

.

2.

Compute context indicators from match state:

Energy deficit: $δ_{e} = max (0, τ_{e} - V_{team} [A_{8}])$
Technical gap: $Δ_{tech} = V_{team} [A_{12}] - V_{opp} [A_{12}]$
Physical gap: $Δ_{phys} = V_{team} [A_{13}] - V_{opp} [A_{13}]$
Time pressure: $δ_{t} = max (0, τ_{t} - t) \cdot 1 [s \leq 0]$

3.

Update specific multipliers using Equations (5)–(7) and the gap/time formulas.

4.

Clamp multipliers to stability bounds:

m_{j} \leftarrow clamp (m_{j}, 0.3, 2.5)

.

5.

Normalize weights to preserve scale:

w_{j} = 14 \cdot m_{j} / \sum_{k} m_{k}

.

This procedure ensures that each attribute

A_{j}

receives an individually calibrated weight reflecting its current tactical salience. Section 5.4 demonstrates via ablation that this fine-grained reweighting outperforms uniform (global) weighting.

4.3. Execution Workflow

The main analytical pipeline executes the following steps:

1.: Profile generation: Compute $V_{team}$ and $V_{opp}$ from player-level data or scenario specifications.
2.: Scenario instantiation: Parse match conditions (time, score, fatigue, morale) from input or generate via scenario templates.
3.: Strategy evaluation: Compute adjusted distances for all 20 strategy templates; rank by ascending distance.
4.: Diagnostic extraction: For the top-ranked strategy, compute per-attribute deltas ( $Δ_{j} = V_{strategy}^{(j)} - V_{team}^{(j)}$ ) to identify capability gaps.
5.: Output generation: Produce tabular rankings, radar charts comparing team profile to recommended strategies, and diagnostic reports.

Steps 3–5 execute in under 5 ms on standard hardware (Intel i7, 16 GB RAM), confirming suitability for real-time tactical dashboards.

4.4. Reproducibility

All experiments use seeded random number generation (SEED = 41) to ensure reproducibility. The repository includes:

football_strategy_generation_1_3_1.py: Core DSS implementation with all 20 strategy templates and macro-attribute aggregation functions.
make_figures.py: Reproducible figure generation for experimental evaluation.
compute_pilot_distances.py: Pilot validation computations (Section 6).

Running each script regenerates all results and figures reported in this paper.

4.5. Extensibility

The modular design supports several extension paths:

New strategies: Adding a strategy requires only specifying a new 14-dimensional vector in the strategy_templates list.
External data integration: The aggregation functions can be connected to live data feeds (e.g., Wyscout, StatsBomb APIs) by replacing the player data input layer.
Custom weight profiles: Coaching staff can modify the dynamic adjustment logic to reflect club-specific tactical philosophies without altering the core distance computation.

5. Experimental Evaluation

5.1. Setup and Scenarios

The experimental phase aimed to validate the prototype’s behavior under realistic match conditions, verifying the consistency and interpretability of its tactical recommendations. Because no proprietary club data were available, the experiments employed simulated yet realistic data based on shed match analysis statistics (e.g., Wyscout, Opta, StatsBomb).

Each team and opponent were represented as 14-dimensional normalized vectors (

V_{team}, V_{opp} \in {[0, 1]}^{14}

) derived from the context tree described in Section 4. Scenario parameters included technical and physical gaps, residual energy, psychological resilience, and time pressure. Table 8 summarizes the four principal experimental configurations.

Each scenario was executed using identical team baselines with parameter variations confined to the variables above, enabling controlled analysis of the DSS response. All experiments employed the linear opponent-aware objective

d_{comb}

as specified in Section 3.5, with

α = 0.2

unless otherwise noted.

5.2. Results by Scenario

For each simulated condition, the DSS produced a ranked list of strategies ordered by the adapted semantic distance

d_{adapt}

. Figure 3 displays an example of a radar plot comparing the actual team profile with the ideal profile of the strategy selected as optimal.

In the Energetic and Balanced scenario, the DSS consistently recommended High Pressing or Gegenpressing, with low semantic distance (

d_{adapt} < 0.15

). In the Fatigued and Inferior condition, the system automatically penalized energy-intensive attributes (

A_{5}

,

A_{8}

) and shifted toward Positional Defense, confirming adaptive coherence. Under High Temporal Pressure, the model prioritized Fast Counterattack, whereas under Technical and Physical Superiority it selected Build-up Play, highlighting strategic alignment with context.

Overall, the DSS exhibited behavior consistent with expert tactical intuition while maintaining quantitative transparency through vector distances.

5.3. Stability and Explainability Analyses

To evaluate stability and interpretability, three complementary analyses were performed across all scenarios.

Sensitivity to $λ$

The

λ

parameter regulates the influence of contextual penalties (e.g., opponent predictability). Figure 4 shows that the recommended strategy remains stable for

0.1 < λ < 0.6

, with monotonic increases in distance values, indicating robustness of the semantic matching process.

5.3.1. Robustness to Input Noise

To assess recommendation stability under measurement uncertainty (cf. the measurement framework in Section 3.2.4), we conducted Monte Carlo perturbation analysis. Each attribute

A_{j}

in the team vector was independently perturbed by

ϵ_{j} \sim U (- 0.05, + 0.05)

, simulating

\pm 5 %

measurement error. Over

N = 100

trials per scenario:

Top-1 consistency: The same strategy was ranked first in 89.3% of trials (mean across scenarios; range: 82–96%).
Top-3 stability: The top-3 strategy set was identical in 94.1% of trials.
Distance interval width: Mean interval width (Equation (2)) was 0.08 units, or approximately 12% of typical inter-strategy distance.

The “Fatigued and Inferior” scenario exhibited the lowest stability (82%), consistent with its position near decision boundaries where small perturbations shift the ranking between tactically similar defensive options. The “Energetic and Balanced” scenario was most stable (96%), reflecting clear separation between high-pressing strategies and alternatives.

These results indicate that the DSS recommendations are reasonably robust to plausible measurement error, though deployments relying on lower-reliability data streams (Tier 2–3) should expect reduced consistency and should present top-k alternatives when distance intervals overlap.

5.3.2. Extended Robustness Analysis

The independent perturbation analysis above represents a baseline. Real deployments face additional challenges: correlated measurement errors, systematic missing data, and distribution shifts across competitions. We conducted extended analyses to characterize DSS behavior under these conditions.

(A) Correlated Perturbations Across Attributes. In practice, measurement errors are often correlated: fatigue (

A_{8}

) affects pressing capability (

A_{5}

); psychological attributes (

A_{7}

,

A_{9}

) co-vary with match events. We tested robustness under structured correlation patterns.

Methodology. Rather than independent noise

ϵ_{j} \sim U (- 0.05, + 0.05)

, we drew perturbations from a multivariate distribution:

ϵ \sim N (0, Σ_{err})

where

Σ_{err}

encodes three correlation structures:

Physical cluster: $ρ = 0.6$ among ${A_{5}, A_{8}, A_{13}}$ (pressing, energy, physical base).
Psychological cluster: $ρ = 0.7$ among ${A_{7}, A_{9}}$ (resilience, morale).
Technical cluster: $ρ = 0.5$ among ${A_{1}, A_{3}, A_{12}}$ (offense, midfield, technical base).

All other pairs had

ρ = 0

. Marginal variance was set to

{(0.05)}^{2}

to match the independent case.

Results are summarized in Table 9.

Interpretation. Correlated perturbations reduce stability by 5–8 percentage points compared to independent noise, as errors compound within attribute clusters. The physical cluster has the largest impact because energy-related attributes (

A_{5}

,

A_{8}

) are heavily weighted in fatigue-sensitive scenarios. Nevertheless, top-1 consistency remains above 80% and rank correlation exceeds 0.90, indicating that the DSS degrades gracefully under realistic error structures.

(B) Missing Data Patterns. Real deployments frequently encounter incomplete data: tracking systems fail, physiological monitors disconnect, qualitative assessments are unavailable. We tested three clinically realistic missingness patterns.

Methodology. For each pattern, missing attributes were imputed using the protocol in Section 3.2.4 (correlated source → historical baseline → neutral default). We measured recommendation stability relative to complete-data baselines.

Pattern M1 (Tracking failure): $A_{4}$ (transition speed), $A_{8}$ (energy), $A_{13}$ (physical base) missing—simulates GPS/tracking outage.
Pattern M2 (Psychological unavailable): $A_{7}$ (resilience), $A_{9}$ (morale), $A_{14}$ (relational cohesion) missing—simulates absence of qualitative input.
Pattern M3 (Sparse data): 6 randomly selected attributes missing per trial—simulates amateur/youth contexts with limited instrumentation.

Results are summarized in Table 10.

Interpretation.

Pattern M1 causes moderate degradation (78.5% top-1 match) because physical attributes are decision-critical in fatigue scenarios; imputation from historical baselines underestimates within-match variation.
Pattern M2 shows surprising resilience (85.2%) because psychological attributes, while conceptually important, have high inter-attribute correlation ( $A_{7}$ – $A_{9}$ : $r = 0.98$ ), so partial information suffices.
Pattern M3 exhibits the largest drop (67.3%) but maintains 84% qualitative agreement—defined as recommending a strategy from the same tactical category (e.g., any pressing variant when full data would recommend High Press).

Practical implication: The DSS should report confidence levels based on data completeness. When >4 attributes are imputed, recommendations should be flagged as “low confidence” and presented as top-k alternatives rather than single choices.

(C) Distribution Shift Across Competitions/Levels. Strategy vectors and attribute benchmarks were calibrated for professional European football. Performance may degrade when applied to different contexts: youth football, lower divisions, or non-European leagues where tactical norms differ.

Methodology. We simulated distribution shift by systematically adjusting team vector distributions:

Youth shift: Reduced mean capabilities by 15% ( $μ \to 0.85 μ$ ); increased variance by 30% ( $σ \to 1.3 σ$ )—reflecting lower skill floors and higher execution variability.
Lower-division shift: Reduced technical attributes ( $A_{1}$ – $A_{6}$ , $A_{12}$ ) by 20%; physical attributes unchanged—reflecting skill gap but comparable athleticism.
Style shift: Rotated attribute profiles to emphasize physicality over technique ( $A_{13} \to 1.2 A_{13}$ ; $A_{12} \to 0.8 A_{12}$ )—simulating leagues with different tactical cultures.

For each shift, we generated 50 team profiles, ran the DSS, and evaluated whether recommendations aligned with expert intuition (assessed by two independent raters).

Results are summarized in Table 11.

Interpretation.

Youth shift reduces agreement to 82%, primarily because the DSS over-recommends high-intensity strategies (pressing, gegenpressing) that youth teams lack the discipline to execute. Recalibrating $A_{5}$ and $A_{11}$ thresholds would address this.
Lower-division shift shows modest degradation (88%), suggesting that the attribute framework transfers reasonably well when physical baselines are similar.
Style shift produces the largest drop (78%), with 11 problematic recommendations—typically suggesting possession-based strategies to physically dominant teams that would benefit more from direct play. This confirms that strategy vectors encode European tactical norms and may require re-elicitation for stylistically distinct leagues.

Practical implication: Deployments outside professional European football should: (i) adjust normalization benchmarks to local populations, (ii) validate strategy vectors against local expert intuition, and (iii) consider re-weighting attributes to reflect context-specific tactical priorities.

Table 12 summarizes the extended robustness findings.

The DSS maintains reasonable performance (top-1

> 75 %

or qualitative agreement

> 80 %

) across most realistic perturbation scenarios. The primary vulnerabilities are: (i) sparse data contexts requiring >4 imputed attributes, and (ii) deployment in tactically distinct football cultures without recalibration. These findings inform the deployment guidelines in Section 7.

Ablation Study

Each macro-attribute was systematically suppressed (

A_{j} = 0

) to estimate its contribution. Attributes most affecting the chosen strategy were: Offensive Strength (

A_{1}

), Tactical Cohesion (

A_{11}

), Residual Energy (

A_{8}

), and Psychological Resilience (

A_{7}

).

5.4. Ablation: Attribute-Wise vs. Uniform Weighting

A central claim of the proposed methodology is that attribute-wise dynamic weighting—where each

w_{j}

is individually adjusted based on match context—provides finer-grained adaptation than a uniform (global) weighting scheme. To validate this claim, we conducted a controlled ablation study comparing three weighting configurations:

1.: Attribute-wise (proposed): Weights computed per attribute using Equations (5)–(7) and the gap/time formulas, as specified in Section 3.6.2.
2.: Uniform baseline: All weights fixed at $w_{j} = 1$ regardless of context (equivalent to unweighted Euclidean distance).
3.: Global scaling: A single scalar multiplier $μ \in [0.5, 1.5]$ applied uniformly to all attribute weights based on aggregate context severity (e.g., $μ = 0.7$ when energy is low), simulating a “global penalty” approach.

5.4.1. Evaluation Metrics

For each scenario, we measured:

Tactical coherence: Whether the top-ranked strategy aligns with expert intuition (e.g., avoiding high-pressing when fatigued).
Ranking sensitivity: The rank change of contextually inappropriate strategies (e.g., gegenpressing in low-energy scenarios).
Diagnostic precision: Whether per-attribute deltas correctly identify the binding constraints.

5.4.2. Results

Table 13 summarizes the comparison across the four test scenarios.

Analysis

The uniform weighting scheme failed in two of four scenarios: it recommended gegenpressing (rank 4/20) in the Fatigued & Inferior scenario because it could not down-weight energy-intensive attributes, and it recommended build-up play rather than fast counterattack under time pressure because it could not up-weight transition speed.

Global scaling partially addressed energy concerns but failed under time pressure: because the scalar multiplier affects all attributes equally, it could not simultaneously penalize energy-intensive strategies and promote transition-speed-dependent strategies. Only attribute-wise weighting achieved correct recommendations across all scenarios.

The diagnostic analysis further confirms the advantage: attribute-wise weighting produces per-attribute deltas that correctly identify

A_{8}

(energy) as the binding constraint in fatigue scenarios and

A_{4}

(transition speed) as critical under time pressure. Global scaling cannot provide this granularity because it treats all attributes identically.

Conclusions

These results demonstrate that the claimed benefits of the DSS—context-sensitive recommendations with fine-grained diagnostics—depend specifically on attribute-wise weighting and cannot be replicated by simpler global penalty schemes.

5.5. Attribute Contribution Analysis

Aggregating results across all scenarios, Figure 5 ranks the top five macro-attributes by overall impact on the DSS decision process. The predominance of psychological and energy-related variables highlights the importance of integrating intangible dimensions—typically underrepresented in data-driven sports analytics.

5.6. Critical Discussion

The experiments provide preliminary evidence that a vector-based semantic model can reproduce coherent tactical reasoning without hard-coded rules, contingent on the specific parameter settings and scenario configurations documented in Appendix A. Within these controlled conditions, the DSS adapts to variations in physical, psychological, and temporal parameters, producing recommendations that align with expert intuition. However, these results should be interpreted as demonstrating internal consistency rather than operational validity: the system behaves as designed, but real-world applicability remains to be established through prospective deployment.

Key limitations constrain the strength of conclusions: (i) the evaluation data are simulated from controlled distributions rather than observed from actual matches; (ii) the distance metric assumes linear, additive attribute contributions; (iii) opponent modeling is static rather than adaptive; and (iv) the ablation and robustness analyses, while systematic, operate within the same synthetic framework used for development. Future work should prioritize external validation with independent datasets before claims of operational readiness can be substantiated.

5.7. Reproducibility and Open Materials

To ensure full transparency and reproducibility, all code used to implement the semantic-distance DSS—including the context-tree aggregation functions, strategy templates, scenario generators, and evaluation pipeline—is publicly available in the accompanying repository. The repository also contains the complete set of figures (radar charts, sensitivity curves, robustness analyses, and ablation studies) together with scripts to regenerate them from scratch. Appendix A provides the complete formal specification, enabling independent verification of all reported results.

6. From Simulation to Practice: A Pilot Case Study

The experimental evaluation in Section 5 validated the DSS under controlled, simulated conditions—demonstrating internal coherence, robustness, and interpretability. However, the ultimate value of a decision support system lies in its applicability to real-world contexts. This section bridges that gap by applying the framework to observational data from an actual competitive match.

The transition from simulation to practice introduces challenges absent in controlled experiments: categorical rather than continuous measurements, partial attribute coverage, missing opponent data, and the inherent noise of live football. By confronting these challenges directly, we provide initial evidence that the semantic-distance methodology can accommodate real-world constraints while preserving its core analytical properties.

6.1. Evaluation Specification

To ensure transparency and reproducibility, we first specify the evaluation framework, including endpoints, baselines, and experimental configuration.

6.1.1. Evaluation Objectives and Scope

This pilot study is designed as a feasibility demonstration rather than a definitive effectiveness evaluation. The objectives are:

1.: Primary: Assess whether the DSS can process real observational data and produce coherent recommendations (feasibility endpoint).
2.: Secondary: Compare DSS recommendations against expert tactical judgment (agreement endpoint).
3.: Exploratory: Examine alignment between DSS recommendations and observed tactical outcomes (descriptive analysis, not causal inference).

Explicit non-goals: This study does not aim to establish (i) causal effectiveness of DSS recommendations on match outcomes, (ii) superiority over alternative decision-support methods, or (iii) statistical generalizability to other matches, leagues, or contexts. Such claims require prospective, multi-match studies with appropriate controls.

6.1.2. Evaluation Endpoints

Endpoint 1: Processing Feasibility (Primary)

Definition: The DSS successfully ingests categorical observational data, converts it to the 14-dimensional attribute space (with partial coverage), and produces a ranked list of strategy recommendations without errors or degenerate outputs.

Success criterion: Complete pipeline execution with interpretable outputs for both match phases (first half, halftime projection).

Endpoint 2: Expert Agreement (Secondary)

Definition: Agreement between the DSS recommendation and independent expert judgment on tactical appropriateness given the observed team state.

Operationalization: Two independent reviewers with football coaching experience (not involved in the match or DSS development) independently reviewed:

The team’s observed attribute profile (categorical ratings converted to numerical values).
The match context (score, time, observable fatigue indicators).
The DSS’s top-3 recommended strategies with diagnostic explanations.

Each expert rated: (a) whether the top-1 recommendation was “Appropriate,” “Partially Appropriate,” or “Inappropriate” for the given context; (b) whether the top-3 set contained at least one strategy they would endorse.

Success criterion: Both experts rate the top-1 recommendation as “Appropriate” or “Partially Appropriate”; both endorse at least one strategy in the top-3 set.

Endpoint 3: Tactical Alignment Analysis (Exploratory)

Definition: Descriptive comparison of DSS recommendation against observed second-half tactics, with post-hoc interpretation of divergence.

Operationalization: Attribute-by-attribute comparison between the recommended strategy’s ideal profile and the team’s actual second-half profile, with qualitative assessment of whether divergence was associated with positive or negative match dynamics.

Note: This endpoint is purely descriptive. Observed alignment (or misalignment) cannot establish causal relationships due to the single-match sample and absence of counterfactual conditions.

6.1.3. Baseline Comparators

To contextualize DSS performance, we compare against three baselines:

1.: Random baseline: Strategy selected uniformly at random from the 20-strategy library. Expected expert agreement rate: ∼15–20% (assuming 3–4 strategies are contextually appropriate at any time).
2.: Default strategy baseline: Always recommend “Build-up Play” (the most versatile, moderate-demand strategy). This represents a “safe default” approach that avoids context-specific adaptation.
3.: Energy-only heuristic: Select strategy based solely on residual energy ( $A_{8}$ ): if $A_{8} \geq 0.6$ , recommend “High Press”; if $A_{8} \in [0.4, 0.6)$ , recommend “Build-up Play”; if $A_{8} < 0.4$ , recommend “Positional Defense.” This represents a simple rule-based comparator using the single most dynamic attribute.

6.1.4. Data Sampling and Selection

Dataset Identity

The data derive from a single match in the German youth football system:

Competition: C-Junioren Saarlandliga (U14/U15 regional championship)
Season: 2023–24
Match: SSV Pachten (home) vs. JSG Stausee-Losheim (away)
Date: [Anonymized for player protection]
Selection rationale: Convenience sample—match was attended by a co-author who collected observational data using a standardized protocol.

Sampling Limitations

This is a single-match convenience sample with no claim to representativeness. The match was selected based on data availability, not match characteristics. Results cannot be generalized to other matches, teams, or competitions without replication.

6.1.5. Train/Test Separation and Leakage Control

Temporal Separation

The DSS parameters (strategy vectors, weight coefficients, normalization benchmarks) were fixed prior to accessing the pilot match data. No parameter tuning was performed using the pilot data.

Information Available at Decision Time

The halftime recommendation used only:

First-half observational data (6 attributes × 1 team)
Pre-match contextual information (match duration, competition level)
Fatigue projection based on standard youth-match depletion curves (not match-specific data)

Second-half observations were used only for retrospective comparison, not for generating recommendations.

Leakage Safeguards

1.: No outcome-based tuning: The final score (4:3) was not used in any DSS computation or parameter selection.
2.: No iterative refinement: The recommendation was generated in a single pass; no adjustments were made after observing the result.
3.: Blind expert evaluation: Expert reviewers assessed the recommendation without knowledge of the match outcome.

6.1.6. Reproducibility Configuration

All computations for this pilot study are reproducible using the public repository:

python compute_pilot_distances.py

Key configuration parameters:

Random seed: SEED = 41 (same as synthetic experiments)
Categorical mapping: Hoch $\to 0.85$ , Mittel $\to 0.50$ , Niedrig $\to 0.20$
Fatigue projection: $A_{8}^{proj} = A_{8}^{HT} - 0.15$
Missing attributes: Excluded from distance computation (reduced 5-dimensional space)
Opponent modeling: Disabled ( $α = 0$ ) due to absence of opponent data
Weight configuration: Default dynamic weights as specified in Section 3.6.2.

6.2. Data Source and Match Context

The validation data were collected from a C-Junioren (U14/U15) match in the German youth football championship system:

Match: SSV Pachten vs. JSG Stausee-Losheim
Final score: 4:3 (home victory)
Match duration: 2 × 35 min
Observation protocol: Six tactical attributes recorded per half using a three-level categorical scale (Hoch/Mittel/Niedrig, corresponding to High/Medium/Low)

Youth football presents particular challenges for tactical analysis: teams exhibit greater execution variability, tactical discipline is less consolidated than at the professional level, and physical and psychological fluctuations are more pronounced. These characteristics make the dataset a useful stress test for the DSS’s robustness and adaptability.

6.3. Observed Attributes and Mapping Protocol

Match observers recorded six team attributes at the conclusion of each half using German terminology consistent with the source data collection protocol. To ensure full traceability between raw observations and DSS computations, we establish a systematic localization policy and provide a complete mapping specification.

6.3.1. Localization Policy

All DSS outputs, diagnostic reports, and experimental results presented in this paper use the canonical English attribute identifiers (

A_{1}

–

A_{14}

) as defined in Table 2. When working with non-English source data:

1.: Input normalization: Source attributes are mapped to the corresponding $A_{j}$ identifier using the translation table below. This mapping is applied at data ingestion, before any computation.
2.: Internal representation: All internal computations use the English identifiers exclusively. The weight vector w, distance calculations, and diagnostic deltas reference $A_{1}$ – $A_{14}$ .
3.: Output standardization: All figures, tables, and textual outputs use English identifiers with German source terms provided parenthetically where relevant for auditability.
4.: Categorical value translation: The German three-level scale (Hoch/Mittel/Niedrig) is converted to numerical values as specified in Equation (14).

6.3.2. German–English Attribute Mapping

Table 14 provides the complete one-to-one mapping between German source terms, English DSS identifiers, definitions, and computation methods for the pilot study data.

Aggregation Rule for $A_{4}$

Two German terms (Direkte vertikale Angriffe and Gegenangriff) both capture aspects of transition play. For DSS computation, these are aggregated to a single

A_{4}

value:

A_{4} = max (v (Direkte vertikale Angriffe), v (Gegenangriff))

where

v (\cdot)

denotes the categorical-to-continuous conversion. This max aggregation reflects the tactical intuition that transition capability is demonstrated by either vertical directness or counterattacking effectiveness.

Unmapped Attributes

The pilot observation protocol captured 6 German attributes that map to 5 unique DSS dimensions (

A_{1}

,

A_{2}

,

A_{4}

,

A_{5}

,

A_{8}

). The remaining 9 attributes (

A_{3}

,

A_{6}

,

A_{7}

,

A_{9}

,

A_{10}

,

A_{11}

,

A_{12}

,

A_{13}

,

A_{14}

) were not directly observed and are therefore excluded from the reduced-dimension analysis in Section 6.5. Future validation studies should employ expanded observation protocols to achieve full attribute coverage.

6.3.3. Categorical-to-Continuous Conversion

The three-level categorical scale was converted to continuous values in

[0, 1]

using the following protocol:

Niveau \mapsto v = \{\begin{matrix} 0.85 & if Hoch (High) \\ 0.50 & if Mittel (Medium) \\ 0.20 & if Niedrig (Low) \end{matrix}

(14)

These anchor points were chosen to preserve discriminability while avoiding boundary effects. Sensitivity analyses (reported below) confirmed that moderate variations in these mappings (

\pm 0.10

) did not alter the primary findings.

6.4. Match Observations

Table 15 presents the raw observational data for both halves of the match, along with the corresponding normalized vector representations.

Tactical Narrative

The observational data reveal a clear temporal pattern:

1.: First half: The team displayed high offensive capability with strong vertical and counterattacking tendencies. Defensive organization and energy reserves were at medium levels, suggesting a balanced but attack-oriented approach.
2.: Second half: While offensive intent remained high, execution quality declined (vertical attacks dropped to medium). Critically, both defensive compactness and residual energy fell to low levels, indicating fatigue-induced tactical degradation.

The final scoreline (4:3) is consistent with this profile: a high-scoring, open match where both teams prioritized attacking play at the expense of defensive solidity, particularly in the later stages.

6.5. DSS Application: Halftime Recommendation

At halftime, we applied the DSS to generate a tactical recommendation for the second half, using the first-half observations as the current team state and projecting likely energy depletion.

6.5.1. Input Configuration

The reduced team vector (6 observable dimensions mapped to 5 unique DSS attributes) was constructed as:

V_{team}^{HT} = [\begin{matrix} A_{1} \\ A_{2} \\ A_{4} \\ A_{5} \\ A_{8} \end{matrix}] = [\begin{matrix} 0.85 \\ 0.50 \\ 0.85 \\ 0.50 \\ 0.50 \end{matrix}]

(15)

For the second-half projection, we applied a fatigue discount of

- 0.15

to

A_{8}

(anticipating energy depletion in a youth match with limited substitution depth), yielding a projected

A_{8} = 0.35

.

6.5.2. Strategy Comparison

Table 16 presents the adapted semantic distances between the projected team vector and the subset of strategy templates relevant to the observable attribute space. Figure 6 visually compares the adaptability of these strategies to the team’s current condition.

6.5.3. DSS Recommendation

Based on the computed distances, the DSS recommended:

Build-up Play—a possession-based approach emphasizing controlled progression and tempo management over high-intensity pressing or rapid vertical transitions.

The diagnostic module identified the following key factors driving the recommendation:

Strengths: High offensive capability ( $A_{1} = 0.85$ ) aligns well with Build-up Play requirements ( $0.80$ ). Defensive organization ( $A_{2} = 0.50$ ) and pressing capability ( $A_{5} = 0.50$ ) match the strategy’s moderate demands.
Constraint: Projected residual energy ( $A_{8} = 0.35$ ) falls short of the strategy’s ideal ( $0.60$ ), with a gap of $+ 0.25$ . This is the primary limitation.
Surplus: The team’s transition speed ( $A_{4} = 0.85$ ) substantially exceeds Build-up Play’s requirements ( $0.50$ ), representing untapped vertical capability.

6.6. Retrospective Analysis

6.6.1. Observed vs. Recommended Tactics

The DSS recommended Build-up Play—a possession-oriented strategy emphasizing tempo control and energy conservation. However, the second-half observations suggest that the team continued with an aggressive, transition-heavy approach despite declining energy reserves and defensive organization. This divergence can be characterized as a high-risk, high-reward tactical choice, which in this instance yielded a positive outcome (the team held on to win 4:3) but with narrow margins.

The comparison reveals that the team diverged from the DSS recommendation on three key dimensions: they maintained high transition speed rather than moderating tempo, allowed defensive compactness to collapse, and depleted energy reserves beyond sustainable levels. This pattern is consistent with a “high-risk continuation” approach rather than the energy-conserving Build-up Play the DSS recommended.

6.6.2. Counterfactual Consideration

Had the team followed the DSS recommendation of Build-up Play—reducing transition speed, conserving energy through possession, and maintaining defensive shape—the expected outcome might have been:

Lower probability of conceding the third goal (defensive compactness preserved)
Reduced offensive output (potentially fewer goals scored, but also fewer high-risk transitions)
Better preservation of energy for critical late-game moments
More controlled match tempo, reducing the chaotic “open game” dynamic

This counterfactual analysis highlights the DSS’s potential as a risk-aware decision-support tool. The team’s actual approach succeeded in this instance, but the DSS correctly identified energy depletion as a critical constraint. In matches where the margin is less forgiving, ignoring such constraints could prove costly.

6.7. Evaluation Results

This section reports outcomes against the evaluation endpoints and baselines specified in Section 6.1.

6.7.1. Endpoint 1: Processing Feasibility

Result: PASSED.

The DSS successfully processed the categorical observational data through the complete pipeline:

Categorical-to-numerical conversion executed without errors for all 6 observed attributes.
Reduced 5-dimensional team vector constructed (after aggregating $A_{4}$ from two sources).
Semantic distances computed for all 20 strategies in the library.
Ranked recommendation list generated with diagnostic output for top-3 strategies.
Halftime projection with fatigue adjustment produced valid results.

No degenerate outputs (e.g., tied rankings, infinite distances, missing values) were observed. Pipeline execution time was <50 ms, confirming suitability for real-time deployment.

6.7.2. Endpoint 2: Expert Agreement

Result: PASSED (with partial agreement on top-1).

Two independent expert reviewers with football coaching experience evaluated the DSS recommendation, with outcomes summarized in Table 17:

Qualitative feedback:

Expert 1: “Build-up Play is a sensible choice given the energy constraints. The diagnostic correctly identifies stamina as the limiting factor. I would have recommended the same.”
Expert 2: “Build-up Play is reasonable but perhaps too conservative for a team leading 2:1 at halftime. I might prefer Cautious Horizontal Play [ranked 3rd by DSS] to maintain some attacking threat. However, the DSS’s reasoning is sound and the top-3 set is useful.”

The success criterion (both experts rate top-1 as Appropriate or Partially Appropriate; both endorse at least one top-3 strategy) was met.

6.7.3. Endpoint 3: Tactical Alignment Analysis

Result: Divergence observed; descriptive analysis provided.

As reported in Table 18, the team’s second-half tactics diverged from the DSS recommendation on 3 of 5 observable dimensions. The team maintained high-intensity, transition-focused play despite declining energy reserves—a higher-risk approach than the DSS recommended.

Outcome association (descriptive only):

The team won the match (4:3), suggesting the high-risk approach succeeded in this instance.
However, the team conceded 2 second-half goals (vs. 1 in the first half), consistent with the DSS’s warning about defensive vulnerability under energy depletion.
Causal attribution is not possible from a single match.

6.7.4. Baseline Comparisons

Table 19 provides a baseline comparison of recommendation methods.

Random baseline: Selected “Offside Trap” (a high-risk defensive tactic requiring precise coordination)—deemed inappropriate by both experts for a fatigued youth team.
Default strategy baseline: Coincidentally matched the DSS recommendation (Build-up Play), achieving the same expert ratings. This highlights that Build-up Play is indeed a “safe” choice but also that the DSS does not always outperform simple defaults. The DSS’s value lies in (i) providing diagnostic reasoning and (ii) adapting to contexts where the default would be inappropriate.
Energy-only heuristic: Recommended “Positional Defense” based solely on low projected energy ( $A_{8} = 0.35 < 0.4$ ). Expert 1 rated this as “Partially Appropriate” (energy conservation is valid), but Expert 2 rated it “Inappropriate” (too passive for a team with strong offensive capability and a lead to protect through controlled possession rather than pure defense).

Key insight: The DSS matched the default baseline in this scenario but provided richer diagnostic output. The energy-only heuristic, while capturing one important factor, missed the interaction between offensive capability and energy management that the full DSS model captures.

6.7.5. Summary of Evaluation Outcomes

The pilot study achieved its primary and secondary endpoints, as summarized in Table 20. The DSS successfully processed real observational data and produced recommendations that independent experts judged appropriate or partially appropriate. Performance matched the default baseline but exceeded the random and energy-only baselines, with the added benefit of diagnostic interpretability.

6.8. Limitations of the Pilot Study

This preliminary validation has several limitations that constrain the strength of conclusions:

1.: Single-match sample: One match cannot establish statistical generalizability. The analysis should be viewed as a proof-of-concept demonstration.
2.: Partial attribute coverage: Only 6 of the 14 DSS attributes were directly observable, limiting the semantic space to a lower-dimensional subspace.
3.: Absence of opponent data: The observational protocol captured only the home team (SSV Pachten), precluding the opponent-aware distance adjustments described in Section 3.
4.: Retrospective rather than prospective: The DSS was applied after the match rather than in real time, preventing assessment of whether recommendations would have influenced actual coaching decisions.
5.: Youth football context: Tactical patterns and physical dynamics in C-Junioren football may differ from senior professional contexts where the DSS is ultimately intended to operate.

6.9. Implications for Framework Validation

Despite its limitations, this pilot study provides preliminary evidence for several framework capabilities, though these observations should be interpreted cautiously given the single-match sample and partial attribute coverage:

Real-data compatibility: The DSS can ingest observational data from actual matches using a straightforward categorical-to-continuous mapping protocol (see Section 6.3), suggesting that the framework is not inherently limited to synthetic inputs.
Temporal dynamics: The framework captures intra-match evolution (first half → second half), enabling phase-specific recommendations. Whether this capability generalizes across match contexts remains to be established.
Diagnostic interpretability: The attribute-level analysis provides insights (e.g., “energy reserves constrain high-intensity options”) that appear actionable, though coach acceptance testing has not been conducted.
Graceful degradation: Even with partial attribute coverage (5 of 14 dimensions), the DSS produces coherent recommendations. This robustness to incomplete information is encouraging but requires systematic evaluation across varying degrees of data availability.

Important caveat: These observations demonstrate feasibility rather than validity. The pilot shows that the DSS can process real match data and produce interpretable outputs; it does not establish that these outputs would improve coaching decisions or match outcomes. Operational validity claims require prospective studies with systematic outcome tracking.

The path from this pilot toward systematic validation involves:

1.: Multi-match datasets: Systematic observation across a full season (15–20 matches) to enable statistical validation.
2.: Expanded attribute protocols: Development of standardized observation instruments covering all 14 DSS attributes, potentially including post-match coach interviews for psychological dimensions.
3.: Opponent observation: Parallel data collection for opposing teams to enable full exploitation of the semantic distance framework.
4.: Prospective deployment: Real-time DSS use during matches (e.g., at halftime) with systematic tracking of recommendation adherence and outcome correlations.

This pilot case study represents one step in the research trajectory: from theoretical formalization (Section 3) through prototype implementation (Section 4) and controlled experimentation (Section 5) toward real-world application. The results establish that the semantic-distance approach can accommodate observational data from actual matches while preserving interpretability and adaptability. However, a single retrospective application cannot establish operational validity or predictive value. The following Discussion (Section 7) synthesizes insights from both the simulated experiments and this pilot study, explicitly delimiting the scope of current claims and charting the validation work required before stronger conclusions can be drawn.

7. Discussion

The experimental evaluation and pilot study provide evidence that the proposed semantic-distance Decision Support System achieves internal coherence within its design parameters and produces recommendations that align with tactical intuition across the tested scenarios. Following revisions to ensure method–implementation consistency (Section 3.5, Section 3.6, Section 3.7, Section 3.8, Section 4.1 and Section 4.2), establishment of a systematic localization policy (Section 6.3), and provision of complete formal specifications (Appendix A), the analytical pipeline from input data to tactical recommendations is now fully auditable and reproducible.

However, auditability does not imply operational validity. The system exhibits stability in balanced or high-energy contexts and interpretability through its diagnostic visualizations, but these properties have been demonstrated primarily within synthetic evaluation frameworks. Beyond the pilot-specific constraints noted in Section 6.8, the DSS architecture itself presents broader limitations that constrain claims regarding real-world applicability and operational readiness.

7.1. Methodological Limitations

7.1.1. Data Quality and Representativeness

The DSS relies on a compact set of inputs: 14 macro-attributes and 20 predefined tactical strategies encoded as idealized vectors. This controlled design facilitates methodological validation but constrains generalizability. High-impact attributes such as team morale, tactical cohesion, and psychological resilience are estimated through heuristic approximations rather than direct measurement, which may explain episodes of moderate robustness (stability dropping to ∼60–70% under high-pressure or low-energy conditions) where the system becomes sensitive to noise.

7.1.2. Static Opponent Modelling

Although the DSS incorporates opponent information, this is primarily in aggregated form. The system does not yet track real-time variations such as formation changes, substitutions, shifts in pressing intensity, or fluctuations in physical condition. In realistic settings, even subtle adjustments—lowering the defensive line, introducing a fast winger—may substantially modify the suitability of a recommended strategy.

7.1.3. Linear Distance Assumptions

The system uses Euclidean distance with linear contextual weighting, assuming additive and independent attribute interactions. Football dynamics, however, involve non-linear synergies: small reductions in stamina can disproportionately undermine high pressing; morale and technical quality interact non-linearly in high-pressure phases. Linear metrics may therefore smooth over transitions that are tactically sharp in practice.

7.1.4. Absence of Operational Constraints

Strategies are encoded as abstract semantic profiles, independent of players actually available. A strategy may appear semantically optimal yet be operationally infeasible—for example, high-width play without fast wide players, or vertical transitions requiring decision-making attributes absent from the current lineup.

7.1.5. User-Facing Interpretability

Despite diagnostic tools (radar charts, sensitivity curves, ablation tests), the prototype remains oriented toward analytically trained users. Real-time decision-makers may require more compact, narrative-style explanations or simplified dashboards suited to the pace of live matches.

These limitations define the development priorities addressed in the following section.

8. Conclusions and Future Work

This work introduced a Decision Support System for context-aware football strategy selection, grounded in a semantic model that represents both teams and strategies as vectors in a shared 14-dimensional attribute space. The adjusted semantic-distance metric combines static team–strategy compatibility with dynamic contextual factors—match time, score state, residual energy, and opponent characteristics—controlled by explicit weighting functions documented in full (Appendix A).

Evaluation through synthetic scenarios demonstrated internal consistency: the DSS produces recommendations that align with tactical intuition, responds appropriately to contextual variations, and provides interpretable diagnostics. A pilot study with real match data established feasibility of processing observational inputs, though the single-match sample and partial attribute coverage preclude claims of generalizability. Critically, these results demonstrate that the system behaves as designed; whether DSS recommendations would improve actual coaching decisions or match outcomes remains an open empirical question requiring prospective validation.

8.1. Summary of Contributions

The principal contributions of this work are:

1.: A semantic formalization of football tactics, encoding both team states and strategy templates as vectors in a shared attribute space amenable to geometric comparison.
2.: An adaptive distance metric that dynamically reweights attributes based on match context (energy, time pressure, opponent gaps), with explicit, reproducible formulas (Section 3.6.2, Appendix A.2).
3.: Diagnostic interpretability tools—radar charts, sensitivity analysis, robustness testing, ablation studies—that expose the reasoning behind recommendations and enable systematic evaluation.
4.: A fully auditable pipeline with complete formal specifications, code availability, and localization protocols that support independent replication and verification.
5.: Preliminary real-data application via a pilot study, demonstrating feasibility (though not yet validity) of processing observational match data.

8.2. Future Directions

The limitations identified in Section 7 motivate several development trajectories, organized from near-term engineering enhancements to longer-term conceptual extensions.

8.2.1. Advanced Data Integration and Modeling

Two complementary directions would evolve the DSS from a prototype into a robust tool:

Real-time data integration and automation Connecting the DSS to live data streams from commercial tracking providers (Wyscout, StatsBomb, Opta) and GPS systems would automate team profiling and dynamically update opponent behaviour (e.g., line height, possession structure), directly addressing the static-opponent limitation. Supplementing this with NLP modules to parse tactical reports would allow the strategy library to be expanded via natural-language queries (e.g., “compact defence with fast diagonal transitions”). Furthermore, the current prototype operates in batch mode; a natural extension would implement an event-driven architecture with a continuous listening loop, ingesting match data from structured files (JSON, CSV) or live feeds (wearable sensors, video tagging systems, coaching dashboards) and producing updated recommendations as play unfolds.
Stable profiling via historical priors and Bayesian updating To complement real-time data and prevent overreaction to transient match fluctuations, the attribute model should incorporate historical priors. Baseline distributions for macro-attributes (e.g., a team’s average pressing intensity or defensive solidity) would be derived from historical season data. These priors would then be updated in a Bayesian framework as in-match events accumulate, yielding more stable and reliable profiles early in a game while remaining adaptable to genuine tactical shifts. Public datasets such as StatsBomb Open Data [27] provide an ideal foundation for calibrating these priors and validating the system.

8.2.2. Non-Linear and Hybrid Metrics

Exploring alternatives to Euclidean distance—Mahalanobis distance, kernel-based metrics, or learned embeddings—could capture the non-linear attribute interactions observed in football. A hybrid approach might combine Euclidean distance for capability matching with cosine similarity for stylistic profiling, offering coaches multiple analytical lenses. Additionally, strategy-specific weighting of team–opponent ratios could capture the intuition that attribute differentials matter unequally across tactics: midfield control gaps are critical for possession-based systems but less relevant for direct counterattacking, whereas transition speed differentials show the reverse pattern.

8.2.3. Multi-Objective Optimization

Extending the model beyond semantic fit to incorporate physical risk indicators (fatigue accumulation, injury probability), expected-threat contributions, and coach-preference profiles (aggressive vs. conservative) would yield a richer decision landscape. Pareto-optimal strategy sets could be presented, allowing coaches to navigate trade-offs explicitly.

8.2.4. Predictive Simulation

Incorporating Bayesian networks, Markov processes, or Monte Carlo simulations would enable what-if testing—evaluating alternative strategies and substitutions before committing. This would transform the DSS from a diagnostic tool into a predictive one, supporting pre-match preparation as well as in-game decisions.

8.2.5. Interactive Coaching Interface

A dashboard integrating radar charts, sensitivity curves, and robustness metrics—with sliders for coach-defined preferences (risk level, pressing intensity, possession–transition balance)—would support real-time, minute-by-minute strategy updates. Natural-language explanations (“why this strategy is recommended now”) and counterfactual exploration (“what if we substitute player X?”) would bridge the gap between analytical depth and operational usability.

8.2.6. Validation with Professional Data

Transitioning from simulated tests to real competitions using professional datasets would provide rigorous external validation. Concrete KPIs—expected goals conceded, shot quality, pressing recoveries—could benchmark DSS recommendations against actual coaching decisions, quantifying added value and identifying failure modes.

8.2.7. Extension to Other Team Sports

The semantic-distance paradigm is not football-specific. Any domain where heterogeneous agents pursue collective objectives against an adaptive opponent admits the same formalization: a shared attribute space, a library of strategy templates, and a distance metric modulated by contextual pressure. Candidate sports include basketball, rugby, American football, ice hockey, and water polo. Of particular interest are mixed human–robotic teams, such as those competing in RoboCup leagues, where artificial players exhibit well-defined, quantifiable capability profiles that map naturally onto macro-attribute vectors.

8.2.8. From Strategy Selection to Strategy Synthesis

The current DSS recommends a single best-matching strategy, but real tactical situations often call for hybrid approaches blending elements from multiple templates. Recent work on entangled heuristics for agent-augmented strategic reasoning [28] offers a natural extension: when several strategies achieve similar semantic distances, the system could compose them via interference-weighted fusion rather than selecting one. That framework models heuristics not as mutually exclusive options but as semantically interrelated potentials synthesized into novel formulations. Transposing this logic to football, a team whose profile activates both “Build-up Play” and “Fast Counterattack” might receive a composed recommendation: controlled possession in midfield with rapid vertical transitions when space opens—a hybrid that neither template captures alone.

8.2.9. Adversarial and Security Domains

Beyond cooperative sports, the methodology extends to explicitly unfriendly scenarios. Recent work on multi-drone urban defence [29] models the problem as a Sequential Stackelberg Security Game sharing structural parallels with ours: spatial decomposition, capability-based profiling, utility-driven strategy selection, and a probabilistic presence parameter analogous to our context weights. Our semantic-matching approach could complement such game-theoretic methods by guiding within-zone resource deployment when defender assets are heterogeneous. This synergy suggests a broader research programme applying explainable, profile-based decision support to hybrid human–AI security systems.

By combining semantic distance computation with diagnostic interpretability, the DSS offers a framework for supporting complex tactical decisions without replacing coaching expertise. The current work establishes internal consistency, auditability, and feasibility of real-data integration. Operational validity—whether DSS recommendations actually improve coaching decisions and match outcomes—remains the critical open question. Prospective validation studies, expanded datasets, and systematic outcome tracking are required before claims of real-world applicability can be substantiated. If validated, systems of this kind could eventually serve not only professional sports but also defence, robotics, and other settings where heterogeneous teams must coordinate adaptively against strategic adversaries.

Author Contributions

Conceptualization, R.P., M.P. and P.Z.; methodology, R.P. and M.P.; software, A.D.R. and R.P.; validation, A.D.R., M.N. and R.P.; formal analysis, A.D.R. and R.P.; investigation, A.D.R. and R.P.; resources, R.P., R.V. and P.Z.; data curation, A.D.R., M.N., R.P., R.V. and P.Z.; writing—original draft preparation, A.D.R. and R.P.; writing—review and editing, M.N. and R.P.; visualization, A.D.R., R.P., R.V. and P.Z.; supervision, R.P.; project administration, R.P.; funding acquisition, R.P. All authors have read and agreed to the published version of the manuscript.

Funding

Remo Pareschi has been funded by the European Union—NextGenerationEU under the Italian Ministry of University and Research (MUR) National Innovation Ecosystem grant ECS00000041-VITALITY—CUP E13C22001060006.

Data Availability Statement

All relevant data are included in the article. The complete implementation details, including the source code, API documentation, and usage examples, as discussed in Section 5 and Section 6, are available in the public repository at https://github.com/Aribertus/football-dss-semantic-distance (accessed on 24 February 2026).

Acknowledgments

During the preparation of this manuscript, the authors used Claude Opus 4.6 and ChatGpt 5.2 for the purposes of support in the correct use of LaTeX commands and in the proofreading of the article. The authors have reviewed and edited the output and take full responsibility for the content of this publication. The authors are grateful to the anonymous reviewers for their insightful comments, which substantially improved the final version of the article.

Conflicts of Interest

Mattia Neri was employed by the Bioretics, Bologna, Italy. Roberto Valtancoli was employed by the Cesena Femminile Football Club, Cesena, Italy. Paolino Zica was employed by the Zica Sport, Benevento, Italy. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A. Complete Formal Specification

This appendix provides the complete formal specification required to reproduce all experiments reported in this paper. All formulas correspond exactly to the implementation in the public repository.

Appendix A.1. Player Attribute Generation

Player-level attributes are generated from role-specific Gaussian distributions. For each role

r \in {GK, CB, FB, CM, FW}

and attribute a, values are sampled as:

p_{r, a} = clip (N (μ_{r, a}, σ_{r, a}^{2}), 0, 1)

Table A1 specifies the distribution parameters

(μ, σ)

for all role–attribute combinations.

Table A1. Player attribute distribution parameters by role. Each cell shows

(μ, σ)

.

Table A1. Player attribute distribution parameters by role. Each cell shows

(μ, σ)

.

Attribute	GK	CB	FB	CM	FW
reflexes	(0.85, 0.05)	(0.30, 0.10)	(0.40, 0.10)	(0.40, 0.10)	(0.40, 0.10)
aerial_duels	(0.80, 0.05)	(0.90, 0.05)	(0.70, 0.10)	(0.70, 0.10)	(0.65, 0.10)
passing	(0.65, 0.10)	(0.70, 0.10)	(0.75, 0.05)	(0.80, 0.05)	(0.70, 0.10)
speed	(0.40, 0.10)	(0.50, 0.10)	(0.75, 0.10)	(0.70, 0.10)	(0.80, 0.05)
stamina	(0.70, 0.05)	(0.75, 0.05)	(0.80, 0.05)	(0.80, 0.05)	(0.80, 0.05)
resilience	(0.80, 0.05)	(0.80, 0.05)	(0.75, 0.05)	(0.80, 0.05)	(0.70, 0.05)
dribbling	(0.30, 0.10)	(0.50, 0.10)	(0.70, 0.05)	(0.75, 0.05)	(0.85, 0.05)
tackling	(0.20, 0.10)	(0.85, 0.05)	(0.70, 0.10)	(0.70, 0.10)	(0.40, 0.10)
interceptions	(0.30, 0.10)	(0.80, 0.05)	(0.70, 0.10)	(0.75, 0.10)	(0.40, 0.10)
xG	(0.00, 0.00)	(0.10, 0.05)	(0.20, 0.10)	(0.40, 0.10)	(0.85, 0.05)
xA	(0.20, 0.10)	(0.20, 0.10)	(0.50, 0.10)	(0.70, 0.10)	(0.60, 0.10)
aggression	(0.60, 0.10)	(0.80, 0.05)	(0.70, 0.10)	(0.75, 0.10)	(0.75, 0.10)

Appendix A.2. Macro-Attribute Aggregation Formulas

Each macro-attribute

A_{j}

is computed from player-level attributes using the following formulas. Let

P_{r} = {p : p . role = r}

denote the set of players with role r.

A₁:: Offensive Strength.

Computed from forwards and central midfielders:

A_{1} = 0.7 \cdot {\bar{xG}}_{P_{FW} \cup P_{CM}} + 0.3 \cdot {\bar{dribbling}}_{P_{FW} \cup P_{CM}}

A₂:: Defensive Strength.

Computed from goalkeeper and center backs:

A_{2} = 0.7 \cdot {\bar{reflexes}}_{P_{GK} \cup P_{CB}} + 0.3 \cdot {\bar{tackling}}_{P_{GK} \cup P_{CB}}

A₃:: Midfield Control.

Computed from fullbacks and central midfielders:

A_{3} = 0.7 \cdot {\bar{xA}}_{P_{FB} \cup P_{CM}} + 0.3 \cdot {\bar{speed}}_{P_{FB} \cup P_{CM}}

A₄:: Transition Speed.

Computed from central midfielders, forwards, and fullbacks:

A_{4} = 0.7 \cdot {\bar{speed}}_{P_{CM} \cup P_{FW} \cup P_{FB}} + 0.3 \cdot {\bar{stamina}}_{P_{CM} \cup P_{FW} \cup P_{FB}}

A₅:: High Press Capability.

Computed from all outfield players:

A_{5} = 0.7 \cdot {\bar{tackling}}_{P_{FW} \cup P_{CM} \cup P_{FB} \cup P_{CB}} + 0.3 \cdot {\bar{interceptions}}_{P_{FW} \cup P_{CM} \cup P_{FB} \cup P_{CB}}

A₆:: Width Utilization.

Computed from central midfielders and fullbacks:

A_{6} = 0.7 \cdot {\bar{xA}}_{P_{CM} \cup P_{FB}} + 0.3 \cdot {\bar{stamina}}_{P_{CM} \cup P_{FB}}

A₇:: Psychological Resilience.

Computed from all players:

A_{7} = 0.7 \cdot {\bar{resilience}}_{all} + 0.3 \cdot {\bar{aggression}}_{all}

A₈:: Residual Energy.

Computed from all players:

A_{8} = 0.7 \cdot {\bar{stamina}}_{all} + 0.3 \cdot {\bar{resilience}}_{all}

A₉:: Team Morale.

Computed from all players:

A_{9} = 0.6 \cdot {\bar{resilience}}_{all} + 0.4 \cdot {\bar{aggression}}_{all}

A₁₀:: Time Management.

Computed from experienced positions (GK, CM, FB):

A_{10} = 0.5 \cdot {\bar{interceptions}}_{P_{GK} \cup P_{CM} \cup P_{FB}} + 0.5 \cdot {\bar{passing}}_{P_{GK} \cup P_{CM} \cup P_{FB}}

A₁₁:: Tactical Cohesion.

Computed from all players:

A_{11} = 0.6 \cdot {\bar{passing}}_{all} + 0.4 \cdot {\bar{xA}}_{all}

A₁₂:: Technical Base.

Mean of all technical attributes across all players:

A_{12} = \frac{1}{| P | \cdot 7} \sum_{p \in P} \sum_{a \in T} p_{a}, T = {reflexes, passing, dribbling, tackling, interceptions, xG, xA}

A₁₃:: Physical Base.

Mean of all physical attributes across all players:

A_{13} = \frac{1}{| P | \cdot 4} \sum_{p \in P} \sum_{a \in P} p_{a}, P = {aerial_duels, speed, stamina, aggression}

A₁₄:: Relational Cohesion.

Estimated via uniform random draw (qualitative proxy):

A_{14} \sim U (0.5, 0.9)

Appendix A.3. Complete Strategy Vector Specifications

Table A2 presents the complete 14-dimensional vector specifications for all 20 tactical strategies.

Table A2. Complete strategy vector specifications. Values represent attribute importance on

[0.2, 0.9]

scale.

Table A2. Complete strategy vector specifications. Values represent attribute importance on

[0.2, 0.9]

scale.

Strategy	$A_{1}$	$A_{2}$	$A_{3}$	$A_{4}$	$A_{5}$	$A_{6}$	$A_{7}$	$A_{8}$	$A_{9}$	$A_{10}$	$A_{11}$	$A_{12}$	$A_{13}$	$A_{14}$
Offensive Systems
Build-up Play	0.8	0.5	0.7	0.5	0.4	0.6	0.7	0.6	0.8	0.7	0.8	0.8	0.6	0.8
Fast Counterattack	0.9	0.6	0.5	0.9	0.5	0.6	0.7	0.8	0.7	0.8	0.6	0.7	0.8	0.6
Long Ball to Target	0.8	0.6	0.5	0.6	0.4	0.4	0.6	0.7	0.6	0.7	0.5	0.5	0.8	0.5
Late Midfield Runners	0.8	0.5	0.6	0.7	0.5	0.5	0.6	0.7	0.7	0.6	0.7	0.7	0.7	0.6
Systematic Crossing	0.7	0.5	0.6	0.6	0.5	0.9	0.7	0.7	0.7	0.6	0.7	0.7	0.7	0.6
Overlapping Flanks	0.7	0.5	0.7	0.7	0.5	0.9	0.7	0.8	0.8	0.7	0.8	0.7	0.8	0.7
Quick Rotations	0.8	0.5	0.7	0.8	0.6	0.7	0.8	0.7	0.8	0.7	0.9	0.7	0.8	0.7
Direct Vertical Attack	0.9	0.5	0.5	0.8	0.5	0.6	0.7	0.7	0.7	0.7	0.6	0.7	0.8	0.6
Defensive Structures
Classic Catenaccio	0.4	0.9	0.7	0.3	0.2	0.3	0.8	0.7	0.7	0.9	0.8	0.6	0.6	0.7
Positional Defense	0.4	0.9	0.8	0.3	0.2	0.3	0.7	0.6	0.6	0.9	0.8	0.6	0.5	0.7
Compact Zonal Defense	0.5	0.9	0.8	0.4	0.4	0.4	0.7	0.6	0.7	0.8	0.9	0.7	0.6	0.7
Strict Man-Marking	0.5	0.9	0.7	0.5	0.5	0.3	0.7	0.7	0.6	0.8	0.8	0.7	0.7	0.7
Offside Trap	0.5	0.8	0.7	0.5	0.6	0.4	0.7	0.7	0.7	0.8	0.8	0.7	0.7	0.7
Pressing Variants
High Press	0.7	0.8	0.6	0.9	0.9	0.5	0.8	0.7	0.8	0.6	0.9	0.7	0.8	0.8
Gegenpressing	0.7	0.8	0.6	0.8	0.9	0.5	0.8	0.7	0.8	0.6	0.9	0.7	0.8	0.8
Midfield Pressing	0.6	0.7	0.7	0.7	0.7	0.4	0.7	0.7	0.7	0.7	0.8	0.7	0.7	0.7
Inducing Build-up Errors	0.7	0.8	0.6	0.8	0.9	0.4	0.7	0.7	0.8	0.6	0.8	0.7	0.7	0.8
Possession/Control
Extended Possession	0.7	0.7	0.9	0.5	0.5	0.6	0.8	0.7	0.8	0.7	0.9	0.8	0.6	0.8
Cautious Horizontal	0.5	0.7	0.8	0.4	0.3	0.5	0.7	0.7	0.8	0.7	0.8	0.7	0.5	0.7
Central Block + Breaks	0.7	0.8	0.7	0.7	0.7	0.5	0.7	0.7	0.7	0.7	0.8	0.7	0.7	0.7

Strategy Vector Construction Protocol

Strategy vectors were constructed through a formal four-stage expert elicitation process designed to maximize reliability and transparency. Full details are provided below; the main text (Section 3.4) summarizes this protocol.

Stage 1: Expert Panel and Training. Three domain experts participated in the elicitation:

Rater A: Academic researcher with expertise in performance analysis and tactical periodization.
Rater B: Experienced football coach with background in youth academy and semi-professional coaching; familiar with tactical analysis.
Rater C: Practitioner with experience in match analysis and video-based tactical coding.

Prior to rating, all experts completed a 45-min calibration session covering: (i) definitions of all 14 macro-attributes with positive and negative examples, (ii) the five-level rating scale with anchor examples, and (iii) practice ratings on three “calibration strategies” not included in the final set.

Stage 2: Independent Rating. Each expert independently rated all 280 strategy–attribute pairs (20 strategies × 14 attributes) using a secure online form. The rating scale was:

Table A3. Rating scale for expert elicitation of strategy vectors.

Level	Definition	Numerical Range
Irrelevant	Attribute has no bearing on strategy success	$0.20$ – $0.30$
Low	Attribute provides minor benefit	$0.40$ – $0.50$
Moderate	Attribute contributes meaningfully	$0.50$ – $0.60$
High	Attribute is important for effectiveness	$0.70$ – $0.80$
Critical	Attribute is essential; deficit causes failure	$0.80$ – $0.90$

Rating took 2–3 h per expert, completed over multiple sessions within one week.

Stage 3: Inter-Rater Reliability. Agreement was assessed using two metrics:

Exact agreement: 163/280 pairs (58.2%) had identical ratings from all three experts.
Within-one-level agreement: 251/280 pairs (89.6%) had all ratings within one ordinal level.
Krippendorff’s alpha: $α = 0.71$ (ordinal), indicating substantial reliability.

Disagreement was concentrated in psychological attributes (

A_{7}

,

A_{9}

:

α = 0.58

) and organizational attributes (

A_{10}

,

A_{14}

:

α = 0.62

). Technical/tactical attributes showed higher agreement (

A_{1}

–

A_{6}

:

α = 0.79

).

Stage 4: Reconciliation and Final Assignment. Discrepancies were resolved as follows:

1.: Minor discrepancies (spread $\leq 1$ level): Median rating adopted; numerical value set to midpoint of corresponding range.
2.: Major discrepancies (spread $\geq 2$ levels): 42 pairs (15%) were flagged for discussion. In a 90-min reconciliation session, experts presented reasoning and reached consensus (38 pairs) or majority decision (4 pairs).
3.: Final numerical assignment: Within-range values were assigned to maximize differentiation between strategies with the same qualitative level (e.g., two “High” ratings might yield 0.75 vs. 0.80 based on discussion nuance).

Provenance and Audit Trail. The following materials are available in the supplementary repository:

Raw rating matrices from all three experts (anonymized as Rater A/B/C)
Reconciliation log with justifications for all 42 discussed pairs
Calibration materials (attribute definitions, anchor examples, practice strategies)
Face-validity review forms from the two independent validators

Limitations and Bias Acknowledgment. Despite the structured protocol, strategy vectors remain partly subjective:

Cultural bias: All experts had European football backgrounds; strategy interpretations may differ in other football cultures (e.g., South American, Asian).
Era effects: Vectors reflect tactical understanding circa 2023–2024; the evolving nature of football tactics may require periodic re-elicitation.
Granularity limits: The five-level scale may not capture fine distinctions; future work could use continuous scales with more extensive calibration.

The sensitivity analyses in Section 5.3.2 demonstrate that recommendations are stable under

\pm 5 %

perturbations to strategy vectors, suggesting that modest rating uncertainty does not substantially affect DSS output.

Appendix A.4. Scenario Specifications

Table A4 provides the exact parameter values for each test scenario.

Table A4. Complete scenario parameter specifications.

Scenario	$A_{8}$	$Δ_{tech}$	$Δ_{phys}$	t	s	Morale
1. Energetic & Balanced	0.80	0.00	0.00	0.70	0	0.75
2. Fatigued & Inferior	0.30	$- 0.15$	$- 0.10$	0.50	0	0.50
3. High Temporal Pressure	0.55	$- 0.05$	0.00	0.15	$- 1$	0.65
4. Tech. & Phys. Superiority	0.65	$+ 0.20$	$+ 0.15$	0.60	0	0.70

Appendix A.5. Implementation Configuration

Random seed: SEED = 41 (set via np.random.seed() and random.seed())
Python version: 3.10+
Dependencies: numpy, pandas, matplotlib (see requirements.txt)
Default formation: Team 1: 4-3-3 (1 GK, 2 CB, 2 FB, 3 CM, 3 FW); Team 2: 5-3-2 (1 GK, 5 CB, 2 FB, 2 CM, 1 FW)
Opponent penalty $α$ : 0.20 (default); sensitivity tested over $[0.0, 0.5]$
Multiplier bounds: $m_{min} = 0.3$ , $m_{max} = 2.5$ (clamping applied per Section 3.6.2)
Robustness trials: $N = 100$ Monte Carlo simulations per scenario
Noise level: $σ = 0.05$ (5% perturbation)

Appendix A.6. Code Availability

The complete implementation is available at: https://github.com/Aribertus/football-dss-semantic-distance (accessed on 24 February 2026).

The repository contains:

football_strategy_generation_1_3_1.py: Core DSS implementation (1002 lines)
make_figures.py: Reproducible figure generation (283 lines)
compute_pilot_distances.py: Pilot validation computations (350 lines)
requirements.txt: Dependency specifications
README.md: Usage instructions and quick-start guide

Running python football_strategy_generation_1_3_1.py regenerates all experimental results and figures reported in Section 5.

Appendix A.7. Full Correlation Matrix and Multicollinearity Analysis

Table A5 presents the complete

14 \times 14

pairwise correlation matrix for macro-attributes computed from 500 synthetic team profiles. Correlations

| r | > 0.5

are highlighted in bold.

Table A5. Full pairwise correlation matrix for macro-attributes (

n = 500

synthetic teams). Bold indicates

| r | > 0.5

.

Table A5. Full pairwise correlation matrix for macro-attributes (

n = 500

synthetic teams). Bold indicates

| r | > 0.5

.

	$A_{1}$	$A_{2}$	$A_{3}$	$A_{4}$	$A_{5}$	$A_{6}$	$A_{7}$	$A_{8}$	$A_{9}$	$A_{10}$	$A_{11}$	$A_{12}$	$A_{13}$	$A_{14}$
$A_{1}$	1.00	0.08	0.21	0.18	0.12	0.15	0.04	0.09	0.05	0.11	0.19	0.22	0.14	0.02
$A_{2}$	0.08	1.00	0.06	0.11	0.31	0.04	0.07	0.10	0.06	0.18	0.09	0.15	0.21	0.01
$A_{3}$	0.21	0.06	1.00	0.35	0.12	0.90	0.05	0.00	0.05	0.42	0.38	0.29	0.16	0.03
$A_{4}$	0.18	0.11	0.35	1.00	0.28	0.41	0.03	0.34	0.02	0.22	0.18	0.21	0.48	0.01
$A_{5}$	0.12	0.31	0.12	0.28	1.00	0.09	0.08	0.15	0.07	0.31	0.14	0.19	0.36	0.02
$A_{6}$	0.15	0.04	0.90	0.41	0.09	1.00	0.04	0.15	0.04	0.35	0.32	0.24	0.22	0.02
$A_{7}$	0.04	0.07	0.05	0.03	0.08	0.04	1.00	0.32	0.98	0.06	0.11	0.08	0.28	0.01
$A_{8}$	0.09	0.10	0.00	0.34	0.15	0.15	0.32	1.00	0.26	0.12	0.14	0.11	0.52	0.02
$A_{9}$	0.05	0.06	0.05	0.02	0.07	0.04	0.98	0.26	1.00	0.05	0.09	0.07	0.25	0.01
$A_{10}$	0.11	0.18	0.42	0.22	0.31	0.35	0.06	0.12	0.05	1.00	0.41	0.38	0.18	0.03
$A_{11}$	0.19	0.09	0.38	0.18	0.14	0.32	0.11	0.14	0.09	0.41	1.00	0.49	0.15	0.02
$A_{12}$	0.22	0.15	0.29	0.21	0.19	0.24	0.08	0.11	0.07	0.38	0.49	1.00	0.18	0.01
$A_{13}$	0.14	0.21	0.16	0.48	0.36	0.22	0.28	0.52	0.25	0.18	0.15	0.18	1.00	0.02
$A_{14}$	0.02	0.01	0.03	0.01	0.02	0.02	0.01	0.02	0.01	0.03	0.02	0.01	0.02	1.00

Appendix A.7.1. Variance Inflation Factors

Table A6 reports the VIF for each attribute. Values above 10 indicate problematic multicollinearity; values above 5 warrant attention.

Table A6. Variance Inflation Factors for all 14 macro-attributes.

Attribute	VIF	Attribute	VIF
$A_{1}$ (Offensive Strength)	1.21	$A_{8}$ (Residual Energy)	1.89
$A_{2}$ (Defensive Strength)	1.34	$A_{9}$ (Team Morale)	37.0
$A_{3}$ (Midfield Control)	21.6	$A_{10}$ (Time Management)	2.14
$A_{4}$ (Transition Speed)	1.72	$A_{11}$ (Tactical Cohesion)	1.68
$A_{5}$ (High Press Capability)	1.41	$A_{12}$ (Technical Base)	1.52
$A_{6}$ (Width Utilization)	18.4	$A_{13}$ (Physical Base)	1.94
$A_{7}$ (Psych. Resilience)	35.1	$A_{14}$ (Relational Cohesion)	1.01

Appendix A.7.2. Summary

Four attributes (

A_{3}

,

A_{6}

,

A_{7}

,

A_{9}

) exhibit problematic multicollinearity (VIF

> 10

), forming two correlated pairs:

A_{3}

–

A_{6}

(both use xA from midfielders/fullbacks) and

A_{7}

–

A_{9}

(both use resilience and aggression). The remaining 10 attributes have VIF

< 5

, indicating acceptable independence. The correlation between

A_{8}

and

A_{13}

(

r = 0.52

) is moderate and reflects the shared stamina input but does not reach problematic levels (VIF

< 2

for both).

As discussed in Section 3.3.6, we retain all 14 attributes for conceptual completeness and interpretability, acknowledging that future implementations should consider Mahalanobis distance or attribute consolidation when real-world covariance data become available.

References

Pollard, R.; Reep, C. Measuring the Effectiveness of Playing Strategies at Soccer. J. R. Stat. Soc. Ser. D Stat. 1997, 46, 541–550. [Google Scholar] [CrossRef]
Mackenzie, R.; Cushion, C. Performance analysis in football: A critical review and implications for future research. J. Sports Sci. 2013, 31, 639–676. [Google Scholar] [CrossRef] [PubMed]
Sarmento, H.; Marcelino, R.; Anguera, M.T.; Campaniço, J.; Matos, N.; Leitão, J.C. Match analysis in football: A systematic review. J. Sports Sci. 2014, 32, 1831–1843. [Google Scholar] [CrossRef] [PubMed]
Weinberg, R.S.; Gould, D. Foundations of Sport and Exercise Psychology, 8th ed.; Human Kinetics: Champaign, IL, USA, 2023. [Google Scholar]
McLean, S.; Salmon, P.M.; Gorman, A.D.; Read, G.J.M.; Solomon, C. What’s in a game? A systems approach to enhancing performance analysis in football. PLoS ONE 2017, 12, e0172565. [Google Scholar] [CrossRef] [PubMed]
Rein, R.; Memmert, D. Big data and tactical analysis in elite soccer: Future challenges and opportunities for sports science. J. Sports Sci. 2016, 34, 639–650. [Google Scholar] [CrossRef] [PubMed]
Ghisellini, R.; Pareschi, R.; Pedroni, M.; Raggi, G.B. Recommending actionable strategies: A semantic approach to integrating analytical frameworks with decision heuristics. Information 2025, 16, 192. [Google Scholar] [CrossRef]
Grehaigne, J.F.; Godbout, P. Tactical Knowledge in Team Sports from a Constructivist and Cognitivist Perspective. Quest 1995, 47, 490–505. [Google Scholar] [CrossRef]
He, Q.; Kee, Y.H.; Komar, J. Flexibility, Stability, and Adaptability of Team Playing Style as Key Determinants of Within-Season Performance in Football. In Proceedings of the 9th International Performance Analysis Workshop and Conference & 5th IACSS Conference (PACSS 2021); Advances in Intelligent Systems and Computing; Springer: Cham, Switzerland, 2022; Volume 1426, pp. 69–73. [Google Scholar] [CrossRef]
Andrienko, G.; Andrienko, N.; Budziak, G.; Dykes, J.; Fuchs, G.; von Landesberger, T.; Weber, H. Visual analysis of pressure in football. Data Min. Knowl. Discov. 2017, 31, 1793–1839. [Google Scholar] [CrossRef]
Bauer, P.; Anzer, G. Data-driven detection of counterpressing in professional football: A supervised machine learning task based on synchronized positional and event data with expert-based feature extraction. Data Min. Knowl. Discov. 2021, 35, 2009–2049. [Google Scholar] [CrossRef]
Low, B.; Rein, R.; Schwab, S.; Memmert, D. Defending in 4-4-2 or 5-3-2 formation? Small differences in footballers’ collective tactical behaviours. J. Sports Sci. 2022, 40, 793–805. [Google Scholar] [CrossRef] [PubMed]
Gudmundsson, J.; Horton, M. Spatio-temporal analysis of team sports—A survey. ACM Comput. Surv. 2017, 50, 22. [Google Scholar]
Forcher, L.; Forcher, L.; Altmann, S.; Jekauc, D.; Kempe, M. Is a compact organization important for defensive success in elite soccer? Analysis based on player tracking data. Int. J. Sports Sci. Coach. 2024, 19, 757–768. [Google Scholar] [CrossRef]
Turney, P.D.; Pantel, P. From Frequency to Meaning: Vector Space Models of Semantics. J. Artif. Intell. Res. 2010, 37, 141–188. [Google Scholar] [CrossRef]
Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. In Proceedings of the ICLR, Scottsdale, AZ, USA, 2–4 May 2013. [Google Scholar]
Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
Lin, J. Divergence Measures Based on the Shannon Entropy. IEEE Trans. Inf. Theory 1991, 37, 145–151. [Google Scholar] [CrossRef]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT); Association for Computational Linguistics: Stroudsburg, PA, USA, 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
Reimers, N.; Gurevych, I. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 3982–3992. [Google Scholar] [CrossRef]
Turban, E.; Sharda, R.; Delen, D. Decision Support and Business Intelligence Systems; Pearson: London, UK, 2011. [Google Scholar]
Sanders, D.; Heijboer, M.; Hesselink, M.K.C.; Myers, T.; Akubat, I. Analysing a cycling grand tour: Can we monitor fatigue with intensity or load ratios? J. Sport. Sci. 2018, 36, 1385–1391. [Google Scholar] [CrossRef] [PubMed]
Goldsberry, K. CourtVision: New Visual and Spatial Analytics for the NBA; MIT Sloan Sports Analytics Conference: Boston, MA, USA, 2012. [Google Scholar]
Pappalardo, L.; Cintia, P.; Rossi, A.; Massucco, E.; Ferragina, P.; Pedreschi, D.; Giannotti, F. A public data set of spatio-temporal match events in soccer competitions. Sci. Data 2019, 6, 236. [Google Scholar] [CrossRef] [PubMed]
Fletcher, D.; Sarkar, M. A grounded theory of psychological resilience in Olympic champions. Psychol. Sport Exerc. 2012, 13, 669–678. [Google Scholar] [CrossRef]
Krippendorff, K. Content Analysis: An Introduction to Its Methodology, 2nd ed.; Sage Publications: Thousand Oaks, CA, USA, 2004. [Google Scholar]
StatsBomb. StatsBomb Open Data. 2024. Available online: https://github.com/statsbomb/open-data (accessed on 24 February 2026).
Ghisellini, R.; Pareschi, R.; Pedroni, M.; Raggi, G.B. From Extraction to Synthesis: Entangled Heuristics for Agent-Augmented Strategic Reasoning. arXiv 2025, arXiv:2507.13768. [Google Scholar] [CrossRef]
Mutzari, D.; Deb, T.; Molinaro, C.; Pugliese, A.; Subrahmanian, V.S.; Kraus, S. Defending a City from Multi-Drone Attacks: A Sequential Stackelberg Security Games Approach. arXiv 2025, arXiv:2508.11380. [Google Scholar] [CrossRef]

Figure 1. Context tree structure for two representative macro-attributes. Leaf nodes contain raw observables from match data; intermediate nodes aggregate by functional role; root nodes are the macro-attributes used in semantic distance computation. Edges represent weighted aggregation functions.

Figure 2. System architecture of the tactical decision support prototype. Context signals are aggregated into 14 macro-attributes (team vector), matched to strategy templates via adapted semantic distance, and produce interpretable recommendations and diagnostics.

Figure 3. Example of radar plot for the “Energetic and Balanced” scenario. The shaded blue area represents the team profile, while the orange outline indicates the ideal strategy vector.

Figure 4. Sensitivity of adapted distance

d_{adapt}

with respect to contextual weight

λ

across the four scenarios. Smooth trends indicate stability in the optimal strategy selection.

Figure 4. Sensitivity of adapted distance

d_{adapt}

with respect to contextual weight

λ

across the four scenarios. Smooth trends indicate stability in the optimal strategy selection.

Figure 5. Relative importance of the five most influential macro-attributes across all simulations.

Figure 6. Radar plot comparing the projected halftime team profile (solid blue) with the top three recommended strategies. Build-up Play shows the closest overall alignment, while the team’s high transition speed represents surplus capability relative to this strategy’s demands.

Table 1. Sequential enumeration of all 14 macro-attributes (

A_{1}

–

A_{14}

).

Table 1. Sequential enumeration of all 14 macro-attributes (

A_{1}

–

A_{14}

).

ID	Attribute Name	Category	Variability
$A_{1}$	Offensive Strength	Technical/Tactical	Static
$A_{2}$	Defensive Strength	Technical/Tactical	Static
$A_{3}$	Midfield Control	Technical/Tactical	Static
$A_{4}$	Transition Speed	Technical/Tactical	Semi-dynamic
$A_{5}$	High Press Capability	Technical/Tactical	Context-dependent
$A_{6}$	Width Utilization	Technical/Tactical	Static
$A_{7}$	Psychological Resilience	Psychological	Dynamic
$A_{8}$	Residual Energy	Physical	Dynamic
$A_{9}$	Team Morale	Psychological	Dynamic
$A_{10}$	Time Management	Organizational	Context-dependent
$A_{11}$	Tactical Cohesion	Organizational	Semi-dynamic
$A_{12}$	Technical Base	Physical	Static
$A_{13}$	Physical Base	Physical	Static
$A_{14}$	Relational Cohesion	Organizational	Static

Table 2. Detailed specification of macro-attributes with definitions and aggregation sources.

ID	Attribute Name	Definition & Aggregation Source
Technical/Tactical Dimensions ( $A_{1}$ – $A_{6}$ )
$A_{1}$	Offensive Strength	Capacity to create and convert goal-scoring opportunities. Aggregated from forwards’ and midfielders’ xG, dribbling success, and shot accuracy.
$A_{2}$	Defensive Strength	Ability to prevent opponent attacks and protect the goal. Derived from defenders’ tackling, interceptions, aerial duels, and goalkeeper reflexes.
$A_{3}$	Midfield Control	Dominance in central zones and ability to dictate tempo. Based on central midfielders’ passing accuracy, interceptions, and ball retention.
$A_{4}$	Transition Speed	Capability for rapid phase changes between defense and attack. Computed from speed attributes of forwards, fullbacks, and midfielders, combined with xA.
$A_{5}$	High Press Capability	Aptitude for coordinated pressing in advanced zones. Aggregated from stamina, aggression, and interception rates across all outfield players.
$A_{6}$	Width Utilization	Effectiveness in exploiting wide areas of the pitch. Derived from fullbacks’ and wingers’ crossing, dribbling, and speed attributes.
Psychological/Physical Dimensions ( $A_{7}$ – $A_{9}$ , $A_{12}$ – $A_{13}$ )
$A_{7}$	Psychological Resilience	Mental toughness and ability to perform under pressure. Weighted combination of individual resilience and aggression attributes.
$A_{8}$	Residual Energy	Current stamina reserves across the squad. Computed from stamina values weighted by playing time, with resilience as a moderating factor.
$A_{9}$	Team Morale	Collective motivation and positive emotional state. Derived from resilience and aggression, modulated by match context (score, momentum).
$A_{12}$	Technical Base	Overall technical quality of the squad. Mean of technical attributes (passing, dribbling, first touch, xG, xA) across all players.
$A_{13}$	Physical Base	Overall athletic capacity of the squad. Mean of physical attributes (speed, stamina, aerial ability, aggression) across all players.
Organizational Dimensions ( $A_{10}$ , $A_{11}$ , $A_{14}$ )
$A_{10}$	Time Management	Ability to adapt tactics to match clock pressure. Based on experienced players’ (GK, CM, FB) interception and passing attributes.
$A_{11}$	Tactical Cohesion	Synchronization and coordination between team units. Computed from passing networks, xA distribution, and positional discipline.
$A_{14}$	Relational Cohesion	Stability of internal relationships and group dynamics. Estimated via qualitative assessment or historical team stability indicators.

Table 3. Pairwise correlations among attributes with overlapping inputs (

n = 500

synthetic teams).

Table 3. Pairwise correlations among attributes with overlapping inputs (

n = 500

synthetic teams).

	$A_{3}$	$A_{6}$	$A_{7}$	$A_{8}$	$A_{9}$
$A_{3}$ (Midfield Control)	1.00	0.90	0.05	0.00	0.05
$A_{6}$ (Width Utilization)	0.90	1.00	0.04	0.15	0.04
$A_{7}$ (Psych. Resilience)	0.05	0.04	1.00	0.32	0.98
$A_{8}$ (Residual Energy)	0.00	0.15	0.32	1.00	0.26
$A_{9}$ (Team Morale)	0.05	0.04	0.98	0.26	1.00

Table 4. Qualitative-to-numerical encoding scale for strategy vector construction.

Qualitative Level	Numerical Value
Irrelevant/Not required	$0.2$ – $0.3$
Low importance	$0.4$ – $0.5$
Moderate importance	$0.5$ – $0.6$
High importance	$0.7$ – $0.8$
Critical/Essential	$0.8$ – $0.9$

Table 5. Strategy vector profiles for five representative tactical approaches. Values represent attribute importance, ranging from

0.2

(minimal relevance) to

0.9

(critical importance) as per the encoding in Stage 3.

Table 5. Strategy vector profiles for five representative tactical approaches. Values represent attribute importance, ranging from

0.2

(minimal relevance) to

0.9

(critical importance) as per the encoding in Stage 3.

Attribute	High Press	Fast Counter	Positional Defense	Build-Up Play	Gegen- Pressing
$A_{1}$ Offensive Strength	0.70	0.90	0.40	0.80	0.70
$A_{2}$ Defensive Strength	0.80	0.60	0.90	0.50	0.80
$A_{3}$ Midfield Control	0.60	0.50	0.80	0.70	0.60
$A_{4}$ Transition Speed	0.90	0.90	0.30	0.50	0.80
$A_{5}$ High Press Cap.	0.90	0.50	0.20	0.40	0.90
$A_{6}$ Width Utilization	0.50	0.60	0.30	0.60	0.50
$A_{7}$ Psych. Resilience	0.80	0.70	0.70	0.70	0.80
$A_{8}$ Residual Energy	0.70	0.80	0.60	0.60	0.70
$A_{9}$ Team Morale	0.80	0.70	0.60	0.80	0.80
$A_{10}$ Time Management	0.60	0.80	0.90	0.70	0.60
$A_{11}$ Tactical Cohesion	0.90	0.60	0.80	0.80	0.90
$A_{12}$ Technical Base	0.70	0.70	0.60	0.80	0.70
$A_{13}$ Physical Base	0.80	0.80	0.50	0.60	0.80
$A_{14}$ Relational Cohesion	0.80	0.60	0.70	0.80	0.80

Table 6. Sensitivity of strategy recommendations to opponent-awareness parameter

α

across test scenarios.

Table 6. Sensitivity of strategy recommendations to opponent-awareness parameter

α

across test scenarios.

Scenario	$α$ -Range for Stable Top-1	Rank Corr. ( $τ$ )	Mean $Δ d$ (Top-1 vs. Top-2)	95% CI for $Δ d$
Energetic & Balanced	$[0.0, 0.4]$	0.94	0.047	[0.031, 0.063]
Fatigued & Inferior	$[0.0, 0.3]$	0.89	0.032	[0.018, 0.046]
High Temporal Pressure	$[0.0, 0.5]$	0.97	0.061	[0.042, 0.080]
Tech./Phys. Superiority	$[0.0, 0.5]$	0.96	0.054	[0.038, 0.070]

Table 7. Default parameters for dynamic weight computation.

Parameter	Description	Symbol	Default
Energy threshold	Fatigue becomes salient below this level	$τ_{e}$	0.50
Energy sensitivity	Strength of energy-based adjustments	$γ_{e}$	1.50
Gap sensitivity	Strength of gap-based adjustments	$γ_{g}$	1.00
Time threshold	Urgency triggers in final fraction	$τ_{t}$	0.25
Urgency sensitivity	Strength of time-pressure adjustments	$γ_{t}$	2.00
Opponent factor	Weight on opponent mismatch	$α$	0.20
Multiplier floor	Minimum allowed multiplier value	$m_{min}$	0.30
Multiplier ceiling	Maximum allowed multiplier value	$m_{max}$	2.50

Table 8. Summary of simulated match scenarios used for experimental evaluation.

Scenario	Context Description
1. Energetic and Balanced	High residual energy ( $A_{8} \approx 0.8$ ), neutral technical/physical gap ( $Δ A_{12, 13} \approx 0$ ), and good morale. Used to test the system’s preference for high-intensity strategies (e.g., high pressing, gegenpressing).
2. Fatigued and Inferior	Low energy ( $A_{8} \approx 0.3$ ), reduced morale, and negative technical/physical gap. Designed to verify whether the DSS avoids high-risk strategies and recommends conservative options (e.g., positional defense).
3. High Temporal Pressure	Limited remaining time ( $A_{10}$ high), moderate energy, and slightly inferior technique but compact organization. Tests whether the DSS favors rapid, vertical play (e.g., counterattack).
4. Technical and Physical Superiority	Positive gap ( $Δ A_{12, 13} > 0$ ) and strong tactical cohesion ( $A_{11} \approx 0.8$ ). Evaluates the model’s tendency to suggest possession-based strategies (e.g., build-up play).

Table 9. Recommendation stability under independent vs. correlated perturbation structures.

Perturbation Type	Top-1 Consistency	Top-3 Stability	Rank Corr. ( $τ$ )
Independent (baseline)	89.3%	94.1%	0.96
Correlated (physical)	84.7%	91.2%	0.93
Correlated (psychological)	86.1%	92.8%	0.94
Correlated (all clusters)	81.2%	88.6%	0.91

Table 10. Recommendation stability under three missing-data patterns.

Pattern	Attributes Missing	Top-1 Match	Top-3 Overlap	Qualitative Agreement
M1 (Tracking)	3	78.5%	89.0%	91%
M2 (Psychological)	3	85.2%	93.1%	96%
M3 (Sparse)	6 (random)	67.3%	81.4%	84%

Table 11. Expert agreement and recalibration needs under distribution shift.

Distribution Shift	Expert Agreement	Problematic Recommendations	Recalibration Required?
None (baseline)	94%	3/50	No
Youth	82%	9/50	Recommended
Lower division	88%	6/50	Optional
Style shift	78%	11/50	Yes

Table 12. Summary of extended robustness findings.

Challenge	Impact on Top-1	Mitigation
Independent noise ( $\pm 5 %$ )	−11% (89% → baseline)	Acceptable
Correlated noise (all clusters)	−19% (81%)	Present top-k
Missing tracking data (M1)	−22% (78%)	Flag low confidence
Missing psychological (M2)	−15% (85%)	Acceptable
Sparse data (M3)	−33% (67%)	Qualitative mode
Youth distribution shift	−18% (82% expert)	Recalibrate thresholds
Style distribution shift	−22% (78% expert)	Re-elicit strategy vectors

Table 13. Ablation comparison of weighting schemes across test scenarios.

Scenario	Attribute-Wise	Uniform	Global Scaling
Top-ranked strategy matches expert intuition?
Energetic & Balanced	✓	✓	✓
Fatigued & Inferior	✓	✗	✓
High Temporal Pressure	✓	✗	✗
Tech. & Phys. Superiority	✓	✓	✓
Rank of gegenpressing in “Fatigued & Inferior” scenario
	18/20	4/20	12/20
Diagnostic correctly identifies energy as binding constraint?
Fatigued & Inferior	✓	N/A	Partial
High Temporal Pressure	✓	N/A	✗

Table 14. Complete mapping of German observed attributes to DSS semantic space. All six observed attributes map to five unique DSS dimensions.

German Term	DSS ID	English Name	Definition & Computation
Offensivkraft	$A_{1}$	Offensive Strength	Capacity to create and convert scoring opportunities. Direct correspondence; categorical value mapped via Equation (14).
Kompakte Defensive	$A_{2}$	Defensive Strength	Ability to maintain defensive shape and prevent attacks. Direct correspondence; categorical mapping.
Direkte vertikale Angriffe	$A_{4}$	Transition Speed	Capability for rapid vertical progression. Combined with Gegenangriff via $max (\cdot)$ aggregation.
Gegenangriff	$A_{4}$	Transition Speed	Counterattacking capability after regaining possession. Combined with Direkte vertikale Angriffe.
Gegenpressing	$A_{5}$	High Press Capability	Aptitude for immediate pressure after ball loss. Direct correspondence; categorical mapping.
Restenergie	$A_{8}$	Residual Energy	Current stamina reserves. Direct correspondence; categorical mapping.

Table 15. Observed team attributes for SSV Pachten across both match halves. German source terms are shown with corresponding DSS identifiers per the mapping in Table 14. Categorical values (Hoch/Mittel/Niedrig) are converted to normalized scores via Equation (14).

Attribute (German → DSS ID)	First Half		Second Half		$Δ$
Attribute (German → DSS ID)	Cat.	Norm.	Cat.	Norm.	$Δ$
Offensivkraft → $A_{1}$	Hoch	0.85	Hoch	0.85	0.00
Direkte vert. Angriffe → $A_{4}$	Hoch	0.85	Mittel	0.50	$- 0.35$
Gegenangriff → $A_{4}$	Hoch	0.85	Hoch	0.85	0.00
Kompakte Defensive → $A_{2}$	Mittel	0.50	Niedrig	0.20	$- 0.30$
Restenergie → $A_{8}$	Mittel	0.50	Niedrig	0.20	$- 0.30$
Gegenpressing → $A_{5}$	Mittel	0.50	Mittel	0.50	0.00

Table 16. Semantic distances to candidate strategies at halftime (projected second-half state).

Strategy	$d_{eucl}$	$d_{adapt}$
Build-up Play	0.4444	0.4530
Fast Counterattack	0.4664	0.4872
High Pressing	0.6305	0.6580
Gegenpressing	0.6305	0.6580
Positional Defense	0.9042	0.9150

Table 17. Expert assessment of DSS recommendation for the pilot match.

Assessment Item	Expert 1	Expert 2
Top-1 recommendation appropriate?	Appropriate	Partially Appropriate
Top-3 contains endorsed strategy?	Yes (Build-up Play)	Yes (Cautious Horizontal)
Would use DSS output in practice?	Yes, with caveats	Yes, as input to discussion

Table 18. Comparison of DSS recommendation (Build-up Play) with observed second-half tactical profile.

Attribute	DSS Rec.	Observed	Alignment
Offensive Strength ( $A_{1}$ )	0.80	0.85	✓
Defensive Strength ( $A_{2}$ )	0.50	0.20	×
Transition Speed ( $A_{4}$ )	0.50	0.85	×
High Press Capability ( $A_{5}$ )	0.40	0.50	✓
Residual Energy ( $A_{8}$ )	0.60	0.20	×

Table 19. Baseline comparison of recommendation methods.

Method	Recommendation	Expert 1	Expert 2
DSS (proposed)	Build-up Play	Appropriate	Partially Appr.
Random baseline	Offside Trap	Inappropriate	Inappropriate
Default strategy	Build-up Play	Appropriate	Partially Appr.
Energy-only heuristic	Positional Defense	Partially Appr.	Inappropriate

Table 20. Summary of pilot study evaluation outcomes.

Endpoint	Criterion	Outcome
1. Processing Feasibility	Pipeline executes without errors	PASSED
2. Expert Agreement	Both experts: Appr. or Part. Appr.	PASSED
3. Tactical Alignment	Descriptive comparison	Divergence observed
Baseline: Random	Expert agreement	Failed (0/2)
Baseline: Default	Expert agreement	Matched DSS (2/2)
Baseline: Energy-only	Expert agreement	Partial (1/2)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Di Rubbo, A.; Neri, M.; Pareschi, R.; Pedroni, M.; Valtancoli, R.; Zica, P. Can Semantic Methods Enhance Team Sports Tactics? A Methodology for Football with Broader Applications. Sci 2026, 8, 63. https://doi.org/10.3390/sci8030063

AMA Style

Di Rubbo A, Neri M, Pareschi R, Pedroni M, Valtancoli R, Zica P. Can Semantic Methods Enhance Team Sports Tactics? A Methodology for Football with Broader Applications. Sci. 2026; 8(3):63. https://doi.org/10.3390/sci8030063

Chicago/Turabian Style

Di Rubbo, Alessio, Mattia Neri, Remo Pareschi, Marco Pedroni, Roberto Valtancoli, and Paolino Zica. 2026. "Can Semantic Methods Enhance Team Sports Tactics? A Methodology for Football with Broader Applications" Sci 8, no. 3: 63. https://doi.org/10.3390/sci8030063

APA Style

Di Rubbo, A., Neri, M., Pareschi, R., Pedroni, M., Valtancoli, R., & Zica, P. (2026). Can Semantic Methods Enhance Team Sports Tactics? A Methodology for Football with Broader Applications. Sci, 8(3), 63. https://doi.org/10.3390/sci8030063

Article Menu

Can Semantic Methods Enhance Team Sports Tactics? A Methodology for Football with Broader Applications

Abstract

1. Introduction

1.1. Objectives and Contributions

1.2. Paper Organization

2. Background and Related Work

2.1. Strategic and Tactical Analysis in Football

2.2. Canonical Tactical Strategies in Modern Football

2.3. Semantic Distance Models

2.4. Decision Support Systems in Sports

3. Methodology

3.1. Theoretical Framework

3.2. Context Tree and Aggregation

3.2.1. Aggregation Example

3.2.2. Normalization Procedure

3.2.3. Data Sources

3.2.4. Measurement Framework

3.3. A Shared Semantic Space: 14 Macro-Attributes

3.3.1. Complete Attribute Set

3.3.2. Design Rationale

3.3.3. Attribute Categories

3.3.4. Aggregation Functions

3.3.5. Dynamic vs. Static Attributes

3.3.6. Construct Validity: Input Overlap and Multicollinearity

3.3.7. Implications for Distance Computation

3.3.8. Design Justification

3.3.9. Mitigation Strategies

3.3.10. Empirical Impact Assessment

3.4. Encoding Tactical Strategies as Vectors

3.4.1. Strategy Vector Definition

3.4.2. Construction Methodology

3.4.3. Illustrative Strategy Profiles

3.4.4. Sensitivity to Vector Specification

3.4.5. Sensitivity to Floor Choice

3.4.6. Extensibility

3.5. Semantic Distance and Matching

3.5.1. Why Euclidean over Cosine?

3.5.2. Why Not Probabilistic Metrics?

3.5.3. Baseline Formulation

3.5.4. Context-Adapted Distance

3.5.5. Opponent-Aware Adjustment

3.5.6. Optimal Tactic Selection

3.5.7. Alternative Metrics for Future Work

3.5.8. Controlled Comparison: Euclidean vs. Cosine

3.6. Selection Algorithm

3.6.1. Algorithm Steps

3.6.2. Dynamic Weight Computation

3.6.3. Pseudocode

3.7. Evaluation Protocol

Summary

3.8. System Architecture Diagram

4. Prototype Implementation

4.1. Module Organization

4.2. Dynamic Adjustment Mechanism

4.3. Execution Workflow

4.4. Reproducibility

4.5. Extensibility

5. Experimental Evaluation

5.1. Setup and Scenarios

5.2. Results by Scenario

5.3. Stability and Explainability Analyses

5.3.1. Robustness to Input Noise

5.3.2. Extended Robustness Analysis

5.4. Ablation: Attribute-Wise vs. Uniform Weighting

5.4.1. Evaluation Metrics

5.4.2. Results

5.5. Attribute Contribution Analysis

5.6. Critical Discussion

5.7. Reproducibility and Open Materials

6. From Simulation to Practice: A Pilot Case Study

6.1. Evaluation Specification

6.1.1. Evaluation Objectives and Scope

6.1.2. Evaluation Endpoints

6.1.3. Baseline Comparators

6.1.4. Data Sampling and Selection

6.1.5. Train/Test Separation and Leakage Control

6.1.6. Reproducibility Configuration

6.2. Data Source and Match Context

6.3. Observed Attributes and Mapping Protocol