This section introduces the problem statement and a list of competency questions.
3.1. Problem Statement
The key notations can be seen in
Table 1 and are described below.
Teams and Players. First of all, we define T as the set of teams , and as all players of a team, i.e., , where .
Power Set of Players of a Team. We define as the power set of any combination of players in a team, i.e., . This power set contains any combination of players, e.g., pairs, triads, quads, quintets, hexads, and so forth.
Restricted Power Set of Players of a Team. The goal is to calculate the statistics for any K-Player combinations of players when they are together on the court. However, since there cannot be more than five players on the court, we restrict the power set to only those subsets that represent legal lineups (e.g., we exclude hexads and so on). Therefore, below, we define the restricted power set of
up to combinations of five.
Meet-semilattice. It is a partially ordered set in which every pair of elements has a greatest lower bound (also called the meet, denoted ), which is also in P. The set , together with the subset relation ⊆, forms a meet-semilattice. The meet of any two elements and is given by their intersection, i.e., .
Hasse Diagram. A semilattice can be visualized as a Hasse diagram, which is a graphical representation of a finite partially ordered set
. Specifically, each element of
P is represented by a node, an edge is drawn between two elements
a and
b if
and there is no
such that
(i.e.,
b covers a), and the diagram is drawn so that if
, then
b appears higher than
a in the layout, and edges are directed implicitly upward (edges are not labeled with arrows). An example is given in
Figure 1.
Number of Maximum K-Player Combinations. The total number of subsets in
is given by summing the binomial coefficients:
For instance, if there are 15 players in a team roster, it results in 3003 5-player combinations, 1365 4-player combinations, 455 3-player combinations, 105 2-player combinations, and 15 1-player combinations, i.e., a total of 4943 combinations.
The 5-player Lineups. Since any time there are lineups of five on the court, we start by defining the lineups having played together for at least 1 s as follows:
All the lineups including a K-Player combination. For a given combination of players, say
, we need to find all the 5-player lineups in which they have participated as follows:
How to measure the statistics. We want to compute the team statistics when the combination B is on the court. First, we define as the cumulative number for a given statistic (points, assists, three points made, minutes, etc.) and a combination . The is computed for the desired statistic by taking the sum for for each 5-player lineup that includes all the players of B. For example, for finding all the points of the lineups, including both “Grant” and “Nunn”, we need to take the sum of the points of all the 5-Player lineups, including these two players. Apart from statistics concerning cumulative numbers, there can be statistics, including average values, i.e., . In such cases, we need to combine cumulative statistics to find the average, e.g., for finding the points per minute we need to divide the number of points by the number of minutes .
Set theory Properties. By the given two combinations of players
, where it holds that
, we know from the set theory that
(i.e., see [
28]). For instance, if
and
, we know that the lineups that include “Grant and Nunn” is a superset of the lineups, including “Grant, Nunn, and Sloukas”. Also, if we know that “Lessort and Yurtseven” have never played together, it means that any 3–4 and 5-player combination, including these two players, have never played together in any game. Finally, for the cumulative stats, we know that if
, then
.
Objective. The objective is to compute any of the statistics of each K-Player combination in an incremental way for exploiting the mentioned set theory properties.
Maximization Questions. One target is to provide an answer to the maximization questions, i.e., give me the combination of players
B that maximize the following criteria:
For the first one, an example can be to find which combination of players has the highest plus/minus (when they are on the court); for the second case, it can find which combination of players the team has the highest three-point percentage. Correspondingly, it can be used for minimization problems, e.g., for statistics, such as the number of turnovers, i.e., “give me the K-Player lineups that minimize the number of turnovers”.
Filtering Questions. Another case is to compute filtering questions for a given value F such as , e.g., find all the combinations having played at least 40 min and , e.g., find all the combinations of players that, when they are on the court, the team has over 2 points per minute. In any case, one question can include one or more filtering conditions and one or more maximization/minimization conditions.
3.2. Competency Questions
We provide different types of questions based on their computation, i.e., a set of questions is given below:
[Q1. General Question]: For each K-Player combination of players, give me all the team statistics, when they are on the court.
[Q2. Maximization Question]: With which 3-player combination does the team have the highest number of points per minute?
[Q3. Filtering and Maximization Question]: Give me the best K-Player combination of Panathinaikos, including Kendrick Nunn, that maximizes the value of the 3-Point percentage (extra filtering condition: at least 50 min on the court)?
[Q4. Comparative and Filtering Question]: Compare the best K-Player combinations for all the statistics for the Panathinaikos team for the 2023–2024 and 2024–2025 seasons (that have played together at least 80 min on the court)?
[Q5. Minimization Question]: Give me the top-5 3-player lineups in EuroLeague for the 2023–2024 season that, when they are on the court, the team has the lowest number of opponent points per minute.
[Q6. Distribution Question]: Give me the minutes of each K-Player lineup (for a given K) to check their distribution (e.g., power-law distribution).