Abstract
We study competitions structured as hierarchically shaped single-elimination tournaments. We define optimal tournaments by maximizing attractiveness such that the topmost players will have the chance to meet in higher stages of the tournament. We propose a dynamic programming algorithm for computing optimal tournaments and we provide its sound complexity analysis. Based on the idea of the dynamic programming approach, we also develop more efficient deterministic and stochastic sub-optimal algorithms. We present experimental results obtained with the Python implementation of all the proposed algorithms regarding the optimality of solutions and the efficiency of the running time.
1. Introduction
Tournament design is a combinatorial problem with many theoretical implications, as well as with a lot of practical applications. There are many types of tournaments that have been theoretically analyzed and practically used in various contexts. Basically, there are two main principles used in tournament design: “round-robin” principle and “knockout” principle. They can be used in isolation or combined for obtaining different tournaments designs, depending on various factors, such as the number of players, time available to carry out the tournament, and application domain.
In this paper we propose a formal definition of competitions that have the shape of single-elimination tournaments, also known as knockout tournaments. We introduce methods to quantitatively evaluate the attractiveness and competitiveness of a given tournament. We consider that a tournament is more attractive if competition is encouraged in higher stages, i.e., higher-ranked players will have the chance to meet in higher stages of the tournament, thus increasing the stake of their matches.
In knockout tournaments, the result of each match is always a win of one of the two players, i.e., draws are not possible. A knockout tournament is hierarchically structured as a binary tree such that each leaf represents one player or team that is enrolled in the tournament, while each internal node represents a game of the tournament.
The tournament is carried out in a series of rounds. If there is a number N of players equal to a power of 2, for example then the tournament tree is a complete binary tree with all the players entering the tournament in the first round; however, in the general case, the number of players might not be a power of 2, for example . In this case some of the players will receive waivers thus entering the tournament directly in the the second round, while the rest of the players will enter the tournament in the first round.
In this paper we significantly extend our preliminary results reported in [1] for fully balanced tournaments (the number of players is ) to general tournaments where the number of players can be an arbitrary natural number, not necessarily a power of 2. Our new results are summarized as follows:
- An exact formula for counting the total number of knockout tournaments in the general case, showing that the number of tournaments grows very large with the number of players.
- A tournament cost function based on players’ quota that assigns a higher cost to those tournaments where highly ranked players tend to meet in higher stages, thus making the tournament more attractive and competitive.
- An exact dynamic programming algorithm for computing optimal tournaments in the general case.
- A more efficient generic sub-optimal algorithm derived from the idea of the dynamic programming approach.
- Deterministic and stochastic versions of the generic sub-optimal algorithm.
- The complexity analysis of all the proposed algorithms.
- The implementation issues of the proposed algorithms using Python, as well as the experimental results obtained with our implementation.
2. Related Works
Tournament design attracted research in operations research, combinatorics, and statistics. The problem is also related to intelligent planning and activity scheduling, broadly covered also by artificial intelligence.
There are two main principles used in tournament design, namely the “round-robin” principle and “knockout” principle, and they can be used in isolation or combined for obtaining different tournaments designs. The “round-robin” principle states that in a tournament, each two players should meet at least once, sometimes exactly once. “Knockout”, also known as the “elimination” principle states that players are eliminated after a certain number of games, sometimes exactly after one game.
For example, in a round-robin tournament in which each two players should meet exactly once, an important aspect is the scheduling of the tournament, a problem also known as league scheduling [2]. This is an important component of the tournament design. Note that the “round robin” principle can also be applied with restrictions. Consider for example a two-team tournament, in which each team has the same number of players. Each game involves two players from different teams and any two players from different teams must play exactly once. In this case, we can still apply the round-robin principle, but players of the same team are not allowed to play. This type of tournament is called a bipartite tournament. A good coverage of the combinatorial aspects of round-robin tournaments can be found in monograph [2].
On the other hand, in a knockout tournament where two players will play at most one game and after each game exactly one player advances in the tournament, while the other is kicked off, the tournament schedule results directly from the tournament design, i.e., no separate scheduling stage is needed.
Note also that the round-robin and knock-out principles can be combined into a single tournament design. Consider for example the UEFA Champions League football tournament. In the groups’ phase, round-robin is used inside each group, while after the group phase, knockout is used to determine the tournament winner.
In this paper we consider knockout tournaments in which two players meet at most once. These are specialized tournaments with possible applications in sports (e.g., football and tennis tournaments), online games (e.g., online poker), and election processes. While in the former case there is a relatively low number of players, thus not raising special computational challenges, the number of players in massive multiplayer online games can grow such that computing the optimal tournament becomes a more difficult problem.
A comprehensive analysis of knockout tournaments is proposed in [3,4]. Traditionally, the method for designing a tournament involves two stages: (i) tournament structure design and (ii) seeding. In the first stage, the structure of the tournament tree is proposed. In the second stage, players are assigned to each leaf of the tournament tree; however, as we can argue that this method of tournament design has some limitations, we proposed different “integrated” approach. One limitation is for example the fact that once fixed, the tournament structure cannot be changed. On one hand this will result in a smaller search space during the seeding process, on the other hand it limits the total number of tournament designs. So our model of tournament trees includes both the tree structure, as well as the players’ seeding; a separate seeding process is not necessary.
Research in combinatorics of knockout tournaments also produced interesting mathematical results. The structure of a knockout tournament can be modeled as a special kind of binary tree, called an Otter tree [5]. The number of knockout tournament structures (i.e., prior to seeding) for N players is given by the Wedderburn—Etherington number [6] of order N that is known to have an exponential growth approximately equal to .
An important aspect concerns the factors that can be used to evaluate a tournament design. This problem has been also considered in previous works [3,4]. An interesting discussion of economic aspects of tournament attractiveness, such as spectator interest, is provided by [7].
A method for augmenting a tournament with probabilistic information based on tournament results was proposed in [8]. The effectiveness of tournament plans based on dominance graphs is studied in [6]. The tournament problem was also a source of inspiration for programming competitions [9]. A more recent work addressing competitiveness development and ranking precision of tournaments is [10].
For example, the more recent work [3,4] proposes a probabilistic approach to define the tournament cost, by including in their model the win–loss probabilities of each game between two players i and j. On one hand, we can question the robustness of such values. On the other hand, we recognize that some approximations of such values might be empirically obtained based on several factors, e.g., global player rankings (when available) or on the history of games between the players (if a nonempty history exists). In our work we do not use this information, i.e., we assume by default that there are equal winning chances for the players of each game. While this simplification clearly has drawbacks, it has the advantage of enabling a clean algorithm design based on dynamic programming principles. Our approach can be extended by adding probabilistic information to the cost function, but then it will require further analysis of algorithmic solutions within our “integrated” approach.
A theoretical investigation of knockout tournaments is provided by [11]. Their analysis is focused only on tournaments with a power of 2 number of players, i.e., similar with [1], but definitely less general than in the current work, where an arbitrary number of players is considered. Interesting results of this work concern the discussion of new seeding approaches named “equal gap” and “increasing competitive”, as well as the investigation of their theoretical properties.
There has been also theoretical interest in analyzing the possible outcomes of knockout tournaments. Upper and lower bounds of winning probabilities of players of a random knockout tournament are provided in [12]. Note that the analysis is focused on the random knockout tournament where the definition of matches to be carried out in each round is defined randomly. Moreover, this work assumes as [3,4], that the win–loss probabilities of each match between two players are known.
There is also interest in the literature in designing new formats of knockout tournaments. For example, a new format based on actively involving the teams in defining the tournament format, was recently proposed in [13] for the specific competition of UEFA Champions League. The proposed format was coined “Choose your opponent” with the claimed benefit to make group stages more exciting. The authors also show how this model can be used for the objective of maximizing the number of home games during the knockout stage.
Knockout tournament structures are sometimes called tournament brackets. According to [14], two types of tournament brackets are possible: fixed and adaptive. In fixed brackets, the tournament structure is fixed, while in adaptive brackets pairings in stage are defined based on winners of stage i. Our approach is clearly fixed, with the difference that we use an integrated approach to define both the structure as well as the seeding. What is different in [14] is the fact that authors look for optimizing a fixed bracket by using utility functions and Bayesian optimal design. They propose a simulated annealing algorithm to optimize the expected value of a given utility function on a fixed tournament bracket. While interesting, this endeavor is clearly different from our approach. We plan however to investigate in the future the suitability of extending integrated approach and proposed algorithms by incorporating probabilistic information.
Clearly tournaments have a lot of practical applications, for example in the sports’ domain. In this context, the recent work [15] provides an interesting discussion on the economics of sports from operations research, as well as practical applicability perspectives. The discussion is centered around several paradoxes of tournament rankings, with clear examples from the practice of tournament design.
3. Knockout Tournaments
We consider hierarchically structured knockout tournaments such that the result of each match is always a win of one of the two players, i.e., draws are not possible. A tournament is modeled as a binary tree such that each leaf node represents a unique player and each internal node represents a game between two players and its winner.
Definition 1
(Tournaments). Let Σ be a finite nonempty set of players. We define the set of trees with leaves Σ as follows:
- 1.
- If is a singleton set then , i.e., there is a single tree containing a single node i.
- 2.
- If and are two disjoint sets of players then let . Then:
Note that the set notation in Equation (1) implies that the trees are not ordered, i.e., the order of the left and right branches does not matter.
Example 1.
We consider examples of tournaments for sets of players with and 4 elements:
- 1.
- If then .
- 2.
- If then .
- 3.
- If then .
- 4.
- If then . It is not difficult to see that in this case there are 15 trees.
Some of the tournaments introduced in Example 1 are depicted graphically in Figure 1. Observe that the tournaments on the first row (labeled “a” and “b”) involve a number of elements that is a power of two ( and , respectively) and are fully balanced. However, the tournaments on the second row are not fully balanced, although the lower rightmost tournament involves players. However, intuitively, the tournament with three players (labeled “c”) should be accepted, as player 3 will enter the tournament only 1 round after players 1 and 2, i.e., it has a sense of “balancing”. However, the lower rightmost tournament with four players (labeled “d”) is not acceptable, as player 4 received an exemption from playing in the first two rounds, and this is considered unfair.
Figure 1.
Tournaments of two and three players (first column) and four players (second column).
Proposition 1
(Counting tournaments). The set with players contains:
elements.
Proof.
The number of full binary tree structures with N leaves is equal to where is Catalan’s number [16] defined by:
Now, each permutation of the N players can be attached to the leaves of a binary tree, thus obtaining trees. However, the branches of each internal node can be exchanged, resulting in the same tree. There are internal nodes and therefore a total number of exchanges, resulting a number of trees given by:
q.e.d. □
Example 2.
For example, if we obtain trees, while if we obtain trees. These results are consistent with Example 1.
A valid tournament should be balanced, i.e., each player should play (almost) the same number of games to win the tournament.
Analyzing the tournaments from Example 1 and Figure 1 we can observe that if then each element of represents a valid tournament. However, if then only 3 trees of represent valid tournaments. For example, is a valid tournament as each player should play exactly two games to win the tournament. In this case we have a fully balanced tournament consisting of players. Moreover, is also considered a valid tournament, as players 1 and 2 must play two games to win, while player 3 must play one game to win, i.e., has an exemption for the first round (the difference between the number of games played by each player is at most 1). However, is not a valid tournament, as players 1 and 2 must play three games to win the tournament, while player 4 must play a single match to win the tournament (the difference between the number of games played by each player is above 1, i.e., more than one exemption for a player is considered unfair).
Observe that a tree representing a valid tournament has the property that all its leaves are of height n or for a suitable value of n. Actually, the value of n can be determined from the given number of players N of the tournament and it represents the number of rounds of the tournament.
Let us consider a tournament with n rounds. It is not difficult to see that the maximum number of players is and it is obtained when in the first round we have a maximum number of games; therefore, for a tournament with n rounds we have:
Observe that from Equation (5) it follows that:
- 1.
- If then so we have a singleton set . In this case .
- 2.
- If , , , and then .
- 3.
- If , , is a fully balanced tree (i.e., ), and then .
If then there are players. Then a tree can be obtained either (i) by joining two balanced trees with layers or (ii) by joining one balanced tree with layers and one fully balanced tree with layers (all its leaves are on layer ), so in both cases the balancing condition of t is properly preserved.
Proposition 2
(Structure of a balanced tournament). Let be a tournament of N players such that n is defined by Equation (6). Then the number of players starting in the first round is and the number of players starting in the second round (waivers) is . Moreover, if then the number of internal nodes of level 2 in the tree is equal to , i.e., and .
Proof.
First observe that if the number of players is a power of 2, i.e., , then , , and . This is trivially true, as in this case the tournament is fully balanced and all the players start in the first round (there are no exemptions).
The proof for the general case can be shown by induction on .
For the tournament has players. In this case there is a single balanced tournament with and , so the property trivially holds.
For the tournament has players. In this case there is a single balanced tournament with , and , so the property trivially holds.
Let us now assume that the property holds for and let us prove it for . There are two cases.
Case 1. If , , , and , let us consider such that the second condition of Definition 2 is fulfilled. According to the induction hypothesis we have , , for and . Then . Similarly and q.e.d.
Case 2. If , , is a fully balanced tree (i.e., ), and , let us consider such that the third condition of Definition 2 is fulfilled. According to the induction hypothesis we have , , , , and , , and . Then . Similarly and similarly for q.e.d.
The relations and can be now easily checked. □
Example 3
(Tournament structure design). Let us consider a tournament with players. In this case , , , . A tree representing a tournament with five players will have three layers such that the first layer consists of leaves (players) and the second layer consists of nodes among which there is internal node and leaves (players). One such a balanced tournament is depicted in Figure 2.
Figure 2.
A balanced tournament of five players.
Proposition 3
(Counting balanced tournaments). The set with players contains:
elements.
Proof.
There are ways of choosing how the players will enter second round. Their ordering matters so we multiply with . Moreover those players are arbitrarily chosen from the set of N players, so we multiply with . Finally, the ordering of those remaining players that enter first round matters, so we also multiply with . For each internal node of the tree, exchanging its left and right sub-tree is a tournament invariant. There are independent ways of exchanging left and right sub-tree of the tree, so we must divide by . We obtain:
A simpler proof is obtained by thinking about structures (i.e., “shapes”) of tournament trees. The selection of the “locations” of those players entering second round can be achieved in ways. For each tree structure defined in this way there are permutations of the leaves (players), thus defining a total number of balanced tournament trees. Finally we divide by and we obtain Equation (7). □
Example 4.
Let us check the number of balanced tournaments for several cases.
- 1.
- The number of balanced tournaments with players can be obtained as follows:It is not difficult to verify that this result is correct. In this case . There are ways of selecting those two players that will enter first round. We have one separate tournament by letting the winner of the game of these two players playing against each of the remaining three players in the second round. So in total there are balanced tournaments with five players.
- 2.
- For players we obtain:
- 3.
- If then , thus we obtain our result for fully balanced tournaments from [1] stating that the total number of fully balanced tournaments is given by:The number of fully balanced tournaments with two rounds is . Observe that applying the formula, we obtain 315 fully balanced tournaments with three rounds. Let us obtain this result using a different reasoning. Let us count the number of a set with eight elements consisting of two subsets of four elements each. There are possibilities, as we consider the four combinations of eight elements, and we divide by two as the order of the subsets of a partition does not matter; however, for each set of each partition there are three fully balanced tournaments of three rounds, so multiplying we obtain a total of nine possibilities. So the number of three-stage tournaments is .
4. Optimal Tournaments
Each round of a tournament with n rounds defines possible games between players. Note that in a given tournament any two players can play in a game at one and only one of its rounds. This follows from the fact that for any two leaves of a binary tree there is a unique closest common ancestor. It follows that the tournament round where players can meet is a unique value in and it is well defined. For example, referring to the tournament shown in Figure 2, , , and .
Intuitively, the higher the quotations of players i and j, the better it is to let them meet in a higher stage of the tournament in order to increase the stakes of their games.
We assume in what follows that a quotation is available for each player . Quotations can be obtained from the players’ current ranking (as for example in international tennis tournaments ATP and WTA) or by other means.
Definition 3
(Tournament cost). Let be a tournament with n rounds and let be the stage of t where players can meet. Let be the quotations of players for all . The cost of t is defined as:
Definition 4
(Optimal tournament). A tournament such that its cost computed with Equation (12) is maximal is called an optimal tournament and it is defined by:
Obviously, better ranked players have a higher quotation. We assume that if player i has rank then its quota is such that whenever we have . For example, if there are players then we can choose for all .
Example 5.
Let us consider the tournaments , , and with three players shown in Figure 3. Let us introduce:
Figure 3.
Balanced tournaments with three players.
We obtain:
The ordering of the costs depends on the ordering of the values of the function for . This function is monotonically increasing on and monotonically decreasing on . Observe that if , i.e., if neither player gets more than a half of the total quotation stake, then the ordering of the costs is given by the ordering of the quotations .
Example 6.
Let us consider four players (see Table 1). We assume that each player has a unique rank from 1 to 4. Now, if we choose then, using this approach for defining players’ quota, player 2 with rank 4 is assigned quotation . We consider the three tournaments from Figure 4. According to Equation (12), the cost of a tournament is:
Table 1.
Players’ ranking and quotation for .
Figure 4.
Balanced tournaments with four players.
Substituting stage values for each tournament from Table 2 into Equation (16) we obtain the tournaments’ cost values from Table 2. We observe that in this case the best tournament is . Actually it can be easily checked that the best tournament is for whatever values of the quota that are decreasingly ordered according to the ranks.
Table 2.
Games playing stages for each tournament of Figure 4 and their costs.
5. Dynamic Programming Algorithm for Computing Optimal Tournaments
For any set of players we denote by the sum of quotations of the players in .
Proposition 4
(Recurrence for tournament cost). Let be an n-stage tournament such that Σ is a finite nonempty set with elements. Then:
Proof.
If then has a single player so the result is obvious, as no games are played to determine the winner of the tournament.
If then has two players i and j so the result is obvious, as a single game is played to determine the winner of the tournament, between player i and player j.
If then . If and then . So Equation (12) gives:
The conditions from the Equation (18) follow directly from the recursive definition of balanced tournaments (Definition 2). □
Proposition 5
(Recurrences for optimal tournaments).
- 1.
- The optimal tournament cost introduced by Equation (13) can be defined recursively as follows:
- 2.
- The optimal tournament can be determined by recording the pairs of subsets that maximize in Equation (20) for , as follows:
Proof.
The proof follows by applying the maximization operation in Equation (18) and observing that the term does not depend on . The condition ensures that a pair is uniquely considered (otherwise each pair will be considered twice as and ).
Moreover, the sets can be used to construct an optimal tournament. Let . We define: , …, . Then the optimal tournament can be defined recursively as follows:
□
Proposition 5 (Equation (20) in particular) can be used to design a dynamic programming algorithm for computing the optimal tournament and its cost. The dynamic programming algorithm can be implemented either with a bottom-up approach or using a top-down approach with memoization [17]. We will explore these possibilities in what follows by deriving a bottom-up dynamic programming algorithm for fully balanced tournaments as well as a top-down dynamic programming algorithm with memoization for the general case.
5.1. Top-Down Dynamic Programming Algorithm with Memoization
Proposition 6
Proof.
As and we obtain:
As and we obtain:
It is not difficult to see that and so the value of right-hand side of Equation (26) is , q.e.d. □
Observe that for fully balanced tournaments , so this top-down recursive process will generate subsets of sizes .
Example 7.
Let us illustrate the application Equation (20) for players. The results are summarized in Table 3. It follows that solving the problem for a set of 25 players requires the solving of all subproblems corresponding to its subsets of players; however, solving the problem for a set of 15 players requires the solving of all subproblems corresponding to its subsets of players. Moreover, solving the problem for a set of 14 players requires the solving of all subproblems corresponding to its subsets of players, while solving the problem for a set of 12 or 13 players requires the solving of all subproblems corresponding to its subsets of players.
Table 3.
Games playing stages for each tournament and tournament costs.
Proposition 6 sets the iteration bounds for exploring the subsets of players in the top-down approach. Combining the results of Propositions 5 and 6, we obtain the top-down approach for computing optimal balanced tournaments; see Algorithm 1.
| Algorithm 1 top-down dynamic programming algorithm with memoization for computing the cost of the optimal tournament. |
| Global:, initially ∅, maps subsets of players to costs of optimal tournaments. |
| , initially ∅, maps subsets to sub-subsets for building optimal tournaments. |
| Input: represents the number of players. |
| q. Vector of size N representing the players’ quota. |
| such that . represents the set of players. |
| Output: . Cost of the optimal tournament for set of players. |
| if then |
| else if (i.e., ) then |
| else |
| if then |
| else |
| end if |
| for do |
| for s.t. do |
| if then |
| else |
| end if |
| if then |
| else |
| end if |
| for do |
| end for |
| for do |
| end for |
| if then |
| end if |
| end for |
| end for |
| end if |
5.2. Bottom-Up Dynamic Programming Algorithm for Fully Balanced Tournaments
The cost of the optimal tournament is computed with the help of vector that is indexed by all the subsets of generated recursively by Equation (20), starting from the topmost set . Note that for a fully balanced tournament we have and the process will generate exactly all the subsets of of cardinal: , , …. Note that in this case the size of can be determined as:
Additionally we must save in vector of size the subsets determined using Equation (21), such that we can reuse them to construct the optimal tournament using Equation (22). Our proposed algorithm is presented as Algorithm 2.
| Algorithm 2 bottom up dynamic programming algorithm for computing the cost of optimal fully balanced tournaments. |
| Input: . N represents the number of players. |
| q. Vector of size N representing the players’ quota. |
| such that . represents the set of players. |
| Output: . Vector of costs of the optimal sub-tournaments. |
| . Vector of sets to construct the optimal tournament. |
| 1: for do |
| 2: |
| 3: end for |
| 4: for do |
| 5: for s.t. do |
| 6: |
| 7: for s.t. do |
| 8: |
| 9: |
| 10: for do |
| 11: |
| 12: end for |
| 13: |
| 14: for do |
| 15: |
| 16: end for |
| 17: |
| 18: if then |
| 19: |
| 20: |
| 21: end if |
| 22: end for |
| 23: |
| 24: |
| 25: end for |
| 26: end for |
5.3. Computing an Optimal Tournament
Note that both Algorithms 1 and 2 determine the structure that records the split points for each subset of players according to Equation (20). The structure can be used to actually build an optimal tournament according to Algorithm 3 using Equation (22).
| Algorithm 3: algorithm for computing the optimal tournament. |
| Input: representing the set of players. |
| representing the number of players. |
| . Structure determined either by Algorithm 1 or by Algorithm 2. |
| Output: Returns the optimal tournament. |
| 1: if (i.e., ) then |
| 2: return j |
| 3: end if |
| 4: |
| 5: |
| 6: |
| 7: |
| 8: |
| 9: return |
5.4. Correctness and Complexity Results
Proposition 7
(Correctness of Algorithms 1–3).
- a.
- The value computed by Algorithms 1 and 2 represents the cost of the optimal tournament in both cases.
- b.
- The tournament determined by Algorithm 3 is the optimal tournament.
Proof.
Proof of a. Algorithms 1 and 2 compute the values of and either in top-down or bottom-up fashion for all the subsets that are required to determine the optimal tournament for the set of players. The computation follows Equations (20) and (21); therefore the correctness of this point follows from Propositions 5 and 6.
Proof of b. Algorithm 3 computes the optimal tournament using Equations (22). As values of are correctly determined according to point “a”, it follows that the tournament computed by Algorithm 3 is the optimal tournament. □
Proposition 8
(Complexity of Algorithms 2 and 3). Let us consider tournaments with N players.
- a.
- Space complexity of Algorithm 2 is .
- b.
- Time complexity of Algorithm 2 is .
- c.
- Space complexity of Algorithm 1 is for fully balanced tournaments and in the general case.
- d.
- Time complexity of Algorithm 1 is .
- e.
- Time complexity of Algorithm 3 is .
Proof.
The proof is using the Stirling approximation of the factorial, written in inequality form ([18]), in fact showing that :
Using this observation it is not difficult to prove that:
Proof of a. The space complexity of Algorithms 2 and 3 is given by the size of structures and (see Equation (27)); however, the asymptotically dominant term of this summation is . Then the result follows using Equation (29) for .
Proof of b. Algorithm 2 contains one “for” loop (lines 4–26) including other three nested “for” loops. The first inner “for” loop (lines 5–25) is executed times. The second inner “for” loop (lines 7–22) is executed times. The third inner “for” loop (lines 10–12 and 14–16) is executed times. The total number of steps of Algorithm 2 is given by:
Observe that the asymptotically dominant term of this summation is obtained for and it can be transformed using Equation (29), thus concluding the proof:
Proof of c. If N is a power of two, i.e., we have a fully balanced tournament, the memory consumption is exactly as in case a, so the first result follows trivially. Otherwise, the space consumption of tables and has an upper bound given by the size of the power set of , and the result follows immediately.
Proof of d. Let us first observe that in order to determine the cost of an optimal tournament with N players we need to know the costs of optimal tournaments for and players such that and condition of Equation (23) holds. It is not difficult to observe that:
First note that the upper bound of follows from the lower bound of , so it is enough to show the lower bound of . Let us assume by contradiction that:
It is easier to analyze the complexity of Algorithm 1 by thinking “bottom-up” rather than “top-down”. The complexity will be the same, as the role of the memorization technique is just to evaluate exactly once the cost of a tournament for each subset of players. So we must determine the cost for a subset of players ( can be omitted without losing generality); therefore the total running time has the following upper bound:
Although , so grouping terms with complementary binomial coefficients of inner sum of (36), noticing that and using (29), the inner sum has an upper bound of:
As this is in fact only an upper bound of our running time, the result of point e follows (i.e., with O rather than ).
Proof of e. Observe that the time complexity of Algorithm 3 satisfies the recursive equation . Unfolding this equation with the substitution method yields an asymptotic execution time . □
5.5. Sub-Optimal Algorithms
The dynamic programming approach for construing optimal tournaments has the disadvantage that the full exploration of the search space becomes prohibitive for larger tournaments. Our experiments (see Section 6) clearly show that this approach is unfeasible for tournaments of more than players; however, the dynamic programming algorithms can be easily adapted to explore a smaller size of the search space, leading to sub-optimal solutions. The exploration strategy can be used to tune the trade-off between the complexity of the computation and the “gap” between the provided sub-optimal solution and the actual optimal solution.
The resulting sub-optimal algorithms follow a strictly top-down approach that can be the best described as divide-and-conquer. At each decision point, rather than exploring all pairs of subsets satisfying conditions of Equation (20), only few such pairs (ideally only 1) are selected for exploration. This selection strategy can be deterministic, based on heuristic principles, or stochastic based on stochastic sampling subsets of satisfying the conditions of Equation (20). The strictly top-down approach has the advantage that it avoids the use of temporary structures and and of the additional algorithm to build the solution. Rather, the top-down approach will build the solution directly, using the recursive divide-and-conquer approach.
The general approach of a sub-optimal algorithm following a top-down divide and conquer approach is presented as Algorithm 4. Note that this algorithm is using a specific strategy to explore only a few subsets of defined by and satisfying the conditions of Equation (20).
Proposition 9
(Complexity of Algorithm 4). Let us consider tournaments for N players and let n be the number of stages of a balanced tournament. Then the time complexity of Algorithm 4 is where s is the average number of subsets of Σ explored by the strategy of the algorithm.
Proof.
It is not difficult to observe that the time complexity of Algorithm 4 satisfies the following recurrence:
Applying the substitution method for Equation (39) we obtain:
For each , from Equation (40) denotes the height of leaf i in the tournament tree, so . So:
q.e.d. □
| Algorithm 4 top-down divide-and-conquer algorithm for computing a sub-optimal tournament. |
| Input: represents the number of players. |
| q. Vector of size N representing the players’ quota. |
| such that . represents the set of players. |
| . Sum of players’ quotations. |
| Output:. Cost of the sub-optimal tournament for set of players. |
| . Sub-optimal tournament tree. |
| 1: |
| 2: if (i.e., ) then |
| 3: |
| 4: |
| 5: else if (i.e., ) then |
| 6: |
| 7: |
| 8: else |
| 9: |
| 10: |
| 11: |
| 12: if then |
| 13: |
| 14: else |
| 15: |
| 16: end if |
| 17: |
| 18: for do |
| 19: |
| 20: for do |
| 21: |
| 22: end for |
| 23: |
| 24: |
| 25: |
| 26: |
| 27: if then |
| 28: |
| 29: |
| 30: end if |
| 31: end for |
| 32: end if |
| 33: return |
Observe that if then the time complexity of Algorithm 4 is linear in the number N of players. Moreover, if then the time complexity of the algorithm is polynomial in N and the degree of the polynomial grows logarithmically with s.
5.5.1. Deterministic Sub-Optimal Algorithms
We define a deterministic sub-optimal algorithm by letting consist of the smallest set of players such that and .
The rationale of this choice is to try to make the product from Equation (20) as high as possible. As is constant, we try to make the values and as close as possible, while maintaining the constraints on the size of subset .
We can define three variants of the deterministic sub-optimal algorithm by considering the sequence of players’ quotations to be: (i) unchanged, i.e., as it was provided as input; (ii) increasingly sorted; (iii) decreasingly sorted.
5.5.2. Stochastic Sub-Optimal Algorithms
We define a stochastic sub-optimal algorithm by letting consist of a family of randomly chosen subsets of players of such that . This is easily achieved by randomly choosing the number of players k uniformly distributed in and then randomly choosing a subset of k elements and uniformly distributed in .
The number of chosen subsets explored by the algorithm is a parameter denoted by and it usually has a low number, as it directly influences the complexity of the algorithm according to Proposition 9, . For example, if the complexity of the algorithm is , if the complexity of the algorithm is and if the complexity of the algorithm is .
6. Implementation and Experiments
6.1. Implementation Issues
There were several issues that we had to address by our experimental implementation of Algorithms 1–3.
Firstly, we have chosen to represent sets as arrays of bits, as well as using the integer value that is equivalent to the binary representation as an array of bits.
Secondly, for generating subsets of given size (i.e., combinations) we have used Algorithm 7.2.1.2L from [19] for generating permutations with repetitions of binary arrays. Basically the subsets representing combinations of k elements of a set with elements are all the permutations with repetitions of a binary vector of n elements containing exactly k elements equal to 1.
Thirdly, we had to choose an efficient representation of and structures. Their operation is crucial for the efficient implementation of some of our algorithms. As for our implementation we have chosen Python platform, we decided to implement and using subset-indexed dictionaries that map subsets of to costs and to subsets necessary for building optimal tournaments, respectively. The subsets representing the dictionary keys are defined as integer values of their characteristic vector in binary format. As Python dictionaries are efficiently implemented using hash tables, an average time complexity is expected for lookup operations.
Finally, for the implementation of the random selection of subsets we have used the array of bits representation of sets and we have applied the random.permute function from NumPy package to return a randomly permuted array representing a random subset.
6.2. Experimental Results
Our experiments were developed in Python 3.7.3 using Jupyter Notebook on an x64-based PC with a 2 cores/4 threads Intel© Core™i7-5500U CPU at 2.40 GHz running Windows 10 (The experimental code is available at http://software.ucv.ro/~cbadica/tour.zip accessed on 2 September 2021).
According to our findings, there are no algorithms directly available to be compared with our own proposals. There are two main causes for this. First, we consider an integrated approach of tournament design, rather than a process involving two separated stages for structure design and seeding. Secondly, we do not use probabilistic information in our cost function, thus hindering the direct comparison of tournament cost values.
We took a different path for evaluating our proposals. We have implemented optimal algorithms, as well as several versions of sub-optimal algorithms and then compared their outcomes in terms of running time and optimality. So finally we have implemented and experimentally evaluated eight algorithms, as presented in Table 4.
Table 4.
Table presenting the list of implemented optimal and sub-optimal algorithms for balanced tournaments.
Note that for the optimal algorithms there are at least two restrictions that hinder a complete experimental comparison with the rest of the algorithms. Firstly, their high computational complexity limits their applicability only to small number of players. Secondly, the dynamic programming bottom-up algorithm works only for a number of players that is an exact power of 2. We have only checked it for .
Our data set includes multiple sequences of players’ quotations. For each we generated a sequence of quotations with integer values randomly chosen with a uniform distribution in the interval . This data set was used to experimentally evaluate the algorithms from Table 4, as follows:
- All the sequences of the data set were used for testing algorithms and for .
- The optimal algorithm was evaluated only for sequences corresponding to players. The reason is that the algorithm has a too high computational complexity and we limited the running time of each problem instance to 5 min.
- The optimal algorithm was evaluated only for sequences corresponding to players. The reason is both the high computational complexity of the algorithm and the fact that this algorithm was designed to work only with a number of players that is an exact power of 2.
For each algorithm, we recorded the (sub-)optimal cost of the output tournament, as well as the running time. Stochastic algorithms for were evaluated by repeating their execution 10 times for each input sequence of quotations from the data set and recording the minimum, maximum, and average costs, as well as the average running times.
Figure 5 presents the sub-optimal costs produced by and algorithms for . The figure plots costs produced by algorithms for , as well as average costs produced by algorithms for and the maximum cost produced by the 10-times repeated execution of algorithm . Note that we included maximum cost only for this case, as it should be obvious that it is expected that stochastic algorithm will produce the best results among for because it uses the highest number of samplings .
Figure 5.
Sub-optimal tournament costs determined by sub-optimal algorithms on different quotations’ sequences for various number of players.
Analyzing Figure 5, we first observe that the relative difference of the costs produced by the various algorithms on the same input sequence is rather low. This is expected, as quotations were generated as integer values from a small interval while the cost tends to reach significantly higher values. For example, analyzing in detail the results obtained for the sample with players we observe that the relative difference between the smallest and the highest cost obtained (147,555 and 156,203) is of only . We can also notice that the best results among sub-optimal algorithms were obtained by algorithm , while the worst results were obtained by algorithms and . It might look a bit surprising that algorithm appears to be superior to and ; however, taking into account how the data set was generated, this could be explained by the fact that algorithm is actually using a random permutation of the quotations’ sequence that provides a better balance of the total quotation distribution between the two subsets and (see Algorithm 4) than algorithms and . One final remark, also observed experimentally, is that algorithms and produce the same sub-optimal costs if the number of players is an exact power of 2 (i.e., 16 and 32 on Figure 5). This is an immediate consequence on the logic behind their strategy definition.
Figure 6 presents the running times and of algorithms and for . The time figures are given in milliseconds and presented on a logarithmic scale and they were computed by taking the average for 10 executions of the algorithm on the same input data. First observe that deterministic versions are the fastest and they have virtually almost the same running times. This can be easily explained by the low computational complexity of the implementation of their underlying strategies. Basically, their strategies use the same mechanism, while the additional sorting of the sequence of quotations adds a negligible cost as it is performed before the actual core processing of the algorithms. Second, the highest execution time is achieved, as expected, by . This algorithm has the highest computational complexity among sub-optimal algorithms, as it is using three subset samples during the top-down search. From Figure 6 it also follows that the highest average execution time was obtained for the sequence of quotations with players and its value was s.
Figure 6.
Running times on logarithmic scale of sub-optimal algorithms on different quotations’ sequences for a variable number of players.
Figure 7 presents results obtained with optimal algorithms and , as well as their comparison with results obtained by sub-optimal algorithm for players, on our input data set.
Figure 7.
Comparing costs and running times of our implemented algorithms.
In Figure 7a we show the comparison of relative maximum and average costs obtained by algorithm ( and ) with the actual optimal cost obtained by algorithm . The relative sub-optimal cost is a measure computed with Equation (42) using the absolute values of sub-optimal cost and optimal cost . Observe that if and only if , i.e., if the algorithm providing sub-optimal cost is in fact optimal. Note that the computation of the relative sub-optimal cost assumes the the exact value of the optimal cost is known. In our case, this value is known, as it was determined using the optimal algorithm for players.
In Figure 7b we show the comparison of running times , , and of algorithms , , and for players. The running times were evaluated by repeating the algorithm execution 10 times for the same input data. They are plotted on a logarithmic scale. Observe that by far the most efficient among them is algorithm . The linear increasing trend of and on the logarithmic scale is consistent to our findings that the complexity of algorithms and is exponential with the number of tournament players. Note that this tendency is also observed on the plot of , for which the values were recorded only for an exact power of two of the number of players, i.e., . Moreover, the sub-linear increasing trend of on the logarithmic scale is consistent with the fact that algorithm has a polynomial time complexity.
7. Conclusions
In this paper we defined optimal competitions structured as hierarchically shaped single-elimination tournaments. The optimality criterion aimed to maximize tournament attractiveness by letting the topmost players meet in higher stages of the tournament. We proposed a dynamic programming algorithm for computing optimal tournaments and we provided a thorough analysis of its correctness and computational complexity. Based on the idea of the dynamic programming approach, we also developed deterministic and stochastic sub-optimal algorithms. We realized an experimental evaluation of the proposed algorithms by providing experimental results that we obtained with their Python implementation. The results addressed the optimality of solutions and the efficiency of the running time.
Author Contributions
Conceptualization, A.B. and C.B.; methodology, A.B. and C.B.; software, C.B.; formal analysis, C.B. and A.B.; investigation, I.B., L.I.C. and D.L.; writing, C.B. and A.B. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Bădică, A.; Bădică, C.; Buligiu, I.; Ciora, L.I.; Logofătu, D. Optimal Knockout Tournaments: Definition and Computation. In Proceedings of the Large Scale Scientific Computing—LSSC’2021, LNCS, Sozopol, Bulgaria, 7–11 June 2021. in press. [Google Scholar]
- Anderson, I. Combinatorial Designs and Tournaments; Oxford University Press Inc.: New York, NY, USA, 1997. [Google Scholar]
- Vu, T.; Shoham, Y. Fair Seeding in Knockout Tournaments. ACM Trans. Intell. Syst. Technol. 2011, 3, 9:1–9:17. [Google Scholar] [CrossRef]
- Vu, T.D. Knockout Tournament Design: A Computational Approach. Ph.D. Thesis, Stanford University, Stanford, CA, USA, 2010. [Google Scholar]
- Bóna, M.; Flajolet, P. Isomorphism and symmetries in random phylogenetic trees. J. Appl. Probab. 2009, 46, 1005–1019. [Google Scholar] [CrossRef] [Green Version]
- Maurer, W. On Most Effective Tournament Plans with Fewer Games than Competitors. Ann. Statist. 1975, 3, 717–727. [Google Scholar] [CrossRef]
- Dagaev, D.; Suzdaltsev, A. Tournament design allows for spectator interest increase. Front. Econ. Res. 2015. Available online: https://voxeu.org/article/tournament-design-allows-spectator-interest-increase (accessed on 23 August 2021).
- Hartigan, J.A. Probabilistic Completion of a Knockout Tournament. Ann. Math. Statist. 1966, 37, 495–503. [Google Scholar] [CrossRef]
- CodeChef. Tennis Tournament. Available online: https://www.codechef.com/COOK27/problems/TOURNAM (accessed on 9 August 2021).
- Bao, N.P.H.; Xiong, S.; Iida, H. Reaper Tournament System. In Intelligent Technologies for Interactive Entertainment. INTETAIN 2017. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering; Chisik, Y., Holopainen, J., Khaled, R., Luis Silva, J., Alexandra Silva, P., Eds.; Springer: Cham, Switzerland, 2018; Volume 215, pp. 16–33. [Google Scholar]
- Karpov, A. A Theory of Knockout Tournament Seedings; Discussion Paper Series; University of Heidelberg, Department of Economics: Heidelberg, Germany, 2015; Volume 600. [Google Scholar]
- Adler, I.; Cao, Y.; Karp, R.; Peköz, E.A.; Ross, S.M. Random Knockout Tournaments. Oper. Res. 2017, 65, 1589–1596. [Google Scholar] [CrossRef] [Green Version]
- Guyon, J. “Choose Your Opponent”: A New Knockout Design for Hybrid Tournaments. J. Sport. Anal. 2021, 1–21, pre-press. [Google Scholar] [CrossRef]
- Hennessy, J.; Glickman, M. Bayesian optimal design of fixed knockout tournament brackets. J. Quant. Anal. Sports 2016, 12, 1–15. [Google Scholar] [CrossRef]
- Csató, L. Tournament Design. How Operations Research Can Improve Sports Rules; Palgrave Macmillan: London, UK, 2021. [Google Scholar]
- Stojadinović, T. On Catalan numbers. Teach. Math. 2015, XVIII, 16–24. [Google Scholar]
- Cormen, T.H.; Leiserson, C.E.; Rivest, R.L.; Stein, C. Introduction to Algorithms, 3rd ed.; The MIT Press: Cambridge, MA, USA; London, UK, 2009; pp. 365–367. [Google Scholar]
- Dutka, J. The early history of the factorial function. Arch. Hist. Exact Sci. 1991, 43, 225–249. [Google Scholar] [CrossRef]
- Knuth, D.E. The Art of Computer Programming, Volume 4A: Combinatorial Algorithms, Part 1; Addison-Wesley Professional: Boston, MA, USA, 2011; pp. 319–320. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).