Minimax Under Pressure: The Case of Tennis

Depoorter, Ben; Jantschgi, Simon; Lendl, Ivan; Mlakar, Miha; Nax, Heinrich H.

doi:10.3390/g16060060

Open AccessFeature PaperArticle

Minimax Under Pressure: The Case of Tennis

by

Ben Depoorter

^1,2,

Simon Jantschgi

³,

Ivan Lendl

⁴,

Miha Mlakar

⁵ and

Heinrich H. Nax

^3,6,*

¹

Center for Internet & Society and Stanford Law School, Stanford, CA 94305, USA

²

University of California Law SF, San Franciso, CA 94102, USA

³

Zurich Center for Market Design and SUZ, University of Zurich, 8050 Zurich, Switzerland

⁴

Independent Researcher, San Diego, CA 92130, USA

⁵

Pareto Lab, 1000 Ljubljana, Slovenia

⁶

Behavioral Game Theory, ETH Zurich, 8092 Zurich, Switzerland

^*

Author to whom correspondence should be addressed.

Games 2025, 16(6), 60; https://doi.org/10.3390/g16060060

Submission received: 24 September 2025 / Revised: 30 October 2025 / Accepted: 5 November 2025 / Published: 18 November 2025

(This article belongs to the Special Issue Game Theory, Sports and Athletes’ Behavior Under Pressure)

Download Versions Notes

Abstract

A series of articles has tested von Neumann’s minimax theory against behavioral evidence based on field data from professional sports. The evidence has been viewed and collectively cited as positive evidence that elite athletes in their familiar sports contexts mix well and behave in line with minimax. In this paper, based on open state-of-the-art tennis data and analytics, we shall uncover new and significant evidence against minimax at the very top of the game, where previously, such results had not been obtained. The kinds of behavioral deviations from minimax that we find become apparent, because we enrich the test strategy to take into account whether or not players face ‘pressure’ situations like break points and other decisive points. Our paper highlights that the prior literature’s failure to reject minimax does not constitute positive behavioral evidence, as some of that literature argued, because it is not robust to data aggregations and separations that are psychologically natural given the relevant real-world context. In this case, this means separating serves into the serve types that players actually consider and separating situations by pressure levels, which leads to clear and sound rejection of minimax.

Keywords:

game theory; minimax; sports; tennis

1. Introduction

von Neumann (1928)’s minimax theorem is a cornerstone of game theory that guides our theoretical understanding of many strategic interactions. For competitive situations, minimax theory provides the benchmark for strategically optimal unpredictability. Behaviorally, however, the empirical foundations of minimax are weak. In controlled laboratory experiments, there is rich evidence that humans generally have difficulties mixing properly and often fail to minimize maximum losses.1 These results have been used to motivate alternative behavioral theories.

Defendants of minimax as a behavioral theory argue that laboratory experiments are not ideally suited to draw any negative general conclusions owing to the lack of external validity of the experiments. While able to produce clean test conditions methodologically, experiments should not be expected to produce minimax play because they force Everyman and Everywoman subjects into abstract and unfamiliar problems where mastery of the strategic subtleties of the situation cannot be expected, and perhaps the stakes are not high enough to encourage sophisticated behavior. True experts, by contrast, could still play minimax in familiar high-stakes situations in the real world. Indeed, many economists have embraced this view and, in particular, cite papers analyzing professional sports as positive minimax evidence.

The ‘Wimbledon paper’ on tennis by Walker and Wooders (2001) was the first paper testing minimax against sports data, followed by Chiappori et al. (2002) and Palacios-Huerta (2003) on football, Hsu et al. (2007) on tennis, and several other papers since. While there is some evidence of over-switching in repeated play, in tennis in particular (Walker & Wooders, 2001), a positive finding in this literature as concerns the one-shot view of the game has been that players mix optimally. That is, players mix between their available strategic options in such a way that the win percentages resulting from all options are not significantly different from one another. This failure to reject minimax has not been a mere power issue, as was recently shown in a highly powered large-scale analysis conducted by Gauriot et al. (2023), where minimax patterns prevail at the top of the (men’s) tennis game in light of evidence from half a million serves (compared with 3216 serves in Walker and Wooders’ 2001 analysis, where failure to reject might have been a power issue for some of the tests). They improve tests to random tests and have more power, leading to rejection even with binary classification of serves and without accounting for situational pressure. However, at the top of the game, they still do not reject.

One overall conclusion based on these findings continues to be that minimax play is a matter of experience and focus, and that the most experienced, best and most professional human athletes (in particular in tennis and in football penalties) do play minimax (Palacios-Huerta, 2023; Palacios-Huerta & Volij, 2008).2 This is the view we challenge in this article by splitting up serves into more realistic serve options and by accounting for situational pressure.

For this purpose, in the present paper, we take a fresh look at minimax testing a la Walker and Wooders in the context of tennis with a focus on optimal mixing in the one-shot game. Following the literature, we conduct tests for equality of win probabilities as a function of serve direction, which is a key predictor of minimax and the one not rejected in prior work. We rely on the best publicly available data, which is structured according to the traditional classification of serve directions into three categories (called T, body and wide) and what has become a standard ATP statistic of match situations into two levels of score importance (called pressure and non-pressure).3 The key innovation of our test is that we test for serve patterns and serve successes while accounting for situational pressure depending on whether or not the server faces a critical point inside a service game (like a break point).4 With this natural split of situations according to pressure levels, our tests result in a sound rejection of minimax, even for the very best. Win rates are not equal. Some serve directions are more successful than others. Moreover, players’ strategic behavior is predictable, especially when it matters the most, which is when they switch up their patterns compared with their regular serve patterns. Players use their best serves more often in big moments of a match than in less important moments, when less successful serves are used more often.

2. Why Tennis

2.1. The Serve as a Simultaneous-Move Game

At first sight, tennis singles appear to be ideal grounds to test minimax predictions due to the high number of points beginning in the same way, with the server deciding where to direct their first serve. Serves of the same player are therefore much more frequently observed than, for example, penalty kicks by the same player in football. Moreover, given the high speeds of first serves of up to 155 mp/h, the server–returner game involving the fastest servers in tennis relatively more adequately resembles a simultaneous-move zero-sum game. This is the kind of game where minimax predictions imply that serves should be mixed up in such a way that win rates between the different options equate.5

When Walker and Wooders (2001) collected their data, they watched video recordings of ten long and particularly famous tennis matches that were played between 1974 and 1997, and manually classified serve directions and point outcomes point by point. All of this was prior to ball-tracking technologies being available. Walker and Wooders coded whether serves were aimed left, center or right, separately for serving from the Deuce and from the Ad sides of the court. This is challenging to do without error, especially for serves that go into the net, because the trajectories are sometimes hard to extrapolate. In their analysis, they focus on left and right. Their main finding is that there is no significant difference in win probabilities on both directions of serve for all servers from both sides—for all but one instance—in these ten matches. Hsu et al. (2007), also analyzing similar amounts of junior’s and women’s matches, corroborate this finding, which is interpreted as supportive of minimax behavior.6 Gauriot et al. (2023) find evidence of gender differences, with more highly ranked male players’ serve behavior and minimax theory conforming ‘remarkably closely.’7

Data size and data quality are a problem for lack of better tracking technologies in these early studies. As a result, they are unable to reject hypotheses of equal win rates despite observed differences in win rates being quite consistent and sizeable: Table 1 of Walker and Wooders, in particular, features twenty-nine out of forty reported win percentages that differ by 5% or more, but only two of these differences are significant at any noteworthy level of significance (i.e., the two that differ by at least twenty percent).

Furthermore, the existing test procedures do not account for situational pressure, which is the innovation we propose in this study. Back in the days of the matches covered in Walker and Wooders, players, when facing pressure points (which are points when the server risks getting broken, like Deuce, break points and points during tie-breaks), predominantly served to their opponent’s backhands, much more than they do on regular points. This made a lot of sense at the time, because backhands were, until recently, the weaker shots for almost all tennis players compared with their forehands. In the 1980 Wimbledon final between Bjorn Borg and John McEnroe, for example, considered among the greatest matches of all time, McEnroe overall serves T from the Ad side almost exactly 50% of the time. This means that McEnroe serves half the time overall to the forehand of Borg even though he has a lower win percentage to that side, confirming that Borg’s backhand is weaker than his forehand. However, on big pressure points, McEnroe very rarely serves to Borg’s forehand; e.g., only 20% of serves are T from Ad at 30–40. This pattern of saving his wicked left-handed wide serve from Ad for the big moments becomes most apparent during the epic fourth-set tie-break, when McEnroe goes to Borg’s backhand almost all the time. By the end of that tie-break, Borg moves far over to his backhand before McEnroe even serves, and tries to run around the serve to return with his forehand, speculating—correctly—that McEnroe will serve that way. This predictive edge arguably should have won Borg that tie-break (see the 16–16 runaround forehand return that just misses), and eventually does win him the match in the fifth with a series of class forehand runaround returns down the line.8

Pressure points have special status in tennis, as matches are won by whoever wins the match point having won enough of the pressure points along the way. The winner may actually be the player who won fewer points overall, as has happened in important matches.9 Aggregating over points independent of pressure status, as carried out in prior studies, might create the false impression that players are unpredictable. One could conclude in the case elaborated above that McEnroe mixed up his serves well or even overplayed the serve to the forehand, when in reality what cost him the match was that he was highly predictable in the critical phases of the match with his serves to the backhand.10

2.2. Publicly Available State-of-the-Art Data

One issue with testing for pressure is that the low number of pressure and break points in any individual match in general—and in particular for the best servers—makes it impossible to test corresponding hypotheses on the basis of small datasets such as the ten individual matches included in Walker and Wooders. Single matches, even if very long, do not generate sufficient pressure and break points for the best servers to test alternative ‘big points’ hypotheses, which would cut up the dataset in even smaller parts. In order to accommodate richer testing with sufficient power, we rely on state-of-the-art tennis analytics data that combines multiple comparable matches played by each of the best-serving players on the men’s professional tennis tour in the years 2010–2015.11 Our data comes from two sources. First, we use the Association of Tennis Professionals (ATP) Stats Center by Infosys, which is publicly accessible at www.atptour.com, for aggregate player statistics to identify who are actually the best servers on the ATP Tour overall. ATP Infosys data is aggregate outcome data concerning win rates, aces, etc. Second, we conduct our tests on detailed multi-match data covering matches played by these servers from “The Match Charting Project” on www.tennisabstract.com, which is a unique crowd-sourced effort to gather detailed shot-by-shot data for pro tennis matches, collected with modern ball-tracking technologies as used at the professional tournament (Hawk-Eye). This data includes detailed strategic choice data such as where players moved and what shots they selected. The use of this data ensures open access to our data, as well as to related data for further investigations, as we use only a small fraction of what is available based on our focus on only the very best servers. All of our data and analyses are described in more detail in the aforementioned Open Science Framework repository, where data and analysis are freely available.

We focus on the ten best servers on the men’s professional tennis tour during 2010–2015. Player quality is ensured by focusing exclusively on players ranked in the top twenty at some point during these years. Within this sample, we focus on the ten best servers as determined by number of aces per match, because aces result from particularly fast, precise and unreadable serves, which helps justify the simultaneous-move game approximation underlying minimax predictions. We focus on the serve patterns of these players during all the matches available in The Match Charting Project that they played during this time period, filtering by matches on hard courts and against similarly ranked right-handed players with two-handed backhands. This gives us an interesting sample of matches of players playing against ‘comparable’ opponents.12 If minimax were to be a behaviorally descriptive theory, it would have to be so for these players, as all other players serve less fast, less accurately and less successfully.

2.3. Our Contribution

We develop a novel test procedure in which we account for pressure of the point, motivated by the conjecture that players will save up their most trusted serves for high-pressure points. In order to have sufficient numbers of pressure points, we combine multiple matches per server. Our focus is on the types of players for whom minimax is most suitable due to their serves’ speed and accuracy. Using state-of-the-art Hawk-Eye data, we test and soundly reject the standard view of tennis play as point-by-point minimax with confidence. Even the best male servers have their trusted serve directions which they save for pressure moments, and they do so with mixed success. Without separating points by pressure level, we actually reproduce the kinds of insignificant results as in Walker and Wooders, despite our multi-match perspective, illustrating that prior tests indeed lacked relevant contextual realism to identify the patterns that help refute minimax interpretations.

3. Testing Procedure

3.1. Deriving Hypotheses

Game-theoretically, a tennis match can be viewed as a recursive game (Everett, 1957), that is, a stochastic game consisting of repeated ‘point games’ until an absorbing state is reached with a non-zero payoff determining winner and loser. Transition probabilities between states of the match (i.e., points determining the match score) are jointly determined by current strategies and the histories of strategies of the two players.

We shall simplify a server’s set of available strategies by following the standard ATP classification into ‘T’, ‘body’ and ‘wide’ serves, as is coded by Hawk-Eye and other match-tracking technologies underlying the The Match Charting Project data.13 Note that, according to the recursive game view, the point-by-point game matrix might change over time if the server, for example, always uses the same serve, which will get less effective over time due to within-match learning. Formally, given a full history of actions and outcomes

h_{t}

up to time t, we therefore denote by

f_{j, h_{t}}^{i}

the choice frequency of server i to choose strategy

j \in {T, B, W} = : J

(for T, body, wide), and by

p_{j, h_{t}}^{i, X}

(with

X = {S, F}

for ‘success’/‘failure’), we denote the corresponding win/loss probability for that point.

It is useful to begin with this recursive game view, because it highlights the fact that generally, at each state, individual point games and their associated win probabilities are, strictly speaking, ‘history-dependent’ (because of fatigue, within-match learning, ball and racket wear and tear, concentration issues or tranquilization of nerves, etc.). This view makes clear that the optimal strategy at time t therefore depends on the score and on the history of the game in a way that is not testable without further knowledge/assumptions on the exact transition probabilities (Blackwell & Ferguson, 1968; Mertens & Neyman, 1981; Vieille, 2000a, 2000b).14 To formulate testable hypotheses, Walker and Wooders make relatively strong assumptions within each match by assuming that each point game is identical and independent of history; that is, the win probabilities

p_{j, h_{t}}^{i, S}

are constantly

p_{j}^{i, S}

for Deuce and Ad sides of the court separately. Under this assumption, tennis matches become “repeated play of point games” (p. 1523) modeled as a repeated zero-sum game, the kind of game, a la Walker et al. (2011), that is memoryless with respect to who did what in the past and what happened. Minimax at every point is consequently (quite naturally) the unique equilibrium of these Markov games.15 In terms of testing, this simplification allows Walker and Wooders to pool all Deuce and Ad points from a match, and to run single tests of equality on the different serve choices, which they classify without the body serve.

So what may non-rejection of minimax in such a Markov version of tennis mean? If the assumption matches reality, then depending on data size and quality, as well as on the power of the tests, non-rejection might be taken as corroboration of the hypothesis that tennis players play minimax. However, if the assumption contradicts the real game of tennis in any way that might create biases in the underlying test statistics, then failure to reject minimax under the assumptions does not inform us much as to whether agents behave optimally or not. In particular, if it is the case that serves ‘wear out’ differentially with increased relative use, because opponents’ returns against certain serves improve as they experience them, it is plausibly optimal to not play history-independent minimax at every point, but rather save the most effective serves for the points that have the highest marginal impact on the final match outcome.

The point of our analysis is not to formulate the right general model of tennis. Instead, we shall test minimax under the assumption of Walker and Wooders (and others since), that is, the point-by-point minimax view. In particular, we shall test whether minimax and history independence withstand joint testing when distinguishing points according to pressure levels

l \in {N P, P} = : L

, defining ‘non-pressure’ (

N P

) points as points from regular service games (not tie-breaks) that cannot lead to a break point, and all other points as ‘pressure’ (P) points.16 This classification follows industry standards. If players are indeed playing history-independent minimax, we should observe, for both pressure levels, no violations of equalities of win percentages and no differences in serve frequencies. In fact, by subsetting the total number of observations into separate pressure categories, minimax, a la Walker and Wooders, if the point-by-point assumption of history independence is justified, should be a test with less power. That is, our test should have less bite than the original testing strategy that does not distinguish pressure levels. However, if by separating pressure levels we find that win percentages differ significantly and/or serve frequencies change as a function of pressure, then the alternative hypothesis must be accepted, which implies either that behavior is optimally strategic in the sense of a more complex history-dependent model (that is untestable without further knowledge of transition probabilities), or that players follow some behavioral heuristics like reserving big plays for big points that are not captured by point-by-point minimax.

This is very important in practice, because one view would make players appear unpredictable, while in reality they are predictable. Clearly, while economists might find elegance in the former, the involved tennis players want to know about the latter.

Formally, we formulate the history-independent minimax null hypothesis with two constituent hypotheses, which we aim to falsify, as follows:

H1 is the standard ‘equal win probabilities’, which we test separately for Ad and Deuce courts and for the two pressure levels: $p_{j, l}^{i, k} = p_{l}^{i, k}$ for all i, j, k, and l.17
H2 is ‘frequency independence’ with respect to pressure, which we test separately for each serve direction: $f_{j, l}^{i} = f_{j}^{i}$ for all i, j, and l.

3.2. Test Statistics

Our tests are implemented via Pearson’s

χ^{2}

tests of homogeneity with Yates’s correction for continuity, as is standard in this literature for testing the equality of win percentages and serve frequencies.18 By

n_{j, l}^{i}

we denote the total number of times that player i uses strategy j in situation l, and by

N_{j, l}^{i, S}

and

N_{j, l}^{i, F}

we denote the resulting numbers of point wins and losses. Hence,

\sum_{j \in J} N_{j, l}^{i, k} / \sum_{j \in J} n_{j, l}^{i}

is the maximum likelihood estimator for

p_{l}^{i, k}

. Similarly,

\sum_{l \in L} N_{j, l}^{i} / \sum_{l \in L} n_{l}^{i}

is the maximum likelihood estimator for

f_{j}^{i}

, where

n_{l}^{i}

corresponds to the total number of serves of player i in situation l, and

N_{j, l}^{i}

denotes how often strategy j is chosen in that situation. The Pearson test statistics for hypotheses H1 and H2 become

\begin{matrix} P_{H 1}^{i, l} = \sum_{j \in J} \sum_{k \in K} \frac{{(N_{j, l}^{i, k} - n_{j, l}^{i} p_{l}^{i, k})}^{2}}{n_{j, l}^{i} p_{l}^{i, k}} and P_{H 2}^{i, j} = \sum_{l \in L} (\frac{{(N_{j, l}^{i} - n_{l}^{i} f_{j}^{i})}^{2}}{n_{l}^{i} f_{j}^{i}} + \frac{{((n_{l}^{i} - N_{j, l}^{i}) - n_{l}^{i} (1 - f_{j}^{i}))}^{2}}{n_{l}^{i} (1 - f_{j}^{i})}), \end{matrix}

(1)

which are asymptotically distributed as

χ^{2}

with 2 and 1 degrees of freedom, respectively. For every player, we compute the relevant test statistics, and, using Fisher’s method, combine them into a test covering all ten players jointly.

3.3. Comparability Assumptions

Like any test of minimax in general, owing to the recursive nature of the true game of tennis, our testing also has to make certain assumptions ensuring point game comparability that permit pooling of observations. With general history dependence, because servers constantly adjust to the returner and vice versa, no two points are directly comparable, and therefore nothing would be testable unless some structure is placed on the exact transition probabilities between point games. In other words, without comparability assumptions between points that allow pooling of multiple points, minimax is well-nigh unfalsifiable in tennis, because the relevant transition probabilities are not known.

To understand different candidates for comparability assumptions, it helps to think of each service situation as being played by a different server–returner pair. As is elucidated by Chiappori et al. (2002) in the context of penalty kicks in soccer, different server–returner pairs (in their case, kicker–goalkeeper pairs) might be too heterogeneous to permit aggregation. If the number of observations at the level of a comparable pair is too low, there will not be sufficient testing power, not even for joint tests. An extreme view could be to think of every point as being unique, which would obviously make minimax entirely untestable. To obtain some power, Walker and Wooders assume history independence within a match, and classify serves as only T and wide. In our tests of history-independent minimax, we distinguish serves by T, body and wide and pressure levels by score. Hence, if history independence is a valid assumption, then our power should be lower than Walker and Wooders’, as our choices result in a finer grid of situations.

An issue is that the single-match analysis results in levels of power that are already too low prior to this split according to pressure level, which is why win percentage differentials of ten percent or more are often not significant in Walker and Wooders. Hence, in order to obtain the necessary power, we turn to Chiappori et al.’s concept of the ‘identical goalkeeper,’ which permits aggregation over goalkeepers in their analysis of football penalties in different matches facing different but identical goalkeepers. Walker and Wooders basically make an ‘identical game’ assumption for all points within a single match. We separate by pressure level, but pool multiple matches of the same server playing against ‘comparable returners.’ Concretely, we selected matches played by the same server on the same kind of (hard-court) surface during our time window against returners with similar rankings, the same playing hand and the same type of backhand (one- versus two-handed), so that the win percentages associated with the different serves against the different returners are plausibly invariant under affine transformations.19

Assuming that these returners are comparable in this way ensures that minimax strategies are the same against all opponents. It also implies that the returners have the same ordinal preferences over the server’s serves (i.e., depending on server, all returners prefer either body > wide > T or body > T > wide, as is evidenced by conditional—on serve going in—win percentages), but the returners may differ from one another in terms of overall return quality as long as percentages are affine-comparable.20

3.4. Results

Table 1 summarizes our test results. For the ten players combined, our minimax hypothesis H1 ∪ H2 is rejected with 99 percent confidence. Individually, we reject with at least 99 percent confidence for four out of ten players, with 95 percent for six and with 90 percent for all but one.21 This shows that serve patterns change significantly under pressure. Furthermore, this change occurs in a natural way as the frequency orderings for serve directions change in favor of more successful serves becoming relatively more frequent when under pressure. In particular, under-pressure players abandon almost entirely the generally less successful strategies of body serves. Importantly, our results challenge the positive interpretations of the failure to reject minimax of the prior literature in a deeper way than simply by having more data. Indeed, when we run our tests without differentiating by pressure levels, we qualitatively replicate the prior literature’s predominant failure to obtain significant rejections despite more observations now belonging to each serve direction. In fact, nothing is significant anymore with 99 percent confidence, with 95 percent confidence we find only two individual rejections, and with 90 percent confidence we find only rejections for the combined test and for three individuals.22 This means that our rejection of history-independent minimax is not driven by a larger sample size, but indeed by the differentiation of points into pressure levels. In fact, contemporaneous studies by Gauriot et al. (2016, 2023), who analyze a much larger Hawk-Eye dataset than us—that is not publicly available—reject minimax only for female tennis players of all ranking levels and for low-ranking male tennis players, but not for high-ranking male players.

4. Conclusions

John Isner was one of the greatest servers of tennis and is a good illustration of our finding (see Table 1). His wide serve from the Ad court is his strongest serve in terms of win percentages. Overall, and when playing non-pressure points from Ad, he actually serves down the T more often. On pressure points, however, he does go wide more often. This pattern violates point-by-point minimax, but it is not necessarily irrational or suboptimal. Indeed, John Isner might rationally preserve the effectiveness of his Ad wide serve for critical moments, which might require under-using it in less important moments of a match so that the opponent does not get used to it or begin to reposition himself. Tests that do not separate by pressure and non-pressure points will not identify this pattern.

This pattern of mixing differently depending on pressure levels is what we qualitatively obtain for eight out of ten servers in our sample, and the joint test is also highly significant. Eight out of the ten players, who represent the best in the game, tend to reserve their better serves for when it matters the most. We view this as a tip-of-the-iceberg falsification, and in future work, we plan to repeat this analysis on more matches and for other players whose serves are less in line with the minimax framework to begin with in order to identify more patterns. But our point here was to show that the non-rejection of the prior literature was a testing problem, and not a gender issue or an expertise difference.

The two exceptions are Milos Raonic and Ivo Karlovic. Ivo Karlovic, in particular, who was the man with the most aces in the history of the sport until John Isner overtook him in 2022, is exceptional in many ways. In our dataset, Ivo Karlovic has the highest win percentage overall, faces the fewest pressure points and is the one player who hits hardly any body serves at all and who does not significantly change his serve pattern whether under pressure or not. In fact, he himself has spoken of his serve as only being broken when he breaks himself. As it stands, Ivo Karlovic and Milos Raonic may be minimax players, but they are the exception. In light of the growth of the market for strategic consultants and data analytics in professional sports like tennis, the fact that minimax is not descriptively accurate should not be surprising, not even for equilibrium-thinking economists. There would not be a demand if players were already strategically sophisticated to the extreme of unpredictability. Today, the majority of elite tennis professionals use advanced tennis analytics to predict opponents’ play and reduce their own predictability. Perhaps there is a future where most athletes truly become minimax players.

The main point of this paper is simply that a reading of the old evidence from tennis, which has been presented in the economics literature, as positive evidence of minimax behavior is not quite appropriate. That literature’s failure to reject minimax was largely due to historical data limitations and owing to a focus on test procedures that originate in the controlled lab experiments run by O’Neill, resulting in a partition of choices and aggregation by pressure situations that wash out the clear deviations from minimax. Our analysis has shown that there are clear and predictable patterns that can give one player an edge over the other, depending on whether they are identified or not. These patterns come out quite clearly in high-pressure situations.

In terms of modeling tennis, a balance needs to be struck between simplicity that permits testing with power and realism that captures the most relevant behavioral aspects of the game. One empirical agenda that could constructively grow out of the ongoing ‘minimax or not’ debate could be constructing measures of ‘distances from minimax,’ perhaps via indices a la Afriat (1972). This might also require revisiting the basic game-theoretic model of tennis, perhaps to formulate and to test more sophisticated extensive-form and recursive games and to go beyond analyses of first serves.23 Our sense is that a full formal model of tennis is out of scope given the complexity and physical limits of the sport, which is what makes it so interesting for us.

Author Contributions

Conceptualization, B.D., S.J., I.L., M.M. and H.H.N.; Methodology, B.D., S.J., I.L., M.M. and H.H.N.; Software, M.M.; Validation, M.M. and H.H.N.; Formal analysis, B.D., S.J., M.M. and H.H.N.; Investigation, B.D., S.J., I.L., M.M. and H.H.N.; Resources, M.M. and H.H.N.; Data curation, B.D. and M.M.; Writing—original draft, B.D., S.J., M.M. and H.H.N.; Writing—review & editing, B.D., S.J., M.M. and H.H.N.; Visualization, Miha Mlakar; Supervision, I.L. and H.H.N.; Project administration, B.D. and H.H.N. All authors have read and agreed to the published version of the manuscript.

Funding

This paper was funded by an SNSF Eccellenza Grant (H.H.N. and S.J.).

Data Availability Statement

The data presented in this study are available at the Center for Open Science under https://osf.io/9c2d8/overview?view_only=41023c09c8d847068723787c8f7685ab (last accessed as a public repo on 7 November 2025), project reference number 9c2d8.

Acknowledgments

We thank John List and Ted Sichelman who crucially shaped this project from the economics and tennis perspectives, as well as Isaiah Andrews and reviewers who provided helpful comments on earlier drafts.

Conflicts of Interest

The authors declare no conflicts of interest.

Notes

1	Inability to reject minimax in light of experimental data (Binmore et al., 2001; McCabe et al., 2000) is less common than significant negative evidence (Levitt et al., 2010; McCabe et al., 2000; Mookherjee & Sopher, 1994; Ochs, 1995; Rapoport & Boebel, 1992), with some earlier positive interpretations (O’Neill, 1987) challenged on grounds of test procedures and interpretations (Brown & Rosenthal, 1990).
2	Chiappori et al. (2002) stick to a terminology of non-rejection without any explicitly positive interpretation of their results, also discussing several fundamental testing hurdles owing to data availability, situational comparability, heterogeneity, etc. We shall get back to this subtlety in our concluding discussion. Nevertheless, their analysis is often cited as positive evidence alongside other studies of football.
3	Note that serve direction classification is conducted using professional ball-tracking technologies in tennis such as Hawk-Eye Innovations. The ATP Stats Center, for example, contains analyses of serves and pressure performance.
4	For previous research showing the relevance of pressure by surface- and game-level importance, see Bailey and McGarrity (2012) and Cohen-Zada et al. (2017), respectively.
5	With respect to the first serves by John Isner, for example, Roger Federer said “you can basically not read. It’s that simple.” See https://www.tennisworldusa.org’s article on ‘Roger Federer shares how he plans to deal with John Isner’s serve’ from 31 March 2019.
6	As mentioned in the Introduction, further tests reveal over-switching in the repeated game in the form of evidence of negative serial autocorrelation in Walker and Wooders, but not in Hsu et al.
7	Paserman (2023), not focusing on minimax and serves, finds evidence of an overall drop in performance under pressure in women’s tennis. This finding is generalized in González-Díaz et al. (2012)’s measure of ‘critical ability,’ which correlates with the ranking of players.
8	Note that Borg at that time had a strategy pioneer as his coach (Lennart Bergelin) who prepared Borg meticulously on game situations against every individual opponent ahead of each match, while McEnroe never had a coach in those years and played more intuitively (due to his status as being—by his own account—“uncoachable”).
9	It happened, for example, in this year’s 2025 French Open men’s final between Sinner and Alcaraz. Another famous instance is the 2019 Wimbledon men’s final between Federer and Djokovic.
10	Walker and Wooders allude to an alternative ‘big point’ hypothesis, but dismiss this view by citing Martina Navratilova from a post-match press conference where she stated that she tried to play the match ‘point-by-point.’ They also subsequently developed a ‘point-by-point’ theory of tennis supporting this view (Walker et al., 2011).
11	All our data is documented and available for replication in the Open Science Framework—see https://osf.io/9c2d8/?view_only=41023c09c8d847068723787c8f7685ab (public repo accessed on 7 November 2025).
12	Our selection filters are motivated by the game-theoretic perspective on minimax testing from Chiappori et al. (2002) concerning ‘comparability’ of situations involving different athletes. Our test procedure rests on assumptions inspired by Chiappori et al., with some adjustments, as shall be discussed explicitly in Part D of Section II.
13	Wide means serving to the sides of the court, body means serving at the opponent, and T means serving to the middle of the court. Hence, when serving from the Deuce (i.e., the right) side of the court, the three serve directions T, body and wide can be thought of as right, center and left, while from the Ad side (i.e., the left side), the three directions T, body and wide can be thought of as left, center and right.
14	This point is related to the issues discussed in Chiappori et al. (2002) but separate, as their focus is on penalty kicks in soccer and on the issues of comparability across matches, goalkeepers and kickers. We shall get back to this issue later.
15	Indeed, for ‘binary Markov games,’ repeated minimax play was, some years after Walker and Wooders, proven to be the unique equilibrium (Walker et al., 2011). Note that this result has not been proven for a general model of tennis that allows history dependence.
16	Concretely, pressure points are 0–30, 15–30, 30–30, deuces, break points and tie-break points.
17	Recall that in Walker and Wooders (and Hsu et al.), serves are ‘left’ or ‘right’ (without ‘center’), and pressure levels are not differentiated. Another difference is that their data is hand-coded, which comes with some degree of freedom concerning their own classifications as becomes apparent by the following remark in Hsu et al. (p. 517): “The number of serves and the number of choices of R and L that we record, however, are slightly different from those in Walker and Wooders (2001). We suspect that this occurs because we use a slightly different standard in defining L and R.” We would like to remark that having degrees of freedom in the definition of the strategy space (see also Norman (1985) for some early work on classifying fast and slow serves) is a problematic feature that distinguishes sports analyses from experimental tests of minimax that are often quoted for comparison. Having an exogenous definition, like ours based on Hawk-Eye to define T, body and wide, mitigates this issue.
18	O’Neill (1987) proposed such a procedure tailored to his experiment.
19	An affine transformation here means replacing each win percentage p by $p^{'} = α \cdot p + β$ with the same $α > 0$ and $β$ in all cells, which means that all win percentages are scaled or shifted by the same factor; thus the minimax strategies in a zero-sum game are unchanged (affecting only the game value).
20	Note that instead of combining matches in the same period on the same surface against different ‘comparable’ returners, another strategy would have been to combine different matches of the same server–returner pair as carried out in (Gauriot et al., 2023). This is also a promising strategy, especially for some of the great rivalries that feature many matches between the same opponents. We explicitly decided against this strategy, because we know that players often tactically change their game quite drastically over the course of these rivalries over longer horizons, creating cycles of dominance, making matches between the same two over time sometimes less comparable than matches against other similar players during a smaller window of time.
21	The one player whose serve patterns survive all minimax tests is Milos Raonic, who, as of 17 June 2024, almost a decade after our data window ended, holds the record for the most aces in a match with 47 against Cameron Norrie at Queen’s.
22	For Ad/Deuce, we obtain, for the same order of players as in Table 1, and for the joint test, p-values of 0.127/0.601, 0.658/0.128, 0.805/0.013 , 0.286/0.864, 0.012 /0.314, 0.348/0.302, 0.156/0.513, 0.698/0.552, 0.542/0.128, 0.092 /0.124 and 0.082 /0.076 *.
23	See, for example, Klaassen and Magnus (2009) for some extensions.

References

Afriat, S. N. (1972). Efficiency estimation of production functions. International Economic Review, 13(3), 568–598. [Google Scholar] [CrossRef]
Bailey, B. J., & McGarrity, J. P. (2012). The effect of pressure on mixed-strategy play in tennis: The effect of court surface on service decisions. International Journal of Business and Social Science, 3(20), 11–18. [Google Scholar]
Binmore, K., Swierzbinski, J., & Proulx, C. (2001). Does minimax work? An experimental study. Economic Journal, 111(473), 445–464. [Google Scholar] [CrossRef]
Blackwell, D., & Ferguson, T. S. (1968). The big match. The Annals of Mathematical Statistics, 39(1), 159–163. [Google Scholar] [CrossRef]
Brown, J. N., & Rosenthal, R. W. (1990). Testing the minimax hypothesis: A re-examination of O’Neill’s game experiment. Econometrica, 58(5), 1065–1081. [Google Scholar] [CrossRef]
Chiappori, P.-A., Levitt, S., & Groseclose, T. (2002). Testing mixed-strategy equilibria when players are heterogeneous: The case of penalty kicks in soccer. American Economic Review, 92(4), 1138–1151. [Google Scholar] [CrossRef]
Cohen-Zada, D., Krumer, A., Rosenboim, M., & Shapir, O. M. (2017). Choking under pressure and gender: Evidence from professional tennis. Journal of Economic Psychology, 61, 176–190. [Google Scholar] [CrossRef]
Everett, H. (1957). Recursive games. Contributions to the Theory of Games, 39, 47. [Google Scholar]
Gauriot, R., Page, L., & Wooders, J. (2016). Nash at Wimbledon: Evidence from half a million serves. SSRN Electronic Journal. [Google Scholar] [CrossRef]
Gauriot, R., Page, L., & Wooders, J. (2023). Expertise, gender, and equilibrium play. Quantitative Economics, 14(3), 981–1020. [Google Scholar] [CrossRef]
González-Díaz, J., Gossner, O., & Rogers, B. W. (2012). Performing best when it matters most: Evidence from professional tennis. Journal of Economic Behavior & Organization, 84(3), 767–781. [Google Scholar] [CrossRef]
Hsu, S.-H., Huang, C.-Y., & Tang, C.-T. (2007). Minimax play at Wimbledon: Comment. American Economic Review, 97(1), 517–523. [Google Scholar] [CrossRef]
Klaassen, F. J., & Magnus, J. R. (2009). The efficiency of top agents: An analysis through service strategy in tennis. Journal of Econometrics, 148(1), 72–85. [Google Scholar] [CrossRef]
Levitt, S. D., List, J. A., & Reiley, D. H. (2010). What happens in the field stays in the field: Exploring whether professionals play minimax in laboratory experiments. Econometrica, 78(4), 1413–1434. [Google Scholar] [CrossRef]
McCabe, K. A., Mukherji, A., & Runkle, D. E. (2000). An experimental study of information and mixed-strategy play in the three-person matching-pennies game. Economic Theory, 15(2), 421–462. [Google Scholar] [CrossRef]
Mertens, J.-F., & Neyman, A. (1981). Stochastic games. International Journal of Game Theory, 10(2), 53–66. [Google Scholar] [CrossRef]
Mookherjee, D., & Sopher, B. (1994). Learning behavior in an experimental matching pennies game. Games and Economic Behavior, 7(1), 62–91. [Google Scholar] [CrossRef]
Norman, J. M. (1985). Dynamic programming in tennis—When to use a fast serve. Journal of the Operational Research Society, 36(1), 75–77. [Google Scholar] [CrossRef]
Ochs, J. (1995). Games with unique, mixed strategy equilibria: An experimental study. Games and Economic Behavior, 10(1), 202–217. [Google Scholar] [CrossRef]
O’Neill, B. (1987). Nonmetric test of the minimax theory of two-person zerosum games. Proceedings of the National Academy of Sciences of the United States of America, 84(7), 2106–2109. [Google Scholar] [CrossRef]
Palacios-Huerta, I. (2003). Professionals play minimax. The Review of Economic Studies, 70(2), 395–415. [Google Scholar] [CrossRef]
Palacios-Huerta, I. (2023). Maradona plays minimax. Sports Economics Review, 1, 100001. [Google Scholar] [CrossRef]
Palacios-Huerta, I., & Volij, O. (2008). Experientia docet: Professionals play minimax in laboratory experiments. Econometrica, 76(1), 71–115. [Google Scholar] [CrossRef]
Paserman, M. D. (2023). Gender differences in performance in competitive environments? Evidence from professional tennis players. Journal of Economic Behavior & Organization, 212, 590–609. [Google Scholar] [CrossRef]
Rapoport, A., & Boebel, R. B. (1992). Mixed strategies in strictly competitive games: A further test of the minimax hypothesis. Games and Economic Behavior, 4(2), 261–283. [Google Scholar] [CrossRef]
Vieille, N. (2000a). Two-player stochastic games I: A reduction. Israel Journal of Mathematics, 119(1), 55–91. [Google Scholar] [CrossRef]
Vieille, N. (2000b). Two-player stochastic games II: The case of recursive games. Israel Journal of Mathematics, 119(1), 93–126. [Google Scholar] [CrossRef]
von Neumann, J. (1928). Zur theorie der gesellschaftsspiele. Mathematische Annalen, 100(1), 295–320. [Google Scholar] [CrossRef]
Walker, M., & Wooders, J. (2001). Minimax play at Wimbledon. American Economic Review, 91(5), 1521–1538. [Google Scholar] [CrossRef]
Walker, M., Wooders, J., & Amir, R. (2011). Equilibrium play in matches: Binary Markov games. Games and Economic Behavior, 71(2), 487. [Google Scholar] [CrossRef]

Table 1. Testing

H_{1}

and

H_{2}

for the ATP Tour Top 10 servers during the years 2010–2015.

Table 1. Testing

H_{1}

and

H_{2}

for the ATP Tour Top 10 servers during the years 2010–2015.

			Anderson	Cilic	del Potro	Gulbis	Isner	Karlovic	Kyrgios	Lopez	Raonic	Tsonga
Non-Pressure	wide	$p_{W, N P}^{S, A D}, p_{W, N P}^{S, D E}$	0.65, 0.67	0.65, 0.55	0.69, 0.65	0.69, 0.57	0.72, 0.69	0.74, 0.75	0.68, 0.72	0.61, 0.68	0.68, 0.71	0.63, 0.56
		$f_{W, N P}^{A D}, f_{W, N P}^{D E}$	0.38, 0.44	0.44, 0.43	0.37, 0.29	0.35, 0.39	0.38, 0.50	0.59, 0.38	0.38, 0.43	0.48, 0.41	0.56, 0.48	0.47, 0.37
		( $n_{W, N P}^{A D}, n_{W, N P}^{D E}$ )	(103, 116)	(168, 161)	(199, 156)	(64, 74)	(170, 213)	(144, 100)	(82, 91)	(217, 179)	(322, 289)	(295, 220)
	body	$p_{B, N P}^{S, A D}, p_{B, N P}^{S, D E}$	0.58, 0.74	0.56, 0.62	0.67, 0.52	0.57, 0.61	0.48, 0.60	0.56, 0.83	0.66, 0.77	0.62, 0.58	0.72, 0.59	0.47, 0.59
		$f_{B, N P}^{A D}, f_{B, N P}^{D E}$	0.11, 0.09	0.15, 0.07	0.14, 0.17	0.20, 0.28	0.06, 0.16	0.08, 0.07	0.12, 0.10	0.11, 0.17	0.05, 0.07	0.09, 0.09
		( $n_{B, N P}^{A D}, n_{B, N P}^{D E}$ )	(29, 23)	(57, 26)	(76, 91)	(37, 52)	(29, 68)	(18, 18)	(26, 22)	(50, 74)	(29, 39)	(55, 56)
	T	$p_{T, N P}^{S, A D}, p_{T, N P}^{S, D E}$	0.72, 0.70	0.62, 0.68	0.70, 0.65	0.67, 0.59	0.65, 0.65	0.72, 0.79	0.61, 0.72	0.63, 0.62	0.71, 0.73	0.66, 0.69
		$f_{T, N P}^{A D}, f_{T, N P}^{D E}$	0.51, 0.47	0.41, 0.50	0.49, 0.54	0.45, 0.33	0.56, 0.34	0.33, 0.55	0.50, 0.47	0.41, 0.42	0.39, 0.45	0.44, 0.54
		( $n_{T, N P}^{A D}, n_{T, N P}^{D E}$ )	(136, 126)	(158, 188)	(268, 287)	(81, 62)	(249, 146)	(81, 147)	(107, 100)	(189, 181)	(220, 273)	(275, 320)
Pressure	wide	$p_{W, P}^{S, A D}, p_{W, P}^{S, D E}$	0.72, 0.61	0.61, 0.59	0.61, 0.63	0.72, 0.68	0.75, 0.72	0.79, 0.76	0.84, 0.66	0.80, 0.59	0.59, 0.70	0.58, 0.67
		$f_{W, P}^{A D}, f_{W, P}^{D E}$	0.45, 0.54	0.37, 0.46	0.46, 0.25	0.46, 0.44	0.51, 0.46	0.63, 0.48	0.49, 0.36	0.51, 0.46	0.54, 0.43	0.47, 0.46
		( $n_{W}^{A D}, n_{W}^{D E}$ )	(25, 52)	(53, 94)	(61, 53)	(25, 34)	(59, 90)	(24, 29)	(19, 29)	(75, 104)	(63, 76)	(94, 140)
	body	$p_{B, P}^{S, A D}, p_{B, P}^{S, D E}$	0.17, 0.38	0.70, 0.50	0.67, 0.54	0.57, 0.44	0.60, 0.59	0.75,1.00	0.50, 0.87	0.54, 0.65	0.40, 0.56	0.63, 0.56
		$f_{B, N P}^{A D}, f_{B, N P}^{D E}$	0.11, 0.08	0.07, 0.10	0.13, 0.18	0.13, 0.11	0.04, 0.09	0.11, 0.07	0.21, 0.10	0.10, 0.14	0.08, 0.09	0.09, 0.08
		( $n_{B, P}^{A D}, n_{B, P}^{D E}$ )	(6, 8)	(10, 20)	(18, 39)	(7, 9)	(5, 17)	(4, 4)	(8, 8)	(15, 32)	(10, 16)	(19, 25)
	T	$p_{T, P}^{S, A D}, p_{T, P}^{S, D E}$	0.71, 0.75	0.64, 0.58	0.69, 0.68	0.50, 0.63	0.75, 0.71	0.90, 0.52	0.59, 0.59	0.55, 0.62	0.66, 0.69	0.62, 0.63
		$f_{T, N P}^{A D}, f_{T, N P}^{D E}$	0.44, 0.38	0.56, 0.44	0.41, 0.57	0.41, 0.45	0.45, 0.45	0.26, 0.47	0.31, 0.54	0.39, 0.40	0.38, 0.48	0.44, 0.46
		( $n_{T, P}^{A D}, n_{T, P}^{D E}$ )	(24, 36)	(80, 89)	(55, 122)	(22, 35)	(52, 87)	(10, 29)	(12, 43)	(58, 91)	(44, 85)	(87, 138)
$H 1$ : Equal win probabilities	NP	$H 1_{N P}^{A D}, H 1_{N P}^{D E}$	0.271, 0.791	0.442, 0.047 **	0.845, 0.056 *	0.450, 0.859	0.036 **, 0.357	0.322, 0.709	0.558, 0.856	0.900, 0.202	0.666, 0.199	0.034 , 0.011
	P	$H 1_{P}^{A D}, H 1_{P}^{D E}$	0.077 *, 0.149	0.877, 0.731	0.628, 0.262	0.298, 0.543	0.965, 0.632	0.877, 0.094 *	0.197, 0.366	0.006 ***, 0.766	0.384, 0.550	0.860, 0.512
	wide	$H 2_{W}^{A D}, H 2_{W}^{D E}$	0.332, 0.080 *	0.160, 0.436	0.059 *, 0.221	0.138, 0.523	0.011 **, 0.420	0.649, 0.190	0.214, 0.315	0.514, 0.259	0.613, 0.228	0.961, 0.007 ***
$H 2$ : Frequency independence	body	$H 2_{B}^{A D}, H 2_{B}^{D E}$	0.985, 0.917	0.016 **, 0.216	0.866, 0.700	0.222, 0.004 ***	0.383, 0.016	0.669, 0.867	0.155, 0.934	0.777, 0.326	0.139, 0.245	0.763, 0.571
$H 2$ : Frequency independence	T	$H 2_{T}^{A D}, H 2_{T}^{D E}$	0.337, 0.090 *	0.003 ***, 0.148	0.085, 0.418	0.624, 0.067 *	0.039 , 0.011	0.390, 0.216	0.029 **, 0.299	0.627, 0.688	0.852, 0.542	0.901, 0.021 **
Does minimax hold?		individually	✗	✗✗✗	✗	✗✗✗	✗✗	✗	✗✗	✗✗✗	✔	✗✗✗
Does minimax hold?		jointly	$\overset{⏟}{✗ ✗ ✗}$

*/**/*** indicate rejection of the respective p-value at the 10/5/1-percent levels of significance. ✗/✗✗/✗✗✗ indicate that at least one hypothesis is rejected at the corresponding levels of significance. ✔ indicates that no hypothesis is rejected at any aforementioned level of significance.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Depoorter, B.; Jantschgi, S.; Lendl, I.; Mlakar, M.; Nax, H.H. Minimax Under Pressure: The Case of Tennis. Games 2025, 16, 60. https://doi.org/10.3390/g16060060

AMA Style

Depoorter B, Jantschgi S, Lendl I, Mlakar M, Nax HH. Minimax Under Pressure: The Case of Tennis. Games. 2025; 16(6):60. https://doi.org/10.3390/g16060060

Chicago/Turabian Style

Depoorter, Ben, Simon Jantschgi, Ivan Lendl, Miha Mlakar, and Heinrich H. Nax. 2025. "Minimax Under Pressure: The Case of Tennis" Games 16, no. 6: 60. https://doi.org/10.3390/g16060060

APA Style

Depoorter, B., Jantschgi, S., Lendl, I., Mlakar, M., & Nax, H. H. (2025). Minimax Under Pressure: The Case of Tennis. Games, 16(6), 60. https://doi.org/10.3390/g16060060

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Minimax Under Pressure: The Case of Tennis

Abstract

1. Introduction

2. Why Tennis

2.1. The Serve as a Simultaneous-Move Game

2.2. Publicly Available State-of-the-Art Data

2.3. Our Contribution

3. Testing Procedure

3.1. Deriving Hypotheses

3.2. Test Statistics

3.3. Comparability Assumptions

3.4. Results

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Notes

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI