Sequential Rationality in Continuous No-Limit Poker

Thomas W. L. Norman

doi:10.3390/g5020092

Magdalen College, University of Oxford, High Street, Oxford OX1 4AU, UK

Games2014, 5(2), 92-96;https://doi.org/10.3390/g5020092

Version Notes

Order Reprints

Abstract

Newman’s (1959, Operations Research, 7, 557–560) solution for a variant of poker with continuous hand spaces and an unlimited bet size is modified to incorporate sequential rationality.

Keywords:

poker; mixed strategies; sequential rationality

Newman [1] analyzes a model of ‘real’ poker, where a player may bet any amount without limit. In particular, each of two risk-neutral individuals, A and B, pays an ante of 1 unit into the pot in return for a private hand—x for A, y for B—dealt uniformly at random from the interval

(0, 1)

. A then chooses a bet

β \geq 0

to put into the pot. B finally decides either to call (matching the bet) or fold. If B calls, both players’ hands are revealed and the higher hand receives the pot of

2 (β + 1)

; if they have the same hand, the pot is split evenly between them. If B folds, the hands remain private and A wins the pot of

β + 2

.

Newman provides a solution where A bets 0 if his hand lies in the interval

(1

/

7, 4

/

7)

, and bets β with either hand

x_{β}^{-} = (1

/

7) (1 - 3 ξ^{2} + 2 ξ^{3})

or

x_{β}^{+} = 1 - (3

/

7) ξ^{2}

, where

ξ = 2 / (β + 2)

; B then calls the bet β if and only if his hand exceeds

y_{β}^{*} = 1 - (6

/

7) ξ

.1 An increasing amount

β (x)

is thus bet by A hands in the interval

(4

/

7, 1)

, with an asymptote at 1, and these bets are mimicked by hands in the interval

(0, 1

/

7)

, as illustrated in Figure 1.

Hence, B knows that the bet β was made by one of two possible hands,

x_{β}^{-}

or

x_{β}^{+}

, but does not know which one; he is then indifferent at a cutoff

y_{β}^{*}

somewhere in between the two possible hands. As Newman notes, his solution is a pure strategy for both A and B; hence, the solution exhibits the pure-strategy “bluffing” encountered, for instance, in the simple von Neumann and Morgenstern [2] (pp. 211–219) limited-bet model (and in the continuous case of the bounded bet space model of Bellman and Blackwell [3]).

Figure 1. Newman’s solution.

Since Newman’s original paper, however, a great deal of progress has of course been made in refining Nash equilibria to capture sequential rationality in finite extensive-form games with perfect recall [4,5]. Here I adapt sequential rationality for the infinite-game setting of Newman. A sequentially rational player must optimise not just in the game overall, but at each of his information sets. In Newman’s game, for a sequentially rational B, the cutoff hand

y_{β}^{*}

must be indifferent between calling β and folding:

(2 β + 2) Pr (x < y_{β}^{*} ∣ β) + 0 Pr (x > y_{β}^{*} ∣ β) - β = 0

(1)

and for (weak) consistency of beliefs, it must be the case that

\frac{Pr (x < y_{β}^{*}) Pr (β ∣ x < y_{β}^{*})}{Pr (β)} = \frac{β}{2 β + 2}

(2)

by Bayes’ rule. These are the requirements of a weak perfect Bayesian equilibrium [6,7]: strategies are sequentially rational and beliefs are derived from Bayes’ rule along the equilibrium path. If each information set were reached with strictly positive probability, such a weak perfect Bayesian equilibrium would automatically be a sequential equilibrium [5]; indeed, this would make Bayesian Nash equilibrium sufficient for sequential equilibrium.

With the infinite strategy set of A, however, whilst each of B’s information sets is along the equilibrium path under Newman’s solution, they are not all reached with strictly positive probability. Indeed,

Pr (β) = Pr (x = x_{β}^{-}) + Pr (x = x_{β}^{+}) = 0

for all

β > 0

, and as a result, the left-hand side of Equation (2) is undefined. In response to this problem, I adopt the approach of Aumann [8], Jung [9] and González Díaz and Meléndez-Jiménez [10], and work directly with conditional probabilities. This is equivalent to the use of Bayes’ rule in finite games, but is also well-defined in infinite games; any conditionally updated beliefs must agree at information sets along the equilibrium path (even if they occur with probability zero). Now, a bet of β is equally likely to have been made by hands

x_{β}^{-}

and

x_{β}^{+}

(and no others) under Newman’s solution, so that Equation (1) implies

Pr (x < y_{β}^{*} ∣ β) = \frac{β}{2 β + 2} = \frac{1}{2}

for all

β > 0

, a contradiction. Newman’s solution thus violates sequential rationality under conditional updating. Since Bayesian Nash equilibrium requires such sequential rationality for almost every β in Newman’s game (i.e., except on a set of Lebesgue measure zero, given the uniform hand distribution and

{β (x)|}_{(0, 1 / 7)} = {β (x)|}_{(4 / 7, 1)}

a homeomorphism), his solution is not a Bayesian Nash equilibrium.

However, a modification of Newman’s solution does satisfy sequential rationality. To see this, note that the lowest that

y_{β}^{*}

can be for positive β is 1/7, the lower limit of the interval of hands with which A checks. But then, given any bet

β > 0

, A’s expected payoff from betting β is the same with any hand below 1/7; these hands win against the same folded B hands and lose against the same calling hands. In particular, all hands

x \in (0, 1

/ 7)

win 2 when B folds and lose β when he calls; hence, they receive the expected payoff

2 (1 - \frac{6}{7} ξ) - \frac{6}{7} ξ β = \frac{2}{7}

(3)

from any bet

β > 0

, which exceeds the

2 x

they get from betting 0. We are thus free to have each

x \in (0, 1 / 7)

mix between positive bets with the probabilities required for conditional updating of B’s beliefs. For instance, we might have each

x \in (0, 1 / 7)

play a mixed strategy putting probability

p_{β (x)} = β (x) / (β (x) + 2)

on Newman’s bet

β (x)

, with the complementary probability spread uniformly over a countably infinite subset

C (x)

of

R^{+} \ {β (x)}

, with

⋂_{x \in (0, 1 / 7)} C (x) = \emptyset

.2 Thus, hand

x_{2}^{-} = 1 / 14

would bet 2 with probability

1 / 2

, and could spread the remaining

1 / 2

probability uniformly over the set

{\frac{1}{14}, 1 \frac{1}{14}, 2 \frac{1}{14}, \dots}

. Any

x \in [1 / 7, 1)

, meanwhile, still bets according to Newman’s solution; the bet 2 would thus be made with probability 1 by hand

x_{2}^{+} = 25 / 28

. With this strategy profile, a bet of

β > 0

is made with positive probability only by hands

x_{β}^{-}

and

x_{β}^{+}

, with probabilities

p_{β}

and 1 respectively (and with probability 0 by all other hands), so that B’s indifference condition

Pr (x < y_{β}^{*} ∣ β) = \frac{p_{β}}{p_{β} + 1} = \frac{β}{2 β + 2}

is satisfied for each

β > 0

, giving sequential rationality under conditional updating.3

There are obviously infinitely many such equilibria, and infinitely many more where hands

x \in (0, 1 / 7)

play different mixed strategies that “average out” to the same conditionally updated beliefs for B, but these are all equivalent in terms of their outcome. This outcome admits the intuitive interpretation of low hands bluffing as a mixed—rather than a pure—strategy, consistent with the findings of Borel [11] and Chen and Ankenman [12] (pp. 154–157). Whilst Borel restricts the first player to a single possible bet (or fold), Chen and Ankenman study a game that differs from the present one only in the drawing of hands from the closed (rather than open) unit interval

[0, 1]

. This means that a worst possible and best possible hand exist in their model, and a bet of infinity must be allowed by the player with the best possible hand, effectively imposing an upper limit on the bet. Their solution has similarities to the one presented here, but also an important difference: hands

x \in (0, 1 / 7)

can bet 0 with positive probability. This requires B hands

y \in (0, 1 / 7)

to fold to a bet of 0; otherwise

x \in (0, 1 / 7)

would prefer positive bets to 0 (see Equation (3)). This is irrational for

y \in (0, 1 / 7)

, since they have a positive probability of winning the pot at zero cost by calling. In my solution, by contrast, all B hands call a bet of 0, and no

x \in (0, 1 / 7)

bets 0 with positive probability as a result.4

Finally, although Newman’s game has infinitely many equilibria, I claim that there are no equilibria giving a different outcome. To see this, note that each player has an infinite set of (measurable) pure Bayesian strategies,

σ_{A} : (0, 1) \to [0, \infty)

and

σ_{B} : [0, \infty) \times (0, 1) \to {call, fold}

respectively. Each player’s set of mixed Bayesian strategies, P and Q respectively, is then a convex subset of a linear topological vector space. Moreover, B’s pure strategy space is an (uncountable) product of compact spaces, and hence compact by Tychonoff’s Theorem; endowing his mixed strategy space Q with the weak* topology, that too is compact. B’s expected gain

K : P \times Q \to R

from the game (net of his ante) is then continuous in each of its arguments, quasi-convex in p and quasi-concave in q (by linearity of expectation and of B’s gain in the players’ actions). It follows by Sion’s [13] general minimax theorem (see Raghavan [14], pp. 751–752) that the game has a value. The value of the game to A, net of the ante, is

\int_{0}^{\frac{1}{7}} \frac{2}{7} d x + \int_{\frac{1}{7}}^{\frac{4}{7}} 2 x d x + \int_{\frac{4}{7}}^{1} (2 (β + 1) (x - 1 + \frac{6}{7} ξ) + (β + 2) (1 - \frac{6}{7} ξ) - β) d x - 1 = \frac{1}{7}

which is unchanged from Newman’s solution, because the mixed strategies of hands

x \in (0, 1 / 7)

have the same expected payoff as the corresponding pure strategies of Newman. The existence of a value here is less obvious than in Chen and Ankenman [12] (pp. 154–157), where both players’ strategy spaces are compact (the possibility of betting infinity providing the Alexandroff extension); nonetheless, the compactness of B’s strategy space in Newman’s game is enough to avoid the arms race in bet size that would make for nonexistence of the value.

Acknowledgments

I thank the editor and referees, as well as Joe Perkins and seminar participants at the University of Oxford, for their comments and suggestions. I am also grateful for the hospitality of the Center for Gaming Research, University of Nevada Las Vegas.

Conflicts of Interest

The author declares no conflict of interest.

References

Newman, D.J. A model for ‘Real’ poker. Oper. Res. 1959, 7, 557–560. [Google Scholar] [CrossRef]
Von Neumann, J.; Morgenstern, O. Theory of Games and Economic Behavior; Princeton University Press: Princeton, NJ, USA, 1944. [Google Scholar]
Bellman, R.; Blackwell, D. Some two-person games involving bluffing. Proc. Nat. Acad. Sci. USA 1949, 35, 600–605. [Google Scholar] [CrossRef] [PubMed]
Selten, R. Reexamination of the perfectness concept for equilibrium points in extensive games. Int. J. Game Theory 1975, 4, 25–55. [Google Scholar] [CrossRef]
Kreps, D.; Wilson, R. Sequential equilibria. Econometrica 1982, 50, 863–894. [Google Scholar] [CrossRef]
Mas-Colell, A.; Whinston, M.D.; Green, J.R. Microeconomic Theory; Oxford University Press: New York, NY, USA, 1995. [Google Scholar]
Myerson, R.B. Game Theory: Analysis of Conflict; Harvard University Press: Cambridge, MA, USA, 1991. [Google Scholar]
Aumann, R.J. Mixed and Behavior Strategies in Infinite Extensive Games. In Annals of Mathematics Studies 52; Dresher, M., Shapley, L.S., Tucker, A.W., Eds.; Princeton University Press: Princeton, NJ, USA, 1964; pp. 627–650. [Google Scholar]
Jung, H.M. Perfect Regular Equilibrium; MPRA Paper No. 26534; 2010; Available online: www.econ.sinica.edu.tw/english/webtools/thumbnail/download/?fd=2013092416160643896_-PDoc&Pname=Perfect_Regular_Equilibrium.pdf (accessed on 14 April 2014).
González Díaz, J.; Meléndez-Jiménez, M.A. On the notion of perfect bayesian equilibrium. TOP 2014, 22, 128–143. [Google Scholar] [CrossRef]
Borel, E. Traité du Calcul des Probabilités et ses Applications Volume IV, Fascicule 2, Applications Aux Jeux Des Hazard; Gautier-Villars: Paris, France, 1938. [Google Scholar]
Chen, B.; Ankenman, J. The Mathematics of Poker; ConJelCo: Pittsburgh, PA, USA, 1938. [Google Scholar]
Sion, M. On general minimax theorem. Pac. J. Math. 1958, 8, 171–176. [Google Scholar] [CrossRef]
Raghavan, T.E.S. Zero-Sum Two-Person Games. In Handbook of Game Theory Volume 2; Aumann, R.J., Hart, S., Eds.; Elsevier: Amsterdam, The Netherlands, 1994; pp. 735–759. [Google Scholar]

^1.All B hands call a bet of 0.
^2.This mixed strategy is well-defined, since ${lim}_{n \to \infty} \sum_{i = 1}^{n} 2 / n (β (x) + 2) = 2 / (β (x) + 2)$ . If positive probability were placed uniformly on an uncountable set U of bets, the strategy would not be a probability measure as $\int_{U} 2 / z (β (x) + 2) d z$ would then be divergent. To see that $⋂_{x \in (0, 1 / 7)} C (x) = \emptyset$ is possible, note that a countable union of countable sets is countable, hence $R^{+} \ {β (x)}$ is the union of uncountably many disjoint countable sets.
^3. $⋂_{x \in (0, 1 / 7)} C (x) = \emptyset$ prevents uncountably many hands mixing (with probability 0) over a given bet β; otherwise, countable additivity would be insufficient to make the sum of these probabilities 0.
^4.It can be verified that Newman’s ξ equals Chen and Ankenman’s [12] (p. 113) $1 / (1 + s)$ , since their s equals the ratio $β / 2$ of the bet to the antes (pot); hence, their $x (s)$ and $y (s)$ coincide with Newman’s solution for $s > 0$ (noting their reversed hand order on the unit interval). Note also that my $p_{β} = 1 - ξ$ , which then equals Chen and Ankenman’s ratio $α = s / (1 + s)$ of bluffs to value bets. The departure from my solution arises from their $x (0) = 6 / 7$ , whereas all B hands calling a bet of 0 (as here) would imply that their $x (0)$ would equal 1.

© 2014 by the author; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Sequential Rationality in Continuous No-Limit Poker

Abstract

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics