On the Nash Equilibria of a Duel with Terminal Payoffs

Kehagias, Athanasios

doi:10.3390/g14050062

Open AccessArticle

On the Nash Equilibria of a Duel with Terminal Payoffs

by

Athanasios Kehagias

Department of Electrical and Computer Engineering, Aristotle University, 54124 Thessaloniki, Greece

Games 2023, 14(5), 62; https://doi.org/10.3390/g14050062

Submission received: 28 August 2023 / Revised: 17 September 2023 / Accepted: 19 September 2023 / Published: 21 September 2023

(This article belongs to the Special Issue Learning and Evolution in Games, 1st Edition)

Download Versions Notes

Abstract

:

We formulate and study a two-player duel game as a terminal payoffs stochastic game. Players

P_{1}, P_{2}

are standing in place and, in every turn, each may shoot at the other (in other words, abstention is allowed). If

P_{n}

shoots

P_{m}

(

m \neq n

), either they hit and kill them (with probability

p_{n}

) or they miss and

P_{m}

is unaffected (with probability

1 - p_{n}

). The process continues until at least one player dies; if no player ever dies, the game lasts an infinite number of turns. Each player receives a positive payoff upon killing their opponent and a negative payoff upon being killed. We show that the unique stationary equilibrium is for both players to always shoot at each other. In addition, we show that the game also possesses “cooperative” (i.e., non-shooting) non-stationary equilibria. We also discuss a certain similarity that the duel has to the iterated Prisoner’s Dilemma.

Keywords:

duel; Nash equilibrium; stochastic games

1. Introduction

In this paper, we study a two-player duel game played in turns. Players

P_{1}, P_{2}

are standing in place and, in each turn, each player may shoots at the other; in other words, abstention is allowed. If

P_{n}

shoots at

P_{m}

(

m \neq n

), either they hit and kill them or they miss and

P_{m}

is unaffected; the respective probabilities are

p_{n}

(

P_{n}

’s marksmanship) and

1 - p_{n}

. The process continues until at least one player dies; if no player ever dies then the game lasts an infinite number of turns. We formulate the above as a stochastic game with terminal payoffs. The precise game rules and players’ payoffs will be presented in Section 2.

Little work has been done on the duel. In fact, to the best of our knowledge, it has only been studied as a preliminary step in the study of the “truel”, in which three stationary players shoot at each other. In early works on the truel [1,2,3,4], the postulated game rules guarantee the existence of exactly one survivor (“winner”). In an important early paper [5], the somewhat paradoxical result of “survival of the weakest” is established; namely for certain marksmanship combinations, the player with lowest marksmanship has the highest probability of survival. A more general analysis appears in a further study [6], which considers the possibility of “cooperation” between the players, in the sense that each player has the option of abstaining, i.e., not shooting at their opponent in one or more turns of the game. This idea is further studied by Kilgour (for the simultaneous truel) [7] and (the sequential truel) [8,9]. These papers are, to the best of our knowledge, the first to address the truel problem using a rigorous game theoretic analysis. Kilgour formulates both the simultaneous and sequential truel as stochastic games with terminal payoffs (i.e., the players receive a single payoff at the end of the game) and obtains Nash equilibria, under appropriate conditions. A similar analysis appears in a further study [10], where, however, the truel is formulated as a discounted stochastic game. Recent papers on the truel include: Refs. [11,12,13,14] where, among other innovations, the truel is formulated as an extensive form game; Refs. [15,16,17,18], where a Markov chain formulation of several truel variants is presented; and Refs. [19,20,21], in which truels among N players are studied, with each player being represented by a node in a scale-free network.1

Several applications of the duel and, more frequently, of the truel have been proposed in the above literature. The truel has been used to model behavior in confrontation situations [25] and in political conflicts [26]. A truel variant has been used as a model of opinion dissemination [17]. Business applications have been presented in a further study [27], in which it is shown that, under certain conditions, weaker companies can grow stronger and stronger companies can grow weaker with all the parties eventually converging. In legal studies, the truel has been used to explore equality issues [28]. Last but not least, the nuel (an N-person generalization of the duel and truel) has been used in biology to explain the maintenance of variation in natural populations [29] and study marriage and reproduction mechanisms [30]. Furthermore, the truel is relevant to the existence of “suicidal strategies” employed by cells and bacteria [31,32].

A common characteristic of all the above-mentioned works is that they limit themselves to the study of stationary strategies. As we will show in the current paper, the duel also possesses Nash equilibria in non-stationary strategies and it is safe to assume that the same is true of the truel and the nuel (the N-player generalization of the duel and truel).

While the above papers focus on various forms of the truel, we believe that the duel is interesting in its own right and has not received the attention it deserves. In particular we will show that, under our formulation, the duel has a certain similarity to the iterated Prisoner’s Dilemma (IPD) and possesses “cooperative” Nash equilibria in non-stationary strategies.

In this paper, we study two versions of the duel with terminal payoffs. The rest of the paper is structured as follows. In Section 2, we define the game rigorously. In Section 3 we establish the existence of equilibria in stationary strategies. In Section 4, we discuss some similarities between the game and the IPD. In Section 5, we prove that the duel also has equilibria in non-stationary strategies (namely grim cooperation and Tit-for-Tat). In Section 6, we summarize our results and propose some future research directions.

2. Game Description

Our duel game involves players

P_{1}, P_{2}

and evolves in discrete time steps (turns)

t \in \{1, 2, \dots\}

. The state at time t is

s (t) = s_{1} (t) s_{2} (t) \in S = \{11, 10, 01, 00\} .

For

n \in \{1, 2\}

,

s_{n} (t)

is

P_{n}

’s state at

t \in \{0, 1, 2, \dots\}

and can be

\begin{matrix} s_{n} (t) = 1 : & when P_{n} is alive at the t -th turn; \\ s_{n} (t) = 0 : & when P_{n} is dead at the t -th turn . \end{matrix}

P_{n}

’s action at

t \in \{1, 2, \dots\}

is

f_{n} (t)

, which can be

F

(

P_{n}

is shooting) or

A

(

P_{n}

is not shooting). If

f_{n} (t) = F

then: (a) we have

s_{- n} (t) = 0

(i.e.,

P_{- n}

dies)2 with probability

p_{n} \in (0, 1)

and (b)

s_{- n} (t) = 1

with probability

1 - p_{n}

. We set

f (t) = f_{1} (t) f_{2} (t)

and

p = (p_{1}, p_{2})

. We assume throughout the paper that for

n \in \{1, 2\}

,

p_{n} \in (0, 1)

, i.e., it is strictly between zero and one.

The game starts at an initial state

s (0)

; obviously, the case of interest is

s (0) = 11

. At times

t \in \{1, 2, \dots\}

, the players simultaneously choose their actions

f_{1} (t)

,

f_{2} (t)

and the game moves to state

s (t)

according to the conditional state transition probability

Pr (s (t) | s (t - 1), f (t))

. If we number the states as follows

00 \to 1, 01 \to 2, 10 \to 3, 11 \to 4,

then we get a “controlled” transition probability matrix

Π (ϕ)

where

Π_{i j} (ϕ) = Pr (s (t) = j | s (t - 1) = i, f = ϕ) .

For every action vector, a terminal state

s \in \{1, 2, 3\}

(or, equivalently,

s \in \{00, 01, 10\}

) transits to itself with probability one; i.e., for all

i \in \{1, 2, 3\}

:

\begin{matrix} Π_{i i} (AA) & = Pr (s (t) = i | s (t - 1) = i, f = AA) = 1; \\ Π_{i i} (AF) & = Pr (s (t) = i | s (t - 1) = i, f = AF) = 1; \\ Π_{i i} (FA) & = Pr (s (t) = i | s (t - 1) = i, f = FA) = 1; \\ Π_{i i} (FF) & = Pr (s (t) = i | s (t - 1) = i, f = FF) = 1 . \end{matrix}

Transitions from the state

s = 4

(or

s = 11

) are a little more complicated. Consider, for example, the case

f = FF

, i.e., when both players fire. Then, letting

{\bar{p}}_{n} = 1 - p_{n}

for

n \in \{1, 2\}

, we have:

\begin{matrix} Π_{41} (FF) & = Pr (s (t) = 00 | s (t - 1) = 11, f = FF) = Pr (“ P_{1} hits P_{2}, P_{2} hits P_{1} ”) = p_{1} p_{2}; \\ Π_{42} (FF) & = Pr (s (t) = 01 | s (t - 1) = 11, f = FF) = Pr (“ P_{1} misses P_{2}, P_{2} hits P_{1} ”) = {\bar{p}}_{1} p_{2}; \\ Π_{43} (FF) & = Pr (s (t) = 10 | s (t - 1) = 11, f = FF) = Pr (“ P_{1} hits P_{2}, P_{2} misses P_{1} ”) = p_{1} {\bar{p}}_{2}; \\ Π_{44} (FF) & = Pr (s (t) = 11 | s (t - 1) = 11, f = FF) = Pr (“ P_{1} misses P_{2}, P_{2} misses P_{1} ”) = {\bar{p}}_{1} {\bar{p}}_{2} . \end{matrix}

The elements of the other matrices are computed similarly, yielding

\begin{matrix} Π (AA) = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix}] & Π (AF) = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & p_{2} & 0 & {\bar{p}}_{2} \end{matrix}] \\ Π (FA) = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & p_{1} & {\bar{p}}_{1} \end{matrix}] & Π (FF) = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ p_{1} p_{2} & {\bar{p}}_{1} p_{2} & p_{1} {\bar{p}}_{2} & {\bar{p}}_{1} {\bar{p}}_{2} \end{matrix}] \end{matrix}

From the above matrices (or from game rules), we see that there exist two possibilities.

The game stays in state 11 ad infinitum (no player is ever killed);
At some $t^{'}$ the game moves to a state $s (t^{'}) \in \{10, 01, 00\}$ (one or both players are killed). These are terminal states, i.e., as soon as they are reached, the game terminates.

When the game reaches a terminal state

s

,

P_{n}

(

n \in \{1, 2\}

) receives payoff

q_{n} (s)

as follows:

\begin{matrix} q_{1} (10) = a_{1} & q_{2} (01) = - b_{2} \\ q_{1} (01) = - b_{1} & q_{2} (01) = a_{2} \\ q_{1} (00) = a_{1} - b_{1} & q_{2} (00) = a_{2} - b_{2} \end{matrix}

where we assume that for

n \in \{1, 2\}

,

a_{n} > 0

and

b_{n} > 0

. We set

a = (a_{1}, a_{2})

and

b = (b_{1}, b_{2})

.

A finite history is a sequence

h = s (0) f (1) s (1) \dots f (T) s (T)

, a non-terminal finite history is an

h = s (0) f (1) s (1) \dots f (T) s (T)

where

s (T) = 11

and an infinite history is an

h = s (0) f (1) s (1) \dots

. An admissible history is one which conforms to the game rules; the set of all admissible finite (resp. infinite) histories is denoted by

H^{*}

(resp.

H^{\infty}

);

{\bar{H}}^{*}

denotes the set of all non-terminal finite histories. The set of all histories is

H = H^{*} \cup H^{\infty}

. It will be useful to define payoff as a function

Q_{n} : H \to R

as follows

Q_{n} (h) = \{\begin{matrix} q_{n} (s (T)) & if h = s (0) f (1) s (1) \dots f (T) s (T) \in H^{*}, s (T) is terminal, \\ 0 & if h = s (0) f (1) s (1) \dots f (T) s (T) \in H^{*}, s (T) is non-terminal, \\ 0 & if h \in H^{\infty} \end{matrix}

Note that if the game never terminates, both players receive zero payoff.

A strategy for

P_{n}

is a function

σ_{n} : {\bar{H}}^{*} \to [0, 1]

; it corresponds to, for every non-terminal finite history h, the probability that, given that the current history is h,

P_{n}

will shoot

P_{- n}

:

σ_{n} (h) = Pr (" P_{n} shoots P_{- n} ") .

A stationary strategy is a

σ_{n}

depending only on the current state

s

, hence we simply write

σ_{n} (s)

. Since a stationary strategy

σ_{n}

depends only on the current state, it is fully determined by the values

σ_{n} (s)

for

s \in \{00, 01, 10, 11\}

, i.e., from

σ_{n} (00), σ_{n} (01), σ_{n} (10), σ_{n} (11) .

But any admissible strategy (i.e., compatible with the game rules) must assign

σ_{n} (00) = σ_{n} (01) = σ_{n} (10) = 0 .

Consequently, a stationary strategy is determined by a single number

x_{n} = σ_{n} (11)

.

A strategy profile is a vector

σ = (σ_{1}, σ_{2})

. We denote the set of all admissible strategies by

Σ

and the set of all admissible stationary strategies by

\bar{Σ}

.

An initial state

s (0)

and two strategies

σ_{1}

and

σ_{2}

(used, respectively, by

P_{1}

and

P_{2}

) determine a probability measure on the set of all histories; hence we can define the expected payoffs

\forall n \in \{1, 2\} :_{} Q_{n} (s (0), σ_{1}, σ_{2}) = E_{s (0), σ_{1}, σ_{2}} (Q_{n} (h)) .

We have, thus, formulated the terminal payoffs duel as a game. We are interested in the game that starts at

s (0) = 11

, which we will denote by

Γ (p, a, b)

. We assume that

P_{1}

and

P_{2}

are looking for a Nash equilibrium (NE), i.e., a strategy profile

({\hat{σ}}_{1}, {\hat{σ}}_{2})

such that

\forall n \in \{1, 2\}, \forall σ_{n} \in Σ : Q_{n} ((1, 1), {\hat{σ}}_{n}, {\hat{σ}}_{- n}) \geq Q_{n} ((1, 1), σ_{n}, {\hat{σ}}_{- n}) .

3. Stationary Equilibria

As already noted, an admissible stationary strategy

σ_{1}

for

P_{1}

is fully determined by

x_{1} = σ_{1} (11) = Pr (P_{1} shoots P_{2})

; i.e.,

σ_{1}

is determined by a single variable

x_{1} \in [0, 1]

. Similarly, every admissible stationary strategy

σ_{2}

for

P_{2}

is fully determined by a single variable

x_{2} \in [0, 1]

. Hence, we will often speak of the strategy

x_{n}

(rather than

σ_{n}

) and the strategy profile

(x_{1}, x_{2})

(rather than

(σ_{1}, σ_{2})

). When

P_{1}

and

P_{2}

use strategies

x_{1}

and

x_{2}

, the state sequence is a Markov chain; using the previous numbering of states we have the transition probability matrix

Π (x_{1}, x_{2}) = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ x_{1} p_{1} x_{2} p_{2} & (x_{1} {\bar{p}}_{1} + {\bar{x}}_{1}) x_{2} p_{2} & x_{1} p_{1} (x_{2} {\bar{p}}_{2} + {\bar{x}}_{2}) & (x_{1} {\bar{p}}_{1} + {\bar{x}}_{1}) (x_{2} {\bar{p}}_{2} + {\bar{x}}_{2}) \end{matrix}]

Also, we can define

V_{n} (x_{1}, x_{2}) = Q_{n} (11, (x_{1}, x_{2})) .

If

(x_{1}, x_{2}) = (0, 0)

we obviously get

V_{1} (0, 0) = 0

. Conversely, if

(x_{1}, x_{2}) \neq (0, 0)

then we have the following equation for

V_{1}

(temporarily omitting the

x_{1}, x_{2}

arguments for brevity of notation):

\begin{matrix} V_{1} = x_{1} p_{1} (x_{2} {\bar{p}}_{2} + {\bar{x}}_{2}) a_{1} - (x_{1} {\bar{p}}_{1} + {\bar{x}}_{1}) x_{2} p_{2} b_{1} \\ + x_{1} p_{1} x_{2} p_{2} (a_{1} - b_{1}) + ({\bar{x}}_{1} + x_{1} {\bar{p}}_{1}) (x_{2} {\bar{p}}_{2} + {\bar{x}}_{2}) V_{1} . \end{matrix}

The equation is obtained as follows: the expected payoff from state 11 is the sum of four terms:

The transition to state 10 gives payoff $a_{1}$ and takes place with probability $x_{1} p_{1}$ ( $P_{1}$ shot and hit $P_{2}$ ) multiplied by $(x_{2} {\bar{p}}_{2} + {\bar{x}}_{2})$ ( $P_{2}$ either shot and missed or did not shoot);
The transition to state 01 gives payoff $- b_{1}$ and takes place with probability $x_{2} p_{2}$ ( $P_{2}$ shot and hit $P_{1}$ ) multiplied by $(x_{1} {\bar{p}}_{1} + {\bar{x}}_{1})$ ( $P_{1}$ either shot and missed or did not shoot);
The transition to state 00 gives payoff $a_{1} - b_{1}$ and takes place with probability $x_{1} p_{1}$ ( $P_{1}$ shot and hit $P_{2}$ ) multiplied by $x_{2} p_{2}$ ( $P_{2}$ shot and hit $P_{1}$ );
The transition to state 11 gives payoff $V_{1}$ (it is as if the game starts from the beginning) and takes place with probability $({\bar{x}}_{1} + x_{1} {\bar{p}}_{1})$ ( $P_{1}$ either shot and missed or did not shoot) multiplied by $(x_{2} {\bar{p}}_{2} + {\bar{x}}_{2})$ ( $P_{2}$ either shot and missed or did not shoot).

After some algebra, the

V_{1}

equation is simplified to

V_{1} = x_{1} p_{1} a_{1} - x_{2} p_{2} b_{1} + V_{1} x_{1} p_{1} x_{2} p_{2} - V_{1} x_{1} p_{1} - V_{1} x_{2} p_{2} + V_{1}

and has the following solution3:

V_{1} (x_{1}, x_{2}) = \frac{x_{1} p_{1} a_{1} - x_{2} p_{2} b_{1}}{1 - (1 - x_{1} p_{1}) (1 - x_{2} p_{2})} .

By the same analysis for

V_{2} (x_{1}, x_{2})

, we finally get the expressions

\begin{matrix} V_{1} (x_{1}, x_{2}) & = \{\begin{matrix} 0 & if x_{1} = x_{2} = 0 \\ \frac{x_{1} p_{1} a_{1} - x_{2} p_{2} b_{1}}{1 - (1 - x_{1} p_{1}) (1 - x_{2} p_{2})} & otherwise \end{matrix} \end{matrix}

(1)

\begin{matrix} V_{2} (x_{1}, x_{2}) & = \{\begin{matrix} 0 & if x_{1} = x_{2} = 0 \\ \frac{x_{2} p_{2} a_{2} - x_{1} p_{1} b_{2}}{1 - (1 - x_{1} p_{1}) (1 - x_{2} p_{2})} & otherwise \end{matrix} \end{matrix}

(2)

Proposition 1.

The only stationary NE of

Γ (a, b, p)

is

(x_{1}, x_{2}) = (1, 1)

.

Proof.

Suppose that

P_{1}

and

P_{2}

use the profile

(x_{1}, x_{2})

. To determine whether this is an NE, from

P_{1}

’s point of view we have to check whether they have anything to gain by unilaterally deviating to some other strategy

σ_{1}

. A crucial fact is that we only have to check whether

P_{1}

gains by switching to another stationary strategy. This is true because, if

P_{2}

uses the stationary strategy

x_{2}

, then

P_{1}

must solve an Markov Decision Process problem; it is well known that in this case he gains nothing by using non-stationary strategies [33].

Let us first check whether

(0, 0)

is a Nash equilibrium. If

P_{1}

deviates to another stationary strategy

x_{1}

, we will have

V_{1} (0, 0) - V_{1} (x_{1}, 0) = 0 - \frac{x_{1} p_{1} a_{1} - 0 p_{2} b_{1}}{1 - (1 - x_{1} p_{1}) (1 - 0 p_{2})} = - a_{1} < 0 .

Hence,

(0, 0)

cannot be an NE. Next, take any

(x_{1}, x_{2}) \neq (0, 0)

and suppose

P_{1}

deviates to

y_{1}

. Then

\begin{matrix} V_{1} (x_{1}, x_{2}) - V_{1} (y_{1}, x_{2}) \\ = \frac{x_{1} p_{1} a_{1} - x_{2} p_{2} b_{1}}{1 - (1 - x_{2} p_{2}) (1 - x_{1} p_{1})} - \frac{y_{1} p_{1} a_{1} - x_{2} p_{2} b_{1}}{1 - (1 - x_{2} p_{2}) (1 - y_{1} p_{1})} \\ = \frac{p_{1} x_{2} p_{2} (a_{1} + b_{1} (1 - x_{2} p_{2})) (x_{1} - y_{1})}{(1 - (1 - x_{1} p_{1}) (1 - x_{2} p_{2})) (1 - (1 - y_{1} p_{1}) (1 - x_{2} p_{2}))} \end{matrix}

The denominator is positive. The numerator has the sign of

x_{1} - y_{1}

. Hence, the sign of

V_{1} (x_{1}, x_{2}) - V_{1} (y_{1}, x_{2})

is the same as that of

x_{1} - y_{1}

and consequently,

P_{1}

never (resp. always) has an incentive to deviate from

x_{1}

to a smaller (resp. greater)

y_{1}

. The same arguments can be applied to

P_{2}

and their strategy

x_{2}

. It follows that the only stationary NE is

(x_{1}, x_{2}) = (1, 1)

and this completes the proof. □

4. Connection to the Iterated Prisoner’s Dilemma

Applying Formulas (1) and (2) to

(x_{1}, x_{2}) \in \{(0, 0), (0, 1), (1, 0), (1, 1)\}

, we get

\begin{matrix} V_{1} (0, 0) = 0 & V_{2} (0, 0) = 0 \\ V_{1} (0, 1) = - b_{1} & V_{2} (0, 1) = a_{2} \\ V_{1} (1, 0) = a_{1} & V_{2} (1, 0) = - b_{2} \\ V_{1} (1, 1) = \frac{p_{1} a_{1} - p_{2} b_{1}}{1 - (1 - p_{2}) (1 - p_{1})} & V_{2} (1, 1) = \frac{p_{2} a_{2} - p_{1} b_{2}}{1 - (1 - p_{2}) (1 - p_{1})} \end{matrix}

It can immediately be seen that

- b_{1} = V_{1} (0, 1) < 0 = V_{1} (0, 0) < a_{1} = V_{1} (1, 0)

and if we identify the strategy

x_{n} = 0

(never shooting at the opponent) with “cooperation” and the strategy

x_{n} = 1

(always shooting at the opponent) with “defection”, the above inequalities remind us of the Prisoner’s Dilemma (PD). The similarity would be complete if the additional inequalities

V_{1} (0, 1) < V_{1} (1, 1) < V_{1} (0, 0)

also held; because in this case we would have

V_{1} (0, 1) < V_{1} (1, 1) < V_{1} (0, 0) < V_{1} (1, 0)

(3)

which corresponds exactly to the well known sequence of PD inequalities [22]:

S < P < R < T .

Now, (3) is equivalent to

- b_{1} < \frac{p_{1} a_{1} - p_{2} b_{1}}{1 - (1 - p_{2}) (1 - p_{1})} < 0 < a_{1} .

The first inequality is equivalent to

0 < \frac{p_{1} a_{1} - p_{2} b_{1}}{1 - (1 - p_{2}) (1 - p_{1})} + b_{1} = p_{1} \frac{a_{1} + b_{1} (1 - p_{2})}{p_{1} + p_{2} (1 - p_{1})}

which is always satisfied. The second inequality is

\frac{p_{1} a_{1} - p_{2} b_{1}}{1 - (1 - p_{2}) (1 - p_{1})} < 0,

which will be satisfied iff

p_{1} a_{1} < p_{2} b_{1}

The third inequality is always satisfied. Similarly, the inequalities

V_{2} (0, 1) < V_{2} (1, 1) < V_{2} (0, 0) < V_{2} (1, 0)

(4)

will be satisfied iff

p_{2} a_{2} < p_{1} b_{2}

Combining the above, we get the following “PD-like condition”

\frac{a_{2}}{b_{2}} < \frac{p_{1}}{p_{2}} < \frac{b_{1}}{a_{1}}

(5)

which is necessary and sufficient to have the following ordering of the payoffs

\begin{matrix} V_{1} (0, 1) & < V_{1} (1, 1) < V_{1} (0, 0) < V_{1} (1, 0) \end{matrix}

(6)

\begin{matrix} V_{2} (0, 1) & < V_{2} (1, 1) < V_{2} (0, 0) < V_{2} (1, 0) \end{matrix}

(7)

In light of (6) and (7), we will call the never-shooting strategy

x_{n} = 0

(which henceforth will also be denoted by

σ^{C}

) the cooperating strategy, and the always-shooting strategy

x_{n} = 1

(which henceforth will also be denoted by

σ^{D}

) the defecting strategy. The terminology is inspired by the analogy to the PD. Namely, in both the PD and the duel, both players would have a higher payoff if they adhered to

(σ^{C}, σ^{C})

; but this is not a NE and each player has incentive to switch to

σ^{D}

. Consequently, rational players will follow the strategy profile

(σ^{D}, σ^{D})

, which, while being an NE, yields lower payoff to both players.4

As is well known, cooperative NE do exist for the iterated PD, and these involve the use of non-stationary strategies, such as grim-cooperation and Tit-for-Tat (TfT). Hence, in the next section, we will show that there exist corresponding non-stationary cooperative strategies which are NE of

Γ (p, a, b)

.

Before concluding this section, it is worth discussing in what ways our duel game

Γ (p, a, b)

differs from the IPD. Three obvious differences are:

The IPD is a deterministic game, while $Γ (p, a, b)$ involves randomness;
In the IPD, each player receives a payoff in every turn and the total payoff is the discounted (by a discount factor $γ$ ) sum of turn payoffs, while in $Γ (p, a, b)$ , payoff is obtained only at the final turn and is undiscounted;
The IPD will last an infinite number of turns, while $Γ (p, a, b)$ may (depending on the p values and the strategy used) terminate in a finite number of turns (in fact, it may be the case that it will terminate in a finite number of terms with probability one).

However, there is an formulation of the IPD in which the payoffs are not discounted but the game may terminate in every turn with a positive probability

p = 1 - γ > 0

. In this formulation, the IPD is also a random game and will terminate in a finite number of turns with probability one; the total expected payoff of each player equals the discounted payoff of the deterministic IPD version.

5. Non-Stationary Equilibria

Drawing upon similar results for the IPD, we will now show that the duel has cooperative NE in non-stationary strategies. The first such strategy we introduce is the grim cooperation strategy

σ^{G}

, which is defined as follows for

P_{n}

(

n \in \{1, 2\}

):

σ^{G} : \begin{matrix} As long as P_{- n} does not shoot P_{n}, P_{n} never shoots P_{- n}; \\ if P_{- n} shoots P_{n} at round t, then P_{n} shoots P_{- n} at all rounds t^{'} > t . \end{matrix}

This strategy was originally used in the analysis of the IPD.

Proposition 2.

(σ^{G}, σ^{G})

is an NE of

Γ (a, b, p)

iff

\frac{b_{1}}{a_{1}} > \frac{1 + p_{2} (1 - p_{1})}{1 - p_{1}} \cdot \frac{p_{1}}{p_{2}} and \frac{b_{2}}{a_{2}} > \frac{1 + p_{1} (1 - p_{2})}{1 - p_{2}} \cdot \frac{p_{2}}{p_{1}} .

(8)

Proof.

We have

V_{1}^{G} = Q_{1} (11, (σ^{G}, σ^{G})) = 0

since, if both players adhere to

σ^{G}

, nobody will ever get killed. Next, let us consider possible

P_{1}

strategies

σ_{1}

deviating from

σ^{G}

. It is easy to see that it suffices to consider the strategy

σ^{D}

, because, as soon as

P_{1}

deviates from

σ^{G}

,

P_{2}

will shoot at

P_{1}

on every turn and hence,

P_{1}

has no incentive to not shoot; furthermore, if

P_{1}

deviates from

σ^{G}

, they might as well deviate on the first turn. Now, let us compute

V_{1}^{R} = Q_{1} (11, (σ^{D}, σ^{G})) .

If

P_{1}

uses

σ^{D}

at

t = 1

, then

P_{2}

will also revert to

σ^{D}

at times

t \in \{2, 3, \dots\}

. Hence,

P_{1}

’s expected payoff will be

\begin{matrix} V_{1}^{R} & = p_{1} a_{1} + (1 - p_{1}) (0 + V_{1}^{D}) \\ = p_{1} a_{1} + (1 - p_{1}) \frac{p_{1} a_{1} - p_{2} b_{1}}{1 - (1 - p_{2}) (1 - p_{1})} \\ = \frac{p_{2} (1 - p_{1}) (p_{1} a_{1} - b_{1}) + p_{1} a_{1}}{1 - (1 - p_{1}) (1 - p_{2})} . \end{matrix}

For

(σ^{G}, σ^{G})

to be an NE, we must have

V_{1}^{R}

< V_{1}^{G}

, which is equivalent to

p_{2} (1 - p_{1}) (p_{1} a_{1} - b_{1}) + p_{1} a_{1} < 0 .

(9)

By assumption

\begin{matrix} \frac{1 + p_{2} (1 - p_{1})}{p_{2} (1 - p_{1})} p_{1} a_{1} - b_{1} & < 0 \Leftrightarrow \\ \frac{- b_{1} p_{2} + b_{1} p_{2} p_{1} + p_{1} a_{1} + p_{1} a_{1} p_{2} - p_{1}^{2} a_{1} p_{2}}{p_{2} (1 - p_{1})} & < 0 \Leftrightarrow \\ - b_{1} p_{2} + b_{1} p_{2} p_{1} + p_{1} a_{1} + p_{1} a_{1} p_{2} - p_{1}^{2} a_{1} p_{2} & < 0 \Leftrightarrow \\ p_{2} (1 - p_{1}) (p_{1} a_{1} - b_{1}) + p_{1} a_{1} & < 0 . \end{matrix}

Hence, (9) holds and

P_{1}

has no incentive to deviate from

σ^{G}

. By a similar analysis, we can also show that

P_{2}

has no incentive to deviate from

σ^{G}

. This completes the proof. □

Remark 1.

The duel NE conditions (8) imply

\begin{matrix} \frac{b_{1}}{a_{1}} & > \frac{1 + p_{2} (1 - p_{1})}{1 - p_{1}} \cdot \frac{p_{1}}{p_{2}} \Rightarrow \frac{p_{1}}{p_{2}} < \frac{1 - p_{1}}{1 + p_{2} (1 - p_{1})} \cdot \frac{b_{1}}{a_{1}} < \frac{b_{1}}{a_{1}}, \\ \frac{b_{2}}{a_{2}} & > \frac{1 + p_{1} (1 - p_{2})}{1 - p_{2}} \cdot \frac{p_{2}}{p_{1}} \Rightarrow \frac{p_{2}}{p_{1}} < \frac{1 - p_{2}}{1 + p_{1} (1 - p_{2})} \cdot \frac{b_{2}}{a_{2}} < \frac{b_{2}}{a_{2}} . \end{matrix}

Hence, the conditions (8) are stronger than the originally postulated condition (5) for the existence of a “PD-like” ordering in the duel.

Now, we will define another non-stationary cooperative strategy, which will turn out to be an NE of the duel. This is the Tit-for-Tat strategy

σ^{T f T}

, defined for

P_{n}

(

n \in \{1, 2\}

) as follows:

σ^{T f T} : \begin{matrix} In the first turn P_{n} does not shoot P_{- n}; \\ at every other turn P_{n} performs the same action (shooting or not shooting) \\ that P_{- n} performed in the previous round . \end{matrix}

This strategy was also originally used in the analysis of the iterated PD.

Proposition 3.

(σ^{T f T}, σ^{T f T})

is an NE of

Γ (a, b, p)

iff

\frac{b_{1}}{a_{1}} > \frac{1 + p_{2} (1 - p_{1})}{1 - p_{1}} \cdot \frac{p_{1}}{p_{2}} and \frac{b_{2}}{a_{2}} > \frac{1 + p_{1} (1 - p_{2})}{1 - p_{2}} \cdot \frac{p_{2}}{p_{1}} .

(10)

Proof.

If both players play the strategy

σ^{T f T}

, then they never shoot at each other and their payoffs are

\forall n \in \{1, 2\} : V_{n}^{T f T} = Q_{n} (11, σ^{T f T}, σ^{T f T}) = Q_{n} (11, σ^{C}, σ^{C}) = 0 .

Now, suppose that

P_{2}

adheres to

σ^{T f T}

but

P_{1}

deviates. If

P_{1}

gains by deviating from

σ^{T f T}

at some turn, then they must also gain by shooting at

P_{2}

in the first turn. If they do so, then

P_{2}

shoots at

P_{1}

for all subsequent turns, until

P_{1}

reverts to not firing. Thus,

P_{1}

has two options after their first deviation.

They can continue shooting in all subsequent turns, in which case, so will $P_{2}$ ;
They can revert to not shooting, in which case, in the next turn, they are in the same situation as at the start of the game.

Consequently, if

P_{1}

can increase their payoff by deviating, then they can do so, either (a) by shooting in every turn, or (b) by alternating between shooting and not shooting. If we find conditions under which

P_{1}

cannot increase their payoff by either of the above strategies, then, under the same conditions,

P_{1}

cannot increase their payoff by deviating, which implies that

(σ^{T f T}, σ^{T f T})

is an NE.

Consider first the case in which $P_{1}$ adopts the strategy $σ^{D}$ of shooting in each turn. Then we have

$Q_{n} (11, σ^{D}, σ^{T f T}) = Q_{1} (11, (σ^{D}, σ^{G})) = V_{1}^{R}$

and, by the same analysis as in the proof of Proposition 2, we know that $V_{1}^{C} - V_{1}^{R} > 0$ iff

$p_{2} (1 - p_{1}) (p_{1} a_{1} - b_{1}) + p_{1} a_{1} < 0$

which is equivalent to our assumption

$\frac{b_{1}}{a_{1}} > \frac{1 + p_{2} (1 - p_{1})}{1 - p_{1}} \cdot \frac{p_{1}}{p_{2}} .$
Next consider the case in which $P_{1}$ alternates between shooting and not shooting. Then their payoff will be

$V_{1}^{S} = p_{1} a_{1} + (1 - p_{1}) (0 + p_{2} (- b_{1}) + (1 - p_{2}) V_{1}^{S}) .$

The above equation holds because the expected payoff $V_{1}^{S}$ is computed by summing the following possibilities. $P_{1}$ will certainly shoot and then:
(a)
With probability $p_{1}$ , $P_{2}$ will kill $P_{2}$ and hence, receive payoff $a_{1}$ ;
(b)
With probability $1 - p_{1}$ , $P_{2}$ will miss (and receive zero payoff) and in the next turn $P_{2}$ will shoot and kill $P_{1}$ ; this combination has probability $(1 - p_{1}) p_{2}$ and gives to $P_{1}$ payoff $- b_{1}$ ;
(c)
With probability $1 - p_{1}$ , $P_{1}$ will miss and in the next turn $P_{2}$ will shoot and miss $P_{1}$ ; this combination has probability $(1 - p_{1}) (1 - p_{2})$ and returns the game to the original state, in which $P_{1}$ receives payoff $V_{1}^{S}$ .
Simplifying the above equation and solving we obtain

$V_{1}^{S} = \frac{p_{1} a_{1} - p_{2} b_{1} + p_{2} b_{1} p_{1}}{p_{1} + p_{2} - p_{2} p_{1}} .$

For an NE we must have $V_{1}^{C} - V_{1}^{S} > 0$ and this will hold when

$0 > p_{1} a_{1} - p_{2} b_{1} + p_{2} b_{1} p_{1} \Leftrightarrow p_{2} b_{1} (1 - p_{1}) > p_{1} a_{1} \Leftrightarrow \frac{b_{1}}{a_{1}} > \frac{1}{1 - p_{1}} \cdot \frac{p_{1}}{p_{2}} .$

However, from our assumption (10), we have

$\frac{b_{1}}{a_{1}} > \frac{1 + p_{2} (1 - p_{1})}{1 - p_{1}} \cdot \frac{p_{1}}{p_{2}} > \frac{1}{1 - p_{1}} \cdot \frac{p_{1}}{p_{2}} .$

Hence $V_{1}^{C} - V_{1}^{S} > 0$ .

Combining 1 and 2, we see that

P_{1}

has no advantage in deviating from

σ^{T f T}

; by a similar analysis, the same holds for

P_{2}

and hence, the proof is completed. □

Corollary 1.

The duel NE conditions (8) and (10) are the same. In other words,

(σ^{D}, σ^{D})

is an NE of

Γ (a, b, p)

iff

(σ^{T f T}, σ^{T f T})

is an NE of

Γ (a, b, p)

.

Let us compare the stationary and non-stationary NE. Initially, we made no assumption regarding the relative size of

a_{n}

and

b_{n}

(although we did assume they are both positive). In other words,

a_{n} - b_{n}

may be positive (

P_{n}

sets more value in surviving), negative (

P_{n}

hates their opponent so much that they value killing them more than surviving) or zero. However, even when

a_{n} > b_{n}

for both n, if the players limit themselves to using stationary strategies, then the only Nash equilibrium consists of both players shooting at each other with probability one (by Proposition 1); for the more desirable outcome of both players surviving to be (another) Nash equilibrium, they must use non-stationary strategies.

6. Conclusions

We have defined a turn-based duel game with terminal payoffs and shown that it has both stationary and non-stationary Nash equilibria. The non-stationary equilibria that we have established are the grim cooperation and Tit-for-Tat pairs. These are of the same form as the synonymous strategies used in the iterated Prisoner’s Dilemma; we were motivated to use these in the duel by the previously explained similarity between the payoff structure of our duel game and that of the IPD.

In addition to their independent interest, the above results have potential application to the truel and nuel problems. As we have pointed out, to the best of our knowledge, the literature on truel and nuel is limited to the study of stationary strategies. In the case of the duel, in addition to stationary NE, we also have non-stationary NE. We reported here two such non-stationary NE (

(σ^{G}, σ^{G})

and

(σ^{T f T}, σ^{T f T})

), and it is not hard to construct additional ones, using an approach similar to that used in the study of repeated games [35]. We conjecture that, using the methods of the current paper, it is also possible to establish a plethora of non-stationary NE for the general, N-player nuel; we intend to pursue this research direction in the future.

Several variants of the duel can be formulated and are worth exploring. In addition to the variant described in this paper, we have explored a variant in which each player receives some discounted payoff for every turn in which they stay alive. Including those results (and the techniques required for their proof) would increase the size of the current paper inordinately; hence, they will be reported in a separate publication. Further variants to be explored in the future include:

sequential play, in which a single player is allowed to shoot in each turn;
random play, in which the player allowed to shoot in each turn is chosen randomly and equi-probably.

In addition, in the future we intend to study the use of non-stationary strategies in truels and nuels.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The author declares no conflict of interest.

Notes

1	It should also be noted that an extensive literature on a quite different type of duel games exists, which essentially are games of timing [22,23,24]. However, this literature is not relevant to the game studied in this paper.
2	In the sequel we use the standard game theoretic notation by which $s_{- 1} = s_{2}$ , $s_{- 2} = s_{1}$ . The same notation is used for players, actions etc.
3	Several parts of this paper require rather involved algebraic calculations. We have always performed these using the computer algebra system Maple and afterwards verified the results by hand.
4	We should clarify at this point that, despite the use of the terms “cooperation” and “cooperative”, the duel is not a cooperative game in Shapley’s sense [34]. In other words, it does not involve external enforcement of cooperative behavior. Instead, the duel is a non-cooperative game and “cooperation” is used in the same sense as in the Prisoner’s Dilemma literature; i.e., “cooperation” is understood as a spontaneous emergence of coordinated moves due to the players’ selfish behavior, rather than due to an explicit alliance mechanism.

References

Gardner, M. New Mathematical Puzzles and Diversions; Simon and Schuster: New York, NY, USA, 1966; pp. 42–49. [Google Scholar]
Kinnaird, C. Encyclopedia of Puzzles and Pastimes; Grosset & Dunlap: Secaucus, NJ, USA, 1946. [Google Scholar]
Larsen, H.D. A Dart Game. Am. Math. Mon. 1948, 3, 640–641. [Google Scholar]
Mosteller, F. Fifty Challenging Problems in Probability with Solutions; Courier Corporation: Chelmsford, MA, USA, 1987. [Google Scholar]
Shubik, M. Does the Fittest Necessarily Survive? In Readings in Game Theory and Political Behavior; Doubleday: New York, NY, USA, 1954. [Google Scholar]
Knuth, D.E. The Triel: A New Solution. J. Recreat. Math. 1972, 6, 1–7. [Google Scholar]
Kilgour, D.M. The simultaneous truel. Int. J. Game Theory 1971, 1, 229–242. [Google Scholar] [CrossRef]
Kilgour, D.M. The sequential truel. Int. J. Game Theory 1975, 4, 151–174. [Google Scholar] [CrossRef]
Kilgour, D.M. Equilibrium points of infinite sequential truels. Int. J. Game Theory 1977, 6, 167–180. [Google Scholar] [CrossRef]
Zeephongsekul, P. Nash Equilibrium Points of Stochastic N-Uels. Recent Developments in Mathematical Programming; CRC Press: Boca Raton, FL, USA, 1991; pp. 425–452. [Google Scholar]
Bossert, W.; Brams, S.J.; Kilgour, D.M. Cooperative vs. non-cooperative truels: Little agreement, but does that matter? Games Econ. Behav. 2002, 40, 185–202. [Google Scholar] [CrossRef]
Brams, S.J.; Kilgour, D.M. The truel. Math. Mag. 1997, 70, 315–326. [Google Scholar]
Brams, S.J.; Kilgour, D.M. Games That End in a Bang or a Whimper; preprint; CV Starr Center for Applied Economics: New York, NY, USA, 2001. [Google Scholar]
Brams, S.J.; Kilgour, D.M.; Dawson, B. Truels and the Future. Math Horiz. 2003, 10, 5–8. [Google Scholar] [CrossRef]
Amengual, P.; Toral, R. Distribution of winners in truel games. AIP Conf. 2005, 779, 128–141. [Google Scholar]
Amengual, P.; Toral, R. A Markov chain analysis of truels. In Proceedings of the 8th Granada Seminar on Computational Physics, Granada, Spain, 7–11 February 2005. [Google Scholar]
Amengual, P.; Toral, R. Truels, or survival of the weakest. Comput. Sci. 2006, 8, 88–95. [Google Scholar] [CrossRef]
Toral, R.; Amengual, P. Distribution of winners in truel games. In AIP Conference Proceedings; American Institute of Physics: College Park, MD, USA, 2005; Volume 779. [Google Scholar]
Dorraki, M.; Allison, A.; Abbott, D. Truels and strategies for survival. Sci. Rep. 2019, 9, 1–7. [Google Scholar]
Xu, X. Game of the truel. Synthese 2012, 185, 19–25. [Google Scholar] [CrossRef]
Wegener, M.; Mutlu, E. The good, the bad, the well-connected. Int. J. Game Theory 2021, 50, 759–771. [Google Scholar] [CrossRef]
Barron, E.N. Game Theory: An Introduction; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
Dresher, M. Games of Strategy: Theory and Applications; Rand Corp.: Santa Monica, CA, USA, 1961. [Google Scholar]
Karlin, S. Mathematical Methods and Theory in Games, Programming, and Economics; Addison-Wesley: Boston, MA, USA, 1959; Volume 2. [Google Scholar]
Cole, S.; Phillips, J.; Hartman, A. Test of a model of decision processes in an intense conflict situation. Behav. Sci. 1977, 22, 186–196. [Google Scholar] [CrossRef] [PubMed]
Brams, S.J. Theory of moves. Am. Sci. 1993, 81, 562–570. [Google Scholar]
Dubovik, A.; Parakhonyak, A. Selective Competition; Econstor; Tinbergen Institute: Amsterdam, The Netherlands; Rotterdam, The Netherlands, 2009. [Google Scholar]
Crump, D. Game theory, legislation, and the multiple meanings of equality. Harv. J. Legis. 2001, 38, 331–412. [Google Scholar]
Archetti, M. Survival of the weakest in N-person duels and the maintenance of variation under constant selection. Evolution 2012, 66, 637–650. [Google Scholar] [CrossRef]
Abbott, D. Developments in Parrondo’s paradox. In Applications of Nonlinear Dynamics; Springer: Berlin/Heidelberg, Germany, 2009; pp. 307–321. [Google Scholar]
Alberts, B.; Lewis, J.; Raff, M.; Roberts, K.; Walter, P. Molecular Biology of the Cell, 4th ed.; Garland Science: New York, NY, USA, 2002. [Google Scholar]
Ratzke, C.; Denk, J.; Gore, J. Ecological suicide in microbes. Nat. Ecol. Evol. 2018, 2, 867–872. [Google Scholar] [CrossRef]
Sobel, M.J. Noncooperative stochastic games. Ann. Math. Stat. 1971, 42, 1930–1935. [Google Scholar] [CrossRef]
Chakravarty, S.R.; Mitra, M.; Sarkar, P. A Course on Cooperative Game Theory; Cambridge University Press: Cambridge, UK, 2014. [Google Scholar]
Peters, H. Game Theory: A Multi-Leveled Approach; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kehagias, A. On the Nash Equilibria of a Duel with Terminal Payoffs. Games 2023, 14, 62. https://doi.org/10.3390/g14050062

AMA Style

Kehagias A. On the Nash Equilibria of a Duel with Terminal Payoffs. Games. 2023; 14(5):62. https://doi.org/10.3390/g14050062

Chicago/Turabian Style

Kehagias, Athanasios. 2023. "On the Nash Equilibria of a Duel with Terminal Payoffs" Games 14, no. 5: 62. https://doi.org/10.3390/g14050062

APA Style

Kehagias, A. (2023). On the Nash Equilibria of a Duel with Terminal Payoffs. Games, 14(5), 62. https://doi.org/10.3390/g14050062

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

On the Nash Equilibria of a Duel with Terminal Payoffs

Abstract

1. Introduction

2. Game Description

3. Stationary Equilibria

4. Connection to the Iterated Prisoner’s Dilemma

5. Non-Stationary Equilibria

6. Conclusions

Funding

Data Availability Statement

Conflicts of Interest

Notes

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI