Bargaining Mechanisms for One-Way Games

Abeliuk, Andrés; Berbeglia, Gerardo; Van Hentenryck, Pascal

doi:10.3390/g6030347

Open AccessArticle

Bargaining Mechanisms for One-Way Games^†

by

Andrés Abeliuk

^1,2,*,

Gerardo Berbeglia

^1,3 and

Pascal Van Hentenryck

^1,4

¹

National Information and Communications Technology Australia, NICTA Victoria Lab, West Melbourne, VIC 3003, Australia

²

Department of Computing and Information Systems, The University of Melbourne, Parkville, VIC 3010, Australia

³

Centre for Business Analytics, Melbourne Business School, The University of Melbourne, Parkville, VIC 3010, Australia

⁴

Department of Industrial and Operations Engineering, University of Michigan, 1205 Beal Avenue, Ann Arbor, MI 48109, USA

^*

Author to whom correspondence should be addressed.

^†

An earlier, shorter version of this paper appeared in the Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI), 2015 [1].

Games 2015, 6(3), 347-367; https://doi.org/10.3390/g6030347

Submission received: 1 June 2015 / Revised: 20 August 2015 / Accepted: 27 August 2015 / Published: 8 September 2015

(This article belongs to the Special Issue Bargaining Games)

Download Versions Notes

Abstract

:

We introduce one-way games, a two-player framework whose distinguishable feature is that the private payoff of one (independent) player is determined only by her own strategy and does not depend on the actions taken by the other (dependent) player. We show that the equilibrium outcome in one-way games without side payments and the social cost of any ex post efficient mechanism can be far from the optimum. We also show that it is impossible to design a Bayes–Nash incentive-compatible mechanism for one-way games that is budget-balanced, individually rational and efficient. To address this negative result, we propose a privacy-preserving mechanism based on a single-offer bargaining made by the dependent player that leverages the intrinsic advantage of the independent player. In this setting the outside option of the dependent player is not known a priori; however, we show that the mechanism satisfies individual rationality conditions, is incentive-compatible, budget-balanced and produces an outcome that is more efficient than the equilibrium without payments. Finally, we show that a randomized multi-offer extension brings no additional benefit in terms of efficiency.

Keywords:

bargaining; mechanism design; price of anarchy; distributed problem solving

1. Introduction

When modeling economic interactions between agents, it is standard to adopt a general framework where payoffs of individuals are dependent on the actions of all other decision-makers. However, some agents may have payoffs that depend only on their own actions, not on actions taken by other agents. In this paper, we explore the consequences of such asymmetries among agents. Since these features lead to a restricted version of the general model, the hope is that we can identify mechanisms that produce efficient outcomes by exploiting the properties of this specific setting.

A classic application of this setting is Coase’s example of a polluter and a single victim, e.g., a steel mill that affects a laundry. The Coase theorem [2] is often interpreted as a demonstration of why private negotiations between polluters and victims can yield efficient levels of pollution without government interference. However, in an influential article, Hahnel and Sheeran [3] criticize the Coase theorem by showing that, under more realistic conditions, it is unlikely that an efficient outcome will be reached. They emphasize that the solution is a negotiation and not a market-based transaction, as described by Coase. As such, incomplete information plays an important role, and game theory and bargaining games can explain inefficient outcomes.

This paper aims at taking the first step in this direction by proposing a class of two-player one-way dependent decision settings, which formalizes many of Hahnel and Sheeran’s critiques. We present a number of negative and positive results on one-way games. We first show that Nash equilibria in one-way games, under no side payments, can be arbitrarily far from the optimal social welfare. Moreover, in contrast to the Coase theorem, we show that when side payments are allowed in a Bayes–Nash incentive-compatible setting, there is no ex post efficient individually rational and budget-balanced mechanism for one-way games. To address this negative result, we focus on mechanisms that are budget-balanced, individually rational and incentive-compatible that are relatively efficient. Our main positive result is a single-offer bargaining mechanism, which, under reasonable assumptions for the players, increases the social welfare compared to the setting where no side payments are allowed. We also show that this single-offer mechanism cannot be improved by a (randomized) multi-offer mechanism.

This paper is an extended version of the work presented at IJCAI-15 [1], and contains additional details, examples, and several generalizations and extensions of the model. More precisely, we have generalized our framework by analyzing the case where the outside option of one of the players is not known a priori in the bargaining mechanism. We also introduce a randomized multi-offer extension of the single-offer bargaining and show that it brings no additional benefit in terms of efficiency.

The rest of this paper is organized as follows. In Section 2, we define one-way games and study the properties of Nash equilibria without payments. In Section 4.1, we prove an impossibility result. The single-offer mechanism is presented and analyzed in Section 5 and Section 5.1, respectively. Finally, in Section 6, we present a multi-offer mechanism, and we show that it does not improve the efficiency with respect to the single-offer one.

2. One-Way Games

One-way games feature two players A and B. Each player

i \in A, B

has a public action set

S_{i}

, and we write

S = S_{A} \times S_{B}

to denote the set of joint action profiles. As most commonly done in mechanism design, we model private information by associating each agent i with a payoff function

u_{i} : S \times Θ_{i} \to R^{+}

, where

u_{i} (s, θ_{i})

is the agent utility for strategy profile s when the agent has type

θ_{i}

. We assume that the player types are stochastically independent and drawn from a distribution f that is common knowledge. We denote by

Θ_{i}

the set of possible types of player i and write

Θ = Θ_{A} \times Θ_{B}

. If

θ \in Θ

, we use

θ_{i}

to denote the type of player i in θ. Similar conventions are used for strategies, utilities and type distributions.

A key feature of one-way games is that the payoff

u_{A} ((s_{A}, s_{B}), θ_{A})

of player A is determined only by her own strategy and does not depend on B’s actions, i.e.,

\forall s_{A}, s_{B}, s_{B}^{'}, θ_{A} : u_{A} ((s_{A}, s_{B}), θ_{A}) = u_{A} ((s_{A}, s_{B}^{'}), θ_{A})

As a result, for the ease of notation, we use

u_{A} (s_{A}, θ_{A})

to denote A’s payoff. Obviously, player B must act according to what player A chooses to do, and we use

s_{B} (s_{A}, θ_{B})

to denote the best response of player B given that A plays action

s_{A}

and player B has type

θ_{B}

, i.e.,

s_{B} (s_{A}, θ_{B}) = \underset{s_{B} \in S_{B}}{arg-max} u_{B} ((s_{A}, s_{B}), θ_{B})

where ties are broken arbitrarily. In this paper, we always assume that ties are broken arbitrarily in arg−max expressions.

One-way games assume that players are risk-neutral agents and that after having observed the realization of their own types, players simultaneously choose their actions. As a consequence, if side payments are not allowed, player A will play an action

s_{A}^{N}

that yields her a maximum payoff, i.e.,

s_{A}^{N} (θ_{A}) = \underset{s_{A} \in S_{A}}{arg-max} u_{A} (s_{A}, θ_{A})

Player B will pick

s_{B}^{N} (θ_{B})

, such that her expected payoff is maximized, i.e.,

s_{B}^{N} (θ_{B}) = \underset{s_{B} \in S_{B}}{arg-max} E_{θ_{A}} [u_{B} (s_{B} (s_{A}^{N} (θ_{A}), θ_{B}), θ_{B})]

The set of Nash equilibria (NE) is thus characterized by

s^{N} (θ) = (s_{A}^{N} (θ_{A}), s_{B}^{N} (θ_{B})) \subseteq S

. The best response

s_{B}^{N} (θ_{B})

of player B may be a bad outcome for her even when B has a much greater potential payoff. Player A achieves her optimal payoff, but our motivating applications aim at optimizing a global welfare function:

S W ((s_{A}, s_{B}), θ) = u_{A} (s_{A}, θ_{A}) + u_{B} ((s_{A}, s_{B}), θ_{B})

Thus, the global welfare achieved by the Nash equilibria can be expressed as

S W (s^{N} (θ), θ)

. We quantify the quality of the Nash equilibrium outcome with the price of anarchy (PoA).

Definition 1.

The price of anarchy of

s^{N} (θ) \subseteq S

given type

θ \in Θ

is defined as:

P o A (θ) = \frac{{max}_{s \in S} S W (s, θ)}{{min}_{s \in s^{N} (θ)} S W (s, θ)}

A natural extension for the price of anarchy is to quantify the expected worst-case equilibrium.

Definition 2.

The Bayes–Nash price of anarchy is defined as:

P o A = E_{θ} [P o A (θ)]

Note that the price of anarchy given type

θ \in Θ

can be used to obtain a lower and an upper bound on the Bayes–Nash price of anarchy in the following way,

min_{θ \in Θ} P o A (θ) \leq P o A \leq max_{θ \in Θ} P o A (θ)

Throughout this paper, we use the following two running examples to illustrate key concepts.

Example 1.

Consider the instance where player A has two possible actions

s_{A}^{1}, s_{A}^{2} \in S_{A}

. Player’s A type

θ_{A} \in [0, 100]

is drawn from a uniform distribution between zero and 100. Action

s_{A}^{1}

has a payoff

u_{A} (s_{A}^{1}, θ_{A}) = θ_{A}

, while action

s_{A}^{2}

has a constant payoff

u_{A} (s_{A}^{2}, θ_{A}) = 100

. Let player B have only one possible type

θ_{B}

. The set of dominant actions for player B corresponds to the set of best responses

s_{B} (s_{A}^{1}, θ_{B})

and

s_{B} (s_{A}^{2}, θ_{B})

, and we set payoffs to be

u_{B} (s_{B} (s_{A}^{1}, θ_{B})) = x

and

u_{B} (s_{B} (s_{A}^{2}, θ_{B})) = 0

, where x is a positive constant. When no transfers are allowed, player A will always play action

s_{A}^{2}

, yielding a social welfare of

100 + 0

. If player A plays

s_{A}^{1}

, her expected payoff is 50, and the expected social welfare is thus

50 + x

. The price of anarchy is

\frac{50 + x}{100}

if

x \geq 50

and one otherwise. Notice that the PoA is an increasing function of x.

Example 2.

Consider the instance where player A has n possible actions

s_{A}^{1}, s_{A}^{2}, \dots, s_{A}^{n} \in S_{A}

. Player’s A type

θ_{A} \in {[0, 1]}^{n}

is drawn from n independent and identically distributed uniform distributions between zero and one. Each action has a payoff

u_{A} (s_{A}^{i}, θ_{A}) = θ_{A, i}

, where

θ_{A, i}

denotes the i-th element of vector

θ_{A}

. For player B, consider the set of best responses

s_{B} (s_{A}^{1}, θ_{B}), s_{B} (s_{A}^{2}, θ_{B}), \dots, s_{B} (s_{A}^{n}, θ_{B})

. Let the expected payoffs be

E_{θ_{B}} [u_{B} (s_{B} (s_{A}^{i}, θ_{B}), θ_{B})] = μ_{i}

for

i = 1, 2, \dots, n

, where

μ_{1} \geq μ_{2} \geq \dots \geq μ_{n}

. All other payoffs are set to zero, i.e., for any

θ_{B} \in Θ_{B}

,

s_{A} \in S_{A}

and

s_{B} \neq s_{B} (s_{A}, θ_{B})

:

u_{B} ((s_{B}, s_{A}), θ_{B}) = 0

.

When no transfers are allowed, player A chooses her (realized) maximizing payoff action. Such an action is distributed according to the largest order statistic, i.e., the maximum between all payoffs. The largest order statistic between n standard uniforms follows a

B e t a (n, 1)

distribution [4], with mean

\frac{n}{n + 1}

. Hence, the expected payoff of player A is

\frac{n}{n + 1}

.

By symmetry of player A’s payoff functions, all of her actions will be played with probability

\frac{1}{n}

. Thus, player B’s expected payoff is maximized when choosing

s_{B} (s_{A}^{1}, θ_{B})

, yielding her an expected payoff of

\frac{μ_{1}}{n}

. As a result, the expected social welfare in equilibrium is

\frac{n}{n + 1} + \frac{μ_{1}}{n}

.

To compute the optimal social welfare, note that if player A select

s_{A}^{1}

with probability one, her expected payoff is

\frac{1}{2}

, and the expected payoff of player B is

μ_{1}

; and so, the social welfare is

\frac{1}{2} + μ_{1}

, which is greater than

\frac{n}{n + 1} + \frac{μ_{1}}{n}

if

\frac{1}{2} + μ_{1} \geq 1 + \frac{μ_{1}}{n}

or, equivalently,

μ_{1} \geq 1

for every value of

n \geq 2

. The price of anarchy is

(\frac{1}{2} + μ_{1}) / (\frac{n}{n + 1} + \frac{μ_{1}}{n})

when

μ_{1} \geq 1

, and in the limit as

n \to \infty

, the PoA becomes

\frac{1}{2} + μ_{1}

. Notice that the PoA is an increasing function of

μ_{1}

.

Examples 1 and 2 illustrate that, when no transfers are allowed, the price of anarchy can be arbitrarily large as player B ’s payoff is large compared to the payoff of player A. We now generalize this idea to quantify the price of anarchy in one-way games. Notice that, for this argument, private information plays no role. For instance, in Example 1, even if player A’s revealed type is set to be public knowledge, this would not change her strategy. However, private information will have an important role when we explore side payments in Section 5.

Proposition 1.

In one-way games, the price of anarchy when no payments are allowed satisfies, for any type θ,

\frac{{max}_{s \in S} u_{B} (s, θ_{B})}{{max}_{s \in S} u_{A} (s, θ_{A}) + u_{B} (s^{N} (θ), θ_{B})} \leq P o A (θ) \leq 1 + \frac{{max}_{s \in S} u_{B} (s, θ_{B})}{{max}_{s \in S} u_{A} (s, θ_{A})}

Proof.

Let

{\bar{u}}_{i} (θ_{i}) = {max}_{s \in S} u_{i} (s, θ_{i})

,

i \in {A, B}

. The independence of player A implies that, for all

θ \in Θ

, her payoff is

u_{A} (s^{N} (θ), θ_{A}) = {\bar{u}}_{A} (θ_{A})

. It follows that:

\begin{matrix} \frac{max {{\bar{u}}_{A} (θ_{A}), {\bar{u}}_{B} (θ_{B})}}{{\bar{u}}_{A} (θ_{A}) + u_{B} (s^{N} (θ), θ_{B})} & \leq & P o A (θ) \\ \leq \frac{{\bar{u}}_{A} (θ_{A}) + {\bar{u}}_{B} (θ_{B})}{{\bar{u}}_{A} (θ_{A}) + u_{B} (s^{N} (θ), θ_{B})} & \leq & \frac{{\bar{u}}_{A} (θ_{A}) + {\bar{u}}_{B} (θ_{B})}{{\bar{u}}_{A} (θ_{A})} = 1 + \frac{{\bar{u}}_{B} (θ_{B})}{{\bar{u}}_{A} (θ_{A})} \end{matrix}

☐

The price of anarchy can thus be arbitrarily large. When it is large enough, Proposition 1 indicates that

{max}_{s \in S} u_{B} (s, θ_{B}) \geq {max}_{s \in S} u_{A} (s, θ_{A}) \geq u_{B} (s^{N} (θ), θ_{B})

. In this case, player B has bargaining power to incentivize player A monetarily, so that she moves from her equilibrium and cooperates to overcome bad social welfare. This paper explores this possibility by analyzing the social welfare when side payments are allowed.

3. Related Work

Before moving to the main results, it is useful to discuss related games. One-way games may seem to resemble Stackelberg games with their notions of leader and follower. The key difference, however, is that, in one-way games, the leader does not depend on the action taken by the follower. In addition, in one-way games, players do not have complete information, and moves are simultaneous. Jackson and Wilkie [5] studied one-way instances derived from their more general framework of endogenous games. However, they tackled the problem from a different perspective and assumed complete information (i.e., the player utilities are not private). Jackson and Wilkie gave a characterization of the outcome when players make binding offers of side payments, deriving the conditions under which a new outcome becomes a Nash equilibrium or remains one. They analyzed a subclass, called ‘one-sided externality’, which is essentially a one-way game, but with complete information. They showed that the efficient outcome is an equilibrium in this setting, supporting Coase’s claim that a polluter and his victim can reach an efficient outcome. Under perfect information, the victim can determine the minimal transfer necessary to support the efficient play. Naturally, this result does not hold under incomplete information [6]. In what follows, we design a bargaining mechanism that is able to cope with the incomplete information setting.

4. Bayesian–Nash Mechanisms

In this section, we consider a Bayesian Nash setting with quasi-linear preferences. Both players A and B have private utilities and beliefs about the utilities of the other players. By the revelation principle, we can restrict our attention to direct mechanisms that implement a social choice function. A social choice function in quasi-linear environments takes the form of

f (θ) = (k (θ), t (θ))

, where, for every

θ \in Θ

,

k (θ) \in S

is the allocation function and

t_{i} (θ) \in R

represents a monetary transfer to agent i. The main objective of mechanism design is to implement a social choice function that achieves near efficient allocations, while respecting some desirable properties. For completeness, we specify these key properties.

Definition 3.

A social choice function is ex post efficient if, for all

θ \in Θ

, we have:

k (θ) \in {arg-max}_{s \in S} \sum_{i} u_{i} (s, θ)

.

Definition 4.

A social choice function is budget-balanced (BB) if, for all

θ \in Θ

, we have

\sum_{i} t_{i} (θ) = 0

.

In other words, there are no net transfers out of the system or into the system. Taken together, ex post efficiency and budget-balance imply Pareto optimality. An essential condition of any mechanism is to guarantee that agents report their true types. The following property captures this notion when agents have prior beliefs on the types of other agents.

Definition 5.

A social choice function is Bayes–Nash incentive compatible (IC) if for every player i:

E_{θ_{- i} | θ_{i}} [u_{i} (k (θ_{i}, θ_{- i}), θ_{i}) + t_{i} (θ_{i}, θ_{- i})] \geq E_{θ_{- i} | θ_{i}} [u_{i} (k ({\hat{θ}}_{i}, θ_{i}), θ_{i}) + t_{i} ({\hat{θ}}_{i}, θ_{- i})]

where

θ_{i} \in Θ_{i}

is the type of player i,

{\hat{θ}}_{i}

is the type player i reports and

E_{θ_{- i} | θ_{i}}

denotes player i’s expectation over prior beliefs

θ_{- i}

of the types of other agents given her own type

θ_{i}

.

The most natural definition of individual rationality (IR) is interim IR, which states that every agent type has non-negative expected gains from participation.

Definition 6.

A social choice function is interim individual rational if, for all types

θ \in Θ

, it satisfies:

E_{θ_{- i} | θ_{i}} [u_{i} (k (θ), θ_{i}) + t_{i} (θ)] \geq {\bar{u}}_{i} (θ_{i})

where

{\bar{u}}_{i} (θ_{i})

is the expected utility for non-participation.

In the context of one-way games, both players have positive outside options that depend only on their types. In particular, the outside options are given by the Nash equilibrium outcome under no side payments. For players A and B, the expected utilities for non-participation are

{\bar{u}}_{A} (θ_{A}) = u_{A} (s_{A}^{N} (θ_{A}), θ_{A})

and

{\bar{u}}_{B} (θ_{B}) = u_{B} (s^{N} (θ), θ_{B})

, respectively.

4.1. Impossibility Result

This section shows that there exists no mechanism for one-way games that is efficient and satisfies the traditional desirable properties. The result is derived from the Myerson–Satterthwaite [6] theorem, a seminal impossibility result in mechanism design. The Myerson–Satterthwaite theorem considers a bargaining game with two-sided private information, and it states that, for a bilateral trade setting, there exists no Bayes–Nash incentive-compatible mechanism that is budget balanced, ex post efficient and that gives every agent type non-negative expected gains from participation (i.e., ex interim individual rationality).

Our contribution is two-fold: we present an impossibility result for one-way games, and we relate them with bargaining games, an idea that we will further explore in the following sections. We now formalize the impossibility result for one-way games.

Consider the Myerson–Satterthwaite bilateral bargaining setting.

Definition 7.

Myerson–Satterthwaite bargaining game:

A seller (Player 1) owns an object for which her valuation is $v_{1} \in V_{1}$ , and a buyer (Player 2) wants to buy the object at a valuation $v_{2} \in V_{2}$ .
Each player i knows her valuation $v_{i}$ at the time of the bargaining, and Player 1 (respectively, 2) has a probability density distribution $f_{2} (v_{2})$ (respectively, $f_{1} (v_{1})$ ) for the other player’s valuation.
Both distributions are assumed to be continuous and positive on their domain, and the intersection of the domains is not empty.

By the revelation principle, we can restrict our attention to incentive-compatible direct mechanisms. A direct mechanism for bargaining games is characterized by two functions: (1) a probability distribution

σ : V_{1} \times V_{2} \to [0, 1]

that specifies the probability that the object is transferred from the seller to the buyer; and (2) a monetary transfer scheme

p : V_{1} \times V_{2} \to R^{2}

. In this setting, ex post efficiency is achieved if

σ (v_{1}, v_{2}) = 1

when

v_{1} < v_{2}

, and zero otherwise.

Our result consists of showing that a mechanism

M^{'}

for the Myerson–Satterthwaite setting can be constructed using a mechanism

M

for a one-way game in such a way that, if

M

is efficient, individual rational (IR), incentive compatible (IC) and budget-balanced (BB), then

M^{'}

is efficient, IR, IC and BB. The Myerson–Satterthwaite impossibility theorem states that such a mechanism

M^{'}

cannot exist, which implies the following impossibility result for one-way games.

Theorem 1.

There is no ex post efficient, individually rational, incentive-compatible and budget-balanced mechanism for one-way games.

Proof.

For any bargaining setting, consider the following transformation into a one-way game instance:

S_{A} = {s_{A}^{1}, s_{A}^{2}}, S_{B} = {s_{B}}

\forall v_{1} \in V_{1} : u_{A} (s_{A}^{1}, v_{1}) = v_{1}, u_{A} (s_{A}^{2}, v_{1}) = 0

\forall v_{2} \in V_{2} : u_{B} ((s_{A}^{1}, s_{B}), v_{2}) = 0, u_{B} ((s_{A}^{2}, s_{B}), v_{2}) = v_{2}

where player types

(v_{1}, v_{2}) \in V_{1} \times V_{2}

are drawn from distribution

f_{1} \times f_{2}

. Two possible outcomes may occur,

(s_{A}^{1}, s_{B})

or

(s_{A}^{2}, s_{B})

, with social welfare

v_{1}

and

v_{2}

, respectively.

Let us assume

M = (k, t)

is a direct mechanism for one-way games and that

M

is ex post efficient, IR, IC and BB. We now construct a mechanism

M^{'} = (σ, p)

, where

σ (v_{1}, v_{2})

is the probability that the object is transferred from the seller to the buyer and

p (v_{1}, v_{2})

is the payment of each player. We define

M^{'}

such that:

σ (v_{1}, v_{2}) = \{\begin{matrix} 0 & if k (v_{1}, v_{2}) = (s_{A}^{1}, s_{B}) \\ 1 & if k (v_{1}, v_{2}) = (s_{A}^{2}, s_{B}) \end{matrix}

and:

p (v_{1}, v_{2}) = t (v_{1}, v_{2})

It remains to show that

M^{'}

satisfies all of the desired properties. An ex post efficient mechanism

M

in the one-way instance satisfies:

k (v_{1}, v_{2}) = \{\begin{matrix} (s_{A}^{1}, s_{B}) & if v_{1} \geq v_{2} \\ (s_{A}^{2}, s_{B}) & if v_{1} < v_{2} \end{matrix}

Therefore,

σ (v_{1}, v_{2})

will assign the object to the buyer iff

v_{1} < v_{2}

. That is, the player with the highest valuation will always get the object, meeting the restriction of ex post efficiency. The budget-balanced constraint in

M

implies that

p_{1} (v_{1}, v_{2}) + p_{2} (v_{1}, v_{2}) = 0

for all possible valuations, so

M^{'}

is budget-balanced.

The individual rationality property for

M^{'}

comes from noticing that the default strategy of player A when no payments are allowed is

s_{A}^{1}

, and the corresponding payoff is

v_{1}

. Therefore, the seller utility is guaranteed to be at least her valuation

v_{1}

. Analogously, the buyer will not have a negative utility given that

u_{B} ((s_{A}^{1}, s_{B}), v_{2}) = 0

.

Incentive compatibility is straightforward from the definition. Assume that

M^{'}

is not incentive compatible, then in mechanism

M

, at least one player could benefit from reporting a false type.

Such a mechanism

M^{'}

cannot exist, since it contradicts the Myerson–Satterthwaite impossibility result, which concludes our proof. ☐

An immediate consequence of this result is that Bayesian Nash mechanisms can only achieve at most two of the three properties: ex post efficiency, individual rationality and budget balance. For instance, the well-known Vickrey–Clarkev–Groves (VCG) mechanism and the d’Aspremont, Gérard-Varet [7] and Arrow [8] (dAGVA) mechanism are part of the Groves family of mechanisms that truthfully implement social choice functions that are ex post efficient. VCG has no guarantee of budget balance, while dAGVA is not guaranteed to meet the individual rationality constraints. We refer the reader to Williams [9] and Krishna and Perry [10] for alternative derivations of the impossibility result for bilateral trading under the Groves family of mechanisms.

5. Single-Offer Mechanism

In this section, we propose a simple bargaining mechanism for player B to increase her payoff. The literature about bargaining games is extensive, and we refer readers to a broad review by Kennan and Wilson [11].

Given the nature of our applications, individual rationality imposes a necessary constraint. Otherwise, player A can always defect from participating in the mechanism and achieve her maximal payoff independently of the type of player B. Additionally, we search for Bayesian Nash mechanisms without subsidies, i.e., budget-balanced mechanisms. The lack of a subsidiary in this case gives rise to a decentralized mechanism that does not require a third agent to perform the computations needed by the mechanism. However, a third party is needed to ensure compliance with the agreement reached by both players.

An interesting starting point for one-way games is the recognition that, whenever player B has a better payoff than A, player A may let player B play her optimal strategy in exchange for money. The resulting outcome can be viewed as swapping the roles of both players, i.e., player B chooses her optimal strategy and A plays her best response to B’s strategy. In this case, as in Proposition 1, the worst outcome would be:

1 + \frac{{max}_{s \in S} u_{A} (s, θ_{A})}{{max}_{s \in S} u_{B} (s, θ_{B})}

This observation together with Proposition 1 leads to the following lemma.

Lemma 1.

Consider the social choice function that selects the best action that maximizes the payoff of either player A or player B, i.e., the strategy:

s^{'} (θ) = \underset{s \in S}{arg-max} (max (u_{A} (s, θ_{A}), u_{B} (s, θ_{B})))

In the one-way game, the price of anarchy of implementing strategy

s^{'} (θ)

is two (i.e.,

\forall θ P o A (θ) = 2

).

The intuition behind this result is that the optimal social welfare is upper bounded by twice the utility of the action that maximizes the payoff of either player, whereas in the worst case scenario, the payoff of one of the players may be zero.

Unfortunately, this social choice function cannot be implemented in dominant strategies without violating individual rationality. Player A may have a smaller payoff by following strategy

s^{'}

instead of the Nash equilibrium strategy

s^{N}

. Indeed, when

S W (s^{'}, θ) < S W (s^{N}, θ)

, for any budget-balanced transfer function, it must be that at least one of the players will be worse than playing the Nash equilibrium strategy

s^{N}

. Lemma 1, however, gives us hope for designing a budget-balanced mechanism that has a constant price of anarchy. Indeed, a simple and distributed implementation would ask player B to propose an action to be implemented, and player A would receive a monetary compensation for deviating from her maximal strategy.

We now present such a distributed implementation based on a bargaining mechanism. The mechanism is inspired by the model of two-person bargaining under incomplete information presented by Chatterjee and Samuelson [12]. In their model, both the seller and the buyer submit sealed offers, and a trade occurs if there is a gap in the bids. The price is then set to be a convex combination of the bids. Our single-offer mechanism adapts this idea to one-way games. In particular, to counteract player A’s advantage, player B makes the first and final offer. Moreover, the structure of our mechanism makes it possible to quantify the price of anarchy and to provide a quality guarantee on the mechanism outcome. Our single-offer mechanism is defined as follows:

Player B selects an action $s_{A} \in S_{A}$ to propose to player A.
Player B also computes her outside option $s_{B}^{O} (s_{A}, θ_{B})$ in case player A rejects action $s_{A}$ , and we denote by $u_{B}^{O} (s_{A}, θ_{B})$ the expected payoff from her outside option.
Player B proposes a monetary value of $γ \cdot δ_{B} (s_{A}, θ_{B})$ with $δ_{B} (s_{A}, θ_{B}) = u_{B} (s_{B} (s_{A}, θ_{B}), θ_{B}) - u_{B}^{O} (s_{A}, θ_{B})$ and $γ \in R_{[0, 1]}$ to player A in the hope that she accepts to play strategy $s_{A}$ instead of strategy $s_{A}^{N}$ .
Player A decides whether to accept the offer.
If player A accepts the offer, the outcome of the game is $(s_{A}, s_{B} (s_{A}, θ_{B}))$ . Otherwise, the outcome of the game is the outside option $(s_{A}^{N} (θ_{A}), s_{B}^{O} (s_{A}, θ_{B}))$ .

It is worth observing that a broker is required in this mechanism to ensure that the outcome

(s_{A}^{N} (θ_{A}), s_{B}^{O} (s_{A}, θ_{B}))

is implemented if player A rejects the unique offer, and no counteroffers are made. A key feature of the single-offer mechanism is that it requires a minimum amount of information from player A (i.e., whether she accepts or rejects the offer).

To derive the equilibrium strategy for the single-offer mechanism, we assume that players are expected utility maximizers. The parameter

γ \in R_{[0, 1]}

has been chosen so that player B, satisfying individual rationality, never offers more than

δ_{B} (s_{A}, θ_{B})

, and her payoff is never worse than her expected outside option

u_{B}^{O} (s_{A}, θ_{B})

. Whereas the mechanism can only guarantee interim individual rationality for player B, it provides ex post individual rationality for player A, as shown in the following proposition.

Proposition 2.

If players A and B play the single-offer mechanism, for any

(θ_{A}, θ_{B}) \in Θ

, player A accepts the offer

(γ, s_{A})

whenever:

u_{A} (s_{A}, θ_{A}) + γ \cdot δ_{B} (s_{A}, θ_{B}) \geq u_{A} (s_{A}^{N} (θ_{A}), θ_{A})

In case player A rejects the offer

(γ, s_{A})

, she will choose her utility maximizing action

s_{A}^{N} (θ_{A})

as her outside option. Note that by Proposition 2, if

s_{A} = s_{A}^{N} (θ_{A})

, player A would never reject the proposed action

s_{A}

. Accordingly, if proposed action

s_{A}

is rejected, then

s_{A} \neq s_{A}^{N} (θ_{A})

. This observation leads to the following proposition.

Proposition 3.

For every task

s \in S_{A}

, let

Θ_{A}^{s} = Θ_{A} \ {θ_{A} \in Θ_{A} : s = {arg-max}_{x \in S_{A}} u_{A} (x, θ_{A})}

. In the single-offer mechanism, player B will pick outside option

s_{B}^{O} (s_{A}, θ_{B})

such that her expected payoff is maximized, i.e.,

s_{B}^{O} (s_{A}, θ_{B}) = \underset{s_{B} \in S_{B}}{arg-max} E_{θ_{A}^{s_{A}}} [u_{B} ((s_{A}^{N} (θ_{A}), s_{B}), θ_{B})]

Example 1.

(continued) The payoff of player B is higher if action

s_{A}^{1}

is played by player A. Hence, player B has incentives to submit an offer c that triggers action

s_{A}^{1}

. Player A of type

θ_{A}

accepts the offer if

c + u_{A} (s_{A}^{1}, θ_{A}) \geq u_{A} (s_{A}^{2}, θ_{A}) = 100

. Given that

u_{A} (s_{A}^{1}, θ_{A})

follows a uniform distribution, the probability that player A accepts the offer is

\frac{c}{100}

if

c \leq 100

and one otherwise. For player B, this offer has an expected payoff of

\frac{c}{100} \cdot (x - c)

if

c \leq 100

and

x - c

otherwise. The optimal value for the offer is given by

c^{*} = \frac{x}{2}

if

x \leq 200

and

c^{*} = 100

if

x > 200

. This leads to an expected social welfare for the single-offer mechanism of:

S W = \{\begin{matrix} 100 + x (\frac{x}{200} - \frac{1}{4}) & if x \leq 200 \\ 50 + x & if x > 200 \end{matrix}

Recall that the optimal social welfare is

50 + x

if

x \geq 50

and 100 otherwise. Therefore, the mechanism has a price of anarchy,

P o A = \{\begin{matrix} \frac{100}{100 + x (\frac{x}{200} - \frac{1}{4})} & if x \leq 50 \\ \frac{50 + x}{100 + x (\frac{x}{200} - \frac{1}{4})} & if 50 \leq x \leq 200 \\ \frac{50 + x}{50 + x} & if 200 \leq x \end{matrix}

The PoA is bounded by a constant, and in fact,

P o A \leq 1.21

for any x. This contrasts with the unbounded PoA obtained when no side payments are allowed.

Example 2.

(continued) The payoff of player B is higher if action

s_{A}^{1}

is played by player A. Hence, player B has an incentive to submit a monetary offer

c \leq 1

that triggers action

s_{A}^{1}

. Player A of type

θ_{A}

accepts the offer if

c + u_{A} (s_{A}^{1}, θ_{A}) \geq {max}_{s_{A} \in S_{A}} u_{A} (s_{A}, θ_{A})

. It can be shown that the probability that player A accepts the offer is

\frac{c n - c^{n}}{n - 1}

. In case of acceptance, the expected payoff is

μ_{1} - c

for player B and

\frac{1}{2} + c

for player A. In case of rejection, it is guaranteed that player A will not play

s_{A}^{1}

, and hence, player B’s outside option is action

s_{B} (s_{A}^{2}, θ_{B})

with an expected payoff of

\frac{μ_{2}}{n - 1}

. Player A’s expected outside option is

\frac{n}{n + 1}

, corresponding to her expected maximum payoff derived in Example 2. As a result, player B by offering c, has an expected payoff of:

\frac{c n - c^{n}}{n - 1} (μ_{1} - c) + (1 - \frac{c n - c^{n}}{n - 1}) \frac{μ_{2}}{n + 1}

When n is large, the probability of acceptance is approximately,

lim_{n \to \infty} \frac{c n - c^{n}}{n - 1} = c

Accordingly, in case of rejection, the expected payoffs become

{lim}_{n \to \infty} \frac{μ_{2}}{n - 1} = 0

for player B and

{lim}_{n \to \infty} \frac{n}{n + 1} = 1

for player A. Player B’s expected payoff is thus

c (μ_{1} - c)

and is maximized when she offers

c^{*} = \frac{μ_{1}}{2}

if

μ_{1} \leq 2

and

c^{*} = 1

otherwise. This leads to an expected social welfare for the single-offer mechanism of

S W = c^{*} (μ_{1} + \frac{1}{2}) + (1 - c^{*}) (0 + 1)

. Recall that the optimal social welfare is

\frac{1}{2} + μ_{1}

if

μ_{1} \geq \frac{1}{2}

and one otherwise. Therefore, the mechanism has the following price of anarchy,

P o A = \{\begin{matrix} \frac{1}{\frac{μ_{1}}{2} (μ_{1} + \frac{1}{2}) + (1 - \frac{μ_{1}}{2})} & if μ_{1} \leq \frac{1}{2} \\ \frac{\frac{1}{2} + μ_{1}}{\frac{μ_{1}}{2} (μ_{1} + \frac{1}{2}) + (1 - \frac{μ_{1}}{2})} & if \frac{1}{2} \leq μ_{1} \leq 2 \\ \frac{\frac{1}{2} + μ_{1}}{\frac{1}{2} + μ_{1}} = 1 & if 2 \leq μ_{1} \end{matrix}

and the PoA has a maximum value of

\frac{4}{31} (3 + 2 \sqrt{10}) \approx 1.203

. This contrasts with the unbounded PoA obtained by the Nash equilibrium when no side payments are allowed.

We now generalize the analysis done in Examples 1 and 2. We proceed by studying the utility-maximizing strategy

(s_{A}, γ)

for player B and then derive the expected social welfare of the outcome for the single-offer mechanism. Note that, in case of agreement, the action of player B of type

θ_{B}

is solely defined by

s_{A}

, as she has no incentives to defect from its best response

s_{B} (s_{A}, θ_{B})

. By Proposition 2, player A accepts an offer whenever

δ_{A} (s_{A}, θ_{A}) \leq γ \cdot δ_{B} (s_{A}, θ_{B})

, where

δ_{A} (s_{A}, θ_{A}) = u_{A} (s_{A}^{N} (θ_{A}), θ_{A}) - u_{A} (s_{A}, θ_{A})

. Player B obviously aims at choosing γ and

s_{A}

to maximize her payoff, and we now study this optimization problem. In the case of an agreement, player B is left with a profit of:

u_{B} (s_{B} (s_{A}, θ_{B}), θ_{B}) - γ \cdot δ_{B} (s_{A}, θ_{B})

Otherwise, player B gets an expected payoff of

u_{B}^{O} (s_{A}, θ_{B})

.

Definition 8.

The probability that player A accepts the offer

(s_{A}, γ)

, given that player B has type

θ_{B} \in Θ_{B}

, is:

P (s_{A}, γ, θ_{B}) = Pr [γ \cdot δ_{B} (s_{A}, θ_{B}) \geq δ_{A} (s_{A}, θ_{A})] = \int_{θ_{A} \in Θ_{A}} f_{A} (θ_{A}) \cdot 1 (s_{A}, γ \cdot δ_{B} (s_{A}, θ_{B}), θ_{A}) d θ_{A}

with:

1 (s_{A}, x, θ_{A}) = \{\begin{matrix} 1 & i f x \geq δ_{A} (s_{A}, θ_{A}) \\ 0 & otherwise \end{matrix}

The expected profit of players A and B for proposed action

s = (s_{A}, s_{B})

and γ when player B has type

θ_{B}

is given by:

\begin{matrix} E_{θ_{A}} [U_{B} (s_{A}, γ, θ_{B})] & = & u_{B}^{O} (s_{A}, θ_{B}) + P (s_{A}, γ, θ_{B}) ((1 - γ) \cdot δ_{B} (s_{A}, θ_{B})) \\ E_{θ_{A}} [U_{A} (s_{A}, γ, θ_{B})] & = & E_{θ_{A}} [u_{A}^{N} (θ_{A})] + P (s_{A}, γ, θ_{B}) \cdot (γ \cdot δ_{B} (s_{A}, θ_{B}) - E_{θ_{A}} [δ_{A} (s_{A}, θ_{A})]) \end{matrix}

The optimal strategy of player B is specified in the following lemma.

Lemma 2.

On the single-offer mechanism, player B chooses

s_{A}^{*} (θ_{B})

and

γ^{*} (s_{A}^{*}, θ_{B})

such that:

s_{A}^{*} (θ_{B}) = \underset{s_{A} \in S_{A}}{arg-max} E_{θ_{A}} [U_{B} (s_{A}, γ^{*}, θ_{B})]

where:

γ^{*} (s_{A}, θ_{B}) = \underset{γ \in R_{[0, 1]}}{arg-max} P (s_{A}, γ, θ_{B}) \cdot (1 - γ)

5.1. Price of Anarchy

We now analyze the quality of the outcomes in the single-offer mechanism. The first step is the derivation of a lower bound for the expected social welfare of the single-offer mechanism. Inspired by Lemma 1, instead of considering all pairs

〈 s_{A}, γ 〉

, the analysis restricts attention to a single action

s_{A}^{'} = {arg-max}_{s_{A} \in S_{A}} u_{B} (s_{B} (s_{A}, θ_{B}), θ_{B})

. We prove that, when offering to player A action

s_{A}^{'}

and its associated optimal value for γ, the expected social welfare is lower than the optimal pair

〈 s_{A}^{*}, γ^{*} 〉

. As a result, we obtain an upper bound to the price of anarchy of the single-offer mechanism.

To make the discussion precise, consider the strategy where player B offers

〈 s_{A}^{'}, γ^{*} (s_{A}^{'}, θ_{B}) 〉

, with

γ^{*} (s_{A}^{'}, θ_{B})

being the optimal choice of γ given

s_{A}^{'}

, following the notation used in Lemma 2.

Lemma 3.

For any type

θ_{B} \in Θ_{B}

of player B, the expected social welfare achieved by the single-offer mechanism is at least the expected social welfare achieved by the strategy

〈 s_{A}^{'}, γ^{*} (s_{A}^{'}, θ_{B}) 〉

.

Proof.

Let

γ^{*} = γ^{*} (s_{A}^{*}, θ_{B})

and

γ^{'} = γ^{*} (s_{A}^{'}, θ_{B})

. The optimality condition of

s^{*}

implies that:

E_{θ_{A}} [U_{B} (s^{'}, γ^{'}, θ_{B})] \leq E_{θ_{A}} [U_{B} (s^{*}, γ^{*}, θ_{B})]

(1)

Two cases can occur. The first case is:

P (s_{A}^{'}, γ^{'}, θ_{B}) \leq P (s_{A}^{*}, γ^{*}, θ_{B})

i.e., the probability of player A accepting offer

(s_{A}^{*}, γ^{*})

is greater than if offered

(s_{A}^{'}, γ^{'})

. Then, it must be that the expected payoff of player A is greater when offered

(s_{A}^{*}, γ^{*})

, i.e.,

E_{θ_{A}} [U_{A} (s_{A}^{'}, γ^{'}, θ_{B})] \leq E_{θ_{A}} [U_{A} (s_{A}^{*}, γ^{*}, θ_{B})]

This, together with Inequality (1), results in the single-offer mechanism having a greater expected social welfare.

The second case is:

P (s_{A}^{'}, γ^{'}, θ_{B}) > P (s_{A}^{*}, γ^{*}, θ_{B})

Consider

γ^{″}

such that

P (s_{A}^{'}, γ^{″}, θ_{B}) = P (s_{A}^{*}, γ^{*}, θ_{B})

. The fact that the probabilities of acceptance are the same implies that the expected payoff of player A is the same in both cases, i.e.,

E_{θ_{A}} [U_{A} (s_{A}^{'}, γ^{″}, θ_{B})] = E_{θ_{A}} [U_{A} (s_{A}^{*}, γ^{*}, θ_{B})]

. This, together with Equation (1), yields:

E_{θ_{A}} [S W (s^{*}, γ^{*}, θ_{B})] \geq E_{θ_{A}} [S W (s^{'}, γ^{″}, θ_{B})]

This is equivalent to:

u_{B} (s^{*}, θ_{B}) + E_{θ_{A}} [u_{A} (s_{A}^{*}, θ_{A})] \geq u_{B} (s^{'}, θ_{B}) + E_{θ_{A}} [u_{A} (s_{A}^{'}, θ_{A})]

(2)

Similarly, consider

γ^{* *}

such that:

P (s_{A}^{'}, γ^{'}, θ_{B}) = P (s_{A}^{*}, γ^{* *}, θ_{B})

which implies:

E_{θ_{A}} [U_{A} (s_{A}^{'}, γ^{'}, θ_{B})] = E_{θ_{A}} [U_{A} (s_{A}^{*}, γ^{* *}, θ_{B})]

The existence of

γ^{* *}

is guaranteed by Inequality (2), which states that there is more money in expectation to transfer to player A when choosing

s^{*}

over

s^{'}

. The fact that the acceptance probabilities are the same, together with Inequality (2), implies that:

E_{θ_{A}} [S W (s^{*}, γ^{* *}, θ_{B})] \geq E_{θ_{A}} [S W (s^{'}, γ^{'}, θ_{B})]

Given that the expected payoff of player A is the same in both cases, it must be the case that the expected payoff of player B is higher when using

(s_{A}^{*}, γ^{* *})

.

Therefore, we have found an offer for the single-offer mechanism with greater expected social welfare and a greater payoff for player B compared to strategy

〈 s_{A}^{'}, γ^{'} 〉

. ☐

We are ready to derive an upper bound for the induced price of anarchy for the single-offer mechanism. We first derive the price of anarchy of strategy

〈 s_{A}^{'}, γ^{'} 〉

in the case of agreement and disagreement of player A.

Lemma 4.

Consider action

s^{'} = arg {max}_{s \in S} u_{B} (s, θ_{B})

, and let

P o A^{A} (γ)

and

P o A^{R} (γ)

denote the induced price of anarchy if player A accepts and rejects the offer given a proposed γ. Then,

P o A^{A} (γ) = 1 + γ and P o A^{R} (γ) = 1 + \frac{1}{γ}

Proof.

Let

s_{A}^{N} = s_{A}^{N} (θ_{A})

,

u_{A}^{N} = u_{A} (s_{A}^{N}, θ_{A})

,

u_{A}^{'} = u_{A} (s^{'}, θ_{A})

,

u_{B}^{'} = u_{B} (s^{'}, θ_{B})

,

s_{B}^{O} = s_{B}^{O} (s_{A}^{'}, θ_{A})

and

u_{B}^{O} = u_{B}^{O} (s_{A}^{'}, θ_{B})

. Player B offers action

s_{A}^{'}

and a monetary value of

γ δ_{B} (s_{A}^{'}) = γ (u_{B}^{'} - u_{B}^{O})

to player A. Two cases can occur.

Player A accepts:

u_{A}^{'} + γ δ_{B} (s_{A}^{'}) \geq u_{A}^{N}

. Strategy

(s_{A}^{'}, s_{B}^{'})

is played.

\begin{matrix} P o A^{A} & \leq & \frac{u_{A}^{N} + u_{B}^{'}}{u_{A}^{'} + u_{B}^{'}} \leq \frac{u_{A}^{'} + u_{B}^{'} + γ \cdot u_{B}^{'}}{u_{A}^{'} + u_{B}^{'}} \\ = & 1 + γ \frac{u_{B}^{'}}{u_{A}^{'} + u_{B}^{'}} \leq 1 + γ \end{matrix}

Player A rejects:

u_{A}^{'} + γ δ_{B} (s_{A}^{'}) < u_{A}^{N}

. Strategy

(s_{A}^{N}, s_{B}^{O})

is played.

P o A^{R} \leq \frac{u_{A}^{N} + u_{B}^{'}}{u_{A}^{N} + u_{B}^{O}} \leq 1 + \frac{u_{B}^{'}}{u_{A}^{N} + u_{B}^{O}} \leq 1 + \frac{1}{γ}

where the last inequality comes from:

u_{A}^{'} + γ δ_{B} (s_{A}^{'}) < u_{A}^{N} \Leftrightarrow γ u_{B}^{'} < u_{A}^{N} + γ u_{B}^{O} - u_{A}^{'} < u_{A}^{N} + u_{B}^{O}

☐

When

γ = 1

, the price of anarchy is two, but player B has no incentive to choose such a value. If

γ = 0.5

, the price of anarchy is three. Of course, player B will choose

γ^{'} = γ^{*} (s_{A}^{'}, θ_{B})

. Lemma 4 indicates that the worst-case outcome is

(1 + γ^{'})

when player A accepts with a probability

P (s_{A}^{'}, γ^{'}, θ_{B})

and

(1 + \frac{1}{γ^{'}})

otherwise. This yields the following result.

Theorem 2.

The Bayesian Nash price of anarchy of the single-offer mechanism for one-way games is at most:

\frac{γ^{'} + 1}{γ^{'}} (1 - P (s_{A}^{'}, γ^{'}, θ_{B}) (1 - γ^{'}))

where:

γ^{'} = \underset{γ \in R_{[0, 1]}}{arg-max} P (s_{A}^{'}, γ, θ_{B}) (1 - γ)

Proof.

By combining Lemmas 3 and 4, we can derive the following upper bound for the PoA.

\begin{matrix} P o A & \leq & P (s_{A}^{'}, γ^{'}, θ_{B}) P o A^{A} (γ^{'}) + (1 - P (s_{A}^{'}, γ^{'}, θ_{B})) P o A^{R} (γ^{'}) \\ = & P (s_{A}^{'}, γ^{'}, θ_{B}) (1 + γ^{'}) + (1 - P (s_{A}^{'}, γ^{'}, θ_{B})) (1 + \frac{1}{γ^{'}}) \\ = & 1 + \frac{1}{γ^{'}} + P (s_{A}^{'}, γ^{'}, θ_{B}) (γ^{'} - \frac{1}{γ^{'}}) \\ = & \frac{γ^{'} + 1}{γ^{'}} + P (s_{A}^{'}, γ^{'}, θ_{B}) (\frac{γ^{' 2} - 1}{γ^{'}}) \\ = & \frac{γ^{'} + 1}{γ^{'}} (1 - P (s_{A}^{'}, γ^{'}, θ_{B}) (1 - γ^{'})) \end{matrix}

☐

To get a better idea of how the mechanism improves the social welfare, it is useful to quantify the price of anarchy in Theorem 2 for a specific class of distributions.

Corollary 1.

If

δ_{A} (s_{A}^{'}, θ_{A})

has a cumulative distribution function

F (x) = {(x / δ_{B})}^{β}

between zero and

δ_{B}

, with

0 < β \leq 1

, then

γ = \frac{β}{β + 1}

, and the price of anarchy is at most:

(2 + \frac{1}{β}) (1 - β^{β} {(1 + β)}^{- (β + 1)})

For example, if

β = 1

, then

F (x)

is the uniform distribution,

γ = \frac{1}{2}

, and the expected price of anarchy is at most

2.25

.

This corollary, in conjunction with Lemma 1, gives us the cost of enforcing individual rationality, moving from a price of anarchy of two to a price of 2.25 in the case of a uniform distribution.

The strategy

〈 s_{A}^{'}, γ^{'} 〉

is of independent interest. It indicates how a player with limited computational power can achieve an outcome that satisfies individual rationality without optimizing overall strategies.

6. Multi-Offer Mechanism

This section extends the single-offer mechanism by allowing player B to make multiple monetary offers for the same proposed action. Our main result shows that making counteroffers under commitment does not improve efficiency over the single-offer mechanism. By commitment, we mean that player B must be able to guarantee that the price schedule she originally announces will not be modified in the future. In this setting, incentive-compatibility and individual rationality conditions refer to the optimality of each player’s complete plan of action, in which players initially commit directly to entire strategies.

The single-offer mechanism was characterized by an action

s_{A} \in S_{A}

and a single value

γ \in R_{[0, 1]}

. The multi-offer mechanism is characterized by a four-tuple:

(s_{A} \in S_{A}, n, γ = (γ_{1}, \dots, γ_{n}) \in R_{[0, 1]}^{n}, p = (p_{1}, \dots, p_{n}) \in R_{[0, 1]}^{n})

where n is the number of offers,

(γ_{1}, \dots, γ_{n})

is a sequence of numbers in

R_{[0, 1]}^{n}

to compute the ratios of

δ_{B} (s_{A}, θ_{B}) = u_{B} (s_{B} (s_{A}, θ_{B}), θ_{B}) - u_{B}^{O} (s_{A}, θ_{B})

to be offered and

(p_{1}, \dots, p_{n})

is a sequence of probabilities for continuing to make offers where we assume that

p_{1} = 1

. The multi-offer mechanism is defined as follows:

Player B selects an action $s_{A} \in S_{A}$ to propose to player A.
Player B also computes her outside option $s_{B}^{O} (s_{A}, θ_{B})$ in case player A rejects action $s_{A}$ , and we denote by $u_{B}^{O} (s_{A}, θ_{B})$ the expected payoff from her outside option.
Player B selects $γ = (γ_{1}, \dots, γ_{n}) \in R_{[0, 1]}^{n}$ and $p = (p_{1}, \dots, p_{n}) \in R_{[0, 1]}^{n}$ , with $p_{1} = 1$ . Player B has to commit to this sequence of values (despite what she learns from player A’s actions).
At step $1 \leq i \leq n$ , player B proposes a monetary value of $γ_{i} \cdot δ_{B} (s_{A}, θ_{B})$ with $δ_{B} (s_{A}, θ_{B}) = u_{B} (s_{B} (s_{A}, θ_{B}), θ_{B}) - u_{B}^{O} (s_{A}, θ_{B})$ and $γ_{i} \in R_{[0, 1]}$ to player A in the hope that she accepts to play strategy $s_{A}$ instead of strategy $s_{A}^{N}$ .
Player A decides whether to accept the offer.
If player A accepts the offer, the outcome of the game is $(s_{A}, s_{B} (s_{A}, θ_{B}))$ .
If player A rejects the offer, set $i \leftarrow i + 1$ , and go to Step 4 with probability $p_{i}$ .
Otherwise, the outcome of the game is the outside option $(s_{A}^{N} (θ_{A}), s_{B}^{O} (s_{A}, θ_{B}))$ .

For ease of notation, we denote

δ_{B} (s_{A}) = δ_{B} (s_{A}, θ_{B})

and

δ_{A} (s_{A}) = δ_{A} (s_{A}, θ_{A})

for the rest of this section, where

δ_{A} (s_{A}, θ_{A}) = u_{A} (s_{A}^{N} (θ_{A}), θ_{A}) - u_{A} (s_{A}, θ_{A})

. In the multiple-offer mechanism, player B makes a sequence of offers

γ_{i} δ_{B} (s_{A})

to player A to play strategy

s_{A}

. The first offer is

γ_{1} δ_{B} (s_{A})

. If player A refuses the offer, then player B makes a second offer

γ_{2} δ_{B} (s_{A})

with probability

p_{2}

. Hence, with probability

1 - p_{2}

, player B makes no offer, and the outcome of the game is

(s_{A}^{N} (θ_{A}), s_{B}^{O} (s_{A}, θ_{B}))

. In general, at iteration i, player B makes an offer

γ_{i} δ_{B} (s_{A})

with probability

p_{i}

and the outside option is played with probability

1 - p_{i}

. The mechanism stops when player A accepts an offer or when player B stops making offers to player A. In this last case, once again, the outside option is played.

Observe that player A could reject an offer even if it is more profitable than playing her maximizing utility action

s_{A}^{N} (θ_{A})

because she may expect a better offer in the future. To avoid this behavior, the multi-offer mechanism imposes a condition on the

γ_{i}

’s and

p_{i}

’s to ensure that player A accepts the first offer that gives her a higher payoff than her default action

s_{A}^{N} (θ_{A})

. Two conditions must hold for player A to accept an offer in step

i \in [1, \dots, n]

:

(a) Individual rationality:

\begin{matrix} γ_{i} δ_{B} (s_{A}) \geq δ_{A} (s_{A}) \end{matrix}

(3)

which is equivalent to Proposition 2.

(b) Greater expected utility in step i than in step

i + 1

:

γ_{i} δ_{B} (s_{A}) + u_{A} (s_{A}, θ_{A}) \geq p_{i + 1} (γ_{i + 1} δ_{B} (s_{A}) + u_{A} (s_{A}, θ_{A})) + (1 - p_{i + 1}) u_{A} (s_{A}^{N} (θ_{A}), θ_{A})

which is equivalent to:

\begin{matrix} \frac{γ_{i} - p_{i + 1} γ_{i + 1}}{1 - p_{i + 1}} δ_{B} (s_{A}) \geq δ_{A} (s_{A}) \end{matrix}

(4)

We now show that the multiple-offer mechanism is in fact equivalent to the single-offer mechanism. We use the notation:

S_{i} = \{\begin{matrix} 0 & i = 0 \\ \frac{γ_{i} - p_{i + 1} γ_{i + 1}}{1 - p_{i + 1}} & n > i > 0 \\ γ_{n} & i = n \end{matrix}

so that Condition (4) can be expressed as

S_{i} δ_{B} (s_{A}) \geq δ_{A} (s_{A})

.

Note that if player A refuses an offer with

γ_{i}

, she will also refuse offers with smaller ratios. This observation leads to the following proposition.

Proposition 4.

In the multi-offer mechanism,

γ_{i + 1} > γ_{i}, \forall i \in [1, \dots, n - 1]

Therefore, Proposition 4 states that counteroffers should be increasing with time.

Lemma 5.

In the multiple-offer mechanism, for all

i \in [1, \dots, n]

,

γ_{i} \geq S_{i}

Proof.

Assume that

γ_{i} < S_{i}

. By the definition of

S_{i}

, it follows that

γ_{i} - p_{i + 1} γ_{i} < γ_{i} - p_{i + 1} γ_{i + 1}

, and hence,

γ_{i} > γ_{i + 1}

. This contradicts Proposition 4, stating that the γ’s are defined as a non-decreasing sequence. ☐

Corollary 2.

Condition (4) implies Condition (3).

Proof.

S_{i} \leq γ_{i}

and

δ_{A} (s_{A}) \leq S_{i} δ_{B} (s_{A})

implies

δ_{A} (s_{A}) \leq S_{i} δ_{B} (s_{A}) \leq γ_{i} δ_{B} (s_{A})

. ☐

If player A rejected the offer in step

i - 1

, then Conditions (3) and (4) were both not satisfied in step

i - 1

. By Lemma 5, only two cases may occur:

If $γ_{i} δ_{B} (s_{A}) < δ_{A} (s_{A})$ , then it must be the case that:

$S_{i - 1} δ_{B} (s_{A}) \leq γ_{i - 1} δ_{B} (s_{A}) < δ_{A} (s_{A})$

(5)
If $γ_{i} δ_{B} (s_{A}) \geq δ_{A} (s_{A})$ , then:

$S_{i - 1} δ_{B} (s_{A}) < δ_{A} (s_{A}) \leq γ_{i - 1} δ_{B} (s_{A})$

(6)

The disjunction of Conditions (5) and (6) yields the following inequality:

S_{i - 1} δ_{B} (s_{A}) < δ_{A} (s_{A})

(7)

By Corollary 2, if player A accepts in step i given that she rejected in step

i - 1

, we have that:

δ_{A} (s_{A}) \leq S_{i} δ_{B} (s_{A})

(8)

Recalling Definition 8, the cumulative distribution function of random variable

δ_{A} (s_{A})

was denoted by,

P (s_{A}, γ) = Pr [δ_{A} (s_{A}) \leq γ δ_{B} (s_{A})]

Hence, the probability of acceptance in step i can be derived from Conditions (7) and (8) as:

Pr [S_{i - 1} δ_{B} (s_{A}) < δ_{A} (s_{A}) \leq S_{i} δ_{B} (s_{A})] = P (s_{A}, S_{i}) - P (s_{A}, S_{i - 1})

Player B aims at choosing the

γ_{i}

’s, the probabilities

p_{i}

’s and action

s_{A}

to maximize her expected utility, which is equivalent to the following optimization problem.

\begin{matrix} (9) & max_{γ, p} & \sum_{i = 1}^{n} [(\prod_{j = 1}^{i} p_{j}) (P (s_{A}, S_{i}) - P (s_{A}, S_{i - 1}))) (1 - γ_{i})] \\ (10) & s.t. & p_{1} = 1 \\ (11) & S_{0} = 0 \\ (12) & S_{i} = \frac{γ_{i} - p_{i + 1} γ_{i + 1}}{1 - p_{i + 1}} \\ (13) & S_{n} = γ_{n} \\ (14) & S_{1} \leq S_{2} \leq \dots \leq S_{n} \leq 1 \end{matrix}

where the term

\prod_{j = 1}^{i} p_{j}

is the probability of reaching the i-th offer. We are now ready to state the main result of this section.

Theorem 3.

The multi-offer mechanism is equivalent to the single-offer mechanism in one-way games.

Proof.

By Equation (12),

(1 - p_{i + 1}) (1 - S_{i}) = (1 - γ_{i}) - p_{i + 1} (1 - γ_{i + 1})

Then, by using Equation (13) and grouping the

P (s_{A}, S_{i})

terms, the objective function becomes:

\sum_{i = 1}^{n - 1} [(\prod_{j = 1}^{i} p_{j}) (1 - p_{i + 1}) P (s_{A}, S_{i}) (1 - S_{i})] + (\prod_{j = 1}^{n} p_{j}) P (s_{A}, γ_{n}) (1 - γ_{n})

(15)

Observe that each term in the objective function features an expression of the form

P (s_{A}, x) (1 - x)

. Hence, the objective is bounded by above by:

E q u a t i o n (15) \leq \sum_{i = 1}^{n - 1} [(\prod_{j = 1}^{i} p_{j}) (1 - p_{i + 1}) \cdot C] + C \cdot \prod_{j = 1}^{n} p_{j}

where

C = {max}_{x} P (s_{A}, x) (1 - x)

. We show that, for any given probabilities p, there is a unique solution that meets this upper bound. Let

x^{*} = arg {max}_{x} P (s_{A}, x)

. The right-hand term in Equation (15) is optimized by setting

γ_{n} = x^{*}

. We show by induction that all of the other terms are optimized by setting

γ_{i} = x^{*}

. Assume that this holds for

γ_{i + 1}, \dots, γ_{n}

. We need to optimize

P (s_{A}, S_{i}) (1 - S_{i})

. By induction,

S_{i} = \frac{γ_{i} - p_{i + 1} x^{*}}{1 - p_{i + 1}}

and assigning

x^{*}

to

γ_{i}

gives

S_{i} = x^{*}

and

P (s_{A}, S_{i}) (1 - S_{i}) = C

. Since all

γ_{i}

are equal, this concludes the proof. ☐

The above derivation is related to a well-known result from Sobel and Takahashi [13], which models an iterative bargaining where there is a buyer with a private reservation price and a seller with reservation price zero who makes all of the offers. There is a known fixed discount factor for each player, and when these discount factors are equal (this is equivalent to have a probability for the next offer), they showed that, under commitment, the infinite horizon bargaining is equivalent to the single shot. There are differences between their model and ours: in our model, the buyer is making the offers; the probabilities are not fixed a priori (player B can choose them); and both outside options are private.

7. Conclusions

In one-way games, the utility of one player does not depend on the decisions of the other player. We showed that, in this setting, the outcome of a Nash equilibrium when no side payments are allowed can be arbitrarily far from the social welfare solution. When it is far enough, player B has bargaining power to incentivize player A monetarily, so that she moves from her equilibrium and cooperates to overcome bad social welfare. We have explored this possibility by analyzing the social welfare when side payments are allowed. In the setting with private information and when side payments are allowed, we proved that it is impossible to design a Bayes–Nash incentive-compatible mechanism for one-way games that is budget-balanced, individually rational and efficient. To alleviate these negative results, we proposed two privacy-preserving mechanisms, a single-offer and a multi-offer mechanism, and showed that both are equivalent in terms of the equilibrium outcomes reached.

The single-offer mechanism is simple for both parties, as well as for the broker who just makes sure that the players follow the protocol. This mechanism also requires minimal information from the agents who perform all of the combinatorial computations, while it incentivizes them to cooperate towards social welfare in a distributed setting. Moreover, the mechanism has the following desirable properties: it is budget-balanced and satisfies the individual rationality constraints and Bayesian incentive-compatibility conditions. Whereas the mechanism can only guarantee interim individual rationality for player B since her outside option is not known a prior, it provides ex post individual rationality for player A. Additionally, we showed that, in a realistic setting, where agents have limited computational resources, a simpler version of the mechanism can be implemented without overly deteriorating the social welfare.

It is an open question whether there exists another mechanism (possibly more complex) that could lead to a better efficiency, while keeping the above properties. Indeed, in one-way games, player A has an intrinsic advantage over player B, which is not easy to overcome. One possible promising mechanism consists of player B setting rewards for all player A’s actions and player A choosing one in return for that money. This is known as the Bayesian unit-demand item-pricing problem (BUPP) [14]. Recent work has shown this problem to be NP-hard [15], but a factor three approximation to the optimal expected revenue of player B is obtained in [14] (subsequently improved to two in [16]). In the context of our paper, several interesting questions arise from the Bayesian unit-demand item-pricing problem: What is the efficiency achieved by the BUPP in one-way games? What is the impact of a constant factor approximation for the revenue on the social welfare in one-way games?

Acknowledgments

National Information and Communications Technology Australia (NICTA) is funded by the Australian Government through the Department of Communications and the Australian Research Council through the ICT Centre of Excellence Program.

Author Contributions

All authors contributed equally to this article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Abeliuk, A.; Berbeglia, G.; Van Hentenryck, P. A Bargaining Mechanism for One-Way Games. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI), Buenos Aires, Argentina, 25–31 July 2015.
Coase, R.H. The problem of social cost. J. Law Econ. 1960, 3, 1–44. [Google Scholar] [CrossRef]
Hahnel, R.; Sheeran, K.A. Misinterpreting the Coase theorem. J. Econ. Issues 2009, 43, 215–238. [Google Scholar] [CrossRef]
Gentle, J.E. Computational Statistics; Springer: Berlin, Germany, 2009; Volume 308. [Google Scholar]
Jackson, M.O.; Wilkie, S. Endogenous games and mechanisms: Side payments among players. Rev. Econ. Stud. 2005, 72, 543–566. [Google Scholar] [CrossRef]
Myerson, R.B.; Satterthwaite, M.A. Efficient mechanisms for bilateral trading. J. Econ. Theory 1983, 29, 265–281. [Google Scholar] [CrossRef]
D’Aspremont, C.; Gérard-Varet, L.A. Incentives and incomplete information. J. Public Econ. 1979, 11, 25–45. [Google Scholar] [CrossRef]
Arrow, K. The Property Rights Doctrine and Demand Revelation Under Incomplete Information. In Economics and Human Welfare: Essays in Honour of Tibor Scitovsky; Boskin, M., Ed.; Academic Press: New York, NY, USA, 1979. [Google Scholar]
Williams, S.R. A characterization of efficient, Bayesian incentive compatible mechanisms. Econ. Theory 1999, 14, 155–180. [Google Scholar] [CrossRef]
Krishna, V.; Perry, M. Efficient Mechanism Design; Working Paper; Department of Economics, The Pennsylvania State University: University Park, PA, USA, 1998. [Google Scholar]
Kennan, J.; Wilson, R. Bargaining with private information. J. Econ. Lit. 1993, 31, 45–104. [Google Scholar]
Chatterjee, K.; Samuelson, W. Bargaining under incomplete information. Oper. Res. 1983, 31, 835–851. [Google Scholar] [CrossRef]
Sobel, J.; Takahashi, I. A multistage model of bargaining. Rev. Econ. Stud. 1983, 50, 411–426. [Google Scholar] [CrossRef]
Chawla, S.; Hartline, J.D.; Kleinberg, R. Algorithmic pricing via virtual valuations. In Proceedings of the 8th ACM Conference on Electronic Commerce (ACM), San Diego, CA, USA, 11–15 June 2007; pp. 243–251.
Chen, X.; Diakonikolas, I.; Paparas, D.; Sun, X.; Yannakakis, M. The Complexity of Optimal Multidimensional Pricing. In Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SIAM), SODA ’14, Portland, OR, USA, 5–7 January 2014; pp. 1319–1328.
Chawla, S.; Hartline, J.D.; Malec, D.L.; Sivan, B. Multi-parameter Mechanism Design and Sequential Posted Pricing. In Proceedings of the 42nd ACM Symposium on Theory of Computing (ACM), STOC ’10, Cambridge, MA, USA, 6–8 June 2010; pp. 311–320.

^*An earlier, shorter version of this paper appeared in the Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI), 2015 [1].

© 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Abeliuk, A.; Berbeglia, G.; Van Hentenryck, P. Bargaining Mechanisms for One-Way Games. Games 2015, 6, 347-367. https://doi.org/10.3390/g6030347

AMA Style

Abeliuk A, Berbeglia G, Van Hentenryck P. Bargaining Mechanisms for One-Way Games. Games. 2015; 6(3):347-367. https://doi.org/10.3390/g6030347

Chicago/Turabian Style

Abeliuk, Andrés, Gerardo Berbeglia, and Pascal Van Hentenryck. 2015. "Bargaining Mechanisms for One-Way Games" Games 6, no. 3: 347-367. https://doi.org/10.3390/g6030347

Article Menu

Bargaining Mechanisms for One-Way Games^†

Abstract

1. Introduction

2. One-Way Games

3. Related Work

4. Bayesian–Nash Mechanisms

4.1. Impossibility Result

5. Single-Offer Mechanism

5.1. Price of Anarchy

6. Multi-Offer Mechanism

7. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Bargaining Mechanisms for One-Way Games †

Abstract

1. Introduction

2. One-Way Games

3. Related Work

4. Bayesian–Nash Mechanisms

4.1. Impossibility Result

5. Single-Offer Mechanism

5.1. Price of Anarchy

6. Multi-Offer Mechanism

7. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Bargaining Mechanisms for One-Way Games^†