Bargaining Mechanisms for One-Way Games

We introduce one-way games, a framework motivated by applications in large-scale power restoration, humanitarian logistics, and integrated supply-chains. The distinguishable feature of the games is that the payoff of some player is determined only by her own strategy and does not depend on actions taken by other players. We show that the equilibrium outcome in one-way games without payments and the social cost of any ex-post efficient mechanism, can be far from the optimum. We also show that it is impossible to design a Bayes-Nash incentive-compatible mechanism for one-way games that is budget-balanced, individually rational, and efficient. To address this negative result, we propose a privacy-preserving mechanism that is incentive-compatible and budget-balanced, satisfies ex-post individual rationality conditions, and produces an outcome which is more efficient than the equilibrium without payments. The mechanism is based on a single-offer bargaining and we show that a randomized multi-offer extension brings no additional benefit.


Introduction
When modeling economic interactions between agents, it is standard to adopt a general framework where payoffs of individuals are dependent on the actions of all other decision-makers. However, some agents may have payoffs that depend only on their own actions, not on actions taken by other agents. In this paper, we explore the consequences of such asymmetries among agents. Since these features lead to a restricted version of the general model, the hope is that we can identify mechanisms that produce efficient outcomes by exploiting the properties of this specific setting.
A classic application of this setting is Coase's example of a polluter and a single victim, e.g., a steel mill that affects a laundry. The Coase theorem (1960) is often interpreted as a demonstration of why private negotiations between polluters and victims can yield efficient levels of pollution without government interference. However, in an influential article, Hahnel and Sheeran (2009) criticize the Coase theorem by showing that, under more realistic conditions, it is unlikely that an efficient outcome will be reached. They emphasize that the solution is a negotiation, and not a market-based transaction as described by Coase. As such, incomplete information plays an important role and game theory and bargaining games can explain inefficient outcomes.
Other real-life applications are found in large-scale restoration of interdependent infrastructures after significant disruptions [Cavdaroglu et al., 2013;Coffrin et al., 2012], humanitarian logistics over multiple states or regions [Van Hentenryck et al., 2010], supply chain coordination (see, e.g., Voigt (2011)), integrated logistics, and the joint planning and the control of gas and electricity networks. Consider for example the restoration of the power system and the telecommunication network after a major disaster. As explained in Cavdaroglu et al. [2013], there are one-way dependencies between the power system and the telecommunication network. This means, for instance, that some power lines must be restored before some parts of the telecommunication network become available. It is possible to use centralized mechanisms for restoring the system as a whole. However, it is often the case that these restorations are performed by different agencies with independent objectives and selfish behavior may have a strong impact on the social welfare. It is thus important to study whether it is possible to find high-quality outcomes in decentralized settings when stakeholders proceed independently and do not share complete information about their utilities.
This paper aims at taking a first step in this direction by proposing a class of two players oneway dependent decision settings which abstracts some of the salient features of these applications and formalizes many of Hahnel and Sheeran's critiques. We present a number of negative and positive results on one-way games. We first show that Nash equilibria in one-way games, under no side payments, can be arbitrarily far from the optimal social welfare. Moreover, in contrast to Coase theorem, we show that when side payments are allowed in a Bayes-Nash incentive-compatible setting, there is no ex-post efficient individually rational, and budget-balanced mechanism for oneway games. To address this negative result, we focus on mechanisms that are budget-balanced, individually rational, incentive-compatible that are relatively efficient. Our main positive result is a single-offer bargaining mechanism which under reasonable assumptions on the players, increases the social welfare compared to the setting where no side payments are allowed. We also show that this single-offer mechanism cannot be improved by a (randomized) multi-offer mechanism.
The rest of this paper is organized as follows. In Section 2 we define one-way games and study the properties of Nash equilibria. In section 4.1 we prove an impossibility result. The single offer mechanism is presented and analyzed in sections 5 and 5.1 respectively. Finally, in section 6 we present a multi-offer mechanism and we show that it doesn't improve the efficiency with respect to the single-offer one.

One-Way Games
One-way games feature two players A and B. Each player i ∈ A, B has a public action set S i and we write S = S A × S B to denote the set of joint action profiles. As most commonly done in mechanism design, we model private information by associating each agent i with a payoff function u i : S × Θ i → R + , where u i (s, θ i ) is the agent utility for strategy profile s when the agent has type θ i . We assume that the player types are stochastically independent and drawn from a distribution f that is common knowledge. We denote by Θ i the possible types of player i and write Θ = Θ A × Θ B . If θ ∈ Θ, we use θ i to denote the type of player i in θ. Similar conventions are used for strategies, utilities, and type distributions.
A key feature of one-way games is that the payoff u A ((s A , s B ), θ A ) of player A is determined only by her own strategy and does not depend on B's actions, i.e., . As a result, for ease of notation, we use u A (s A , θ A ) to denote A's payoff. Obviously, player B must act according to what player A chooses to do and we use s B (s A , θ B ) to denote the best response of player B given that A plays action s A and player B has type θ B , i.e., where ties are broken arbitrarily. In this paper, we always assume that ties are broken arbitrarily in arg-max expressions.
One-way games assumes that players are risk-neutral agents and that after having observed the realization of their own types, players simultaneously choose their actions. As a consequence, if side payments are not allowed, player A will play an action s N A that yields her a maximum payoff, i.e., Player B will pick s N B (θ B ) such that her expected payoff is maximized, i.e., The set of Nash equilibria (NE) is thus characterized by s N (θ) = (s N A (θ A ), s N B (θ B )) ⊆ S. The best response s N B (θ B ) of player B may be a bad outcome for her even when B has a much greater potential payoff. Player A achieves its optimal payoff, but our motivating applications aim at optimizing a global welfare function Thus, the global welfare achieved by the Nash equilibria can be expressed as SW (s N (θ), θ). We quantify the quality of the Nash equilibrium outcome with the price of anarchy (PoA).
Definition 1. The price of anarchy of s N (θ) ⊆ S given type θ ∈ Θ is defined as .
A natural extension for the price of anarchy is to quantify the expected worst-case equilibrium.
Definition 2. The Bayes-Nash price of anarchy is defined as Note that the price of anarchy given type θ ∈ Θ can be used to obtain a lower and an upper bound on the Bayes-Nash price of anarchy in the following way, min θ∈Θ P oA(θ) ≤ P oA ≤ max θ∈Θ P oA(θ).
Throughout this paper, we use the following two running examples to illustrate key concepts.
Example 1. Consider the instance where player A has two possible actions s 1 A , s 2 A ∈ S A . Action s 1 A has a payoff u A (s 1 A ) distributed according to a uniform distribution between 0 and 100, while action s 2 A has a constant payoff u A (s 2 A ) = 100. The set of dominant actions for player B corresponds to the set of best responses s B (s 1 A ) and s B (s 2 A ). Let player B have only one type and we set payoffs to be u B (s B (s 1 A )) = x and u B (s B (s 2 A )) = 0, where x is a positive constant. When no transfers are allowed, player A will always play action s 2 A , yielding a social welfare of 100 + 0. If player A plays s 1 A , her expected payoff is 50 and the expected social welfare is thus 50 + x. The price of anarchy is 50+x 100 if x ≥ 50 and 1 otherwise. Notice that the PoA is an increasing function of x.
Example 2. Consider the instance where player A has n possible actions s 1 A , s 2 A , . . . , s n A ∈ S A . Each action has a payoff u A (s i A ) which is independent and identically distributed according to a Uniform distribution between 0 and 1. For player B, consider the set of best responses s B (s 1 A , θ B ), s B (s 2 A , θ B ), . . . , s B (s n A , θ B ). Let the expected payoffs be E θ B u B (s B (s i A , θ B ), θ B ) = µ i for i = 1, 2, . . . , n, where µ 1 ≥ µ 2 ≥ · · · ≥ µ n . All other payoffs are set to 0, i.e., for any When no transfers are allowed, player A chooses her (realized) maximizing payoff action. Such action is distributed according to the largest order statistic, i.e., the maximum between all payoffs. The largest order statistic between n standard Uniforms follows a Beta(n, 1) distribution, with mean n n+1 . Hence, the expected payoff of player A is n n+1 . By symmetry of player A's payoff functions, all her actions will be played with probability 1 n . Thus, player B's expected payoff is maximized when choosing s B (s 1 A , θ B ), yielding her an expected payoff of µ 1 n . As a result, the expected social welfare in equilibrium is n n+1 + µ 1 n . To compute the optimal social welfare note that if player A select s 1 A with probability 1, her expected payoff is 1 2 and the expected payoff of player B is µ 1 and so the social welfare is 1 2 + µ 1 , which is greater than n n+1 + µ 1 n if µ 1 ≥ 1 2 . The price of anarchy is 1 2 + µ 1 / n n+1 + µ 1 n , and in the limit as n → ∞, the PoA becomes 1 2 + µ 1 . Notice that the PoA is an increasing function of µ 1 .
Examples 1 and 2 illustrates that, when no transfers are allowed, the price of anarchy can be arbitrarily large as player B 's payoff is large compare with the payoff of player A. We now generalize this idea to quantify the price of anarchy in one-way games.
Proposition 1. In one-way games, the price of anarchy when no payments are allowed satisfies, for any type θ, The price of anarchy can thus be arbitrarily large. When it is large enough, Proposition 1 indicates In this case, player B has bargaining power to incentivize player A monetarily so that she moves from her equilibrium and cooperates to overcome a bad social welfare. This paper explores this possibility by analyzing the social welfare when side payments are allowed.

Related Work
Before moving to the main results, it is useful to discuss related games. One-way games may seem to resemble Stackelberg games with their notions of leader and follower. The key difference however is that, in one-way games, the leader does not depend on the action taken by the follower. In addition, in one-way games, players do not have complete information and moves are simultaneous. Jackson and Wilkie (2005) studied one-way instances derived from their more general framework of endogenous games. However, they tackled the problem from a different perspective and assumed complete information (i.e., the player utilities are not private). Jackson and Wilkie gave a characterization of the outcome when players make binding offers of side payments, deriving the conditions under which a new outcome becomes a Nash equilibrium or remains one. They analyzed a subclass, called 'one sided externality', which is essentially a one-way game but with complete information. They showed that the efficient outcome is an equilibrium in this setting, supporting Coase's claim that a polluter and his victim can reach an efficient outcome. Under perfect information, the victim can determine the minimal transfer necessary to support the efficient play. Naturally, this result does not hold under incomplete information [Myerson and Satterthwaite, 1983]. In what follows, we design a bargaining mechanism that is able to cope with the incomplete information setting.

Bayesian-Nash Mechanisms
In this section, we consider a Bayesian-Nash setting with quasi-linear preferences. Both players A and B have private utilities and beliefs about the utilities of the other players. By the revelation principle, we can restrict our attention to direct mechanisms which implement a social choice function. A social choice function in quasi-linear environments takes the form of f (θ) = (k(θ), t(θ)) where, for every θ ∈ Θ, k(θ) ∈ S is the allocation function and t i (θ) ∈ R represents a monetary transfer to agent i. The main objective of mechanism design is to implement a social choice function that achieves near efficient allocations while respecting some desirable properties. For completeness, we specify these key properties.
Definition 3. A social choice function is ex-post efficient if, for all θ ∈ Θ, we have k(θ) ∈ arg-max s∈S i u i (s, θ).

Definition 4. A social choice function is budget-balanced
In other words, there are no net transfers out of the system or into the system. Taken together, ex-post efficiency and budget-balance imply Pareto optimality. An essential condition of any mechanism is to guarantee that agents report their true types. The following property captures this notion when agents have prior beliefs on the types of other agents.

Definition 5. A social choice function is Bayes-Nash incentive compatible (IC) if for every player
where θ i ∈ Θ i is the type of player i,θ i is the type player i reports, and E θ −i |θ i denotes player i's expectation over prior beliefs θ −i of the types of other agents given her own type θ i .
The most natural definition of individual-rationality (IR) is interim IR, which states that every agent type has non-negative expected gains from participation.
where u i (θ i ) is the expected utility for non-participation.
In the context of one-way games, both players have positive outside options that depend only in their types. In particular, the outside options are given by the Nash equilibrium outcome under no side payments. For players A and B, the expected utilities for non-participation are

Impossibility Result
This section shows that there exists no mechanism for one-way games that is efficient and satisfies the traditional desirable properties. The result is derived from the Myerson-Satterthwaite (1983) theorem, a seminal impossibility result in mechanism design. The Myerson-Satterthwaite theorem considers a bargaining game with two-sided private information and it states that, for a bilateral trade setting, there exists no Bayes-Nash incentive-compatible mechanism that is budget balanced, ex-post efficient, and gives every agent type non-negative expected gains from participation (i.e., ex interim individual rationality). Our contribution is twofold: we present an impossibility result for one-way games and we relate them with bargaining games, an idea that we will further explore on the following sections. We now formalize the impossibility result for one-way games.
Consider the Myerson-Satterthwaite bilateral bargaining setting.
Definition 7. Myerson-Satterthwaite bargaining game: 1. A seller (player 1) owns an object for which her valuation is v 1 ∈ V 1 , and a buyer (player 2) wants to buy the object at a valuation v 2 ∈ V 2 .
2. Each player i knows her valuation v i at the time of the bargaining and player 1 (resp. 2) has a probability density distribution f 2 (v 2 ) (resp. f 1 (v 1 )) for the other player's valuation.
3. Both distributions are assumed to be continuous and positive on their domain, and the intersection of the domains is not empty.
By the revelation principle, we can restrict our attention to incentive-compatible direct mechanisms. A direct mechanism for bargaining games is characterized by two functions: (1) a probability distribution σ : V 1 × V 2 → [0, 1] that specifies the probability that the object is transferred from the seller to the buyer and (2) a monetary transfer scheme p : V 1 × V 2 → R 2 . In this setting, ex-post efficiency is achieved if σ(v 1 , v 2 ) = 1 when v 1 < v 2 , and 0 otherwise.
Our result consists in showing that a mechanism M ′ for the Myerson-Satterthwaite setting can be constructed using a mechanism M for a one-way game in such a way that, if M is efficient, individual-rational (IR), incentive compatible (IC), and budget-balanced (BB), then M ′ is efficient, IR, IC, and BB. The Myerson-Satterthwaite impossibility theorem states that such a mechanism M ′ cannot exist, which implies the following impossibility result for one-way games.
Theorem 1. There is no ex-post efficient, individually rational, incentive-compatible, and budgetbalanced mechanism for one-way games.
Proof. For any bargaining setting, consider the following transformation into a one-way game instance: where player types (v 1 , v 2 ) ∈ V 1 × V 2 are drawn from distribution f 1 × f 2 . Two possible outcomes may occur, (s 1 A , s B ) or (s 2 A , s B ), with social welfare v 1 and v 2 respectively. Let us assume M = (k, t) is a direct mechanism for one-way games and that M is ex-post efficient, IR, IC, and BB. We now construct a mechanism M ′ = (σ, p), where σ(v 1 , v 2 ) is the probability that the object is transferred from the seller to the buyer and p(v 1 , v 2 ) is the payment of each player. We define M ′ such that It remains to show that M ′ satisfies all the desired properties. An ex-post efficient mechanism M in the one-way instance satisfies Therefore, σ(v 1 , v 2 ) will assign the object to the buyer iff v 1 < v 2 . That is, the player with the highest valuation will always get the object, meeting the restriction of ex-post efficiency. The budget-balanced constraint in M implies that The individual rationality property for M ′ comes from noticing that the default strategy of player A when no payments are allowed is s 1 A and the corresponding payoff is v 1 . Therefore, the seller utility is guaranteed to be at least her valuation v 1 . Analogously, the buyer will not have a negative utility given that u B ((s 1 A , s B ), v 2 ) = 0. Incentive-compatibility is straightforward from definition. Assume that M ′ is not incentivecompatible, then in mechanism M, at least one player could benefit from reporting a false type.
Such a mechanism M ′ cannot exist since it contradicts Myerson-Satterthwaite impossibility result, which concludes our proof.
An immediate consequence of this result is that Bayesian-Nash mechanisms can only achieve at most two of the three properties: ex-post efficiency, individual-rationality, and budget balance. For instance, VCG and dAGVA [d' Aspremont and Gérard-Varet, 1979;Arrow, 1979] are part of the Groves family of mechanisms that truthfully implement social choice functions that are ex-post efficient. VCG has no guarantee of budget balance, while dAGVA is not guaranteed to meet the individual-rationality constraints. We refer the reader to Williams (1999) and Krishna and Perry (1998) for alternative derivations of the impossibility result for bilateral trading under the Groves family of mechanisms.

Single-Offer Mechanism
In this section, we propose a simple bargaining mechanism for player B to increase her payoff. The literature about bargaining games is extensive and we refer readers to a broad review by Kennan and Wilson (1993).
Given the nature of our applications, individual rationality imposes a necessary constraint. Otherwise, player A can always defect from participating in the mechanism and achieve her maximal payoff independently of the type of player B. Additionally, we search for Bayesian-Nash mechanisms without subsidies, i.e., budget-balanced mechanisms. The lack of a subsidiary in this case gives rise to a decentralized mechanism that does not require a third agent to perform the computations needed by the mechanism. However, a third party is needed to ensure compliance with the agreement reached by both players.
An interesting starting point for one-way games is the recognition that, whenever player B has a better payoff than A, player A may let player B play her optimal strategy in exchange for money. The resulting outcome can be viewed as swapping the roles of both players, i.e., player B chooses her optimal strategy and A plays her best response to B's strategy. In this case, as in Proposition 1, the worst outcome would be This observation leads to the following lemma.
Lemma 2. Consider the social choice function that selects the best strategy that maximizes the payoff of either player A or player B, i.e., the strategy In the one-way game, strategy s ′ (θ) has a price of anarchy of 2 (i.e., ∀θ P oA(θ) = 2).
Unfortunately, this social choice function cannot be implemented in dominant strategies without violating individual rationality. Player A may have a smaller payoff by following strategy s ′ instead of the Nash equilibrium strategy s N . Indeed, when SW (s ′ , θ) < SW (s N , θ), it must be that at least one of the players will be worse than playing the Nash equilibrium strategy s N . Lemma 2 however gives us hope for designing a budget-balanced mechanism that has a constant price of anarchy. Indeed, a simple and distributed implementation would ask player B to propose an action to be implemented and player A would receive a monetary compensation for deviating from her maximal strategy.
We now present such a distributed implementation based on a bargaining mechanism. The mechanism is inspired by the model of two-person bargaining under incomplete information presented by Chatterjee and Samuelson (1983). In their model, both the seller and the buyer submit sealed offers and a trade occurs if there is a gap in the bids. The price is then set to be a convex combination of the bids. Our single-offer mechanism adapts this idea to one-way games. In particular, to counteract player A's advantage, player B makes the first and final offer. Moreover, the structure of our mechanism makes it possible to quantify the price of anarchy and provide quality guarantee on the mechanism outcome. Our single-offer mechanism is defined as follows: 1. Player B selects an action s A ∈ S A to propose to player A.

Player
and γ ∈ R [0,1] to player A in the hope that she accepts to play strategy s A instead of strategy s N A . 4. Player A decides whether to accept the offer.
5. If player A accepts the offer, the outcome of the game is (s A , s B (s A , θ B )). Otherwise the outcome of the game is the outside option s N It is worth observing that a broker is required in this mechanism to ensure that the outcome s N A (θ A ), s O B (s A , θ B ) is implemented if player A rejects the unique offer, and no counteroffers are made. A key feature of the single-offer mechanism is that it requires a minimum amount of information from player A (i.e., whether she accepts or rejects the offer).
To derive the equilibrium strategy for the single-offer mechanism, we assume players are expected utility maximizers. The parameter γ ∈ R [0,1] has been chosen so that player B, satisfying individual rationality, never offers more than ∆ B (s A , θ B ) and her payoff is never worse than her expected outside option u O B (s A , θ B ). Whereas the mechanism can only guarantee interim individual rationality for player B, it provides ex-post individual rationality for player A, as shown in the following proposition.
Proposition 2. If players A and B play the single-offer mechanism, for any (θ A , θ B ) ∈ Θ, player A accepts the offer (γ, s A ) whenever In case player A rejects the offer (γ, s A ), she will choose her utility maximizing action s N A (θ A ) as her outside option. Note that by Proposition 2, if s A = s N A (θ A ), player A would never reject the proposed action s A . Accordingly, if proposed action s A is rejected then s A = s N A (θ A ). This observation leads to the following proposition.

Proposition 3. For every task s ∈ S
In the single-offer mechanism, player B will pick outside option s O B (s A , θ B ) such that her expected payoff is maximized, i.e., Example 1. (continued) The payoff of Player B is higher if action s 1 A is played by player A. Hence, player B has incentives to submit an offer c that triggers action s 1 A . Player A accepts the offer if c + u A (s 1 A ) ≥ u A (s 2 A ) = 100. Given that u A (s 1 A ) follows a uniform distribution, the probability that player A accepts the offer is c 100 if c ≤ 100 and 1 otherwise. For player B, this offer has an expected payoff of c 100 · (x − c) if c ≤ 100 and x − c otherwise. The optimal value for the offer is given by c * = x 2 if x ≤ 200 and c * = 100 if x > 200. This leads to an expected social welfare for the single-offer mechanism of Recall that the optimal social welfare is 50 + x if x ≥ 50 and 100 otherwise. Therefore, the mechanism has a price of anarchy, The PoA is bounded by a constant and in fact, P oA ≤ 1.21 for any x. This contrasts with the unbounded PoA obtained when no side payments are allowed.

Example 2. (continued)
The payoff of player B is higher if action s 1 A is played by player A. Hence, player B has an incentive to submit a monetary offer c ≤ 1 that triggers action s 1 A . Player A accepts the offer if c + u A (s 1 A ) ≥ max s A ∈S A u A (s A ). It can be shown that the probability that player A accepts the offer is cn−c n n−1 . In case of acceptance, the expected payoff is µ 1 − c for player B and 1 2 + c for player A. In case of rejection, it is guaranteed that player A will not play s 1 A and hence player B's outside option is action s B (s 2 A , θ B ) with an expected payoff of µ 2 n−1 . Player A's expected outside option is n n+1 , corresponding to her expected maximum payoff derived in Example 2. As a result, player B by offering c, has an expected payoff of When n is large, the probability of acceptance is approximately, Accordingly, in case of rejection, the expected payoffs become lim n→∞ µ 2 n−1 = 0 for player B, and lim n→∞ n n+1 = 1 for player A. Player B's expected payoff is thus c (µ 1 − c) and is maximized when she offers c * = µ 1 2 if µ 1 ≤ 2 and c * = 1 otherwise. This leads to an expected social welfare for the single-offer mechanism of SW = c * (µ 1 + 1 2 ) + (1 − c * )(0 + 1). Recall that the optimal social welfare is 1 2 + µ 1 if µ 1 ≥ 1 2 and 1 otherwise. Therefore the mechanism has the following price of anarchy, and the PoA has a maximum value of 4 31 3 + 2 √ 10 ≈ 1.203. This contrasts with the unbounded PoA obtained by the Nash equilibrium when no side payments are allowed.
We now generalize the analysis done in Examples 1 and 2. We proceed by studying the utilitymaximizing strategy (s A , γ) for player B and then derive the expected social welfare of the outcome for the single-offer mechanism. Note that, in case of agreement, the action of player B of type θ B is solely defined by s A as she has no incentives to defect from its best response s B (s A , θ B ). By Proposition 2, player A accepts an offer whenever ∆ . Player B obviously aims at choosing γ and s A to maximize her payoff and we now study this optimization problem. In the case of an agreement, player B is left with a profit of Otherwise, player B gets an expected payoff of u O B (s A , θ B ). Definition 8. The probability that player A accepts the offer (s A , γ), given that player B has type θ B ∈ Θ B , is The expected profit of players A and B for proposed action s = (s A , s B ) and γ when player B has type θ B is given by The optimal strategy of player B is specified in the following lemma.
Lemma 3. On the single-offer mechanism, player B chooses s * A (θ B ) and γ * (s * A , θ B ) such that

Price of Anarchy
We now analyze the quality of the outcomes in the single-offer mechanism. The first step is the derivation of a lower bound for the expected social welfare of the single-offer mechanism. Inspired by Lemma 2, instead of considering all pairs s A , γ , the analysis restricts attention to a single action We prove that, when offering to player A action s ′ A and its associated optimal value for γ, the expected social welfare is lower than the optimal pair s * A , γ * . As a result, we obtain an upper bound to the price of anarchy of the single-offer mechanism.
To make the discussion precise, consider the strategy where player B offers s ′ A , γ * (s ′ A , θ B ) , with γ * (s ′ A , θ B ) being the optimal choice of γ given s ′ A , following the notation used in Lemma 3. Lemma 4. For any type θ B ∈ Θ B of player B, the expected social welfare achieved by the singleoffer mechanism is at least the expected social welfare achieved by the strategy s ′ A , γ * (s ′ A , θ B ) . Proof. Let γ * = γ * (s * A , θ B ) and γ ′ = γ * (s ′ A , θ B ). The optimality condition of s * implies that Two cases can occur. The first case is i.e., the probability of player A accepting offer (s * A , γ * ) is greater than if offered (s ′ A , γ ′ ). Then, it must be that the expected payoff of player A is greater when offered (s * A , γ * ), i.e., This, together with Inequality (1) results in the single-offer mechanism having a greater expected social welfare. The second case is P (s ′ A , γ ′ , θ B ) > P (s * A , γ * , θ B ). Consider γ ′′ such that P (s ′ A , γ ′′ , θ B ) = P (s * A , γ * , θ B ). The fact that the probabilities of acceptance are the same implies that the expected payoff of Player A is the same in both cases, i.e., . This, together with Equation (1) yields This is equivalent to Similarly, consider γ * * such that . Existence of γ * * is guaranteed by Inequality (2) which states that, there is more money in expectation to transfer to player A when choosing s * over s ′ . The fact that the acceptance probabilities are the same, together with Inequality (2), implies that Given that the expected payoff of player A is the same in both cases, it must be the case that the expected payoff of player B is higher when using (s * A , γ * * ). Therefore, we have found an offer for the single-offer mechanism with greater expected social welfare and a greater payoff for player B compared with strategy s ′ A , γ ′ .
We are ready to derive an upper bound for the induced price of anarchy for the single-offer mechanism. We first derive the price of anarchy of strategy s ′ A , γ ′ in case of agreement and disagreement of player A.
Lemma 5. Consider action s ′ = arg max s∈S u B (s, θ B ) and let P oA A (γ) and P oA R (γ) denote the induced price of anarchy if player A accepts and rejects the offer given a proposed γ. Then, P oA A (γ) = 1 + γ and P oA R (γ) = 1 + 1 γ .
to player A. Two cases can occur.
where the last inequality comes from When γ = 1, the price of anarchy is 2 but player B has no incentive to choose such a value. If γ = 0.5, the price of anarchy is 3. Of course, player B will choose γ ′ = γ * (s ′ A , θ B ). Lemma 5 indicates that the worst-case outcome is (1 + γ ′ ) when player A accepts with a probability P (s ′ A , γ ′ , θ B ) and (1 + 1 γ ′ ) otherwise. This yields the following result.
Theorem 6. The Bayesian price of anarchy of the single-offer mechanism for one-way games is at most Proof. By combining Lemmas 4 and 5, we can derive the following upper bound for the PoA.
To get a better idea of how the mechanism improves the social welfare, it is useful to quantify the price of anarchy in Theorem 6 for a specific class of distributions.
Corollary 7. If ∆ A (s ′ A , θ A ) has a cumulative distribution function F (x) = (x/∆ B ) β between 0 and ∆ B , with 0 < β ≤ 1, then γ = β β+1 and the price of anarchy is at most For example, if β = 1, then F (x) is the uniform distribution, γ = 1 2 , and the expected price of anarchy is at most 2.25. This corollary, in conjunction with Lemma 2, gives us the cost of enforcing individual rationality, moving from a price of anarchy of 2 to a price of 2.25 in the case of a uniform distribution.
The strategy s ′ A , γ ′ is of independent interest. It indicates how a player with limited computational power can achieve an outcome that satisfies individual rationality without optimizing over all strategies.

Multi-Offer Mechanism
This section extends the single-offer mechanism by allowing player B to make multiple monetary offers for the same proposed action. Our main result shows that making counteroffers under commitment does not improve efficiency over the single-offer mechanism. By commitment we mean that player B must be able to guarantee that the price schedule she originally announces will not be modified in the future.
The single-offer mechanism was characterized by a an action s A ∈ S A and a single value γ ∈ R [0,1] . The multi-offer mechanism is characterized by a 4-tuple where n is the number of offers, (γ 1 , . . . , γ n ) is a sequence of numbers in R n [0,1] to compute the ratios to be offered, and (p 1 , . . . , p n ) is a sequence of probabilities for continuing to make offers where we assume that p 1 = 1. The multi-offer mechanism is defined as follows: 1. Player B selects an action s A ∈ S A to propose to player A.

Player
and γ i ∈ R [0,1] to player A in the hope that she accepts to play strategy s A instead of strategy s N A .

Player
A decides whether to accept the offer.
6. If player A accepts the offer, the outcome of the game is (s A , s B (s A , θ B )).
7. If player A rejects the offer, set i ← i + 1 and go to step 4 with probability p i .
8. Otherwise the outcome of the game is the outside option s N A (θ A ), s O B (s A , θ B ) .
For ease of notation, we denote In the multiple-offer mechanism, player B makes a sequence of offers γ i ∆ B (s A ) to player A to play strategy s A . The first offer is γ 1 ∆ B (s A ). If player A refuses the offer, then player B makes a second offer γ 2 ∆ B (s A ) with probability p 2 . Hence, with probability 1 − p 2 , player B makes no offer and the outcome of the game is s N A (θ A ), s O B (s A , θ B ) . In general, at iteration i, player B makes an offer γ i ∆ B (s A ) with probability p i and the outside option is played with probability 1 − p i . The mechanism stops when player A accepts an offer or when Player B stops making offers to player A. In this last case, once again, the outside option is played.
Observe that player A could reject an offer even if it is more profitable than playing her maximizing utility action s N A (θ A ) because she may expect a better offer in the future. To avoid this behavior, the multi-offer mechanism imposes a condition on the γ i 's and p i 's to ensure that player A accepts the first offer that gives her a higher payoff than her default action s N A (θ A ). Two conditions must hold for player A to accept an offer in step i ∈ [1, . . . , n]: (a) Individual Rationality: which is equivalent to Proposition 2.
(b) Greater expected utility in step i than in step i + 1: which is equivalent to We now show that the multiple-offer mechanism is in fact equivalent to the single-offer mechanism. We use the notation so that Condition (4) can be expressed as Note that if player A refuses an offer with γ i , she will also refuse offers with smaller ratios. This observation leads to the following proposition.
Lemma 8. In the multiple-offer mechanism, for all i ∈ [1, . . . , n], Proof. Assume that γ i < S i . By definition of S i , it follows that γ i − p i+1 γ i < γ i − p i+1 γ i+1 and hence γ i > γ i+1 . This contradicts Proposition 4, stating that the γ's are defined as a non-decreasing sequence.
If player A rejected the offer in step i − 1, then Conditions (3) and (4) were both not satisfied in step i − 1. By Lemma 8, only two cases may occur: 1. If γ i ∆ B (s A ) < ∆ A (s A ) then it must be the case that The disjunction of Conditions (5) and (6) yields the following inequality By Corollary 9, if player A accepts in step i given that she rejected in step i − 1, we have that Recalling Definition 8, the cumulative distribution function of random variable ∆ A (s A ) was denoted by, Hence the probability of acceptance in step i can be derived from Conditions (7) and (8) as Player B aims at choosing the γ i 's, the probabilities p i 's and action s A to maximize her expected utility, which is equivalent to the following optimization problem.
Where the term i j=1 p j is the probability of reaching to the i-th offer. We are now ready to state the main result of this section.
Theorem 10. The multi-offer mechanism is equivalent to the single-offer mechanism in one-way games.
Proof. By Equation (12), Then, by using (13) and grouping the P (s A , S i ) terms, the objective function becomes Observe that each term in the objective function features an expression of the form P (s A , x)(1 − x).
Hence the objective is bounded by above by . We show that, for any given probabilities p, there is a unique solution that meets this upper bound. Let x * = arg max x P (s A , x). The right-hand term in (15) is optimized by setting γ n = x * . We show by induction that all the other terms are optimized by setting γ i = x * . Assume that this holds for γ i+1 , . . . , γ n . We need to optimize P (s A , S i )(1 − S i ). By induction, and assigning x * to γ i gives S i = x * and P (s A , S i )(1−S i ) = C. Since all γ i are equal, this concludes the proof.
The above derivation is related to a well-known result from Sobel and Takahashi (1983), which models an iterative bargaining where there is a buyer with a private reservation price and a seller with reservation price 0 who makes all the offers. There is a known fixed discount factor for each player and, when these discount factors are equal (this is equivalent to have a probability for a next offer), they showed that, under commitment, the infinite horizon bargaining is equivalent to the single shot. There are differences between their model and ours: In our model, the buyer is making the offers, the probabilities are not fixed a priori (Player B can choose them), and both outside options are private.

Discussion
In one-way games, the utility of one player does not depend on the decisions of the other player. We showed that, in this setting, the outcome of a Nash equilibrium can be arbitrarily far from the social welfare solution. We also proved that it is impossible to design a Bayes-Nash incentive-compatible mechanism for one-way games that is budget-balanced, individually rational, and efficient. To alleviate these negative results, we proposed two privacy-preserving mechanisms: a single-offer and a multi-offer mechanism and showed that both are equivalent.
The single-offer mechanism is simple for both parties, as well as for the broker who just makes sure that the players follow the protocol. This mechanism also requires minimal information from the agents who perform all the combinatorial computations, while it incentivizes them to cooperate towards the social welfare in a distributed setting. Moreover, the mechanism has the following desirable properties: it is budget-balanced and satisfies the individual rationality constraints and Bayesian incentive-compatibility conditions. Additionally, we showed that, in a realistic setting, where agents have limited computational resources, a simpler version of the mechanism can be implemented without overly deteriorating the social welfare.
It is an open question whether there exists another mechanism (possibly more complex) that could lead to a better efficiency, while keeping the above properties. Indeed, in one-way games, player A has a intrinsic advantage over player B, which is not easy to overcome. One possible promising mechanism consists of player B setting rewards for all player A's actions, and player A choosing one in return for that money. This is known as the Bayesian Unit-demand Item-Pricing Problem (BUPP) [Chawla et al., 2007]. Recent work has shown this problem to be NPhard [Chen et al., 2014], but a factor 3 approximation to the optimal expected revenue of player B is obtained in [Chawla et al., 2007] (subsequently improved to 2 in [Chawla et al., 2010]). In the context of our paper, several interesting questions arise from the Bayesian Unit-demand Item-Pricing Problem. What is the efficiency achieved by the BUPP in one-way games? What is the impact of a constant factor approximation for the revenue on the social welfare in one-way games?
There are also many other directions for future research. It is important to generalize one-way games to multiple players. Moreover, there are applications where the dependencies are in both directions, e.g., the restoration of the power and the gas systems considered in Coffrin et al. (2012). These applications typically have multiple components to restore and the dependencies form an acyclic graph. Hence such a mechanism would likely need to consider this internal structure to obtain efficient outcomes.