The Vote with the Wallet Game: Responsible Consumerism as a Multiplayer Prisoner’s Dilemma

: Socially responsible consumers and investors are increasingly using their consumption and saving choices as a ‘vote with the wallet’ to award companies that are at vanguard in reconciling the creation of economic value with social and environmental sustainability. In our paper, we model the vote with the wallet as a multiplayer prisoner’s dilemma, outline equilibria and possible solutions to the related coordination failure problem in evolutionary games, apply our analysis to domains in which the vote with the wallet is empirically more relevant, and provide policy suggestions.


Introduction
The vote with the wallet is a phenomenon of growing relevance in the contemporary economic scenario. We understand the vote with the wallet as the propensity of consumers to consider social and environmental sellers' responsibility into consumption and saving choices, in order to stimulate companies to 'retail' bundles of private and public goods [1] which may ultimately be in consumers' and investors' own interest (e.g., in terms of healthier food, better job opportunities, and higher corporate fiscal responsibility).
Consumers may interpret their consumption as a voting behaviour [2] to send general messages to governments and other citizens, increase activism and engagement, and boost sustainable development [3]. Other consumers who are not aware of their political consumerisms, however, also face purchase decisions between responsible and non-responsible products. In this introduction we describe four examples showing how the vote with the wallet is currently working. These real-life examples document the empirical relevance of the theoretical approach we develop in our paper. The model outlined in the second section explains why the vote with the wallet is a special case of multiplayer Prisoner's Dilemma (PD): each consumer voting with the wallet for the responsible product produces a positive externality for the other consumers, because her vote contributes to the public good benefit of a higher socially the European investment market, while voluntary exclusions not related to CM & APL cover about 23 percent (A C4.0 trillion) of the market. In the same year the USSIF report finds that sustainable, responsible, and impact investing assets expanded by 76 percent over two years up to $6.57 trillion at the start of 2014 (Report on US Sustainable, Responsible and Impact Investing Trends 2014) accounting for a market share of around 17 percent of all assets under professional management in the United States. A novel and relevant initiative in this direction is the Montreal Pledge (Montréal Pledge, 2014. Retrieved from http://montrealpledge.org.) signed by a coalition of funds accounting for $3 trillion of assets under management. The initiative requires signatories "commit to measure and publicly disclose the carbon footprint of their investment portfolios on an annual basis" and to reduce progressively their footprint providing a new frontier of application of the vote with the wallet.
A third relevant example of the vote with the wallet comes from the Oxfam Behind the Brands campaign. In February 2013 Oxfam rated the 10 largest food multinationals by evaluating social and environmental responsibility of their supply chains on different domains (land, women, farmers, workers, climate, transparency, and water) in terms of awareness, knowledge and disclosure, commitment, and supply chain management. Oxfam then asked campaign supporters around the world to take action by voting with the wallet (i.e., buying products of highest rank companies) or sending ad hoc messages to companies expressing their disappointment in case of low scores. At beginning of 2015, nearly 700,000 actions had been taken and 32 major investment funds accounting for around $1.5 trillion joined Oxfam in asking the 10 biggest companies to improve their social and environmental stance. As a result of the campaign, 9 out of the 10 biggest food companies took actions to improve their scores (more specifically Oxfam reports that 9 companies out of 10 improved their scores from February 2013 to October 2014. Advancement concerned among others policies that commit to implementing the principles of Free Prior and Informed Consent, women's rights, farmers, and environment.).
A fourth vote with the wallet practice is that of "Community supported agriculture". These small networks commit to buy directly from producers local agricultural products that are socially and environmentally responsible and compete with traditional distributors of traditional product chains with larger geographical extension (at March 2015 around 1600 solidarity-based purchasing groups were present in Italy; for further details see http://www.retegas.org/index.php?module=pagesetter&func=viewpub&tid=2&pid=10.).
The four examples described above document that millions of people are currently playing the vote with the wallet game, and more so if we consider that also many of those who do not choose 'responsible' products face in any case the alternative between conventional and alternative products. Understanding how the vote with the wallet works and how players can be successful in overcoming the related coordination failure problem is an interesting and still partially unexplored field of research.
The literature has analyzed so far in depth the supply side of the vote with the wallet phenomenon with oligopolistic models which investigate how companies compete for attracting socially and environmentally consumers. Ref.
[1] outline a model where producers compete to attract SR consumers by retailing public goods (the industrial organization literature models competition in CSR by considering the latter an additional feature of the product; see, among others, [13,14]). Other contributions [15] document, under reasonable parametric conditions, that the market entry of not-for-profit pioneers triggers (partial) imitation as optimal reaction of profit-maximizing incumbents, thereby identifying in the vote with the wallet one of the originating causes for Corporate Social Responsibility (CSR) and the above described contagion observed in fields such as fair trade. However, the current literature is actually missing a demand side analysis with a more in depth game theoretical inspection of consumers' interactions when voting with the wallet. On this side, the practise of rewarding ethical firms behaviour through consumption has been analysed by [3,16], who respectively discuss the nature and impact of ethical consumer decision making and how buycott and boycott are used as a form of political consumerism. A theory of boycott has also been modelled by [17]. In his model, individuals can first boycott a firm and then bargain private policy with the latter according to their preferences and available information. The vote with the wallet differs from the boycott in four ways. First, while boycott typically reduces demand, the vote with the wallet is an action aimed at redirecting (and in many cases increasing) consumer demand. Second, the vote with the wallet does not imply a bargaining process since individuals just reveal their preferences by consuming the products of their preferred firms. Third, the vote with the wallet is a practice that individuals do everyday by consuming, while boycotting is in general an extraordinary action that can be chosen under specific circumstances. Fourth, the vote with the wallet is a positive action which aims to reward the most virtuous firms creating emulation, while boycotting is a negative action which penalises the worst firms. For these reasons, even though voting with the wallet for a responsible product implies not buying the alternative conventional product, the vote with the wallet and boycott cannot be considered as symmetric problems.
Our paper provides a contribution to the vote with the wallet phenomenon. In what follows we argue that coordination among consumers voting with the wallet creates a typical multiplayer PD with some qualifying characteristics that make the game unique. We then explore equilibria and potential solutions to the coordination problem from the simplest one-shot two-player up to the one-shot and infinitely repeated multiplayer games. More specifically, we outline conditions under which the PD can be overcome with grim strategies in Folk theorems, Pavlov and proportional tit-for-tat strategies for evolutionary games, and with the identification and creation of coalitions of voters who adopt proper strategies to enforce mutual voting equilibrium in the game.
The paper is divided into six sections (including introduction and conclusion). In the second section we outline the basic characteristics of the two-player one shot version of the game. We then illustrate its multiplayer extension and discuss how the PD can be overcome. In the third section we analytically illustrate how Folk theorems and memory-one strategies in evolutionary games may enforce mutual voting equilibrium in the repeated multiplayer game . In the fourth section examines the power of coordination illustrating how coalitions may enforce strategies which overcome the PD. In the fifth section we discuss how our findings may inform the policy debate on responsible consumerism. The final section concludes.

The Simplest Model Representation: A 2-Player Static Game
In the simplest version of the game there are 2 players, i = 1, 2, who can vote with the wallet for the SR product (vR) or for the standard product (vS). The payoff of player i is where S := (S i , S −i ) ∈ {vR, vS} 2 is the strategy profile. The payoff function (1) depends on three crucial factors: the public good benefit accruing from the choice of the SR product (b ∈ (0, +∞)), weighted for the share of players choosing the strategy vR, the enjoyment arising from players' other-regarding preferences (a ∈ [0, +∞)), and the extra-cost of voting for the SR product (c ∈ [0, +∞)).
The first factor (b) is the economic benefit accruing to the individual from company behaviour change due to the vote with the wallet. This element hinges on the assumption that voters' actions have an impact on companies in proportion to the share of responsible voters and can direct them toward a more responsible behavior. Valuable examples of this benefit are higher chances of getting a job or higher job satisfaction in a more socially responsible company and health benefits or amenities in a more environmentally sustainable company. Other examples may relate to tax or cultural corporate responsibility. According to the former, consumers vote with the wallet for a company abstaining from tax dodging practices that reduce tax financed welfare services in their country. According to the latter, they vote with the wallet for a company that finances local cultural inheritance with its CSR policies. (An example of cultural corporate responsibility comes from Expedia, Inc., a world's leading online travel company establishing a partnership of the World Heritage Alliance with the UNESCO World Heritage Centre. The World Heritage Alliance includes 59 corporate members and partners (such as the Fairmont Hotels and Resorts and Mandarin Oriental) promoting environmental, cultural and social responsibility, and supporting local community tourism initiatives at World Heritage sites with grants or promotion in favour of responsible tourists contributions. The alliance is currently involved in the protection of 20 World Heritage sites in seven countries including Mexico, Costa Rica, Belize, Jordan, Dominica, Ecuador, and the United States. For other case studies of cultural corporate responsibility see [18,19].) In all these cases it is reasonable to assume that the responsible vote with the wallet produces a utility for consumers through a benefit which has public good features since it is clearly non-rivalrous and non-excludable (a socially, environmentally, fiscally, or culturally responsible company cannot limit the enjoyment of its responsible stance to consumers voting for SR products excluding free riders voting for conventional products). (We consider b as exogenous for simplicity (and not related to oligopolistic models of CSR competition) since we focus on the perspective of consumers, who are reasonably assumed tohave their own approximate idea of the positive externality arising from the vote with the wallet and not to have a sophisticated knowledge of the competition model behind it.).
The second factor (a) is the contribution, if any, of the responsible purchase to the utility function of the voter if she has some form of other-regarding preferences, an element which has been demonstrated not to be uncommon in the experimental literature. (Empirical findings from Dictator Games [20], Gift Exchange Games [21,22]), Public Good Games [23][24][25], Trust Games [26,27], Ultimatum Games [28,29], provide ample evidence documenting the existence of other-regarding preferences. Evidence from behavioural studies highlights that individuals have other-regarding elements in their preferences ranging from (positive and negative) reciprocity [30], inequity aversion [31,32], other-regarding preferences [33], social welfare preferences [34], and various forms of pure and impure (warm glow) altruism [35,36]. A meta study of [37] examines results from around 328 different Dictator game experiments for a total of 20,813 observations. The result is that only around 36 percent individuals follow Nash rationality and give zero (based on these numbers the author can reject the null hypothesis that the dictator amount of giving is 0 with z = 35.44, p < 0.00001) and more than half give no less than 20 percent.) The third factor (c) (We assume c ∈ [0, +∞) since it represents an extra-cost between a standard products and a SR product, when the former is cheaper than the latter. Alternatively, it could be assumed that c ∈ (−∞, +∞). However a negative c makes the problem trivial. Our theoretical analysis hence applies to the more frequent and reasonable cases in which CSR adds extra costs.) measures the cost of voting with the wallet, namely the extra cost, if any, paid by the consumer when choosing a product of a responsible company vis-à-vis a product of comparable quality and lower price of another company which falls below the responsibility standards of the former. We as well assume for simplicity that Y i > c for all i = 1, 2 (where Y i is the income of player i), that is, all players' decisions to vote or not for SR products depend only on utility considerations and are not constrained by lack of income.
The above described game can be represented by is the set of players, (S i ) i∈N is the set of actions, and (U i ) i∈N is the set of payoffs described in (1). The payoff matrix writes Player 1 The game G has always a unique NE, which is (vS, vS) if 1 2 b + a < c and (vR, vR) otherwise. The parametric conditions creating a PD in the game are That is, when (2) holds, the (unique) NE (vS, vS) is Pareto dominated by the strategy pair (vR, vR) which yields the highest payoff for both players.
By considering b and a as product and individual characteristics respectively, and c as a parameter which may differ depending on the characteristics of the market , the three regions of equilibria generated by different values of the cost of voting responsibly are illustrated in Figure 1. More specifically, given (2), we are not anymore in the PD area when the cost of voting responsibly is too high (c ≥ b + a) or too low (c < 1 2 b + a), which could be the case of products with a relatively low (or zero) cost of voting for socially responsibility (see our discussion in Section 2.2 which follows). Please note that if c = b + a, the NE are (vR, vR) and (vS, vR), and both are inefficient, while if c = 1 2 b + a, we have that (vR, vR) is efficient and (vS, vS) is inefficient. Based on the above described features the originality of the game in the PD literature lies in its 'hybrid' provision-PD game characteristics ( [38] classify PDs into four categories (provision, commons, altruism, selfish) according to the private/public benefits and costs to players and to the action/inaction choices related to the 'cooperation' and 'defection' strategies.) where both classical 'cooperation' and 'defection' strategies require an action. Another difference with respect to standard provision-PD game is given by self-regarding preferences adding a private benefit to the 'cooperative' strategy, typically displayed by the vote with the wallet experience. As we will see in what follows these specific features and the framework of the game produce original attempts to overcome the dilemma (Section 2.2) and original theoretical results vis-à-vis the standard provision-PD game in terms of interval of the PD area (Section 2.1), Folk theorem threshold patience (Section 3.1), and renegotiation proofness conditions (Section 4.2). The specific features of the game also produce an altruism paradox (Section 4.1) which can be overcome with mechanism designs that are unique to the vote with the wallet framework (Section 4.3).

The Multiplayer Game
When the number of players is n ≥ 2, the game is represented by G n = (N, with j being the number of players who play vR in The game G n has always a unique NE, which is mutual voting for conventional products if 1 n b + a < c and mutual voting for SR products otherwise (we prove this result in Appendix A.) However, if we fall again in the PD and the equilibrium (vS, vS) is not efficient since, for both players, the highest payoff is a + b − c, which is obtained with the (vR, vR) strategy pair. Figure 2 clearly shows that, when the number of players grows, the area of the PD in the voting with the wallet game decreases its infimum. This implies that in standard global consumer markets where the number of players is very large the (vR, vR) equilibrium can be attained only if the other-regarding preference parameter is higher than the cost differential parameter for all players (this is because lim n→+∞ 1 n b + a = a). As a result PD is a highly relevant problem in the vote with the wallet game wiht a high number of players and whenever the value of c is not negligible.

Low cost
Intermediate cost High cost

Discussion and Possible Extensions to Find Solution to the Vote with the Wallet PD
Before looking at formal solutions to the multiplayer PD, we shortly discuss in this section how the dilemma can be practically solved.
A first obvious and simple solution is lowering as much as possible the extra cost of the responsible vote. This is what occurs in two of the four examples we made in the introduction (SR investment funds if the universe of investable funds is large enough to eliminate the cost of missed diversification (The literature highlights that managers of SR investment funds voting with the wallet have three potential additional costs vis-à-vis managers of conventional investment funds (costs of acquiring information on the SR stance of investable stocks, missed diversification opportunities due to the application of their exclusion criteria and cost of disinvesting when a stock enters the exclusion list). Theoretical analysis however shows that the second cost becomes negligible or null as far as the universe of investable stocks is large enough [39]. Empirical evidence confirms that risk adjusted returns of SR investment funds are not significantly different from those of conventional funds [40,41].) and the Oxfam's Behind the brand campaign where in some of the proposed actions-such as posting a tweet or a Facebook message to a company-there is no purchase and no extra economic and opportunity cost).
A second type of solution is a government intervention that may facilitate lowering the extra cost in different ways (preferential access to public procurement according to the CSR stance of the bidders/characteristics of the product, ad hoc tax allowances such as green consumption taxes, etc.). If the government aims at providing some public goods, it may find this kind of intervention optimal in order to foster the production of these public goods in the market by companies that internalise the externalities. Some of these interventions are currently pursued by various institutions around the world. (The most relevant example is represented by feed-in tariff schemes for renewable energy adopted in 63 jurisdictions worldwide [42]. To mention other examples, in many countries dedicated outlets selling FT products have a preferential fiscal treatment and green consumption taxes create fiscal advantages for more environmentally responsible value chains. These fiscal advantages can be directly on consumer prices or, when on producer prices, can be transmitted on consumers prices depending on demand/supply elasticities thereby reducing c in our model. In addition to it, governments directly vote with the wallet for the responsible product giving preferential treatment to SR products in procurement rules (i.e., Green Public Procurement rules are a relevant example, for their application in the EU see http://ec.europa.eu/environment/gpp/index_en.htm).).
A third type of solution relies on how individual consumers may solve the coordination problem with their own bottom-up actions, given the extra cost c. A standard approach consists of applying the class of Folk theorems to the infinitely repeated game. Another interesting approach is the development of zero determinant (ZD) strategies. ZD strategies are memory one strategies (i.e., strategies in which player's behavior depends only from action in the previous period) unilaterally enforced by a single (focal) player who chooses a linear reaction to other players' behaviour as a strategy. The literature in this respect documents that the action of the focal player is more important than what may be intuitively thought [43]. The focal player adopting a ZD strategy can set a linear relationship between her payoff and her co-players' average payoff. Stewart and Plotkin (2013) [44] demonstrate that generous ZD strategies have strong power to make population evolve toward cooperation.
A fourth type of solution concerns the action of institutions that may organise coalitions of players that represent an important share of the market in order to enforce mutual cooperation [43].
In what follows, we discuss these last two types of solutions by providing a contribution on how they can enforce mutual responsible voting equilibria in the repeated game and give rise to new practical solutions of the multiplayer PD.

The Repeated Multiplayer Game
Suppose now the multiplayer game is repeated for T stages. At any stage t (t = 1, . . . , T), we have n playersand each player i chooses an action S i ∈ {vR, vS} and obtains a payoff U i . Now G := (N, (S i ) i∈N , (U i ) i∈N ) represents each stage of the game.
We know from the Folk theorem (see, among others, [45]) literature that we can overcome the PD (i.e., we can solve the inefficiency of the NE) if and only if we play the game an infinite number of times as in doing so we can reach every feasible and enforceable payoff-and in particular mutual responsible voting-as a NE.

A Folk Theorem for the Vote with the Wallet
Suppose each player adopts a grim strategy, that is, each player plays vR as long as all her co-players do so and, if a deviation occurs, each player plays vS forever. In the static game this strategy profile is not an equilibrium, since each player is better off by voting with the wallet for the conventional product.
However, when the game is repeated infinitely many times, (as is well known the assumption of an infinite number of game rounds is not necessarily unrealistic since it may simply be viewed as players do not know when the game ends and the discount rate may be assumed higher in proportion to the expectation of how close is the termination of the game.) if one player votes for conventional products (keeping the other players voting responsibly), then her payoff at the first stage increases by n−1 n b − (b + a − c), but is reduced at later stages by b + a − c. Alternatively, in case of mutual responsible voting each player obtains b + a − c discounted at each stage by 1 − δ, where δ is the discounted factor that represents the level of patience (the lower δ, the more patient are the individuals). This problem can be written as Hence, the discount rate that ensures the mutual responsible voting equilibrium is (see Appendix A for details) We can define the factor c−a b in (5) as the standardised cost of responsible voting with the wallet. That is, the net cost of voting responsibly (extra cost from purchasing the responsible product minus the other-regarding preference benefit) as a proportion of the responsible voting benefit b. Inequality (5) suggests that a higher standardised cost of responsible voting with the wallet requires a higher level of players' patience to make mutual responsible voting equilibrium enforceable with a grim strategy in the infinitely repeated game. As well, a higher number of players requires a higher degree of patience. This is because, as n increases, the payoff that players can obtain by conventional voting also increases since it is − 1 n b − a + c. Note as well that in the PD area n n−1 (1 − a−c b ) ∈ (0, 1) if and only if n > b c−a , which ensures reasonable discount rates for sufficiently large n.

Evolutionary Strategies and Mutual Responsible Voting
Originally introduced in 1973 by Smith and Price [46] to analyse the evolution of populations in biology, evolutionary game theory has been gaining popularity among economists. While standard game theory assumes individuals behave rationally, evolutionary game theory allows for different individual behaviour such as the adoption of pre-determined strategies that can be adapted based on players' experience and history of the game. Following [43], our analysis will focus on different strategies that an individual, namely the focal player, can adopt to enforce mutual responsible voting in the repeated vote with the wallet game.
First, we define a memory-one strategy adopted by player i as the 2n-dimensional vector where p S i ,j ∈ [0, 1] denotes the probability to vote responsibly in the next stage provided that in the previous stage player i played the strategy S i ∈ {vR, vS} and j co-players voted responsibly (more formally, a memory-one strategy should be defined together with an initial strategy. However, in our setting the initial strategy does not influence the final outcome, since the game is repeated infinitely many times and the strategy profile will converge to a NE regardless the initial strategy [43]. (Since the game is assumed to be symmetric (i.e., the payoff does not depend on who is the responsible voter, but on the number of responsible voters only), memory-one strategies do not depend on player i and accordingly we drop index i for the sake of notation.) Then, we say that a zero-determinant (ZD) strategy is a memory-one strategy of the form where p R denotes player i's strategy played in the previous stage, g i := (g i S i ,j ) and g −i := (g −i S i ,j ) are the 2n-dimensional vectors of possible payoffs for player i and average payoff of i's co-players respectively, (Following [43], in our game the average payoff of i's co-players is defined . Hence, the vectors g i and g i can be written as , . . . , n−1 n b + a − c) respectively. 1 denotes to the 2n-dimensional vector of ones, φ is a payoff parameter inversely related to σ, (more precisely, σ = α φ for any α ∈ R, and therefore we require φ = 0) σ is the strategy slope, and l is the baseline payoff of the ZD strategy (please note that we set l = b + a − c, since we look at those strategies that allow for mutual responsible voting). From [47], (in particular, following the notation in [43], we apply ( [47], Theorem 1.3) to our voting with the wallet game) we know that player i who applies a ZD strategy of of the form (6) can enforce the following payoff relation where G i is the player i's payoff in the repeated game and G −i is the average payoff of i's co-players in the repeated game. ( [43] defines the player i's payoff in the repeated game as G i := lim T→∞ ∑ T t=1 U i (t), where U i (t) is the player i' payoff at stage t, and the average payoff of i's coplayers in the repeated game as G −i := ∑ n k=1 1 n−1 G k . It can be easily shown that we have G i = g i · v and G −i = g −i · v, where v := lim t→∞ v(t) is the limit point of the 2n-dimensional vector v(t) := (v S i ,j (t)) and v S i ,j (t) denotes the probability that at stage t the focal player i plays S i ∈ {V, A} and j of the i's co-players vote.) Thus, the strategy slope σ captures the variation of the co-players' average payoff G −i as the player i's payoff G i varies, and the baseline payoff is the payoff obtained by each player when playing the same strategy. When σ < 1 we say that the strategy is generous, and when σ = 1 we say that the strategy is fair [43].
In what follows we show how the PD can be overcome with two strategies, a pure memory-one Pavlov strategy and the ZD proportional Tit-for-tat strategy, which respectively extend the notion of Pavlov and Tit-for-tat strategies in standard game theory.

The Pavlov Strategy
The Pavlov strategy is a memory-one strategy that can be represented by the 2n-dimensional vector p Pav = (1, 0, . . . , 0, 1), that is, the focal player who applies a Pavlov strategy will vote after mutual responsible voting or after mutual conventional voting only. The intuition is that players will continue to vote responsibly whether all the other players do the same, and they will as well vote responsibly after all players voted for conventional products in the previous stage giving a new opportunity for a mutual responsible voting equilibrium. The Pavlov strategy therefore differs from the grim strategy defined in Section 3.1 for this new opportunity of cooperation given despite other players' defection in the previous round.
By applying [43] necessary and sufficient conditions for a memory-one strategy to be a NE, we have for our voting with the wallet game that

Proposition 1 (Pavlov strategy conditions allowing for mutual responsible voting). Mutual responsible voting with the Pavlov strategy is a NE if and only if
Proposition 1 tells that mutual responsible voting with the Pavlov strategy is always a NE if the cost is sufficiently small (Figure 3). When the cost is too low (c ≤ 1 n b + a − c), this is not surprising, since mutual responsible voting was already an (efficient) NE. Then, what the Pavlov strategy solves vis-á-vis the standard multiplayer game described in Section 2.1 is the PD for 1 n b + a < c < 1 2 b + a − c, that is, the case in which the cost of voting responsibly is the lowest within the intermediate area of PD ( Figure 3). However, at the right side of the segment where the cost c is too high, the Pavlov strategy is unable to prevent mutual conventional voting from being a NE (please note that in Proposition 1 the number of players n is a natural number, so that in the high differential cost we cannot have n smaller than 1 and therefore mutual responsible voting is not a NE).

Low cost
Intermediate cost High cost Mutual vR is not a NE 1 n b + a

The Proportional Tit-for-Tat Strategy
To apply the proportional tit-for-tat strategy we need to make explicit some properties that characterise ZD strategies allowing for stable cooperation (i.e., buying the responsible product) to be a NE of the vote with the wallet game.

Proposition 2.
Consider the vote with the wallet game described above. If the focal player sets the baseline payoff l = b + a − c and applies a ZD strategy with parameter then mutual responsible voting is a NE.

Proof. See Appendix A.
Proposition 2 links the power of the focal player to enforce a linear relationship among her payoff, the average payoff of the other players, and the Nash equilibrium of the game. The assumption on l requires the baseline payoff to be the maximum payoff b + a − c. Intuitively and in terms of efficiency, we want that the group payoff is maximised and equal to b + a − c when all players play the same strategy.
The condition on σ concerns the fairness of the focal player, because it requires the slope to be very close (but not equal) to one. This means that the focal player should adopt a strategy that gives a slightly higher average payoff to her co-players. From Section 3.2 we know that a slope equal to one corresponds to a fair strategy. Then, the focal player strategy in Proposition 2 can be considered generous, but not too generous. Another implication of this proposition is that the focal player will increase its generosity as far as the number of players is low (σ is decreasing in n), while generosity is 'inefficient' in terms of bringing other players toward the voting strategy when the number of players is too high.
Proportional tit-for-tat strategy is an example of ZD strategy which can be viewed as a generalised version of tit-for-tat strategy. When the proportional tit-for-tat is played, the probability to vote responsibly is given by the proportion of responsible voters among co-players in the previous round. More formally, the proportional tit-for-tat (pTFT) strategy is a mixed strategy and it can be represented by the following 2n-dimensional vector , . . . , 0), and the probability to cooperate at stage t = 1 is equal to one. When the number of players is equal to 2, pTFT becomes the standard TFT strategy used in Section 3.1 in the Folk theorem. The pTFT strategy is a ZD strategy (in Appendix A we show that it can be obtained by Equation (6) setting σ = 1 and φ = 1 c−a ). In particular, it is a fair strategy. Therefore, from Proposition 2 we have that the pTFT is a NE for the responsible voting with the wallet game.

The Power of Coordination
Applications of the Folk theorem in the multiperiod game where players adopt grim strategies as well as of the Pavlov and pTFT strategies in evolutionary games enacted by individual players significantly restrict the area of the PD. However, real world scenarios may fail in two directions: (i) Folk theorems are difficult to enforce with a large number of players and a non-infinite number of rounds due to the well-known endgame problems; (ii) time needed to reach the mutual responsible voting equilibrium in evolutionary games may be too long; (iii) individuals may have higher power in enforcing mutual responsible voting equilibrium if they act as a coalitionslike labor associations, labor unions, or political parties.
Coalitions are particularly useful since we have shown that the power of strategies enacted by individual players to enforce mutual voting is much weaker as far as the number of players increases (Section 2.1).
Suppose a coalition is composed by k Coal members, with 1 ≤ k Coal < n, who are able to set a strategy p to be played during the game. By applying [43] results on strategy alliances, we find that a coalition composed by k Coal members can enforce mutual responsible voting if and only if the coalition adopts either a fair strategy, or a generous strategy and (proof in Appendix A) We have shown in Proposition 2 that individuals cannot enforce mutual responsible voting by adopting strategies that are generous but not too much. On the other hand, (9) shows that the higher is the coalition size (high k Coal ), the more the strategy can be generous (low σ) (as we have seen above, one example of ZD strategy a coalition can apply is the proportional TFT).

The Coalition of the Willing and the Paradox of Altruism
Here we provide a more intuitive example illustrating how a coalition can work . Suppose the existence of a share π * of non-income constrained individuals with other-regarding preferences of the form a * where a * > c − π * b or π * > c−a * b . These individuals would nonetheless vote with the wallet for the responsible product if they could coordinate and form a coalition of the willing which synchronises their voting choices. Based on the above inequality, π min = c−a * b can be defined as the minimum responsible vote with the wallet coalition threshold required by individuals with other-regarding preferences higher or equal to a * . Figure 4 shows the three-dimensional (π, a * , c) threshold of feasible parameters for a 'coalition of the willing' when conveniently normalising b = 1. The plane c = 0 always belongs to our set because it corresponds to the situation in which the cost of responsible voting is null, and therefore the benefit for each individual of the coalition satisfies b + a * ≥ b for all a * . Hence, when c = 0, we have π min = 0. However, when c increases, the other-regarding preference coefficient a * also needs to increase in order to keep π min = 0 (segment OC). This may be interpreted as the higher the cost of voting responsibly, the higher the other-regarding preferences of individuals to ensure compatibility with the lowest coalition threshold. On the other hand, the coalition of the willing must be larger with a higher cost c in order to convince individuals with other-regarding preferences to vote responsibly. (As an extreme case, when the cost is too high and other-regarding preferences are too small, π min = 1, implying that there will always be (at least one) individual(s) with buying the conventional product as a best strategy (segment AB). Note as well that when all players are willing to vote responsibly and every individual has a preference to vote responsibly equal to the benefit of the public good component (the numeraire b), then the extra cost can be up to twice as much the numeraire b (point B).).
Suppose now that an organisation can form a coalition of the willing so that a share π * > π min (this is a necessary condition to make responsible voting with the wallet strategy nonetheless incentive compatible) of altruists reveal their strategy p t 1 = (1, 1, 1, 1) at t 1 (regardless the outcome at t 0 ). For instance, he strategy can be revealed through a cash mob (Cash mobs are media (video recorded) events where an organised group of sellers gather into a retail outlet to buy a given product and intend to communicate its decision to the general public. For a reference on the US cash mobs see http://cash-mobs.com/.) where the coalition plays the responsible voting with the wallet strategy and announces her strategy for the future. After the strategy is revealed the remaining players will however vote for standard product if c < (1 − π)b + a * or because they will fall into the PD if Assuming that the above inequality holds, in order to avoid the PD the coalition of the willing needs t announce addedits strategy at t 1 as p t 1 = (1, 0, 1, 0), which consists in punishing the free riders at period t 1 by not voting with the wallet responsibly.
Given the coalition of the willing's strategy, the benefit in t 1 for the myopic self-interested individuals who buy the conventional product is The potential loss for out-of-coalition individuals from the punishment occurring when members of the coalition of the willing deviate from their responsible voting strategy is where δ is the discount rate measuring players' patience. The potential conventional buyers will vote responsibly if punishment is higher than temptation, that is Inequality (10) outlines an altruism paradox since, coeteris paribus, in a (two period) finite number of rounds a larger coalition (generated by a higher number of individuals with enough other-regarding preferences) increases the propensity to free ride given the specific characteristics of the vote with the wallet game. This is because, as it is clear from inequality (10), with a higher π marginal benefit of free riding (buying the conventional product) will be higher than marginal cost.
To analyse the effect of a coalition on players' patience, we elaborate a Folk theorem in presence of a coalition action. Suppose a coalition of k voters vote responsibly at each stage regardless other players' strategies. In other words, we are now assuming a coalition of players who decide to vote responsibly even if the other players vote for the conventional product. Then each voter solves the problem Hence, within the PD area, when the paradox of altruism holds the patience parameter δ is higher than the previous patience parameter measured in the absence of coalition (the altruism paradox does not imply that cash mobs and coalitions are not useful since, when other-regarding preferences of coalition members are not high enough, they produce the effect of triggering the vote with the wallet of coalition members (who would have voted for the standard product if playing the game with any coalition)).

Renegotiation Proofness
We wonder whether the strategies described above to enforce mutual responsible voting equilibrium are renegotiation proof. The cost of punishing for the punisher (that is, what she loses by executing punishment) is b + a − c if the alternative is full coordination, or π * b + a − c if the alternative is partial coordination. Hence the strategy is renegotiation proof if we reasonably assume that it is not possible to enforce free riders to play cooperatively (i.e., buying the responsible product) in time t + 1 after they free-rided at time t and if b + a > c > π * b + a where π * is the share of punishers. Under such condition the tit-for-tat strategy announced by the coalition of the willing is renegotiation proof, that is, there is no interest for punishers to renegotiate the strategy after the violation of free riders and before the punishment for that violation is enacted, since the PD area in the multiplayer PD is such that 1 n b + a < c < b + a.

The Optimal Cash Mob
Based on what observed above about the power of coalitions and the paradox of responsibility in the vote with the wallet game, we outline the characteristics of a bottom-up mechanism design which can bring to mutual voting equilibrium in the infinitely repeated game.
A coalition of the willing may reveal its existence and strategy by organiszing a cash mob able to 1.
Announce the coalition members number k; 2.
Communicate to the general public the crucial parameters of the game and, more specifically b, c and n; 3.
Communicate the 'permanent' commitment of coalition members to play the responsible strategy (to avoid the paradox of altruism documented in Section 4.1) by subscription a pre-authorised debit for purchase of SR product which is automatically renewed in absence of a cancelation notice; 4.
Define the 'threat', that is, the commitment of coalition members to buy the conventional product if other players in the game do not buy the responsible product. As shown above in Section 4.2 the threat is renegotiation proof as far as parameters are within the PD area since c > 1 n b + a.
Cash mob plays the role of a reinforced signal. It is more effective than a press conference because the public announcement works together with a credible commitment to enact the strategy. The success of the cash mob depends on the rationality of the non-coalition players and their agreement on the model parameters (the benefit b and the cost c), which is ensured ex ante by assumption in the theoretical model but not in reality. (As stated in the introduction, the perception of b is easier in some specific dimensions of corporate responsibility. For instance, a rise in corporate fiscal responsibility should produce a clearly identifiable increase in domestic fiscal revenues and therefore in resources available for local public goods.)

Discussion
The vote with the wallet game described in this paper postulates that solutions to the related PD rely on four fundamental factors: (i) non-rivalrous and non-excludable positive effect on consumers induced by the corporate move toward higher CSR, stimulated by responsible vote; (ii) players' other-regarding preferences; and (iii) extra purchasing cost generated by responsible vote with the wallet and share of individuals voting responsibly which acts as a weight for the societal benefit.
Findings and their discussions allow us to compare the above devised mechanism designs with real-life experiences, such as cash mobs and the behaviour of sustainable purchasing groups. These experiences speak to the public policy debate indicating two directions. First, they address the paradox of altruism suggesting how a mechanism may strengthen the capacity of cash mobs to enforce mutual responsible voting equilibrium. Second, they work on both the extra cost and the coordination problem. Many existing voting with the wallet experiences do try to reduce the cost of responsible vote and enhance coordination among responsible voters. In doing that, these experience reduce PD inefficiency.
This paper illustrates theoretical insights that sustainable policies may be aware when implementing tax/subsidies mechanisms and designing behavioural policies. Our paper offer future policies a tool to target markets where responsible consumption may be more easily achieved. Governments aimed at implementing sustainable consumption may use our vote with the wallet game to identify the following market characteristics: positive extra cost, easily identifiable societal benefit, high number of consumers, and consumers' expected responsiveness. These characteristics are all linked according to what our model predicts and how mechanism suggests.

Conclusions
Consumers' willingness to pay and revealed preferences implicit in the non-negligible market shares of SR consumption and savings document that the vote with the wallet is becoming an increasingly relevant feature of contemporary economics. Growth of FT products and SR investment funds document that non-price demand elasticity (where consumers consider CSR as one of the product characteristics on which they base their choices) plays an important role.
Our paper deals with these novel features of contemporary markets by focusing on the game theoretical demand side characteristics of the vote with the wallet. The paper also investigates the embedded multiplayer PD generated by the public good features of the benefits produced by the vote for the responsible product.
We analyse the vote with the wallet game, its associated PD, and possible solutions, as this game is actually played by millions of consumers. First, we document that the area of the dilemma widens as the number of players increases, as standard in global consumer markets. Second, we apply the Folk theorem and we outline the conditions allowing the PD inefficiency to be overcome with grim strategies. Then, we show that two crucial parameters affect the threshold patient level, that is the number of players and the standardised cost of responsible voting (i.e., the net cost of responsible voting as a proportion of the responsible voting benefit). Third, we investigate how Pavlov and pTFT strategies in evolutionary games enacted by individual players may lead to mutual voting equilibrium. Finally, we show that the formation of stable coalitions of responsible players may lead to larger positive externality generated because of higher CSR but fall into the paradox of altruism because of increasing other players' propensity to free ride.
Findings presented in this paper through theorems and qualitative mechanisms may inform the policy debate on responsible consumerisms. People do actually vote with the wallet and what we model is the link between key market characteristics, that is the extra cost and the societal benefit of responsible products, consumers' other-regarding preferences, number of consumers and the role of timing and coalition. Future research can extend our theoretical findings with behavioural components that may react as an additional stimulus to reduce the PD area as well as with empirical evidence showing the link between market components and firms' CSR.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
Proof of Equation (4). We want to show that if 1 n b + a > c mutual responsible voting is the unique NE of the game, and if 1 n b + a < c then mutual conventional voting (buying the conventional product) is the unique NE of the game (the extensive form game is represented in Figure A1).
Without loss of generality, let assume that n − k responsible voters and k conventional voters, and we want to analyse whether it is a Nash equilibrium. Now, if 1 n b + a > c (respectively, if 1 n b + a < c), then each responsible voter (respectively, conventional voter) has no profitable deviation while each conventional voter (respectively, responsible voter) has always a profitable deviation, since n−k+1 n b + a − c > n−k n b (respectively, n−k+1 n b + a − c < n−k n b). Then the unique equilibrium is mutual responsible voting (i.e., k = 0) if and only if 1 n b + a > c and mutual conventional voting (i.e., k = n) otherwise. (5). Solving for δ, we have that

Proof of Equation
).
Proof of Proposition 1. ([43], Supporting Information, Proposition 4) characterizes all pure memory-one strategies that allow for mutual cooperation in a social dilemma. Hence, for the vote with the wallet game a pure memory-one strategy allowing for mutual responsible voting must satisfy the following conditions: (i) p VR,n−1 = 1, (ii) p VR,n−2 = 0, (iii) p A,1 ≤ since b + a − c − 1 n ≥ 0 and n−1 n b − (b + a − c) ≥ 0. Lastly, for condition (iv) we have and since n is a natural number and b 2(c−a)−b < 0 if c < b 2 + a, then we have that Proof. (pTFT is a ZD strategy). From Equation (6), setting φ = c − a and σ = 1, we have p = p R + 1 c−a (g i − g −i ) writes Proof of Proposition 2. We apply ( [43], Supporting information, Proposition 3) to our game, and we assume σ ≥ n−2 n−1 and l = b + a − c. By contradiction, we also assume that the ZD strategy (l, σ) is not a Nash equilibrium. Then there exists (at least) a player i who strictly prefers to deviate from conventional voting, and who obtains a payoff G i > b + a − c. By Equation (7), we have that the (at most) n − 2 . . . Figure A1. The extensive form of the vote with the wallet game.