Payoffs and Coherence of a Quantum Two-Player Game in a Thermal Environment

A two-player quantum game is considered in the presence of a thermal decoherence modeled in terms of a rigorous Davies approach. It is shown how the energy dissipation and pure decoherence affect the payoffs of the players of the (quantum version) of prisoner dilemma. The impact of the thermal environment on a coherence of game, as a quantum system, is also presented.


Introduction
We believe that there is but one fundamental framework for all physical phenomena: quantum theory.Information processing is a fundamental physical phenomenon, and therefore, one cannot separate information theory from both applied and fundamental physics.Recently, attention to the quantum aspects of information processing opened new perspectives in computation, cryptography, communication methods, etc. Numerous examples of game-like physical systems allowing a quantum description revealing advantages over the classical situation were found.However, does quantum mechanics offer more effective mechanisms of playing games?In game theory one often has to consider mixed strategies, that is probabilistic mixtures of "genuine" (pure) strategies [1][2][3].In some cases, they can be intertwined in more subtle ways by exploring interference or entanglement [4][5][6].There are situations in which it can be assumed that quantum theory can enlarge the set of possible strategies in a sensible way.Technically, this is a quite nontrivial problem, because quantum systems usually are unstable, and their preparation and maintenance might be difficult to implement due to the practically unavoidable decoherence, the destructive interactions with the environment [7,8].Actually, quantum formalism can be used in game theory in a more abstract way without any reference to physical quantum states; the decoherence is not a problem in such cases [9][10][11][12][13][14], but it is not our objective in this publication.The real question is if genuine quantum games are of any practical value.Some commercial cryptographical and communication methods/products are already available on the market.Therefore, in some sense, the answer is positive.Here, we aim to analyze the fundamental and unfortunately complex problem of the resistance of a quantum game to distortion caused by thermal noise.The influence of the environment on the course of a quantum game has already been discussed, among others, in [15][16][17][18].Our approach differs in various aspects since, instead of using a phenomenological approach, we adopt a Davies approximation [19], which is derived for a very general class of open systems and is valid for well-established, both mathematically and physically, conditions.Moreover, Davies modeling of open systems is known to agree with thermodynamic predictions for non-equilibrium quantum systems [20].
Quantum game theory covers a broad spectrum of problems.Therefore, a general definition of a quantum game can be arbitrarily complex due to a variety of possible contexts.Here, by a quantum game, we will understand a quantum system that can be manipulated by some parties and for which utilities of moves (payoffs) can be at least operationally defined.Although we will use the concept of a quantum game in an abstract quantum theoretical sense (note that the implementation of quantum auctions seems to be feasible [21]), we assume that the system in question can be prepared with satisfactory accuracy and mathematically represented by a density operator related to a corresponding finite-dimensional Hilbert space: we will assume that the corresponding structures are definable and implementable.Nevertheless, we neglect the possible technical problems with actual identification of the state, etc.We shall suppose that all players know the state of the game at all crucial stages of the actual game being played (we do not consider quantum games in Nature; in such cases, the agents even might not be aware of playing the game!).In addition, implementation of a quantum game should include measuring apparatuses and information channels that provide necessary information on the state of the game at all stages and that specify the moment and methods of its termination.We will focus on the influence of the environment on the course of the game.Related problems have been discussed previously, among others in [15][16][17][18]22].Our approach extends recent studies by a detailed analysis of the coherence of a quantum game affected by a thermal environment.We show that despite an unavoidable coherence loss, a quantum game preserves most of its unusual properties.
The paper is organized as follows.We will begin by a brief presentation of the quantum game formalism.Then, we will describe our approach to the decoherence in quantum games with emphasis given to the payoffs of players and the coherence of the state of the pair of qubits used in the game.Finally, we will try to show some problems that should be addressed in the near future.

Quantum Game
The general analysis of decoherence in a quantum game is a very hard task.Therefore, to obtain concrete results, we will consider only some specific two-player quantum games.We will consider only two-player quantum games: the generalization for the N players case is straightforward [6].We will suppose that a two-player quantum game, mathematically represented by the n-tuple Γ = (H, ρ i , S A , S B , P A , P B ), is completely specified by the underlying Hilbert space H of the quantum system, the initial state given by the density matrix ρ i ∈ S(H), where S(H) is the associated state space, the sets S A and S B of quantum operations representing moves (strategies) of the players and the payoff (utility) functions P A and P B , which specify the payoff for each player after the final measurement performed on the final state ρ f .A quantum strategy s A ∈ S A , s B ∈ S B is a collection of admissible quantum operations, that is the mappings of the space of states onto itself.The generalization of the formalism for the N players case is straightforward.One usually supposes that they are completely positive trace-preserving maps.Schematically, we have: This scheme for a quantum two-player game can be implemented as a quantum map: where initially: describes the identical starting positions of Alice (A) and Bob (B).J (this operation is introduced for technical reasons only; often, its use is criticized, as "quantum strategy" can be understood as any manipulation of the system in question, cf.[23]) describes the process of the creation of entanglement in the system and D the possible destructive noise effects that will be neglected here.The use of entanglement is one of the possible ways to utilize the power of quantum mechanics in quantum games.Note that we follow here the "scenario" put forward in [5].This approach can be criticized as unduly restricting the setting and strategies [24,25], but is widely used and convenient for our aims.One of the possibilities is that the states of players are being entangled by using the transformation: with: into an entangled state.Here, I and σ x denote the identity operator and the Pauli matrix, respectively.The additional parameter γ describes the possible destructive role of the environment (noise).Equation (3) has the following explicit matrix form for the initial state given by Equation ( 2) [26]: The individual strategies of players S X , X = A(lice), B(ob) are implemented as unitary transformations of the form: where the quantum strategy is realized by unitary transformations.For example, both S A and S B can have the general matrix form in the two-dimensional case [26]: Noise affects virtually all physical systems.Quantum systems are particularly susceptible to the destructive influence of their environment.Dynamical decoherence resulting from "spontaneous monitoring" of the system by the environment leads to suppression of interference.Therefore, any game involving such phenomena, as entanglement or interference, in a fundamental way are doomed to decoherence from the start.Therefore, the key issues are the analysis of possible destructive ways in which decoherence can affect both the implementation process and the results of a quantum game.As mentioned earlier, we do not pretend to offer a complete analysis of this problem.We focus on the famous prisoner's dilemma [27] that is widely used in the analysis of cooperation in various contexts and has since attracted widespread and increasing attention in a variety of disciplines.The class of game theoretical problems known as the prisoner's dilemma were devised and discussed by Merrill Flood and Melvin Dresher in the 1950s when they worked at the Rand Corporation on possible applications of game theory to nuclear weapon strategies.The title "prisoner's dilemma" and the idea of prison sentences as payoffs were put forward by Albert Tucker, who wanted to make Flood and Dresher's ideas more accessible to a wider audience.Tucker's story says that two rational prisoners, the actual players, have to decide without communication whether to cooperate or not.They might decide to not cooperate, even if it is obvious that they are better off if they do so; hence the term dilemma.Payoffs, the results of the game, can be calculated as an expectation value, weighted by certain game-dependent real numbers a, b, c, d constituting the payoff matrix: Here, Bob and Alice are the players who can adopt two strategies denoted by zero and one.In general, the strategies denoted by zero and one may consist of different "physical actions" for the players in question!The payoffs of Alice and Bob are given by ordered pairs (x, y) corresponding to the strategies that they adopted.Such a general game becomes reduced to the prisoner's dilemma provided that parameters in payoff Equation ( 8) fulfill the relation c > a > b > d [27].Its quantum extension is no less popular, at least as a theoretical tool, cf.[9,28].It was formulated in the seminal paper [5] and later generalized to various types of networks problems [29,30].In the quantum version, the game ceases to be paradoxical for some classes of quantum strategies, but we should stress here that the dilemma disappears due to dramatic enlargement of the set of strategies for both agents.Therefore, the quantum prisoner dilemma is a quite new game that reduces the classical one only if the strategy sets are properly reduced.

Decoherence and Its Description
The only fully-natural source of decoherence affecting quantum systems is due to their environment, causing both energy and information dissipation.In our model analysis, we assume that only one of two qubits (say Bob's) in the state ρ = J(ρ 0 ) shared by Alice and Bob just before applying their strategies S A,B interacts with the environment E where the second one is isolated and its evolution in time is unitary.As Alice and Bob can be separated from each other, we neglect any direct interaction between their qubits (via the proper Hamiltonian term).In other words, the Hamiltonian of the total system is of the form: where where subscripts A and B refer to Alice's and Bob's systems, respectively, and subscript E to the environment.H E is a free Hamiltonian of the environment, and H int is an interaction term, the explicit forms of which will be specified in the next paragraph of this section.In order to describe a reduced (with respect to the environment) dynamics of the qubit shared by Bob, encoded in a density matrix ρ B , we used a semigroup approach, where the evolution of a state is given by the following master equation: with a formal solution that defines the one-parameter semigroup Λ t as: obeying a condition Λ t+s = Λ t Λ s .G is the so-called generator of a semigroup Λ t = e Gt , and it is given by: where the action of the operator δ is the unitary part of the evolution governed by the Hamiltonian H 0 , i.e., and L is called the dissipator, which, for a quantum Markov process, is of the form: There are plenty of different types of operators K i that can model such physical reality as: systems coupled to heat baths, damped harmonic oscillators, models of decoherence, quantum Brownian particles and many others [31].In our model, we use Davies maps [32], which are elements of Davies dynamical semigroups at some particular time.We want to briefly review the criteria for a generator G d = iδ +L d of the Davies map Dt = e G d t for which the dissipator L d commutes with δ and is self-adjoint with respect to the Gibbs state ρ β defined in the following sense [32]: for any N × N complex matrices A and B, where the Gibbs state ρ β is defined as: with partition function Z = Tre −βH and with inverse temperature β.Davies approximation has been successfully used in studies of various problems in quantum information and physics of open quantum systems, including entanglement dynamics [33,34], quantum discord [35] or properties of geometric phases of qubits [36] and the thermodynamic properties of nano-systems [37].The main properties of the Davies semigroups is that the unitary part and the dissipative part decouples, and in equilibrium, transition rates between eigenstates of Hamiltonian H obey a micro-reversibility condition, i.e., the transition in one direction is exactly balanced from the opposite one.
We also want to stress that instead of the axiomatic definition of a Davies semigroup given above, it was historically derived rigorously and consistently from microscopic models of open systems in the weak coupling limit [19] that satisfy most of the desired thermodynamic and statistical-mechanical properties, such as the detailed balance condition [20,38].For our model, we can chose H E as an infinite set of harmonic oscillators [33] with Hamiltonian H E = n ω n a † n a n , and the interaction term can be written as H int = 3 i=1 σ i ⊗ B i where σ i are the Pauli matrices and B i are self-adjoint operators on the environment Hilbert space [38].For the Hamiltonian Equation ( 9), the time-evolution operator is equal U t = exp(−iHt) = U A ⊗ U B,E .Then, the reduced density matrix ρ AB of qubits shared by Alice and Bob is the following: where ω E is some (not specified) initial state of the environment.If we expand initial state ρ 0 in the basis {|0 , |1 } of the Hamiltonian H 0 eigenvalues, see Equation (10): then Equation ( 18) can be rewritten as: Since U A is the dynamics of a noiseless qubit governed by the Hamiltonian H 0 , then we get: where ω jk = (j −k)ω.We will not go into detailed microscopic descriptions in terms of the Hamiltonian parameters, but instead, we used a "phenomenological" form of the Davies map given by: where D t = e L d t acts on basis states as follows [32]: where p ∈ [0, 1/2] is related to the temperature (here, we set k B = 1) via: The parameters A = 1/τ R and G = 1/τ D , if interpreted in terms of spin relaxation dynamics [39], are related to the energy relaxation time τ R and the dephasing time τ D , respectively [32].The inequalities [39]: guarantee that the Davies map is a trace-preserving completely positive map.The limiting case A = 0 and G = 0 corresponds to pure dephasing without dissipation of energy.Finally, we obtain: In the following quantum game scenario (see the next section), we consider three possibilities where: either the qubit shared by Alice or the qubit shared by Bob couples to the environment, or none of them (the last as a natural and obvious reference to the role played by decoherence), i.e., ρ

Payoffs for the Quantum Prisoner's Dilemma
Payoff, the results of the game, can be calculated as an expectation value, weighted by certain game-dependent real numbers a, b, c, d.
For example, the strategy profile (S A = 0, S B = 1) is encoded in the quantum state |01 and results in payoffs d for Bob and c for Alice.The trace operation is due to the projective measurement of an output.We describe the influence of the thermal environment in terms of the payoff's difference: where the tilde denotes the "noisy" player, cf., Equations ( 31) and (30), respectively.The noiseless quantity in Equation ( 37), calculated for purely unitary evolution Equation (32), is used as a reference.The sign of ∆'s in Equations ( 35)-( 36) allows one to identify the winner of the game.For a highly symmetric case of Davies maps considered in this paper acting on a maximally-entangled Bell state (which is an example of the so-called X-states), it does not matter which one among the two players, either Alice or Bob, is "noisy".That is why in the subsequent discussion, we assume that the Alice qubit is affected by a thermal environment.
To be more precise we restrict ourselves to the study of one of the best known examples of a game: the celebrated prisoner's dilemma (PD).The general quantum game considered so far becomes reduced to the PD provided that the parameters in payoff Equation ( 33) are given by: We analyze in detail mixed strategies of players (or prisoners) assuming that one of them (Bob) can apply both classical and quantum strategies Our aim is to present the relation between the difference of Bob's and Alice's payoffs with respect to the parameters of the Davies environment provided that the entangling parameter γ = π/2 in Equation ( 4) is fixed.The Alice-Bob strategy is the following and the quantum strategy Equation ( 7) with U = U (π/2, 0, π/2).
Notice that the pair (V c , V c ) consists of two "miracle moves" [40].
There are, except temperature p Equation ( 27), two parameters in the Davies map Equation ( 23) indicating the role played by an environment related to pure dephasing (for G > 0 and A = 0) and energy dissipation when both A and G (confined by the condition Equation ( 28) are non-zero.We present their effect on the payoff ∆$ A in Figure 1.We use, as a reference, the purely unitary case, when the time evolution of players is governed by Equation (32).First, let us consider pure decoherence presented in the upper panel of Figure 1.Increasing G results in damping of the oscillations of ∆$ A caused by the Hamiltonian part of Davies map.For large G, the payoff difference becomes steady and negative.If there is an energy dissipation additionally present in the system, cf. the lower panel of Figure 1, the steady value of the payoff difference increases up to maximal value ∆$ A ≈ 0 for A = 2G, i.e., for a maximum allowed by the condition Equation (28).Moreover, for a sufficiently large ratio A/G, one can expect that the payoff difference ∆$ A changes its sign.This suggests the possibility of modifying the payoff difference by the proper adjustment or choice of parameters of the environment. -5

Coherence of a Quantum Game
In this section, we consider the coherence of an output of the game: calculated just before applying the final (un)entangling operation in Equation (1).In other words, we investigate how coherent the system is after applying player's strategies S, either quantum or classical.There are various quantifiers of the coherence of quantum systems.Usually, they are "definition-dependent" and studied in very different contexts [7,8,41].Here, we use relative entropy of coherence, discussed in [41]: where S(ρ) is the quantum von Neumann entropy and a state ρ diag has been obtained from ρ by setting to zero all off-diagonal elements, i.e., [ρ diag ] i,j = [ρ] i,j δ ij (no summation).The quantifier C has a clear interpretation of information content carried by off-diagonal elements of the density matrix.The relative entropy of coherence possesses all of the desired properties of the coherence measure, as discussed in [41].
Before analyzing the influence of the environment on the coherence of (the state of) the game, let us consider how the parameter γ in Equation ( 4) affects C at t = 0, i.e., if one neglects the duration of the process of implementing strategies.The result is plotted in Figure 2. Let us notice that increasing γ, which corresponds to increased entanglement between players, lowers C. As the "quantum power' of the game seems to be most exhibited for maximal initial entanglement between players' qubits, this results can be considered counter-intuitive.The effect of pure decoherence, presented in the upper panel of Figure 3, is correlated to that that was reported for payoffs in the previous section in the behavior of oscillation damping of relative entropy C and stabilizing its value at some level.As intuitively expected, increasing G (with A = 0) results in higher damping.However, it is seen that while C is at any time the biggest for the unitary evolution, than for the payoffs, it is not.In fact, the intersection points of the graphs for payoffs in Figure 1 occur for times t when the relative entropy of coherence has the maximum value.Moreover, slowly decreasing the minimal value of C in time (into values less than the minimal value of the unitary evolution) results in increasing the maximal value of the payoffs.In the lower panel of Figure 3, we present an effect of energy dissipation.As for the pure decoherence, C has the biggest value for the unitary evolution at any time t, but in this case, C for dissipating systems firstly exhibits a deep minimum and then increases to its asymptotic value (contrary to decohered systems, where C decreases asymptotically).This is correlated with the fact that payoffs for these systems are more often bigger than governed by unitary evolution.This is especially seen for the extreme example, discussed before, for A = 2G, where payoffs are always bigger than for unitary evolution and the relative entropy C before quick dampingof oscillations (as in pure decoherence) passes through the minimum, which results in an increase of payoff values (which was not seen for the decohered system).In other words, we can say that dampingof oscillations "freezes" the value of payoffs, either for pure decoherence and with dissipation, but later, deep initial loss of coherence results in an increase of the payoffs.Let us also emphasize that increasing A does not influence the long time properties of C.
The asymptotic value of C is strongly affected by the temperature p of the environment.In other words, one can control and design desired value of coherence at a given time instant just by a proper choice of the properties of the environment.The results presented in Figure 4 confirm the natural expectation that increasing of p causes the decreasing of coherence.In particular, for an infinite temperature p = 1/2, coherence C = 0, as presented in the lower panel of

Conclusions
The environment is an essential ingredient of any realistic physical object.This is also the case for quantum systems, which almost always require modeling using open system techniques.A quantum game described as a closed system is always an idealization.In this work, we attempt to take one step beyond this approximation and consider the prisoner's dilemma in the presence of a thermal environment modeled via rigorous Davies approach Equation (23).Such a description is most general when applied to Markovian open systems weakly coupled to the thermal bath.We consider the most natural quantifier studied in the context of game theory: the difference between payoffs of players Equation (35).We show that there are at least two aspects of the influence of the Davies bath on the results of the game.The first is obvious, as one expects that the thermal bath generates damping of all properties, which can be derived from the conservative part of the qubits dynamics.The next is probably more interesting: for a proper relation between pure dephasing and energy dissipation, one can obtain the players' payoff difference of the desired properties.For example, we show that there are environments changing the sign of the payoff difference, cf. Figure 1.In other words, there are environments that can change the result of the game for a given strategy of players.The second problem that is discussed in the paper is more fundamental.We attempt to relate the properties of the quantum game and the thermal environment to the coherence of a pair of qubits used by players.We show that the coherence quantified in terms of the relative entropy of coherence Equation ( 40) strongly depends on the properties of the environment.It is possible to obtain the desired value of coherence provided that one can control the properties of an environment, e.g., its temperature.
To summarize, this work is a modest contribution to our understanding of the properties of quantum games in the presence of the decoherence of a very general type.All of the results presented here can serve as a guideline for designing experiments that implement quantum games of desired properties.

Figure 1 .
Figure 1.Payoff difference ∆$ A Equation (35) taken for γ = π/2 versus time t for a mixed Alice-Bob strategy (V c , V c U ) with V c = I/2 + F/2 (F = iσ x is a flip operation) and the quantum strategy Equation (7) with U = U (π/2, 0, π/2) for the thermal Davies environment: with different values of G and A = p = 0 (upper panel); with different values of A and G = 1 and p = 0 (lower panel).

Figure 2 .
Figure 2. Initial (calculated at t = 0) relative entropy of coherence C Equation (40) as a function of γ.

Figure 3 .
Figure 3. Relative entropy of coherence C Equation (40) taken for γ = π/2 versus time t for a mixed Alice-Bob strategy (V c , V c U ) with V c = I/2 + F/2 and the quantum strategy Equation (7) with U = U (π/2, 0, π/2) for the thermal Davies environment: with different values of G and A = p = 0 (upper panel); with different values of A and G = 1 and p = 0 (lower panel).

Figure 4 .
Figure 4. Relative entropy of coherence C Equation (40) taken for γ = π/2 for a mixed Alice-Bob strategy (V c , V c U ) with V c = I/2 + F/2 and the quantum strategy Equation (7) with U = U (π/2, 0, π/2) for the thermal Davies environment: versus time t with different values of p (upper panel); versus p in the long time limit (lower panel).Other parameters A = 2G = 2.