Toward a Theory of Play: A Logical Perspective on Games and Interaction

Logic and game theory have had a few decades of contacts by now, with the classical results of epistemic game theory as major high-lights. In this paper, we emphasize a recent new perspective toward “logical dynamics”, designing logical systems that focus on the actions that change information, preference, and other driving forces of agency. We show how this dynamic turn works out for games, drawing on some recent advances in the literature. Our key examples are the long-term dynamics of information exchange, as well as the much-discussed issue of extensive game rationality. Our paper also proposes a new broader interpretation of what is happening here. The combination of logic and game theory provides a fine-grained perspective on information and interaction dynamics, and we are witnessing the birth of something new which is not just logic, nor just game theory, but rather a Theory of Play.

and their combination raises interesting questions of definability, axiomatization and computational complexity [1][2][3][4].Epistemic game theory, c.f. [5], has added one more element to this mix, again familiar to logicians: the role of factual and higher-order information.This much is well-understood, and there are excellent sources, that we need not reproduce here, though we will recall a few basics in what follows.
In this paper we will take one step further, assuming that the reader knows the basics of logic and game theory.We are going to take a look at all these components from a dynamic logical perspective, emphasizing actions that make information flow, change beliefs, or modify preferences-in ways to be explained below.For us, understanding social situations as dynamic logical processes where the participants interactively revise their beliefs, change their preferences, and adapt their strategies is a step towards a more finely-structured theory of rational agency.In a simple phrase that sums it up, this joint off-spring "in the making" of logic and game theory might be called a Theory of Play instead of a theory of games.
The paper starts by laying down the main components of such a theory, a logical take on the dynamics of actions, preferences, and information (Sections 1 and 2).We then show that this perspective has already shed new light on the long-term dynamics of information exchange, Section 3, as well as on the question of extensive game rationality, Section 4. We conclude with general remarks on the relation between logic and game theory, pleading for cross-fertilization instead of competition.This paper is introductory and programmatic throughout.Our treatment is heavily based on evidence from a number of recent publications demonstrating a variety of new developments.

An Encounter Between Logic and Games
A first immediate observation is that games as they stand are natural models for many existing logical languages: epistemic, doxastic and preference logics, as well as conditional logics and temporal logics of action.We do not aim at encyclopedic description of these systems- [2] is a relatively up-to-date overview.This section just gives some examples setting the scene for our later more detailed dynamic-logic analyses.

Strategic Games
Even simple strategic games call for logical analysis, with new questions arising at once.To a logician, a game matrix is a semantic model of a rather special kind that invites the introduction of well-known languages.Recall the main components in the definition of a strategic game for a set of n players N : (1) a nonempty set A i of actions for each i ∈ N , and (2) a utility function or preference ordering on the set of outcomes.For simplicity, one often identifies the outcomes with the set S = Π i∈N A i of strategy profiles.As usual, given a strategy profile σ ∈ S with σ = (a 1 , . . ., a n ), σ i denotes the ith projection (i.e., σ i = a i ) and σ −i denotes the choices of all agents except agent i: σ −i = (a 1 , . . ., a i−1 , a i+1 , . . ., a n ).
Games as models.Now, from a logical perspective, it is natural to treat the set S of strategy profiles as a universe of "possible worlds". 1 These worlds then carry three natural relations, that are entangled in various ways.For each σ, σ ∈ S, define for each player i ∈ N : • σ ≥ i σ iff player i prefers the outcome σ at least as much as outcome σ , • σ ∼ i σ iff σ i = σ i : this epistemic relation represents player i's "view of the game" at the ex interim stage where i's choice is fixed but the choices of the other players' are unknown, • σ ≈ i σ iff σ −i = σ −i : this relation of "action freedom" gives the alternative choices for player i when the other players' choices are fixed. 2  This can all be packaged in a relational structure with S the set of strategy profiles and the relations just defined.
Matching modal game languages.The next question is what is the "right" logical language to reason about these structures?The goal here is not simply to formalize standard game-theoretic reasoning.That could be done in a number of ways, often in the first-order language of these relational models.Rather, the logician will aim for a well-behaved language, with a good balance between the level of formalization and other desirable properties, such as perspicuous axiomatization, low computational complexity of model checking and satisfiability, and the existence of an elegant meta-theory for the system.In particular, the above game models suggest the use of modal languages, whose interesting balance of expressive power and computational complexity has been well-researched over the last decades. 3  Our first key component-players' desires or preferences-has been the subject of logical analysis since at least the work of [10]. 4Here is a modern take on preference logic [12,14].A modal betterness model for a set N of players is a tuple M = W, {≥ i } i∈N , V where W is a nonempty set of states, for each i ∈ N , ≥ i ⊆ W × W is a preference ordering, and V is a valuation function V : At → ℘(W ) (At is a set of atomic propositions describing the ground facts about the situation being modeled).Precisely which properties ≥ i should have has been the subject of debate in philosophy: in this paper, we assume that the relation is reflexive and transitive.For each ≥ i , the corresponding strict preference ordering is written > i .
A modal language to describe betterness models uses modalities ≥ i ϕ saying that "there is a world at least as good as the current world satisfying ϕ", and likewise for strict preference: • M, w |= ≥ i ϕ iff there is a v with v ≥ i w and M, v |= ϕ • M, w |= > i ϕ iff there is a v with v ≥ i w, w ≥ i v, and M, v |= ϕ Standard techniques in modal model theory apply to definability and axiomatization in this modal preference language: we refer to ( [9], Chapter 3) and [13] for details.Both [12] and [13] show how this language can also define "lifted" generic preferences between propositions, i.e., properties of worlds.
Next, the full modal game language for the above models must also include modalities for the relations that we called the "view of the game" and the "action freedom".But this is straightforward, as these are even closer to standard notions studied in epistemic and action logics.
Again, we start with a set At of atomic propositions that represent basic facts about the strategy profiles. 5Now, we add obvious modalities for the other two relations to get a full modal logic of strategic games: Some issues in modal game logic for strategic games.A language allows us to say things about structures.But what about a calculus of reasoning: what is the logic of our modal logic of strategic games?For convenience, we restrict attention to 2-player games.First, given the nature of our three relations, the separate logics are standard: modal S4 for preference, and modal S5 for epistemic outlook and action freedom.What is of greater interest, and logical delicacy, is the interaction of the three modalities.For instance, the following combination of two modalities makes ϕ true in each world of a game model: Thus, the language also has a so-called "universal modality".Moreover, this modality can be defined in two ways, since we also have that: This validity depends on the geometrical "grid property" of game matrices that if one can go x ∼ i y ≈ i z, then there exists a point u with x ≈ i u ∼ i z.
This may look like a pleasant structural feature of matrices, but its logical effects are delicate.It is well-known that the general logic of such a bi-modal language on grid models is not decidable, and not even axiomatizable: indeed, it is "Π 1 1 -complete". 7In particular, satisfiability in grid-like models can encode computations of Turing machines on their successive rows, or alternatively, they can encode geometrical "tiling problems" whose complexity is known to be high.From a logical point of view, simple-looking strategic matrix games can be quite complex computational structures.details. 7 Cf.[15][16][17][18] for formal details behind the assertions in this paragraph.
However, there are two ways in which these complexity results can be circumvented.One is that we have mainly looked at finite games, where additional validities hold 8 -and then, the complexity may be lower.Determining the precise modal logic of finite game matrices appears to be an open problem.
Here is another interesting point.It is known that the complexity of such logics may go down drastically when we allow more models, in particular, models where some strategy profiles have been ruled out.One motivation for this move has to do with dependence and independence of actions. 9 Full matrix models make players' actions independent, as reflected in the earlier grid property.By contrast, general game models omitting some profiles can represent dependencies between players' actions: changing a move for one may only be possible by changing a move for another.The general logic of game models allowing dependencies does not validate the above commutation law.Indeed, it is much simpler: being just multi-agent modal S5.Thus, complexity of logics matches interesting decisions on how we view players: as independent, or correlated.
Against this background of available actions, information, and freedom, the preference structure of strategic games adds further interesting features.One benchmark for modal game logics has been the definition of the strategy profiles that are in Nash Equilibrium.And this requires defining the usual notion of best response for a player.One can actually prove 10 that best response is not definable in the language that we have so far.One extension that would do the job is taking an intersection modality: Then the best response for player i is defined as Questions of complexity and complete axiomatization then multiply.But we can also deal with preference structure in other ways.Introduce proposition letters "Best(i)" for players i saying that the profiles where they hold are best responses for i in the game model.Then one finds interesting properties of such models reflected in the logic.One example is that each finite game model has a cycle of points where (for simplicity assume there are only two players i and j): . Such loops represent subgames where all players are "strongly rational" in the sense of considering it possible that their current move is a best response to what their opponent is doing.Thus, the logic encodes basic game theory. 11  Our main point with this warm-up discussion for our logical Theory of Play it that the simple matrix pictures that one sees in a beginner's text on game theory are already models for quite sophisticated logics of action, knowledge and preference.Thus, games of even the simplest sort have hidden depths for logicians: there is much more to them than we might think, including immediate open problems for logical research. 12

Extensive Games
Just like strategic games, interactive agency in the more finely-structured extensive games offers a natural meeting point with logic.We will demonstrate this with a case study of Backwards Induction, a famous benchmark at the interface, treated in a slightly novel way.Our treatment in this section will be rather classical, that is static and not information-driven.However, in Section 4 we return to the topic, giving it a dynamic, epistemic twist.
Dynamic logic of actions and strategies.The first thing to note is that the sequential structure of players' actions in an extensive game lends itself to logical analysis.A good system to use for this purpose is propositional dynamic logic (P DL), originally designed to analyze programs and computation (see [27] for the original motivation and subsequent theory).Let Act be a set of primitive actions.An action model is a tuple M = W, {R a | a ∈ Act}, V where W is an abstract set of states, or stages in an extensive game, and for each a ∈ Act, R a ⊆ W ×W is a binary transition relation describing possible transition from states w to w by executing the action a.On top of this atomic repertoire, the tree structure of extensive games supports complex action expressions, constructed by the standard regular operations of "indeterministic choice" (∪), "sequential composition" (;) and "unbounded finitary iteration" ( * : Kleene star): This syntax recursively defines complex relations in action models: The key dynamic modality [α]ϕ now says that "after the move described by the program expression α is taken, ϕ is true": P DL has been used for describing solution concepts on extensive games by many authors [2,4,28].An extended discussion of logics that can explicitly define strategies in extensive games is found in [29].
Adding preferences: the case of Backwards Induction.As before, a complete logical picture must bring in players' preferences on top of P DL, along the lines of our earlier modal preference logic.To show how this works, we consider a key pilot example: the Backwards Induction (BI) algorithm.This procedure marks each node of an extensive game tree with values for the players (assuming that distinct end nodes have different utility values): 13BI Algorithm: At end nodes, players already have their values marked.At further nodes, once all daughters are marked, the player to move gets her maximal value that occurs on a daughter, while the other, non-active player gets his value on that maximal node.
The resulting strategy for a player selects the successor node with the highest value.The resulting set of moves for all players (still a function on nodes given our assumption on end nodes) is the "bi strategy".
Relational strategies and set preference.But to a logician, a strategy is best viewed as a subrelation of the total move relation.It is an advice to restrict one's next choice in some way, similar to the more general situation where our plans constrain our choices.Mathematically, this links up with the usual way of thinking about programs and procedures in computational logic, in terms of the elegant algebra of relations and its logic P DL as defined earlier.
When the above algorithm is modified to a relational setting-we can now drop assumptions about unicity at end-points-we find an interesting new feature: special assumptions about players.For instance, it makes sense to take a minimum value for the passive player at a node over all highest-value moves for the active player.But this is a worst-case assumption: my counter-player does not care about my interests after her own are satisfied.But we might also assume that she does, choosing a maximal value for me among her maximum nodes.This highlights an important feature: solution methods are not neutral, they encode significant assumptions about players.
One interesting way of understanding the variety that arises here has to do with the earlier modal preference logic.We might say in general that the driving idea of Rationality behind relational BI is the following: I do not play a move when I have another whose outcomes I prefer.
But preferences between moves that can lead to different sets of outcomes call for a notion of "lifting" the given preference on end-points of the game to sets of end-points.As we said before, this is a key topic in preference logic, and here are many options: the game-theoretic rationality behind BI has a choice point.One popular version in the logical literature is this: This says that we choose a move with the highest maximal value that can be achieved.A more demanding notion of preference for a set Y over X in the logical literature [10] Here is what relational BI looks like when we follow the latter stipulation, which makes Rationality less demanding, and hence the method more cautious: First mark all moves as active.Call a move a dominated if it has a sibling move all of whose reachable endpoints via active nodes are preferred by the current player to all reachable endpoints via a itself.The second version of the BI algorithm works in stages: At each stage, mark dominated moves in the ∀∀ sense of preference as passive, leaving all others active.
Here "reachable endpoints" by a move are all those that can be reached via a sequence of moves that are still active at this stage.
We will analyze just this particular algorithm in our logics to follow, but our methods apply much more widely.
Defining Backwards Induction in logic.Many logical definitions for the BI strategy have been published [cf.again the survey in 2, Section 3].Here is a modal version combining the logics of action and preferences presented earlier-significantly, involving operator commutations between these: Theorem 1.1 ( [30]).For each extensive game form, the strategy profile σ is a backward induction solution iff σ is played at the root of a tree satisfying the following modal axiom for all propositions p and players i: Here move i = a is an i-move a, turn i is a propositional variable saying that it is i's turn to move, and end is a propositional variable true at only end nodes.Instead of a proof, we merely develop the logical notions involved a bit further.
The meaning of the crucial axiom follows by a modal frame correspondence ([9], Chapter 3). 14Our notion of Rationality reappears: Fact 1.2.A game frame makes (turn i ∧ [σ * ](end → p)) → [move i ] σ * (end ∧ pref i p) true for all i at all nodes iff the frame has this property for all i: RAT: No alternative move for the current player i guarantees outcomes via further play using σ that are all strictly better for i than all outcomes resulting from starting at the current move and then playing σ all the way down the tree.
A typical picture to keep in mind here, and also later on in this paper, is this: 14 "Game frames" here are extensive games extended with one more binary relation σ.
More formally, RAT is this confluence property for action and preference: Now, a simple inductive proof on the depth of finite game trees shows for our cautious algorithm that: This result is not very deep, but it opens a door to a whole area of research.
The general view: fixed-point logics for game trees.We are now in the realm of a well-known logic of computation, viz.first-order fixed-point logic LF P (F O) [31].The above analysis really tells us: Theorem 1.4.The BI relation is definable as a greatest-fixed-point formula in the logic LF P (F O).
Here is the explicit definition in LF P (F O): The crucial feature making this work is a typical logical point: the occurrences of the relation S in the property CF are syntactically positive, and this guarantees upward monotonic behaviour.We will not go into technical details of this connection here, except for noting the following.
Fixed-point formulas in computational logics like this express at the same time static definitions of the bi relation, and procedures computing it. 15Thus, fixed-point logics are an attractive language for extensive games, since they analyze both the statics and dynamics of game solution.
This first analysis of the logic behind extensive games already reveals the fruitfulness of putting together logical and game-theoretical perspectives.But it still leaves untouched the dynamics of deliberation and information flow that determine players' expectations and actual play as a game unfolds, an aspect of game playing that both game theorists and logicians have extensively studied in the last decades.In what follow we make these features explicit, deploying the full potential of the fine-grained Theory of Play that we propose.

Information Dynamics
The background to the logical systems that follow is a move that has been called a "Dynamic Turn" in logic, making informational acts of inference, but also observations, or questions, into explicit first-class citizens in logical theory that have their own valid laws that can be brought out in the same mathematical style that has served standard logic so well for so long.The program has been developed in great detail in [19,33] drawing together a wide range of relevant literature, but we will only use some basic components here: single events of information change and, later on in this paper, longer-term interactive processes of information change.Towards the end of the paper, we will also briefly refer to other dynamic components of rational agency, with dynamic logics for acts of strategy change, or even preference change.
Players' informational attitudes can be broadly divided into two categories: hard and soft information [34,35]. 16Hard information, and its companion attitude, is information that is veridical and not revisable.This notion is intended to capture what agents are fully and correctly certain of in a given game situation.So, if an agent has hard information that some fact ϕ is true, then ϕ really is true.In absence of better terminology and following common usage in the literature, we use the term knowledge to describe this very strong type of informational attitude.By contrast, soft information is, roughly speaking, anything that is not "hard": it is not necessarily veridical, and it is revisable in the presence of new information.As such, it comes much closer to beliefs or more generally attitudes that can be described as "regarding something as true" [36].This section introduces some key logical systems for describing players' hard and soft information in a game situation, and how this information can change over time.

Hard Information and Public Announcements
Recall that N is the set of players, and At a set of atomic sentences p describing ground facts, such as "player i choose action a" or "the red card is on the table".A non-empty set W of worlds or states then represent possible configurations of plays for a fixed game.Typically, players have hard information about the structure of the game-e.g., which moves are available, and what are their own preferences and choices, at least in the ex interim stage of analysis.
Static epistemic logic.Rather than directly representing agents' information in terms of syntactic statements, in this paper, we use standard epistemic models for "semantic information" encoded by epistemic "indistinguishability relations".Setting aside some conceptual subtleties for the purpose of exposition, we will assume that indistinguishability is an equivalence relation.Each agent has some "hard information" about the situation being modeled, and agents cannot distinguish between any two states that agree on this information.This is essentially what we called the player's "view of the game" in Section 1. Technically, we then get well-known structures: A simple modal language describes properties of these structures.Formally, L EL is the set of sentences generated by the grammar: where p ∈ At and i ∈ N .The propositional connectives →, ↔, ∨ are defined as usual, and the dual L i of K i is ¬K i ¬ϕ.The intended interpretation of K i ϕ is "according to agent i's current (hard) information, ϕ is true" (in popular jargon, "i knows that ϕ is true").Here is the standard truth definition: Given the definition of the dual of K i , it is easy to see that: This says that "ϕ is consistent with agent i's current hard information".
Information update.Now comes a simple concrete instance of the above-mentioned "Dynamic Turn".Typically, hard information can change, and this crucial phenomenon can be added to our logic explicitly.
The most basic type of information change is a public announcement [37,38].This is an event where some proposition ϕ (in the language of L EL ) is made publicly available, in full view, and with total reliability.Clearly, the effect of such an event should be to remove all states that do not satisfy ϕ: new hard information shrinks a current range of uncertainty.
V be an epistemic model and ϕ an epistemic formula.The model updated by the public announcement of ϕ is the structure Clearly, if M is an epistemic model then so is M ϕ .The two models describe two different moments in time, with M the current information state of the agents and M ϕ the information state after the information that ϕ is true has been incorporated in M.This temporal dimension can be represented explicitly in our logical language: Let L P AL extend L EL with expressions of the form [ϕ]ψ with ϕ ∈ L EL .The intended interpretation of [ϕ]ψ is "ψ is true after the public announcement of ϕ" and truth is defined as Now, in the earlier definition of public announcement, we can also allow formulas from the extended language L P AL : the recursion will be in harmony.As an illustration, a formula like ¬K i ψ ∧ [ϕ]K i ψ says that "agent i currently does not know ψ but after the announcement of ϕ, agent i knows ψ".So, the language of L P AL describes what is true both before and after the announcement while explicitly mentioning the informational event that achieved this.
While this is a broad extension of traditional conceptions of logic, standard methods still apply.A fundamental insight is that there is a strong logical relationship between what is true before and after an announcement, in the form of so-called reduction axioms: Theorem 2.4.On top of the static epistemic base logic, the following reduction axioms completely axiomatize the dynamic logic of public announcement: Going from left to right, these axioms reduce syntactic complexity in a stepwise manner.This recursive style of analysis has set a model for the logical analysis of informational events generally.Thus, information dynamics and logic form a natural match.

Group Knowledge
Both game theorists and logicians have extensively studied a next phenomenon after the individual notions considered so far: group knowledge and belief. 17We assume that the reader is familiar with the relevant notions, recalling just the merest basics.For a start, the statement "everyone in the (finite) group G ⊆ N knows ϕ" can be defined as follows: In general, we need to add a new operator C G ϕ to the earlier epistemic language for this.It takes care of all iterations of knowledge modalities by inspecting all worlds reachable through finite sequences of 17 [39] and [40] provide an extensive discussion. 18Cf.[42] for an alternative reconstruction.
epistemic accessibility links for arbitrary agents.Let M = W, {∼ i } i∈N , V be an epistemic model, with w ∈ W . Truth of formulas of the form Cϕ is defined by: where R * G := ( i∈G ∼ i ) * is the reflexive transitive closure of i∈G ∼ i .As for valid laws of reasoning, the complete epistemic logic of common knowledge expresses principles of "reflective equilibrium", or mathematically, fixed-points: 19   • Fixed-Point Axiom: Studying group knowledge is just a half-way station to a more general move in current logics of agency.Common knowledge is a notion of group information that is definable in terms of what the individuals know about each others.But taking collective agents-a committee, a scientific research community-seriously as logical actors in their own right brings us beyond this reductionist perspective.
Finally, what about dynamic logics for group modalities?Baltag, Moss and Solecki [44] proved that the extension of L EL with common knowledge and public announcement operators is strictly more expressive than with common knowledge alone.Nonetheless, a technical reduction axiom-style recursive analysis is still possible, as carried out in [45].

Soft Information and Soft Announcements
But rational agents are not just devices that keep track of hard information, and produce indubitable knowledge all the time.What seems much more characteristic of intelligent behaviour, as has been pointed out by philosophers and psychologists alike, is our creative learning ability of having beliefs, perhaps based on soft information, that overshoot the realm of correctness.And the dynamics of that is found in our skills in revising those beliefs when they turn out to be wrong.Thus, the dynamics of "correction" is just as important to rational agency as that of "correctness".Models of belief via plausibility.While there is an extensive literature on the theory of belief revision, starting with [46], truly logical models of the dynamics of beliefs, hard and soft information have only been developed recently.For a start, we need a static base, extending epistemic models with softer, revisable informational attitudes.One appealing approach is to endow epistemic ranges with a plausibility ordering for each agent: a pre-order (reflexive and transitive) w i v that says "player i considers world v at least as plausible as w."As a convenient notation, for X ⊆ W , we set M in i (X) = {v ∈ W | v i w for all w ∈ X }, the set of minimal elements of X according to i .The plausibility ordering i represents which possible worlds an agent considers more likely, encoding soft information.Such models representing have been used by logicians [35,47,48], game theorists [49], and computer scientists [50,51]: 19 Cf. [43] for an easy way of seeing why the next principles do the job.

Definition 2.5 (Epistemic-Doxastic Models
).An epistemic-doxastic model is a tuple: where W, {∼ i } i∈N , V is an epistemic model and, for each i ∈ N , i is a well-founded 20 reflexive and transitive relation on W satisfying, for all w, v ∈ W : • plausibility implies possibility: if w i v then w ∼ i v.
• locally-connected: if w ∼ i v then either w i v or v i w. 21  These richer models can define many basic soft informational attitudes: This is the usual notion of belief which satisfies standard properties, Thus, ϕ is safely believed if ϕ is true in all states the agent considers more plausible.This stronger notion of belief has also been called certainty by some authors ([52], Section 13.7). 22  Soft attitudes in terms of information dynamics.As noted above, a crucial feature of soft informational attitudes is that they are defeasible in light of new evidence.In fact, we can characterize these attitudes in terms of the type of evidence which can prompt the agent to adjust them.To make this precise, consider the natural notion of a conditional belief in a epistemic-doxastic model M. We say i believes ϕ given ψ, denoted encodes what agent i will believe upon receiving (possibly misleading) evidence that ψ is true. 23Unlike beliefs, conditional beliefs may be inconsistent (i.e., B ψ ⊥ may be true at some state).In such a case, agent i cannot (on pain of inconsistency) revise by ψ, but this will only happen if the agent has hard information that ψ is false.Indeed, K¬ϕ is logically equivalent to B ϕ i ⊥ over the class of epistemic-doxastic models.This suggests the following dynamic characterization of hard information as unrevisable belief: Safe belief can be similarly characterized by restricting the admissible evidence: i ϕ for all ψ with M, w |= ψ. i.e., i safely believes ϕ iff i continues to believe ϕ given any true formula.
Baltag and Smets [55] give an elegant logical characterization of all these notions by adding the safe belief modality i to the epistemic language L EL . 20Well-foundedness is only needed to ensure that for any set X, M in i (X) is nonempty.This is important only when W is infinite-and there are ways around this in current logics.Moreover, the condition of connectedness can also be lifted, but we use it here for convenience. 21We can even prove the following equivalence: where [w] i is the equivalence class of w under ∼ i .This has been studied by [53,54]. 23We can define belief B i ϕ as B i ϕ: belief in ϕ given a tautology.
Belief change under hard information.Let us now turn to the systematic logical issue of how beliefs change under new hard information, i.e., the logical laws governing [ϕ]B i ψ.One might think this is taken care of by conditional belief B ϕ i ψ, and indeed they are when ψ is a ground formula not containing any modal operators.But in general, they are different.In this model, the solid lines represent agent 2's hard and soft information (the box is 2's hard information ∼ 2 and the arrow represent 2's soft information 2 ) while the dashed lines represent 1's hard and soft information.Reflexive arrows are not drawn to keep down the clutter in the picture.Note that at state w 1 , agent 2 knows p and q (e.g., w 1 |= K 2 (p ∧ q)), and agent 1 believes p but not q (w 1 |= B 1 p ∧ ¬B 1 q).Now, although agent 1 does not know that agent 2 knows p, agent 1 does believe that agent 2 believes q (w 1 |= B 1 B 2 q).Furthermore, agent 1 maintains this belief conditional on p: w 1 |= B p 1 B 2 q.However, public announcing the true fact p, removes state w 3 and so we have w 1 |= [p]¬B 1 B 2 q.Thus a belief in ψ conditional on ϕ is not the same as a belief in ψ after the public announcement of ϕ.The reader is invited to check that B p i (p ∧ ¬K i p) is satisfiable but [!p]B i (p ∧ ¬K i p) is not satisfiable. 24he example is also interesting as the announcement of a true fact misleads agent 1 by forcing her to drop her belief that agent 2 believes q ([33], pg.182).Despite these intricacies, the logical situation is clear: The dynamic logic of changes in absolute and conditional beliefs under public announcement is completely axiomatizable by means of the static base logic of belief over plausibility models plus the following complete reduction axiom: Belief change under soft information.Public announcement assumes that agents treat the source of the incoming information as infallible.But in many scenarios, agents trust the source of the information up to a point.This calls for softer announcements, that can also be brought under our framework.We only make some introductory remarks: see ( [33], Chapter 7) and [55] for more extensive discussion.
How to incorporate less-than-conclusive evidence that ϕ is true into an epistemic-doxastic model M? Eliminating worlds is too radical for that.It makes all updates irreversible.What we need for a soft announcement of a formula ϕ is thus not to eliminate worlds altogether, but rather modify the plausibility ordering that represents an agent's current hard and soft information state.The goal is to rearrange all states in such a way that ϕ is believed, and perhaps other desiderata are met.There are many "policies" for doing this [57], but here, we only mention two, that have been widely discussed in the literature on belief revision.The following picture illustrates the setting: Suppose the agent considers all states in C as least as plausible as all states in A ∪ D, which she, in turns, considers at least as plausible as all states in B ∪ E. If the agent gets evidence in favor of ϕ from a source that she barely thrusts.How is she to update her plausibility ordering?
Perhaps the most ubiquitous policy is conservative upgrade, which lets the agent only tentatively accept the incoming information ϕ by making the best ϕ the new minimal set and keeping the old plausibility ordering the same on all other worlds.In the above picture a conservative upgrade with ϕ results in the new ordering The general logical idea here is this: "plausibility upgrade is model reordering". 25This view can be axiomatized in a dynamic logic in the same style as we did with earlier scenarios ( [33], Chapter 7 for details).
In what follows, we will focus on a more radical policy for belief upgrade, between the soft conservative upgrade and hard public announcements.The idea behind such radical upgrade is to move all ϕ worlds ahead of all other worlds, while keeping the order inside these two zones the same.In the picture above, a radical upgrade by ϕ would result in The precise definition of radical upgrades goes as follow.Let w] i is the equivalence class of w under ∼ i ) denote this set of ϕ worlds: Definition 2.7 (Radical Upgrade.).Given an epistemic-doxastic model M = W, {∼ i } i∈N , { i } i∈N , V and a formula ϕ, the radical upgrade of M with ϕ is the model and finally, for all i ∈ N and w ∈ W ⇑ϕ : A logical analysis of this type of information change uses modalities [⇑ i ϕ]ψ meaning "after i's radical upgrade of ϕ, ψ is true", interpreted as follows: 26   Here is how belief revision under soft information can be treated: 25 The most general dynamic point is this: "Information update is model transformation". 26Conservative upgrade is the special case of radical upgrade with the modal formula best i (ϕ, w) Theorem 2.8.The dynamic logic of radical upgrade is completely axiomatized by the complete static epistemic-doxastic base logic plus, essentially, the following recursion axiom for conditional beliefs: This result is from [58], and its proof shows how revision policies as plausibility transformations really give agents not just new beliefs, but also new conditional beliefs -a point sometimes overlooked in the literature.

The General Logical Dynamics Program
Our logical treatment of update with hard and soft information reflects a general methodology, central to the Theory of Play that we advocate here.Information dynamics is about steps of model transformation, either in their the universe of worlds, or their relational structure, or both.
Other dynamic actions and events.These methods work much more generally than we are able to show here, including model update with information that may be partly private, but also for various other relevant actions, such as inference manipulating finer syntactic information, or questions modifying a current agenda of issues for investigation.These methods even extend beyond the agents' informational attitudes, such as the dynamics of preferences expressing their "evaluation" of the world. 27rom local to global dynamics.One further important issue is this.Most information flow only makes sense in a longer-term temporal setting, where agents can pursue goals and engage in strategic interaction.This is the realm of epistemic-doxastic temporal logics that describe a "Grand Stage" of histories unfolding over time.By now, there are several studies linking up between the dynamic logics of local informational step that we have emphasized, and abstract long-term temporal logics.We refer to [33,59] for these new developments, that are leading to complete logics of information dynamics with "protocols" and what may be called procedural information that agents have about the process they are in.Obviously, this perspective is very congenial to extensive games, and in the rest of this paper, it will return in many places, though always concretely. 28

Long-term Information Dynamics
We now discuss a first round of applications of the main components of the Theory of Play outlined in the previous sections.We leave aside games for the moment, and concentrate on the dynamic of information in interaction.These applications have in common that they use single update steps, but then iterate them, according to what might be called "protocols" for conversation, learning, or other relevant processes.It is the resulting limit behavior that will mainly occupy us in this section.
We first consider agreement theorems, well known to game theorists, showing how repeated conditioning and public announcements lead to consensus in the limit.This opens the door a general analysis of fixed-points of repeated attitude changes, raising new questions for logic as well as for interactive epistemology.Next we discuss underlying logical issues, including extensions to scenarios of belief merge and formation of group preferences in the limit.Finally we return to a concrete illustration: viz.learning scenarios, a fairly recent chapter in logical dynamics, at the intersection of logic, epistemology, and game theory.

Agreement Dynamics
Agreement Theorems, introduced in [60], show that common knowledge of disagreement about posterior beliefs is impossible given a common prior.Various generalizations have been given to other informational attitudes, such as probabilistic common belief [61] and qualitative non-negatively introspective "knowledge" [62].These results naturally suggest dynamic scenarios, and indeed [63] have shown that agreement can be dynamically reached by repeated Bayesian conditioning, given common prior beliefs.
The logical tools introduced above provide a unifying framework for these various generalizations, and allow to extend them to other informational attitudes.For the sake of conciseness, we will not cover static agreement results in this paper.The interested reader can consult [64,65].
For a start, we will focus on a comparison between agreements reached via conditioning and via public announcements, reporting the work of [65].In the next section, we show how generalized scenarios of this sort can also deal with softer forms of information change, allowing for diversity in update policies within groups.
Repeated Conditioning Lead to Agreements.The following example, inspired by a recent Hollywood production, illustrates how agreements are reached by repeated belief conditioning: Cobb needs to convince Mal, otherwise dreadful consequences will ensue.For the sake of the example, let us assume that Cobb knows they are not dreaming, but Mal mistakenly believes that they are: state w 1 in Figure 1.The solid and dashed rectangles represent, respectively, Cobb's and Mal's hard information.The arrow is their common plausibility ordering.
With some thinking, Mal can come to agree with Cobb.The general procedure for achieving this goes as follows: A sequence of simultaneous belief conditioning acts starts with the agents' simple belief about ϕ, i.e. for all i, the first element B 1,i in the sequence is B i ϕ if M, w |= B i ϕ, and ¬B i ϕ otherwise.Agent i's beliefs about ϕ at a successor stage are defined by taking her beliefs about ϕ, conditional upon learning the others' belief about ϕ at that stage.Formally, for two agents i, j then: ϕ, and ¬B B n,j ϕ i ϕ otherwise. 29ollowing the zones marked with an arc in Figure 1, the reader can check that, at w 1 , Mal needs three rounds of conditioning to switch her belief about their waking, and thus reach an agreement with Cobb.Her belief stays the same upon learning that Cobb believes that they are not dreaming.Let us call this fact ϕ.The turning point occurs when she learns that Cobb would not change his mind even if he would learn ϕ.Conditional on this, she now believes that they are indeed not dreaming.Note that Cobb's beliefs stay unchanged throughout, since he knows the true state at the outset.
Iterated conditioning thus leads to agreement, given common priors.Indeed, conditioning induces a decreasing map from subsets to subsets, which guarantees the existence of a fixed points, where all agent's conditional beliefs stabilize.Once the agents have reached this fixed-point, they have eliminated all higher-order uncertainties concerning the posteriors beliefs about ϕ of the others.Their posteriors beliefs are now common knowledge: 65]).At the fixed-point n of a sequence of simultaneous conditioning acts on ϕ, for all w ∈ W and i ∈ I, we have that: The reader accustomed to static agreement theorems will see that we are now only a small step away from concluding that sequences of simultaneous conditionings lead to agreements, as it is indeed the case in our example.Since common prior and common belief of posteriors suffice for agreement, we get: Corollary 3.3.Take any sequence of conditioning acts for a formula ϕ, as defined above, in a finite model with common prior.At the fixed point of this sequence, either all agents believe ϕ or they all don't believe ϕ.
This recasts, in our logical framework, the result of [63], showing how "dialogs" lead to agreements.Still, belief conditioning has a somewhat private character. 30In the example above, Cobb remains painfully uncertain of Mal's thinking process until he sees her changing her mind, that is until she makes the last step of conditioning.Luckily for Cobb, they can do better, as we will now proceed to show.
Repeated Public Announcements Lead to Agreements. Figure 2 shows another scenario, where Cobb and Mal publicly and repeatedly announce their beliefs at w 1 .They keep announcing the same thing, but each time, this induces important changes in both agents' higher-order information.Mal is led stepwise to realize that they are not dreaming, and crucially, Cobb also knows that Mal receives and processes this information.As the reader can check, at each step in the process, Mal's beliefs are common knowledge.One again, Figure 2 exemplifies a general fact.We first define a dialogue about ϕ as a sequence of public announcements.Let M, w be a finite pointed epistemic-doxastic model. 31Now let B w 1,i , i's original belief state at w, be B i ϕ if this formula holds at w, and ¬B i ϕ, otherwise.Agent i's n + 1 belief state, written B w n+1,i , is defined as and as [ j∈I B w n,j ϕ]¬B i ϕ, otherwise.Intuitively, a dialogue about ϕ is a process in which all agents in a group publicly and repeatedly announce their posterior beliefs about ϕ, while updating with the information received in each round.
In dialogues, just like with belief conditioning, iterated public announcements induce decreasing maps between epistemic-doxastic models, and thus are bound to reach a fixed point, where no further discussion is needed.At this point, the protagonists are guaranteed to have reached consensus: Theorem 3.4 ( [65]).At the fixed-point M n , w of a public dialogue about ϕ among agents in a group I: 65]).For any public dialogue about ϕ, if there is a common prior that is a well-founded plausibility order, then at the fixed-point M n , w, either all agents believe ϕ or all do not believe ϕ.
As noted in the literature [63,64], the preceding dynamics of agreement is one of higher-order information.In the examples above, Mal's information about the ground facts of dreaming or not dreaming, does not change until the very last round of conditioning or public announcement.The information she gets by learning about Cobb's beliefs affects her higher-order beliefs, i.e., what she believes about Cobb's information.This importance of higher-order information flow is a general phenomenon, well-known to epistemic game theorists, which the present logical perspective treats in a unifying way.
Agreements and Dynamics: Further Issues.
Here are a few points about the preceding scenarios that invite generalization.Classical agreement results require the agents to be "like-minded" [66].Our analysis of agreement in dynamic-epistemic logic reveals that this like-mindedness extends beyond the common prior assumption: it also requires the agents to process the information they receive in the same way. 32One can easily find counter-examples to the agreement theorems when the update rule is not the same for all agents.Indeed, the issue of "agent diversity" is largely unexplored in our logics (but see [12] for an exception).
A final point is this.While agreement scenarios seem special, to us, they demonstrate a general topic, viz.how different parties in a conversation, say a "Skeptic" and an ordinary person, can modify their positions interactively.In the epistemological literature, this dynamic conversational feature has been neglected-and the above, though solving things in a general way, at least suggests that there might be interesting structure here of epistemological interest.

Logical Issues about Hard and Soft Limit Behavior
One virtue of our logical perspective is that we can study the above limit phenomena in much greater generality.
Hard information.For a start, for purely logical reasons, iterated public announcement of any formula ϕ in a model M must stop at a limit model lim(M, ϕ) where ϕ has either become true throughout (it has become common knowledge), or its negation is true throughout. 33This raises an intriguing open model-theoretic problem of telling, purely from syntactic form, when a given formula is uniformly "self-fulfilling" (the case where common knowledge is reached), or when "self-refuting" (the case where common knowledge is reached of the negation).Game-theoretic assertions of rationality tend to be self-fulfilling, as we shall see in Section 4 below.But there is no stigma attached to the self-refuting case: e.g., the ignorance assertion in the famous Muddy Children puzzle is self-refuting in the limit.Thus, behind our single scenarios, there is a whole area of limit phenomena that have not yet been studied systematically in epistemic logic. 34 In addition to definability, there is complexity and proof.Van Benthem [4] shows how announcement limit submodels can be defined in various known epistemic fixed-point logics, depending on the syntactic shape of ϕ.Sometimes the resulting formalisms are decidable, e.g., when the driving assertion ϕ has "existential positive form", as in the mentioned Muddy Children puzzle, or simple rationality assertions in games.
But these scenarios are still quite special, in that the same assertion gets repeated.There is large variety of further long-term scenarios in the dynamic logic literature, starting from the "Tell All" protocols in [69][70][71] where agents tell each other all they know at each stage, turning the initial distributed knowledge of the group into explicit common knowledge.
Soft information.In addition to the limit dynamics of knowledge under hard information, there is the limit behavior of belief, making for more realistic dialog scenarios.This allows for more interesting phenomena in the earlier update sequences.An example is iterated hard information dovetailing agents' opinions, flipping sides in the disagreement until the very last steps of the dialogue (cf.[33] and [72], 32 Thanks to Alexandru Baltag for pointing out this feature to us. 33We omit some details with pushing the process through infinite ordinals.The final stage is discussed further in terms of "redundant assertions" in [67]. 34Even in the single-step case, characterizing "self-fulfilling" public announcements has turned out quite involved [68]. p.110-111).Such disagreement flips can occur until late in the exchange, but as we saw above, they are bound to stop at some point.
All these phenomena get even more interesting mathematically with dialogs involving soft announcements [⇑ ϕ], when limit behavior can be much more complex, as we will see in the next section.Some relevant observations can be found in [71], and in Section 4 below.First, there need not be convergence at all, the process can oscillate: Example 3.6.Suppose that ϕ is the formula (r ∨ (B ¬r q ∧ p) ∨ (B ¬r p ∧ q)) and consider the one agent epistemic-doxastic models pictured below.Since In line with this, players' conditional beliefs may keep changing along the stages of an infinite dialog. 35But still, there is often convergence at the level of agents' absolute factual beliefs about that the world is like.Indeed, here is a result from [71]: Theorem 3.7.Every iterated sequence of truthful radical upgrades stabilizes all simple non-conditional beliefs in the limit.
Belief and Preference Merge.Finally, we point at some further aspects of the topics raised here.Integrating agents' orderings through some prescribed process has many similarities with other areas of research.One is belief merge where groups of agents try to arrive at a shared group plausibility rder, either as a way of replacing individual orders, or as a way of creating a further group agent that is a most reasonable amalagam of the separate components.And this scenario is again much like those of social choice theory, where individual agents have to aggregate preference orders into some optimal public ordering.This naturally involves dynamic analysis of the processes of delberation that lead to the eventual act of voting. 36Thus, the technical issues raised in this section have much wider impact.We may be seeing the contours of a systematic logical study of conversation, deliberation and related social processes. 35Infinite iteration of plausibility reordering is in general a non-monotonic process closer to philosophical theories of truth revision in the philosophical literature [73,74].The technical theory developed on the latter topic in the 1980s may be relevant to our concerns here [75]. 36Van Benthem [33], Chapter 12, elaborates this connection in more technical detail.

Learning
We conclude this section with one concrete setting where many of the earlier themes come together, viz.formal learning theory: see [76][77][78].The paradigm we have in mind is identification in the limit of correct hypotheses about the world (cf.[79] on language learning), though formal learning theory in epistemology has also studied concrete learning algorithms for inquiry of various sorts.
The learning setting shows striking analogies with the dynamic-epistemic logics that we have presented in this paper.What follows is a brief summary of recent work in [80,81], to show how our logics link up with learning theory.For broader philosophical backgrounds in epistemology, we refer to [82].The basic scenario of formal learning theory is one of an agent trying to formulate correct and informative hypotheses about the world, on the basis of an input stream of evidence (in general, an infinite history) whose totality describes what the world is like.At each finite stage of such a sequence, an agent outputs a current hypothesis about the world, which can be modified as new evidence comes in.Success of such a learning function in recognition can be of two kinds: either a correct hypothesis is identified uniformly on all histories by some finite stage (the strong notion of "finite identifiability"), or more weakly, each history reaches a point where a correct hypothesis is stated, but when that is may vary according to the history ("identifiability in the limit").There is a rich mathematical theory of learning functions and what classes of hypotheses can, and cannot, be described by them.Now, it is not hard to recognize many features here of the logical dynamics that we have discussed.The learning function outputs beliefs, that get revised as new hard information comes in (we think of the observation of the evidence stream as a totally reliable process).Indeed, it is possible to make very precise connections here.We can take the possible hypotheses as our possible worlds, each of which allows those evidence streams (histories of investigation) that satisfy that hypothesis.Then observing successive pieces of evidence is a form of public announcement allowing us to prune the space of worlds.The beliefs involved can be modeled as we did before, by a plausibility ordering on the set of worlds for the agent, which may be modified by successive observations.
On the basis of this simple analogy, [83] prove results like the following, making connections very tight: Theorem 3.8.Public announcement-style eliminative update is a universal method: for any learning function, there exists a plausibility order that encodes the successive learning states as current beliefs.The same is true, taking observations as events of soft information, for radical upgrade of plausibility orders.Theorem 3.9.When evidence streams may contain a finite amount of errors, public announcement-style update is no longer a universal learning mechanisms, but radical upgrade still is.
With these bridges in place, one can also introduce logical languages in the learning-theoretic universe.[80] show how many notions in learning theory then become expressible in dynamic-epistemic or epistemic-temporal languages, say convergence in the limit as necessary future truth of knowledge of a correct hypothesis about the world. 37Thus, we seem to be witnessing the beginning of merges between dynamic logic, belief revision theory and learning theory.Such combinations of dynamic epistemic logic and learning theory also invite comparison with game theory.Learning, for instance, to coordinate on a Nash equilibrium in repeated games, has been extensively studied, with many positive and negative results-see, for example, [84]. 38his concludes our exploration of long-term information dynamics in our logical setting.We have definitely not exhausted all possible connections, but we hope to have shown how a general Theory of Play fits in naturally with many different areas, providing a common language between them.

Solution Dynamics on Extensive Games
We now return to game theory proper, and bring our dynamic logic perspective to bear on an earlier benchmark example: Backwards Induction.This topic has been well-discussed already by eminent authors, but we hope to add a number of new twists suggesting broader ramifications in the study of agency.
In the light of logical dynamics, the main interest of a solution concept is not its "outcome", its set of strategy profiles, but rather its "process", the way in which these outcomes are reached.Rationality seems largely a feature of procedures we follow, and our dynamic logics are well-suited to focus on that.

First Scenario: Iterated Announcement of Rationality
Here is a procedural line on Backwards Induction as a rational process.We can take BI to be a process of prior off-line deliberation about a game by players whose minds proceed in harmony, though they need not communicate in reality.The treatment that follows was proposed by [22] (which mainly deals with strategic games), and studied in much greater detail by [85].
As we saw in Section 3, public announcements saying that some proposition ϕ is true transform an epistemic model M into its submodel M |ϕ whose domain consists of just those worlds in M that satisfy ϕ.Now the driving assertion for the Backwards Induction procedure is the following assertion.It states essentially the notion of Rationality discussed in our static analysis of Section 1.As before, at a turn for player i, a move a is dominated by a sibling b (a move available at the same node) if every history through a ends worse, in terms of i's preference, than every history through b: "at the current node, no player ever chose a strictly dominated move coming here" (rat) This makes an informative assertion about nodes in a game tree, that can be true or false.Thus, announcing this formula rat as a fact about the players will in general make the current game tree smaller.But then we get a dynamics of iteration as in our scenarios of Section 3. In the new smaller game tree, new nodes may become dominated, and hence announcing rat again (saying that it still holds after this round of deliberation) makes sense, and so on.As we have seen, this process must reach a limit: Example [Solving games through iterated assertions of Rationality.]Consider a game with three turns, four branches, and pay-offs for A, E in that order: We see how the BI solution emerges from the given game step by step.The general result follows from a simple correspondence between subrelations of the total move relation and sets of nodes ( [85] has a precise proof with full details): Theorem 4.1.In any game tree M, the model (rat, M) # is the actual subtree computed by the BI procedure.
The logical background here is just as we have seen earlier in our epistemic announcement dynamics.The actual BI play is the limit sub-model, where rat holds throughout.In terms of our earlier distinction, this means that Rationality is a "self-fulfilling" proposition: its announcement eventually makes it rue everywhere, and hence common knowledge of rationality emerges in the process.Thus, the algorithmic definition of the BI procedure in Section 1 and our iterated announcement scenario amount to the same thing.One might say then that our deliberation scenario is just a way of "conversationalizing" a mathematical fixed-point computation.Still, it is of independent interest.Viewing a game tree as an logical model, we see how repeated announcement of Rationality eventually makes this property true throughout the remaining model: it has made itself into common knowledge.

Second Scenario: Belief and Soft Plausibility Upgrade
Many foundational studies in game theory view Rationality as choosing a best action given what one believes about the current and future behaviour of the players.An appealing alternative take on the BI procedure does not eliminate any nodes of the initial game, but rather endows it with "progressive expectations" on how the game will proceeed.This is the plausibility dynamics that we studied in Section 3, now performing a soft announcement of rat, where the appropriate action is the "radical upgrade" studied earlier.The essential information produced by the algorithm is then in the binary plausibility relations that it creates inductively for players among end nodes in the game, standing for complete histories or "worlds": Example [The BI outcome in a soft light.]A soft scenario does not remove nodes but modifies the plausibility relation.To implement this, we start with all endpoints of the game tree incomparable. 39ext, at each stage, we compare sibling nodes, using this notion: A move x for player i dominates its sibling y in beliefs if the most plausible end nodes reachable after x along any path in the whole game tree are all better for the active player than all the most plausible end nodes reachable in the game after y.
Rationality * (rat * ) says no player plays a move that is dominated in beliefs.Now we perform essentially a radical upgrade ⇑ rat * : 40   If game node x dominates node y in beliefs, make all end nodes reachable from x more plausible than those reachable from y, keeping the old order inside these zones.This changes the plausibility order, and hence the pattern of dominance-in-belief, so that iteration makes sense.Here are the stages in our earlier example, where letters x, y, z stand for the end nodes of the game: x > y > z In the first game tree, going right is not yet dominated in beliefs for A by going left.rat * only has bite at E's turn, and an upgrade takes place that makes (0, 100) more plausible than (99, 99).After this upgrade, however, going right has now become dominated in beliefs, and a new upgrade takes place, making A's going left most plausible.Here is the general result [33,85]: Theorem 4.2.On finite trees, the Backwards Induction strategy is encoded in the plausibility order for end nodes created by iterated radical upgrade with rationality-in-belief.
Again this is "self-fulfilling": at the end of the procedure, the players have acquired common belief in rationality.An illuminating way of proving this uses an idea from [86]: Strategies as plausibility relations.Each sub-relation R of the total move relation induces a total plausibility order ord(R) on endpoints of a game: x ord(R) y iff, looking up at the first node z where the histories of x, y diverged, if x was reached via an R move from z, then so is y.
More generally, relational strategies correspond one-to-one with "move-compatible" total orders of endpoints.In particular, conversely, each such order ≤ induces a strategy rel(≤).Now we can relate the computation in our upgrade scenario for belief and plausibility to the earlier relational algorithm for BI in Section 1: Fact 4.3.For any game tree M and any k, rel((⇑ rat * ) k , M)) = BI k .Thus, the algorithmic view of Backwards Induction and its procedural doxastic analysis in terms of forming beliefs amount to the same thing.Still, as with our iterated announcement scenario, the dynamic logical view has interesting features of its own.One is that it yields fine-structure to the plausibility relations among worlds that are usually taken as primitive in doxastic logic.Thus games provide an underpinning for the possible worlds semantics of belief that seems of interest per se.

Logical Dynamic Foundations of Game Theory
We have seen how several dynamic approaches to Backwards Induction amount to the same thing.To us, this means that the notion is logically stable.Of course, extensionally equivalent definitions can still have interesting intensional differences.For instance, the above analysis of strategy creation and plausibility change seems the most realistic description of the "entanglement" of belief and rational action in the behaviour of agents.But as we will discuss soon, a technical view in terms of fixed-point logics may be the best mathematical approach linking up with other areas.
No matter how we construe them, one key feature of our dynamic announcement and upgrade scenarios is this.Unlike the usual epistemic foundation results, common knowledge or belief of rationality is not assumed, but produced by the logic.This reflects our general view that rationality is primarily a property of procedures of deliberation or other logical activities, and only secondarily a property of outcomes of such procedures.

Logics of of Game Solution: General Issues
Our analysis does not just restate existing game-theoretic results, it also raises new issues in the logic of rational agency.Technically, all that has been said in Sections 2 an 3 can be formulated in terms of existing fixed-point logics of computation, such as the modal "µ-calculus" and the first-order fixed-point logic LF P (F O).This link with a well-developed area of computational logic is attractive, since many results are known there, and we may use them to investigate game solution procedures that are quite different from Backwards Induction. 41But the analysis of game solutions also brings some new logical issues to this area.
Game solution and fragments of fixed-point logics.Game solution procedures need not use the full power of fixed-point languages for recursive procedures.It makes sense to use small decidable fragments where appropriate.Still, it is not quite clear right now what the best fragments are.In particular, our earlier analysis intertwines two different relations on trees: the move relation of action and computation, and the preference relations for players on endpoints.And the question is what happens to known properties of computational logics when we add such preference relations: The complexity of rationality.In combined logics of action and knowledge, it is well-known that apparently harmless assumptions such as Perfect Recall for agents make the validities undecidable, or non-axiomatizable, sometimes even Π 1 1 -complete [15].The reason is that these assumptions generate commuting diagrams for actions move and epistemic uncertainty ∼ satisfying a "confluence property" ∀x∀y((x move y ∧ y ∼ z) → ∃u(x ∼ u ∧ u move z)) These patterns serve as the basic grid cells in encodings of complex "tiling problems" in the logic. 42 Thus, the logical theory of games for players with perfect memory is more complex than that of forgetful agents [15,18].But now consider the non-epistemic property of rationality studied above, that mixes action and preference.Our key property CF in Section 1 had a confluence flavour, too, with a diagram involving action and preference: So, what is the complexity of fixed-point logics for players with this kind of regular behaviour?Can it be that Rationality, a property meant to make behaviour simple and predictable, actually makes its theory complex?
Zooming in and zooming out: modal logics of best action.The main trend in our analysis has been toward making dynamics explicit in richer logics than the usual epistemic-doxastic-preferential ones, in line with the program in [33].But in logical analysis, there are always two opposite directions intertwined: getting at important reasoning patterns by making things more explicit, or rather, by making things less explicit!
In particular, in practical reasoning, we are often only interested in what are our best actions without all details of their justification.As a mathematical abstraction, it would then be good to extract a simple surface logic for reasoning with best actions, while hiding most of the machinery: Can we axiomatize the modal logic of finite game trees with a move relation and its transitive closure, turns and preference relations for players, and a new relation best computed by Backwards Induction?Further logical issues in our framework concern extensions to infinite games, games with imperfect information, and scenarios with diverse agents.See [12,72,87] for some first explorations.

From Games to Their Players
We end by high-lighting a perhaps debatable assumption of our analysis so far.It has been claimed that the very Backwards Induction reasoning that ran so smoothly in our presentation, is incoherent when we try to "replay" it in the opposite order, when a game is actually played. 43  42 Recall our earlier remarks in Section 1 on the complexity of strategic games. 43There is a large literature focused on this "paradox" of backwards induction which we do not discuss here.See, for example, [88].
Example [The 'Paradox of Backwards Induction'.]Recall the style of reasoning toward a Backward Induction solution, as in our earlier simple scenario: Backwards Induction tells us that A will go left at the start, on the basis of logical reasoning that is available to both players.But then, if A plays right (as marked by the thick black line) what should E conclude?Does not this mean that A is not following the BI reasoning, and hence that all bets are off as to what he will do later on in the game?It seems that the very basis for the computations in our earlier sections collapses. 44  Responses to this difficulty vary.Many game-theorists seem under-impressed.The characterization result of [89] assumes that players know that rationality prevails throughout. 45One can defend this behaviour by assuming that the other player only makes isolated mistakes.Baltag, Smets and Zvesper [86] essentially take the same tack, deriving the BI strategy from an assumption of "stable true belief" in rationality, a gentler form of stubbornness stated in terms of dynamic-epistemic logic.
Players' revision policies.We are more inclined toward the line of [91,92].A richer analysis should add an account of the types of agent that play the game.In particular, we need to represent the belief revision policies of the players, that determine what they will do when making a surprising observation contradicting their beliefs in the course of a game.There are many different options for such policies in the above example, such as "It was just an error, and A will go back to being rational", "A is telling me that he wants me to go right, and I will be rewarded for that", "A is an automaton with a general rightward tendency", and so on. 46Our analysis so far has omitted this type of information about players of the game, since our algorithms made implicit uniform assumptions about their prior deliberation, as well as what they are going to do as the game proceeds.
This matching up of two directions of thought: backwards in "off-line dynamics" of deliberation, and forwards in "on-line dynamics" of playing the actual game, is a major issue in its own right, beyond specific scenarios.Belief revision policies and other features of players must come in as explicit components of the theory, in order to deal with the dynamics of how players update knowledge and revise beliefs as a game proceeds.But all this is exactly what the logical dynamics of Section 2 is about.Our earlier discussion has shown how acts of information change and belief revision can enter logic in a systematic manner. 44The drama is clearer in longer games, when A has many comebacks toward the right. 45Samet [90] calls this "rationality no matter what", a stubborn unshakable belief that players will act rationally later on, even if they have never done so up until now. 46One reaction to these surprise events might even be a switch to an entirely new style of reasoning about the game.That would require a more finely-grained syntax-based views of revision: cf. the discussion in [93].
Thus, once more, the richer setting that we need for a truly general theory of game solution is a perfect illustration for the general Theory of Play that we have advocated.

Conclusion
Logic and game theory form a natural match, since the structures of game theory are very close to being models of the sort that logicians typically study.Our first illustrations reviewed existing work on static logics of game structure, drawing attention to the fixed-point logic character of game solution methods.This suggests a broader potential for joining forces between game theory and computational logic, going beyond specific scenarios toward more general theory.To make this more concrete, we then presented the recent program of "logical dynamics" for information-driven agency, and showed how it throws new light on basic issues studied in game theory, such as agreement scenarios and game solution concepts.
What we expect from this contact is not the solution of problems afflicting game theory through logic, or vice versa, remedying the aches and pains of logic through game theory.Of course, game theorists may be led to new thoughts by seeing how a logician treats (or mistreats) their topics, and also, as we have shown, logicians may see interesting new open problems through the lense of game theory.
But fruitful human relations are usually not therapeutic: they lead to new facts, in the form of shared offspring.In particular, one broad trend behind much of what we have discussed here is this.Through the fine-structure offered by logic, we can see the dynamics of games as played in much more detail, making them part of a general analysis of agency that also occurs in many other areas, from "multi-agent systems" in computer science to social epistemology and the philosophy of action.It is our expectation that the offspring of this contact might be something new, neither fully logic nor game theory: a Theory of Play, rather than just a theory of games.

Theorem 1 . 3 .
BI is the largest subrelation S of the move relation in a game with (a) S has a successor at each intermediate node, (b) S satisfies CF .

Example 2 . 6 .
[Dynamic Belief Change versus Conditional Belief] Consider state w 1 in the following epistemic-doxastic model:

Figure 2 .
Figure 2. Cobb and Mal's discussion on the window ledge.
Stage 0 rules out u, the only point where rat fails, Stage 1 rules out z and the node above it (the new points where rat fails), and Stage 2 rules out y and the node above it.In the remaining game, Rationality reigns supreme: