Moving Information Horizon Approach for Dynamic Game Models

: In the paper, a new class of dynamic game models with a moving information horizon or dynamic updating is studied. In this class of games, players do not have full information about the game structure (motion equations, payoff functions) on the interval on which the game is deﬁned. It is supposed that the players at each stage of the dynamic game have only truncated information about the game structure deﬁned by the information horizon. Cooperative and noncooperative settings are considered in the paper. Results are illustrated using the oligopoly advertising game model, and comparison between the solution in the initial game model and in the game model with moving information horizon is presented. Simulation results are presented.


Introduction
Noncooperative game theory deals with strategic interactions among multiple decision makers with the objective functions depending on the choices of all the players. It is supposed that the players do not cooperate, and it is important to model their behavior using some optimality principal. There are many results that have been obtained in the field of noncooperative game theory; for a general description. see the fundamental book [1]. In the papers [2,3], the solution concept called the Nash equilibrium was proposed; in the current paper, this solution concept is used, but for a dynamic setting. The following book [4] can be used to study the main recent results in the field of dynamic cooperative game theory, which is a focus of the current paper.
The cooperative dynamic game theory offers socially convenient and group efficient solutions to different decision problems involving strategic actions. One of the fundamental questions in the theory of cooperative dynamic games is to define the allocation rule for cooperative payoff or set of imputations based on the optimal behavior of players, and the notation of characteristic function is used for that, which shows the strength of coalition S. In this paper, it is defined as in [5] as a total payoff of players from coalition S in Nash equilibrium in the game with the following set of players: coalition S (acting as one player) and players from the set N\S. To compute the feedback Nash equilibrium, we analyze [4]. A set of imputations or solutions of the game is determined using the characteristic function at the beginning of each subinterval. In the paper [6], Haurie noted the problem of time inconsistency for the Nash bargaining solution in one differential game model. The notion of time consistency was formalized mathematically by Petrosyan in the paper [7]. In the next paper on time consistency [8], L. Petrosyan defined the notion of the Imputation Distribution Procedure (IDP), which is used to compose a time consistent cooperative solution or single imputation. Later on, L. Petrosyan defined the notion of strong time consistency in the paper [9]. See the recent publications on this topic in [10][11][12]. In order to determine a solution for the game model with dynamic updating, it is necessary to combine partial solutions and their IDP on subintervals. The theories of time consistency and strong time consistency introduced by L. Petrosyan in [7,9] are also studied for the offered solution.
Classical dynamic game models assume that the game structure does not change in the interval on which the game is defined or that the players have all information about the change of the game structure. However, if we consider a long term process, then these assumptions cannot cope with the reality. In the class of games with continuous updating, it is supposed that the players have or use only information about motion equations and payoff functions defined on the interval with a length equal to the value of the information horizon. This information is updated as the current time evolves, see Figure 1. In order to define the best possible behavior for players in this type of dynamic game, it is necessary to use a special approach, which is called the Looking Forward Approach (LFA).
The concept of the looking forward approach was introduced in [13] for differential games, where a new class of games was presented and the foundation for further study of games with dynamic updating or a moving information horizon was laid. To get more information about the approach, one may consider the following papers [13][14][15][16][17][18][19][20]. In the paper [13], the looking forward approach was applied to the cooperative differential game with a finite horizon. The paper [17] was focused on studying the looking forward approach with stochastic forecast and dynamic adaptation. In the paper [15], the looking forward approach was applied to a cooperative differential game of pollution control. In the paper [16], the looking forward approach was applied to the cooperative differential game with an infinite horizon. The paper [14] was devoted to studying the cooperative differential games with an infinite horizon and different types of forecasts and information structures. The last papers on the looking forward approach [18,21] were devoted to either studying the looking forward approach for cooperative differential games with nontransferable utility and real-life application of the looking forward approach to economic simulation. The class of differential games with continuous updating was considered in the papers [19,20]; here, it was supposed that the updating process evolves continuously in time. In the paper [19], the system of Hamilton-Jacobi-Bellman equations was derived for the Nash equilibrium in the game with continuous updating. In the paper [20], the class of linear-quadratic differential games with continuous updating was considered, and the explicit form of the Nash equilibrium was derived. In this paper, we apply the looking forward approach to the class of games with dynamic updating. The importance of the results of this paper is supported by the applicability of dynamic models to real-life processes, the more simple structure of dynamic models, and the dynamic nature of numerical model predictive control. In the papers described above, we studied differential games. Another important thing is that in this paper, we study the dynamic oligopoly marketing model of advertising, where we apply the looking forward approach. This result is essentially new.
In order to demonstrate the looking forward approach, we present the dynamic marketing model of advertising with a finite horizon. We refrain from reviewing the literature on advertising models, given that there are recent surveys on this topic. Huang, Leng, and Liang [22] gave an extensive and comprehensive survey of dynamic advertising competition since 1994, their coverage starting where the previous survey in Feichtinger, Hartl, and Sethi [23] stopped. Jørgensen and Zaccour [24] covered advertising models in oligopolies (horizontal strategic interactions) and in marketing channels (vertical strategic interactions), whereas He, Prasad, Gutierrez, and Sethi [25] focused on Stackelberg differential game models in supply chains and marketing channels that include advertising. Finally, Aust and Buscher [26] and Jørgensen and Zaccour [27] concentrated on cooperative advertising in marketing channels. This model was based on the model presented by Naik, Prasad, and Sethi in [28]. The model in [28] is the extension of the Sethi model for awareness of auto brands with the churn term, which is the extension of the decay market share term in monopoly models capturing forgetting and noise. The closed-loop Nash equilibrium concept is used to obtain the optimal advertising expenditure for the noncooperative game model. Considering the applications of game theory in different fields of research, many subjects from society modeling have been covered, for example see [29]. In this paper, the looking forward approach is applied to the dynamic case of the model presented in [28]. Both noncooperative and cooperative models are considered.
The paper is structured as follows. In Section 2 , a description of the looking forward approach is presented for a general dynamic game model, and a solution for a game with dynamic updating is presented for both fixed and random information horizons. In Section 3, the looking forward approach is applied to a dynamic oligopoly marketing model of advertising for noncooperative game and cooperative game models, respectively. In Section 4, simulation results for a marketing game model are presented.

Looking Forward Approach
Information plays one of the main roles in game theory. Generally, classical differential game models consider only the complete information, which means the information about the game structure is known and will not change during the whole game. This paper focuses on, however, the case that when information updates dynamically, players use only the truncated information and update it at each stage. In order to model the behavior of players when information updates dynamically, a novel approach called the looking forward approach in [13] is introduced.
Consider a general N-stage n-person nonzero-sum discrete-time dynamic game Γ(x 0 , N) [30] with initial state x 0 ∈ R m . The state dynamics of the game is defined by the following difference equation: where u k i ∈ U i ⊂ R m i is the control vector of player i at stage k, x k ∈ X is the state of the game, and k ∈ {1, 2..., N} ≡ T.
The payoff function of player i is: where r is the discount rate. This game is a complete information game; however, in real-life-like games, players do not have full information about the game structure on the interval on which the game is defined. We apply the looking forward approach in order to model the behavior of players when information about the game updates dynamically.

Truncated Subgame
Assume that at each stage k, players have full information about the motion equations and payoff functions within k +T stages, whereT is a fixed value, namely the information horizon. At the stage k, information about the game is updated. At the stage k + 1, players have full information about the game structure on the interval until the stage k + 1 +T, see Figure 1.
The motion equation and payoff function of truncated subgame on the stage interval [k, k +T] coincide with that of the initial game on the same stage interval. The motion equation and initial condition of truncated subgame have the following form: (denote x k,l = x k+l , u k,l i = u k+l i , l = 0,T for the truncated subgame Γ k (x k,0 , k, k +T)). The payoff function of firm i in truncated subgame Γ k (x k,0 , k, k +T) has the form: for i ∈ {1, 2..., n} ≡ I. Figure 1. The behavior of players in the game with truncated information can be modeled using a series of truncated subgames Γ k (x k,0 , k, k +T), k = 1, ..., N −T.

Noncooperative Outcomes in the Game Model with a Moving Information Horizon
Considering each truncated subgame, let u NE,l k (x) = (u NE,k,l 1 (x), . . . , u NE,k,l n (x)), l ∈ [0,T] be a vector of strategies that provides a feedback Nash equilibrium solution for the truncated subgame Γ k (x k,0 , k, k +T), and byV k i (l, x), denote the payoff of player i ∈ I in feedback Nash equilibrium in the subgame defined on the interval [l, k +T], l = k, ..., k +T: Theorem 1. ( [31]) Suppose that there exist functionsV k i (l, x) satisfying (5), then strategies that maximize the right hand side of (5) constitute the feedback Nash equilibrium: Denote the strategies obtained by maximization the right hand side of (5) by u NE,l , denote the trajectory obtained with strategies u NE,l k (x) involved. The presented solution is valid only for a truncated subgame, not for the whole game defined on the interval [0, N]. It is also impossible to construct the Nash equilibrium using the classical approaches, since information about the game updates dynamically, and it is different on different intervals. In order to construct the resulting equilibrium strategies and corresponding trajectory for a class of games with dynamic updating, we introduce the notion of the resulting strategies and corresponding resulting trajectory: is a combination of x NE k for each truncated subgame Γ k (x NE k,0 , k, k +T): Along the resulting noncooperative trajectory, players receive a payoff according to the following formula:

Cooperative Game Model with a Moving Information Horizon
The theory of cooperative dynamic games offers socially convenient and group efficient solutions to different decision problems involving strategic actions. One of the fundamental problems in the theory of cooperative dynamic games is the formulation of optimal or cooperative behavior for players and the definition of the allocation rule or cooperative solution, which is the set of imputations chosen by the players in the game. We suppose that the players decide to cooperate in each truncated subgame. By Γ c k (x k,0 , k, k +T), we denote a truncated cooperative subgame defined on the interval [k, k +T] with the initial condition x k,0 . The maximal joint payoff of all players in this game can be defined by maximizing functional: subject to: Assume that the Bellman function for each truncated subgame Γ c k (x k,0 , k, k +T) has the following form [32]: where T] such that the following recursive relations are satisfied: where W k (T + 1, x) = 0. Assume that the maximum in (12) is achieved under control u * i (k), then u * i (k) is optimal in the control problem (8), (9). Definition 4. The resulting cooperative strategies {û * ,j i } N j=1 , ∀i ∈ I are constructed in the following way: In a similar way, we define the resulting cooperative trajectory. By u * , denote the optimal strategies obtained by maximization of the right hand side of (12). The corresponding trajectory is defined by On the interval [k, k + 1], the resulting cooperative trajectory coincides with the cooperative trajectory x * k in the truncated cooperative subgame Γ c k (x * k , k, k +T). At the stage k + 1, information about the game structure updates. On the interval [k + 1, k + 2], trajectoryx * k coincides with cooperative trajectory x * k+1 in the truncated cooperative subgame Γ c k+1 (x * k+1 , k + 1, k + 1 +T), etc.
The characteristic function of the coalition is an essential concept in the theory of cooperative games. In this paper, it is defined as in [5] as a total payoff of players from coalition S ⊆ I in Nash equilibrium in the game with the following set of players: coalition S (acting as one player) and players from the set I\S (acting as individuals). For each coalition S ⊆ I, define the values of the characteristic function for each truncated subgame as was done in [16]: whereṼ k (S, x * k,0 ) is defined as the total payoff of players from coalition S \ I in feedback Nash equilibrium u NE = (ū NE 1 , ...,ū NE n ) in the game with the following set of players: coalition S (acting as one player) and players from the set |I \ S|, i.e., in the game with |I \ S| + 1 players. For |S| = 1, the cooperative case turns into the noncooperative case. In other cases, players from coalition S act as one player maximizing their total payoff.

Resulting Cooperative Solution and Theoremerties
is defined as an arbitrary vector satisfying the conditions: Denote the set of all possible imputations for each truncated subgame by E k (x * , k, k +T). Suppose that for each truncated subgame, a non-empty cooperative solution is defined by: it can be a core, nucleus, or Shapley value. In order to guarantee time consistency Theoremerty [30] in each cooperative truncated subgame, it is necessary to use the notion of the imputation distribution procedure. The Theoremerty of time consistency was introduced in 1977 by L. Petrosyan [33].
Following the continuous time analysis of Yeung and Petrosyan [34] for cooperative differential games, we formulate a discrete time version of the imputation distribution procedure [8] so that the agreed upon imputations will be time consistent. By B k i (x * ,k i ), denote the payment that firm i receives in stage k under the cooperative agreement along the cooperative trajectory x * ,k i N k . The payment scheme involving B k i (x * ,k i ) constitutes an IDP in the sense that the imputation of firm i over the stages from k to N can be expressed as: The agreed-upon optimality principle or cooperative solution is time consistent, if the condition (6) is maintained at any time instant throughout the game along the cooperative trajectory.
In order to construct the resulting cooperative solution or cooperative solution in the game with dynamic updating Γ(x 0 , N), it is necessary to use a special approach, because the standard approach can only be applied to each truncated subgame with dynamic updating defined on the interval [k, k +T]. IDP also provides the time consistency Theoremerty of the new solution concept and the ability to determine solutions within the interval [k, k +T].
Suppose that in each truncated subgame Γ c k (x * k , k, k +T), cooperative solution M k (x * , k, k +T) = ∅ along the cooperative trajectory x * k is selected. Suppose that the imputation: and corresponding IDP for k ∈ 1, N −T are selected in a way so as to guarantee the time consistency Theoremerty: The corresponding IDP β k,l (x * k,l ) can be obtained in the following form [30]: where ξ k (x * k,T+1 , k +T + 1, k +T) = 0. However, the time consistency only holds on the interval [k, k +T] for each truncated subgame Γ c k (x * k , k, k +T), k ∈ {1, ..., N}. In order to guarantee time consistency for the whole interval [1, N], we introduce the resulting IDP. Definition 7. The resulting IDPβ k (x * ) is defined for a set of chosen imputations in each truncated subgame ξ k (x * k,0 , k, k +T) ∈ M k (x * , k, k +T) using the corresponding β k (x * k ) as follows: Using the resulting IDPβ k (x * ), we define the following vector: is a vector defined using the resulting IDPβ k (x * ) in the following way, for stage k = 1, N:ξ in particularξ(x * , N) = ∑ N l=1β l (x * ).
Using the notion of resulting imputation, it is possible to define the allocation rule for the joint payoff: Definition 9. Resulting solutionM k (x * , N − k) is the set of resulting imputationŝ ξ(x * , N − k) for all possible resulting IDPs,β k (x * ) (defined by different imputations in each truncated subgame).
Any resulting imputationξ(x * , N) ∈M(x * , N) and corresponding resulting IDPβ k (x * ) allocate the total payoff of players along the resulting cooperative trajectoryx * l in the initial game with prescribed duration or the following condition holds: The resulting solutionM k (x * , N − k) is time consistent by construction. However, it can also be proven that an arbitrary resulting cooperative solution is strong time consistent in the game model with dynamic updating. By an arbitrary solution, we understand a solution that for example is composed of some vectors from the core in the first stage, the Shapley value in the second stage, etc.
where a ⊕ A = {a + a : a ∈ A}.
Theorem 3. The arbitrary resulting cooperative solutionM(x * , N) is strong time consistent in the game model with dynamic updating.

Construction of the Characteristic Function in the Game Model with Dynamic Updating
In the previous section, we defined the notion of the resulting cooperative solution or we defined a way for allocating cooperative payoff along the chosen trajectory in the game with dynamic updating. However, we did not show that the constructed set was a set of imputations or was a set of vectors satisfying individual and group rationality Theoremerties. In order to justify the resulting cooperative solution, we need to define the characteristic function for the game model with dynamic updating. Definition 11. The resulting characteristic functionV(S;x * k , N − k), S ⊆ I in the game Γ(x * k , N − k) with dynamic updating was the function calculated using the values of characteristic functions V(S; x * k , k, k +T) in every truncated subgame Γ c k (x k,0 , k, k +T) along the resulting cooperative trajectoryx * k for k ∈ {1, ..., N}: Using the resulting characteristic function, we can show that the resulting imputationξ(x * , N) was the imputation in the game with dynamic updating: Theorem 4. The resulting imputationξ(x * , N) is the imputation in the game Γ(x 0 , N) with dynamic updating, if for k ∈ {1, ..., N −T}, the following condition is satisfied: where l = 0,T.
The proof of Theorems 4,5,6,8,9,10,11 in the paper can be found in the Appendix A.

Relationship of the Solutions in Truncated Subgames and Resulting Solutions
In this section, it is shown that if the players choose imputations ξ k (x * k , k, k +T) ∈ E k (x * , k, k +T) based on V(S; x * , k, k +T) for k ∈ {1, ..., N −T}, S ⊆ I in every truncated subgame by the same rule, then the resulting imputationξ(x * , N − k) corresponds to the imputation chosen by the same rule using the resulting characteristic functionV(S; x * , N − k), S ⊆ I. Further, we prove it for the number of classical optimality principals.
We assume that if in every truncated subgame Γ c k (x * k , k, k +T), players choose Shapley value Sh k (x * k , k, k +T) as the imputation, then the corresponding resulting imputationξ(x * , N − k) coincides with the Shapley valueŜh(x * k , N − k) calculated using the resulting characteristic functionV(S;x * , N − k), S ⊆ I.
Suppose that in every truncated subgame Γ c k (x * k , k, k +T), players choose the core C(x * k , k, k +T) as the cooperative solution, then each resulting imputationξ(x * , N − k) belongs to the core, defined using the resulting characteristic functionV(S,x * k , N − k).

Random Information Horizon
In this section, we consider a case when the length of the information horizon is not fixed; this means that the players may have certain information on truncated intervals [k,T k ], but the length of intervals is a random variable (see Figure 2). ByT k , we denote the length of the information horizon in the stage k.T k can take values from the interval [T k−1 + 1, N], whereT k−1 is the realization of random horizonT k−1 in the previous truncated subgame, in particularT 0 = 0. It is worth mentioning that oncē T k = N, then the updating procedure stops.T k is a discrete random variable. At the initial position x 0 , the following probabilities are defined for realization on the interval [1, N]: {γ 1 , . . . , γ N }. In the position x 1 (second position), the probabilities are recalculated in the following way: The motion equation on the interval [k,T k ] is: The payoff function of player i ∈ {1, . . . , n} for the k th truncated subgame is: Consider the noncooperative game model. In order to define the feedback Nash equilibrium in this game, denote the Bellman function V k i (x, k) for the k th truncated subgame as follows: with the set of strategies {u j i = ψ j i , j = k,T k , i ∈ I}, then: As in the theorem in [35], a feedback Nash equilibrium in the noncooperative case of each truncated subgame with random horizon can be characterized as follows.

Theorem 7.
A set of strategies {ψ j i , j = k, N, i ∈ I} provides a feedback Nash equilibrium solution to the game (25), (26), if there exist functions V k i (x, j), for j ∈ {k, . . . , N} and i ∈ I, such that the following recursive relations are satisfied: Consider the case when all players agree to cooperate in each truncated subgame. To maximize their expected joint payoff for the k th truncated subgame, we need to solve the following discrete time dynamic programming problem: subject to (25). As a noncooperative case, we can characterize an optimal solution for (25) and (28) as follows:

Corollary 1. A set of strategies {φ
j i , j = k, N, i ∈ I} provides an optimal cooperative strategies to the game (25) and (28), if there exists function W k i (x, j), for j ∈ {k, . . . , N} and i ∈ I, such that the following recursive relations are satisfied: Using the set of optimal strategies, {φ j i , j = k, N, i ∈ I} x * k can be generated from (25), and we obtain the corresponding cooperative trajectory.

Initial Game Model
Consider an n-firm dynamic oligopoly marketing model of advertising defined on the closed time interval [t 0 , T]. The solution for the noncooperative case of this model in differential form was presented in [28]. In this paper, the model is transformed to the dynamic setting and studied for not only the noncooperative case, but mainly for the cooperative case [36]. Advertising efforts as strategies are used by firms to compete on the oligopoly market, and each firm tries to increase its market share while competitors try to minimize it using their advertising efforts. Denote by x k i the market share of firm i ∈ I ≡ {1, ..., n} at stage k. The full list of notations is presented in the Table 1. The dynamic oligopoly marketing model of advertising is defined as follows. Motion equation of firm i ∈ I or its market share: where z 0 i is a positive constant. The payoff function of firm i ∈ I in the dynamic game model or the profit of firm i ∈ I: Unlike the motion Equation (2) in the model example, here we use the discount rate r = 0. The main reason is that we do not want to complicate the result of the model and show the difference between the looking forward approach and the initial differential game first. The next thing is that we may model the advertising process on the short time interval, i.e., with a small value of T.
In the table, the list of parameters of the model is presented. Notation Explanation Market share of firm i ∈ I ≡ {1, ..., n} at stage k. u k i ≥ 0 Advertising effort rate of firm i ∈ I at stage k.
Advertising effectiveness parameter of firm i ∈ I. δ > 0 Churn parameter. m i > 0 Industry sales multiplied by the per unit profit margin of firm i ∈ I. C(u k i ) Cost of advertising of firm i ∈ I at stage k, parameterized by (u k i ) 2 .
The initial marketing game model is defined on the interval [0,N]. As in Section 2, suppose that players have full information about the motion equations and payoff functions on the interval [k, k +T]. In order to construct a solution in the game model with dynamic updating, we use the notion of truncated subgame Γ k (x k,0 , k, k +T).
The payoff function of firm i ∈ I in truncated subgame Γ k (x 0 k , k, k +T) has the form: In Expression (30), there is a logical consistency requirement that the market shares should satisfy:

Noncooperative Outcomes in a Truncated Subgame
Firstly, we consider the noncooperative case for each truncated subgame Γ k (x 0 k , k, k +T), k ∈ {0, . . . , N −T} with the initial condition x 0 k . According to Theorem 1, the set of strategies {ū k,l i , for i ∈ {1, ..., n}, l = 0,T and k ∈ {0, ..., N −T}}, provides a feedback Nash equilibrium solution to the game (30), (32) if there exist functions V k i (l, x k,l i ), for i ∈ {1, ..., n}, l = 0,T and k ∈ {0, ..., N −T} in each truncated subgame Γ k (x 0 k , k, k +T), such that the following recursive relations are satisfied, for each i ∈ I: Theorem 8. The payoff of players for each truncated subgame Γ k (x 0 k , k, k +T) in feedback Nash equilibrium on the interval [k, k +T] has the following form: whereÃ k,l i ,B k,l (i) are determined from the relations: The corresponding feedback Nash equilibrium strategies are: Consider now the case of a random horizon for a truncated subgame. Suppose that the players know certain information about the game structure, but the duration of this information being right (or not changed) is unknown. Suppose that the probabilities are symmetric on the interval [1, N], N . In our model, the first truncated subgame starts at Stage 0. Invoking Theorem 7 for the k th truncated subgame, if there exist continuously differentiable functions . . , N}, such that the following recursive relations are satisfied: then strategies u i , i ∈ I that maximize the right hand side of System (35) constitute the feedback Nash equilibrium. The corresponding Bellman functions can be obtained as follows: Theorem 9. The Bellman functions for the feedback Nash equilibrium of the k th truncated subgame in (35) are: where A with initial conditions A k,N i = m i , B k,N+1 (i) = 0, i ∈ {1, ..., n}, A k,N j = 0, for j ∈ I \ i and Y k,t+1

Cooperative Outcomes in a Truncated Subgame
In this section, we focus on the cooperative setting, where all firms agree to cooperate in each truncated subgame. Denote by Γ c k (x 0 k , k, k +T) the truncated cooperative subgame on the interval [k, k +T] with the initial condition x 0 k . All firms aim to maximize their joint payoff: Suppose that the set of strategies {u * ,k,l i , for i ∈ {1, ..., n}, l = 0,T and k ∈ {0, ..., N −T}}, provides an optimal control to the game (30), (38). Invoking Theorem 2.2, if there exist functions {W k (l, x l k ), x l k = (x k,l 1 . . . . , x k,l n )}, for i ∈ {1, ..., n}, l = 0,T and k ∈ {0, ..., N −T} in each truncated subgame Γ c k (x 0 k , k, k + T), then the following recursive relations are satisfied: Theorem 10. The maximum joint payoff of players for each truncated subgame Γ c k (x 0 k , k, k +T) has the following form: whereĈ k,l i ,D k,l i , i ∈ {1, ..., n}, k ∈ {0, ..., N −T}, l = 0,T satisfy the relations: The optimal cooperative strategies for each truncated subgame have the form: Denote the solution of (39) with optimal strategies u * involved with the cooperative trajectory, x * ,l k = (x * ,k,l 1 , . . . , x * ,k,l n ), l = 0,T. The cooperative trajectory in each truncated subgame can be calculated using (30) as follows, k ∈ {0, . . . , N −T}: So far, we have formalized the case when all firms cooperate in the game model with dynamic updating and fixed information horizon. Hence, we start to study the case when the information horizon is random, whereT 1 has symmetric probabilities on the interval [1, N], γ 1 = · · · = γ N = 1 N . In the k th cooperative truncated subgame, if there exist functions W k (x, f ), for i ∈ {1, . . . , n}, f ∈ {k, . . . ,T k−1 }, andŴ k (x, τ), for τ ∈ {T k−1 + 1, . . . , N}, such that the following recursive relations are satisfied:Ŵ

Characteristic Function in a Truncated Subgame
Using the characteristic function formula (14), we define the characteristic function for each truncated subgame: whereṼ k (S, x * ,0 k ) is defined as the total payoff of players from coalition S in feedback Nash equilibrium u NE = (ū 1 , ...,ū n ) in the game with the following set of players: coalition S (acting as one player) and players from the set I \ S, i.e., in the game with |I \ S| + 1 players.
Suppose that the Bellman function for coalition S ⊆ I and players i ∈ I \ S in truncated subgame Γ c k (x 0 k , k, k +T) has the following form: then the corresponding system of Bellman equations will have the form:

Numerical Simulation for a Dynamic Advertising Game Model with Updating
We consider a specific three firm oligopoly case n the stage interval N = 8, with fixed information horizonT = 2. In order to illustrate our model,    Figures 5 and 6 show the dynamics of players' market shares corresponding to the feedback Nash equilibrium strategies for each firm in the noncooperative case and the dynamics of market shares corresponding to the optimal cooperative strategies in the cooperative case respectively for both the initial game and the game model with dynamic updating.

Stage
Noncooperative Outcomes  Figure 9 shows the difference between the resulting Shapley value and the Shapley value in the initial game model. As we can see from above, the resulting Shapley value of the game with moving information horizon changed more steadily over stages rather than the Shapley value of the initial game. It can be seen that the competition among the companies was more fierce in the initial game rather than in the game with a moving information horizon, considering market shares in Figure 6 and advertising expenditure in Figure 7. The Shapley value of each company barely changed in the last stage, which means that the profit of each company did not change much, as is shown in Figure 9. Suppose that the realization of the game information horizon:T 1 = 2,T 2 = 5,T 3 = 8. Figure 10 shows the difference of the resulting noncooperative trajectory with a fixed information horizon and a random information horizon, respectively.

Stage
NE 3 Figure 10. The resulting noncooperative trajectory with a fixed information horizon (dashed line) and the resulting noncooperative trajectory with a random information horizon (solid line). Figure 11 shows the difference between the resulting noncooperative outcomes of each firm in the noncooperative case with the fixed information horizon and random information horizon approaches.

Stage
Noncooperative Outcomes Figure 11. The resulting noncooperative outcomes with a fixed information horizon (dashed line) and the resulting noncooperative outcomes with a random information horizon (solid line). Figure 12 shows the difference between the resulting cooperative trajectory with fixed and random information horizons.  Figure 12. The resulting cooperative trajectory with a fixed information horizon (dashed line) and the resulting cooperative trajectory with a random information horizon (solid line).

Conclusions
In this paper, a novel approach was presented for defining the solution in the dynamic game model, where information updated dynamically. Here, players did not have full information about the game structure through the whole time interval on which the game was defined, but they had certain information about the game structure on a fixed interval, namely the information horizon. As the time evolved, information about the game structure updated. Moreover, a random information horizon was considered for this approach. Therefore, the approach can be used to perform more realistic modeling of real-life conflict processes.
The dynamic oligopoly marketing model of advertising was considered, and the corresponding numerical example was used to illustrate the comparison between the initial dynamic marketing model and the corresponding model with dynamic updating. Besides, the comparison between the looking forward approach with a fixed information horizon and the looking forward approach with a random information horizon was presented. Denote function According to Definition 8, the right hand side of (A1) can be rewritten as: The right hand side of (A3) is equal toV(I;x * k , N − k) according to Definition 11; therefore, (A1) is correct. Now, consider (A2), where the left hand side can be rewritten as: N)).

(A6)
The theorem is proven. Anticipating that the controls will be shown to be positive, substituting it into (A13), collecting the terms together, then the parametersÃ k,l i andB k,l (i) can be expressed as: i , Z i = hρ i 2(n−1) .
Appendix A.5. Proof of Theorem 9 Consider the Bellman function in the last stage, in the stage N. Performing the indicated maximization in it yields: Invoking (36) andV k i (x N i , N) = m i x N i , the second equation in (35) becomes: Performing the indicated maximization in the above yields: for i ∈ I and τ ∈ {T k−1 + 1, . . . , N − 1}, then the feedback Nash equilibrium strategy of firm i ∈ I can be obtained in the form: Performing the indicated maximization in the above yields: for i ∈ I and τ ∈ {T k−1 + 1, . . . , N − 1}, then the feedback Nash equilibrium strategy of firm i ∈ I can be obtained in the form: Anticipating that the controls will be shown to be positive, substituting this into the second equation in (40), collecting the terms together, then the parametersC k,τ i andD k,τ i can be expressed as: