Strong Time-Consistent Solution for Cooperative Differential Games with Network Structure

: One class of cooperative differential games on networks is considered. It is assumed that interaction on the network is possible not only between neighboring players, but also between players connected by paths. Various cooperative optimality principles and their properties for such games are investigated. The construction of the characteristic function is proposed, taking into account the network structure of the game and the ability of players to cut off connections. The conditions under which a strong time-consistent subcore is not empty are studied. The formula for explicit calculation of the Shapley value is derived. The results are illustrated by the example of one differential marketing game.


Introduction
One of the current directions of research in game theory is network games. Such games explore individuals' interactions connected through a network and whose behavior depends on their neighbors' behavior. An overview and structuring of the main trends in network games can be found in [1][2][3][4].
The theory of differential games on networks is commonly used when the evolution of decision-making occurs continuously in time. The papers investigating applied models of differential games on networks confirm the relevance of this topic. For example, a differential game of a duopoly with network externalities is examined in [5]. Differential games with network structure in marketing are considered in [6]. An application of such games in regional cooperation is developed in [7]. Adaptation of differential games to the problem of network traffic is proposed in [8]. Cooperative differential games on networks were first considered by L. Petrosyan [9], who introduced a new type of strategies. The possibility of cutting the links with neighboring players during the game is included in these strategies. When evaluating a coalition's worth, the classical methods assume that players who are not in this coalition minimize the coalition's payoff. The use of the new type of strategies has led to the possibility of measuring a coalition's worth without considering the actions of players who are not members of this coalition. Based on this, a novel form for characteristic function-named as cooperative-trajectory characteristic function-is proposed in [10]. Moreover, it satisfies the convexity property. In [11], formulas for the Shapley value and some core imputations are obtained using the new characteristic function. In this paper, we generalize the approach used in [10,11] on a more interesting case when the payoff of a player depends not only on his neighbors' actions but also on players' actions with whom paths in the network connect the player. This case differs essentially from one considered in [10]. In such games, the convexity of the new characteristic function depends on the structure of the network on which the game is defined.
In this paper, we consider cooperative differential games on networks defined on a given time-interval. We suppose that payoffs are transferable. Under the cooperation, we understand that players choose their strategies to maximize the sum of payoffs and then define a rule (optimality principle) on how to allocate this joint payoff between them. There are different optimality principles such as core, NM-solution, the Shapley Value, and others in classical and differential game theory. However, a serious problem arises with time-consistency and, in the case of set-valued optimality principles, also with strong time-consistency, which very often is not satisfied (see [12,13]). This makes questionable the application of the optimality principles mentioned above.
Thus, our main contributions to the literature on differential network games are the following:

1.
A new characteristic function for the game with interactions through paths is defined.

2.
It is found that the properties of such a characteristic function depend on the game's network structure.

3.
A condition under which the characteristic function is convex is obtained.

4.
Under this condition, we prove the non-emptiness of the strong time-consistent subcore and receive explicit forms for the Shapley value and the IDP of imputations from the strong time-consistent subcore.
The paper is structured as follows. The definition of the cooperative differential game on a network is given in Section 2. In Section 3, the definition of the characteristic function based on cooperative strategies used by players from a coalition is given. As mentioned above, the basic difference from previous approaches is that, when defining the value of the characteristic function for a given coalition, it is supposed that the left out players trying to minimize the payoff of the coalition are cutting connections with players from this coalition. Based on defined characteristic function, the core is constructed, and a strong time-consistent solution as a subset of the core is proposed in Section 4. The formula for the dynamic Shapley value is derived in Section 5. As an illustrative example, a differential marketing game on the network is investigated in Section 6. For this game, the characteristic function, the strong time-consistent subset of the core, and the dynamic Shapley value are computed in explicit form.

Problem Formulation
Consider a class of n-person differential games with prescribed duration of T. Let N = {1; 2; . . . ; n} be the set of players. The players are connected in a network system. A pair (N, L) is called a network, where N is a set of nodes and L ⊂ N × N is a given set of links. The nodes are used to represent the players. If pair (i, j) ∈ L, there is a link connecting players i ∈ N and j ∈ N. It is supposed that all links are undirected.
Denote the set of players directly connected to player i as K(i) = {j : (i; j) ∈ L}. Denote by K m (i), where m ≥ 2, the set of players connected with player i ∈ N by a minimal path containing exactly m edges (only paths without cycles and loops are considered) and let Every player i ∈ N at any instant of time can cut the connection with any other players from K(i).
The system dynamics is given bẏ Here, x i (t) ⊂ R m is the state variable of player i ∈ N at time t and u i (t) ∈ U i ⊂ R k is the control variable of player i ∈ N. The function f i (x i ; u i ) is continuously differentiable in x i and u i .
The payoff function of player i depends on his state variable, the state variables of players from the sets K m (i), 1 ≤ m ≤ n − 1, and his control variable. In particular, the payoff of player i is given as The term h j i (x i (τ); x j (τ); u i (τ)) is the instantaneous gain that player i can obtain through interaction with player j ∈ K m (i) and h i i (x i (τ); x i (τ); u i (τ)) is the instantaneous gain that player i can obtain by itself. Assume that δ ∈ (0, 1). The multiplier δ m−1 shows that, the farther players are in the network from player i, the less their behavior influences the payoff of this player. Suppose that functions h j i (x i (τ); x j (τ); u i (τ)), for j ∈ K m (i), j = i are non-negative. We denote by x 0 = (x 0 1 ; x 0 2 ; . . . ; x 0 n ) the vector of initial conditions. We say that we have the game Γ(x 0 , T − t 0 ) if the network (N, L) is defined, the system dynamics (1) and the sets of feasible controls U i , i ∈ N are given, and the players' payoffs are determined by (2). Each player, choosing a control variable u i from his set of feasible controls, steers his own state according to the differential Equation (1) and seeks to maximize his objective functional (2).
Suppose that players can cooperate in order to achieve the maximum joint payoff subject to dynamics (1). The optimal cooperative strategies of players u(t) = (u 1 (t), . . . , u n (t)), for t ∈ [t 0 ; T] are defined as follows The trajectory corresponding to the optimal cooperative strategies (u 1 (t), . . . , u n (t)) is the optimal cooperative trajectory x(t) = (x 1 (t); x 2 (t); . . . ; x n (t)). The maximum joint payoff can be expressed as: subject to dynamicṡ To determine how to allocate the maximum total payoff among the players under an agreeable scheme, defining the characteristic function is necessary.
There are many approaches to define the characteristic function (see [14][15][16]). In [9], for differential games on networks, it is supposed that the worth of the coalition S ⊂ N does not take into account the actions of players from the coalition N \ S, since the worst thing they can do for the coalition S is to cut the connection with players from S. In [10], it is also proposed to find the value of the characteristic function of S on the cooperative trajectory when the players from S use cooperative strategies under the condition that the connections with players from N \ S are cut off. The characteristic function constructed in this way is easier to compute and possesses some advantageous properties. In this paper, we apply this approach to the class of games under consideration.
Let S ⊂ N. A pair (S, L S ) is called a subnet (subgraph) if it only has subset S of the set of vertices (players) of the original network and L S contains all links from L whose initial and final vertices are both within subset S. For player i ∈ S, denote by K m S (i), where m ≥ 2, the set of players connected with player i by a minimal path in L S containing exactly m edges and let K 1 The worth of coalition S in the game is evaluated along the cooperative trajectory where x i (t) and u i (t) are the solutions obtained in (4) and (6). Similarly, the cooperative-trajectory characteristic function of the subgame Γ(x(t), T − t) starting at time t ∈ [t 0 ; T] can be evaluated as

Properties of the Characteristic Function
In references [10,11], the characteristic function is constructed in a similar way, but it was assumed that h j i = 0 if the players i and j are not connected by an edge. It was shown that such characteristic function is convex. Characteristic function V(S; x 0 , T − t 0 ) is called convex (or supermodular) if for any coalitions S 1 , S 2 ⊆ N the following condition holds: . A game is called convex if its characteristic function is convex. However, in our case, additional restrictions on the network are required for the characteristic function to be convex.
Define functions W(S; t) that can be interpreted as instantaneous values of the characteristic function according to the following rule: If there are no cycles in the network (N, L), then the following inequality holds for each i ∈ N \ S 1 and each t ∈ [t 0 , T]: Proof. For simplicity, we denote The absence of cycles in the network (tree or forest) means that there can be only one path between any two vertices. Suppose i / ∈ S 1 . If vertex i lies on the path between some vertices in the coalition S 1 ∪ {i}, then the path between these vertices does not exist in (S 1 , L S 1 ). Let P m S (i) be the set of pairs of vertices {p, q} such that p ∈ S \ i, q ∈ S \ i, the distance between them equals m, all the vertices of the path between p and q belong to S, and i lies on this path. Then, It follows that (10) is satisfied.

Remark 1.
Note that the presence of a cycle in the network can lead to the violation of property (10). Indeed, the presence of a cycle in the network allows several paths between two vertices. Let us assume that there are two paths with the length r between vertices p ∈ S 2 and q ∈ S 2 . Suppose that all vertices from the first path belong to S 2 ∪ {i}, and vertex i lies on this path. Assume also that the second path contains vertices from S 1 \ S 2 and vertex i does not lie on this path.
Note that there exists a path between p and q in (S 2 ∪ {i}, L S 2 ∪{i} ), but there is no path between these vertices in (S 2 , L S 2 ), since the first path goes through player i who is no longer in the coalition and the second path does not belong to (S 2 , L S 2 ). Then, there is a term δ r−1 (h p q (t) + h q p (t)) in the right-hand side of (12).
There are two paths with the length r between vertices p and q in (S 1 ∪ {i}, L S 1 ∪{i} ). One of them passes through vertex i and the other does not. Thus, there exists path between p and q in (S 1 , L S 1 ). This means that, on the right-hand side of (11), there is no term corresponding to the vertices p and q.

Corollary 1.
If there are no cycles in the network (N, L), then the characteristic function defined in (7) and (8) is convex.
Proof. It is shown that, for each τ ∈ [t 0 , T], S 1 ⊂ N, S 2 ⊂ S 1 and each i ∈ N \ S 1 , the following inequality holds: Integrating both sides of this inequality with respect to t, we have for each t ∈ [t 0 , T], This means that the characteristic function defined in (7) and (8) is convex (see [17]).
The absence of cycles guarantees the convexity of the characteristic function. However, even if there are cycles in the network, the characteristic function defined according to a given rule (7) and (8) may be convex for some parameter values. Therefore, the existing formula (7) for the characteristic function is relevant not only for networks without cycles.

Strong Time-Consistent Subcore
The next problem to define a cooperative solution is to determine an allocation rule to distribute among the players their maximum joint payoff. We again assume that there are no cycles in the network (N, L).
The set of all imputations in the game Γ(x 0 , T − t 0 ), L(x 0 , T − t 0 ), is given by Definition 1 ([17]). The core C(x 0 , T − t 0 ) of the game Γ(x 0 , T − t 0 ) is the subset of the imputation set L(x 0 , T − t 0 ), such that Similarly, for every t ∈ [t 0 , T] denote by L(x(t), T − t) the set of all imputations and by C(x(t), T − t) the core in the subgame Γ(x(t), T − t) along the cooperative trajectory.
The convexity of the characteristic function (7) and (8), which is proved above, guarantees that the core is not empty for every t ∈ [t 0 , T] [17].
The time consistency of cooperative solutions is an important issue when considering dynamic games. Suppose the optimality principle chosen by the players contains a set of imputations. In that case, as the game evolves along the cooperative trajectory x(t), the players can deviate from the imputation chosen in the initial time for any other imputation in that optimality principle. Will it lead to the payment rule selected in the game corresponding to the initial optimality principle? This is true only if the cooperative solution is strong time-consistent. This property of cooperation solution was introduced by L. Petrosyan [13]. Let us show that there is a subset of the core in the class of games under consideration, which is strong time-consistent.
Definition 2 (see [18] Definition 3 (see [13]). An optimality principal M(x 0 , For each α ∈ M(x 0 , T − t 0 ) there exists an IDP β(τ) = (β 1 (τ), . . . , β n (τ)), τ ∈ [t 0 , T], For a ∈ R n , B ⊂ R n , the symbol ⊕ means the following: In [19], a method for constructing a strong time-consistent subset of the core is proposed. Furthermore, conditions under which such a subset exists are obtained. In this paper, we implement this approach for the class of games under consideration, taking into account the peculiarities of the definition of the characteristic function.
Using the functions W(S; t) introduced in (9), we define B(t) for each t ∈ [t 0 , T] as the set of functions β(t) = (β 1 (t), . . . , β n (t)) such that Note that, if we consider W(S; t) : 2 N → R as a characteristic function of the stationary game, then β(t) ∈ B(t) is an imputation in this game and B(t) coincides with its core. Taking into account the fulfillment of property (10), we can conclude that, for each t ∈ [t 0 , T], the set B(t) is not empty in the considered class of games since the characteristic function W(S; t) in each moment t is convex [17].
To prove that β(t) ∈ B(t) is an IDP in the initial differential game Γ(x 0 , T − t 0 ), let us consider β(t) ∈ B(t). Now, we define ξ i (x 0 , T − t 0 ) in the following way According to (17), Then, is an imputation in Γ(x(t), T − t). Denote by C(x(t), T − t) the set of all imputations ξ(x(t), T − t), specified by Equation (20) for all β(t) ∈ B(t).

Proposition 2.
In the class of games under consideration, the sets C(x(t), T − t) are non-empty for every t ∈ [t 0 , T] and C(x(t), T − t) ⊂ C(x(t), T − t).
Proof. The sets B(t) are non-empty for each t ∈ [t 0 , T] because functions W(S; t) are convex for each t ∈ [t 0 , T]. Then, C(x(t), T − t) are non-empty for each t ∈ [t 0 , T]. Now, we prove that C(x(t), Then, From (22), it follows that the imputation ξ(x(t), T − t) belongs to the core C(x(t), T − t) for each t ∈ [t 0 , T]. This means that C(x(t), T − t) ⊂ C(x(t), T − t) for each t ∈ [t 0 , T]. Proof. The proof follows directly from Theorem 5.1 in [19].
Consider also a rule for constructing some IDP from the set B(t). Denote by D m (i) the set of pairs {k, l}, where k ∈ N, l ∈ N, the distance between k and l equals m, and vertex i belongs to the path between k and l (i can coincide with k or l). Let D m S (i) be the set of pairs {k, l}, where k ∈ S, l ∈ S, with the distance between k and l equal to m, all vertices from this path belonging to S, and vertex i lying on the path between k and l.
For each pair of vertices {p, q}, p, q ∈ N, we denote by Φ p,q the set of vectors φ p,q = Here, we assume the length of the path between p and q equals m (for each vertex γ j , j = 1, m + 1, belonging to the path between p and q, the coefficient φ γ j p,q is given).

Proposition 4.
In the class of games under consideration, the IDP defined by the Equation (23) belongs to the set B(t) for any φ p,q ∈ Φ p,q for each pair of vertices p ∈ N, q ∈ N Proof. We prove that the vector defined by Equation (23) satisfies conditions (17). Indeed, This concludes the proof.
Denote the set of all IDP constructed with Equation (23) as B φ (t). Proposition 4 shows that B φ (t) ⊂ B(t) for each t ∈ [t 0 , T]. Formula (23) allows finding IDP in explicit form.

The Shapley Value
Consider also the problem of constructing the Shapley value (see [20]) in differential games on networks. It turned out that using the characteristic function (7) in the considered class of games, the construction of the Shapley value does not require the calculation of the characteristic function of every S ⊂ N. Proposition 5 demonstrates this specific result.

Proposition 5.
If the characteristic function in the cooperative differential game on network (1) and (2) is defined under the rule (7) and the network (N, L) has no cycles, then the Shapley value in this game has the following form Proof. The formula for the Shapley value [20] is as follows . This is the value that the coalition S loses when a player i leaves it. This value consists of )dτ is the payoff of player i that he can obtain by itself.
)dτ is the payoff that player i receives from interaction with players from S \ i.
; u j (τ))dτ is the payoff that players from S \ i receive from interaction with player i.
h p q (x q (τ); x p (τ); u q (τ)) + h q p (x p (τ); x q (τ); u p (τ)) dτ are the payoffs that players p and q receive from interaction with each other, where p and q are such vertices from S \ i that there exists a path with the length m between them in S, and player i lies on this path (since there is no path in S \ i between p and q, because player i leaves the set S).
Rewrite the values from the last three items into one summand using the definition of sets D m S (i): Consider two fixed players k and l with the length of the path between them equal to m. These two vertices (players), together with all vertices (players) from the path between them, are included in C s−m−1 n−m−1 coalitions of power s. Then, This concludes the proof.

Remark 2.
Sh(x 0 , T − t 0 ) belongs to the strong time-consistent subcore. This can be verified by choosing an IDP for it, constructed according to (23) with all coefficients φ i k,l equal to 1 m+1 .

One Cooperative Differential Network Marketing Game
As an illustrative example, we consider a simple model of network marketing. There are n distributors (players) selling the same product or brand. Let x i denote the sales rate of player i. The sales rate dynamics of player i evolves according to the accumulation equatioṅ where u i (t) ≥ 0 is the promotional effort of distributor i, x 0 i ≥ 0. We assume that each player's payoff depends on their own sales and those of other distributors. First, we consider the general case where a player's payoff depends on all participants' sales, taking into account the distance between players on the network. Then, we apply the obtained solution to a multi-level marketing model on a directed network. The directions of arcs are supposed to indicate the hierarchy between players. In this case, distributors can only benefit from down-line distributors.
If each agent extracts linear utility from his own sales and sales of other participants and the cost function has a quadratic form, then the objective functions are given by Here, a i x i (τ) is the instantaneous gain that player i can obtain by himself, δ m−1 b i(j) x j (τ) is the instantaneous gain that player i obtains through interaction with player j ∈ K m (i), and c i (u i (τ)) 2 is a cost function of agent i. Parameter b i(j) indicates the percentage of sales of player j that player i receives.
Suppose that the game is played in a cooperative scenario and players have the opportunity to cooperate in order to achieve maximum total payoff: To define the cooperative strategies, we use the Bellman dynamic programming technique. We denote by V(N, x, T − t) the Bellman function in a subgame starting at the moment t from x(t): The Hamilton-Jacobi-Bellman (HJB) equation has the following expression: The solution of (34) is found in the form The maximization problem in (34) yields a strategy for player i: Substituting it into (34), we have the following system of differential equations for A i (t) and B(t):Ȧ The solution of (37) is the following Then, the optimal cooperative strategies have the form The optimal cooperative trajectory is The value function for the cooperative joint payoff of all n players can be obtained as According to (7), the characteristic function becomes Consider now the numeric example for the case of multi-level marketing game. This model assumes that the graph is directed. The direction of edge from player i to player j means that player i has recruited player j. Figure 1 shows the structure of the network in the game. It is assumed that distributors can only benefit from down-line distributors. In the framework of the considered example of network marketing, this means that b 2(1) Thus, to obtain the Shapley value, it is not necessary to calculate all the values of the characteristic function; it is enough to find the following quantities: To construct IDP from B φ (t), we use a set of coefficients that satisfy the system: Then, the formulas for IDP have the form: Choosing 2 1,4 = φ 4 1,4 = 1 3 we obtain the IDP for the Shapley Value. Finally, we show the graphical representation of the obtained solutions. Expressing ξ 4 = V(N, x 0 , T − t 0 ) − ξ 1 − ξ 2 − ξ 3 and passing to the system of inequalities with three variables (15) to find the core, we plot the resulting area in Figure 2. The Shapley value (red color) and imputations from the strong time-consistent subcore, calculated according to (23) (blue color), are also indicated in Figure 2. The example considered in Section 6 illustrates a possible application of the proposed model and shows how to distribute the joint payoff between the players connected by the network. This example also shows the possibility of applying the obtained solutions to the case of a directed graph.

Conclusions
One class of cooperative differential games on networks is investigated. It is assumed that each player's payoff depends not only on his neighbors' actions but also on players' actions with whom paths in the network connect the player. A novel form for characteristic function in such games is proposed. It is shown that the convexity of this characteristic function not always takes place and essentially depends on the structure of the network on which the game is defined. It is proved that, if the network has no cycles, then the characteristic function is convex and the strong time-consistent subcore is not empty. The formula for explicit calculation of the Shapley value is derived. An algorithm for the construction of IDP, corresponding to imputations from the strong time-consistent subcore is given. An illustrative example demonstrating this result is considered.
Further research can be done on cooperative differential games on networks. It will be interesting to use the approach based on cooperative trajectory characteristic functions to differential games defined on general networks containing cycles. It seems that, in this case, the convexity of the corresponding characteristic function will not take place, but the existence of the core could be proved, and the dynamic Shapley value could be calculated in explicit form. It would also be interesting to obtain such values of δ for which the characteristic function turns out to be convex even in the presence of cycles in the network.