Hierarchical Structures and Leadership Design in Mean-Field-Type Games with Polynomial Cost

: This article presents a class of hierarchical mean-field-type games with multiple layers and non-quadratic polynomial costs. The decision-makers act in sequential order with informational differences. We first examine the single-layer case where each decision-maker does not have the information about the other control strategies. We derive the Nash mean-field-type equilibrium and cost in a linear state-and-mean-field feedback form by using a partial integro-differential system. Then, we examine the Stackelberg two-layer problem with multiple leaders and multiple followers. Numerical illustrations show that, in the symmetric case, having only one leader is not necessarily optimal for the total sum cost. Having too many leaders may also be suboptimal for the total sum cost. The methodology is extended to multi-level hierarchical systems. It is shown that the order of the play plays a key role in the total performance of the system. We also identify a specific range of parameters for which the Nash equilibrium coincides with the hierarchical solution independently of the number of layers and the order of play. In the heterogeneous case, it is shown that the total cost is significantly affected by the design of the hierarchical structure of the problem.


Introduction
The idea of hierarchy dates back at least to 1934, when Stackelberg [1] introduced a game that models markets where some firms have a stronger influence on others. Stackelberg games consist of two players, a leader and a follower. The leader who moves first decides an optimal strategy after anticipating the best response of the follower. Then, the follower eventually chooses the anticipated best response to optimize their cost or payoff. Therefore, this game is a game with two-level hierarchy. A dynamic Linear-Quadratic (LQ) Stackelberg differential game was studied by Samaan and Cruz [2]. A stochastic LQ Stackelberg differential game was investigated by Bagchi and Başar [3]. Bensoussan et al. [4] derived a maximum principle for the leader's Stackelberg solution under the adapted closed-loop memoryless information structure.
Having two or more players, the Stackelberg game is called a hierarchical game, and it becomes more interesting and involved due to its multi-layer structure, including various forms of information. The players act in sequential order, such that each one of them is a leader for the previous and a follower of the next player in the hierarchy. For hierarchical mean-field-free differential games, see, for example, [5][6][7][8][9].

•
A single decision-maker can have a strong impact on the mean-field terms; • The expected payoffs are not necessarily linear with respect to the state distribution; • The number of decision-makers is not necessarily infinite.
Games with non-linear distribution-dependent quantity-of-interest are very attractive in terms of applications, since the non-linear dependence of the payoff functions in terms of state distribution allow us to capture risk measures, which are functionals of variance, inverse quantile, and/or higher moments. In portfolio optimization, for instance, payoff functions may include the third and the fourth moments, known as the kurtosis and skewness (e.g., [32,33]). Generally, equilibrium solutions to mean-field-type games are presented as either open-loop or closed-loop solutions. The open-loop solutions are controls that do not explicitly depend on the state process at time t, that is, they are rather adapted processes that depend only on time and the initial data. The stochastic maximum principle can be used as a methodology for finding such optimal control strategies. Closed-loop solutions (i.e., feedback solutions) are deterministic functions that depend on the state of the process at time t, as well as its marginal distribution. The dual adjoint functions which are obtained from the Hamilton-Jacobi-Bellman (HJB) equations can be used for finding feedback optimal controls. We will use this approach throughout this paper. For linear quadratic stochastic differential games, Sun and Yong [34] established that the existence of open-loop optimal control strategies is equivalent to the solvability of the corresponding optimality system, which is a forward-backward Stochastic Differential Equation (SDE), and the existence of closed-loop optimal strategies is equivalent to the existence of a regular solution to the corresponding Riccati equation.
Our contribution can be summarized as follows. This work examines a class of hierarchical mean-field-type games with multiple layers, leaders, and followers. Based on infinite-dimensional partial integro-differential equations (PIDEs) on the space of measures, we provide semi-explicit solutions in closed-loop form of a class of master systems with hierarchical structure and non-quadratic cost, which are not covered in the earlier works. Recall that the non-quadratic costs allow for analysis other classes of higher risk terms, such as kurtosis [32,33]. The novelty of this paper mainly lies in the analysis of the effect of hierarchy and leadership on the solutions.
The rest of this article is structured as follows. We present the model setup in Section 2. Section 3 investigates the Nash equilibrium (no leader). Section 4 presents the Stackelberg solution. The multi-layer case is presented in Section 5. Numerical examples are presented in Section 6. Finally, concluding remarks are drawn in Section 7.

The Setup
There are I ≥ 2 decision-makers interacting within the time horizon [t 0 , t 1 ], t 0 < t 1 . The set of decision-makers is denoted by I = {1, 2, . . . , I}. Decision-maker i ∈ I has a control action u i ∈ U i = R.
The state x is driven by a Drift-Jump-Diffusion process of mean-field-type, given by where P (R) denotes the set of probability measures on R. We assume that x(t 0 ), B and N are mutually independent. The performance functional of decision-maker i is where m(t, dy) = P x(t) (dy) is the probability measure of the state x(t) at time t, and In addition, each decision-maker is assumed to have a computational capability, such as being able to compute an aggregative term of m from the model. Let U i be the set of control strategies of decision-maker i that are progressively measurable with respect to the filtration generated by the unions of events in {B, N}.

Games with Polynomial Cost
We investigate the mean-field-type game with the following data: where k i ≥ 1,k i ≥ 1, are natural numbers, and the coefficients are time-dependent. The coefficient functions q i , r i ,q i andr i are nonnegative functions, and

Hierarchical Leader Design and Algorithmic Approach
The hierarchical leadership design consists of finding the optimal number of hierarchical layers h and the non-empty subsets of players I 1 , . . . , I h , partitioning the set of all players as Here, we take into consideration three main game scenarios, described as follows. First, the game has a unique layer, that is, a situation in which all the players select their strategies simultaneously. Second, the game is played in two layers (bi-level hierarchy). The players are grouped into two sets (h = 2): leaders, which are those who decide first, as well as simultaneously; and followers, which are those who react against the decision of the leaders. Third, the game is structured to take into account as many layers as the number of players (fully hierarchical configuration with h = I), that is, players select strategically in sequence one-by-one in I layers. For all configurations, let L * i denote the optimal cost of the player i ∈ I in the hierarchical mean-field-type game problem, and S(h, I 1 , . . . , I h ) = ∑ i∈I L * i denotes the total (social) cost at the hierarchical solution. The hierarchical leadership design consists in determining the optimal leaders, followers, and/or number of layers, such that the total cost is minimized.
Notice that, for both the bi-level and fully hierarchical cases, there are multiple combinations for the players. In the bi-level scenario, the set of all possible sets of leaders is given by the power set 2 I , and any set of leaders is denoted by I L ⊆ 2 I with the corresponding set of followers, I F = I \ I L . Regarding the fully hierarchical game, there are as many possibilities in the strategic ordering as permutations of the set of players I. All possible permutations of the players are considered.
For the bi-level case, the optimal set for leaders and followers is I * L ∈ arg min 2 I S(2, I 1 , I 2 ), On the other hand, for the fully hierarchical case, we have that the optimal permutation is (I * 1 , . . . , I * I ) ∈ arg min I 1 ,...,I I S(I, I 1 , . . . , I I ).
In this paper, we study the three aforementioned scenarios involving one, two, and I layers, as presented in Figure 1. We also present under which conditions all the three configurations have the same solution, that is, when the Nash solution coincides with the hierarchical solutions at different layers. Furthermore, we present numerical examples considering different levels of hierarchy. The problem addressed in this paper can be interpreted as a mechanism design that, instead of determining the appropriate cost functionals or utility functions to induce a desired output, we design the best hierarchical structure in order to reduce the overall social cost. Remark 1 (Feasibility and Existence). The set of possible combinations for the layers/levels and players per level is non-empty and finite. Then, the optimal hierarchical leader design is feasible, and there exists an optimal solution (combination) such that the social cost is minimized.

Equivalence if
Since the feasible set of possible combinations for the hierarchical configurations is non-empty and finite, then it is possible to find the best hierarchical structure by means of Algorithm 1. The main results evoked in the Algorithm 1, given by Propositions 1-3, are presented throughout the paper. H(i) is a candidate optimal design; end i ← i + 1 ; end The optimal leadership design is H * with social cost S * ; According to the procedure in Algorithm 1, one of the main concerns in the leadership design problem is related to the dimensionality of the feasible set for the hierarchical structures (NP-hard problem). The total number of combinations is, given by the total number of ordered partitions from a set, where such total combinations are computed by means of the ordered Bell number B : N → N-that is, for I players we have: For instance, if I = 2, then there are B(2) = 3 possible leadership configurations, as shown in Figure 2; i I = 3, and then there are B(3) = 13 possible leadership structures presented in Figure 3, and B(4) = 75, B(5) = 541, and B(6) = 4683. Figure 4 illustrates the rapid increment of the number of combinations as the decision-makers increase. Notice that it is not possible to have more levels than players in the hierarchical game (h ≤ I). The following sections are devoted to the presentation of semi-explicit solutions for hierarchical mean-field-type games with different levels from one (Nash scenario) up to the number of players I (fully hierarchical scenario).

Nash Mean-Field-Type Equilibrium
The risk-neutral mean-field-type game is, given by A risk-neutral Nash mean-field-type Equilibrium is a solution of the following fixed-point problem: LetV i (t, m) be the optimal cost-to-go from m at time t ∈ (t 0 , t 1 ) given the strategies of the others, that is,V If m(dx) = 0, then adding a constant toV i,m (t, x, m) does not change the value of the integral in (2). For any scalar λ and m ∈ P (R) one has, λ = λ m(dx). Thus, λ is also a Gâteaux-derivative of the constant function λ. However, in our problem, the termV i,xm , which is the gradient of x → V i,m (t, x, m), will be used in the Hamiltonian, andV i,xm does not have the constant ambiguity. Let us denote the jump operator J as

Let us introduce the integrand Hamiltonian as
A sufficiency condition for a risk-neutral Nash equilibrium system is, given by the following PIDE system: We refer the reader to [35] for a derivation of this equilibrium system. The system (3) is an infinite-dimensional PIDE system in m and it provides the Nash equilibrium values of the mean-field-type game. Notice that from (3), the equilibrium strategies have the best response to the integrand Hamiltonian and can be expressed as functions of t, x, m,V i,m ,V i,xm ,V i,xxm .
Next, we semi-explicitly provide the Nash mean-field-type equilibrium in linear state-and-mean -field feedback strategies. To do so, we use (3).

Proposition 1.
A risk-neutral Nash mean-field-type equilibrium is given in a semi-explicit way, as follows: whenever the above coefficient system admits a solution which does not escape within [t 0 , t 1 ].
Proof. The proof is presented in Appendix A.
The following Remark discusses the existence and uniqueness of the η terms in Proposition 1.

Remark 2.
The uniqueness of the coefficient system (4) in η requires a strong condition, that is, • Let I be an arbitrary integer and k i = k = 1, the system in η becomes linear and has a unique solution if, and only if the determinant of the matrix M is non-zero, with M ii = r i and M ij = ij , i = j. When the determinant is zero, the resulting control strategies become non-admissible and the costs become infinite.

•
For k i = k = 2, and I = 2 the system in η becomes a binary cubic polynomial, given by For 12 = 0, there is a unique solution, given by For 12 = 0, we derive from the first equation that By substituting it to the second equation, we arrive at The latter equation is a polynomial of odd degree "9". It has a unique real root in η 1 if its derivative has a constant sign. Its derivative is It has a constant sign if 21 and r 1 r 2 12 have opposite signs. If r 1 and r 2 are positive, then the condition is reduced to 21 12 ≤ 0.
• I = 2 and arbitrary k i ≥ 1. Thus, a sufficiency condition is that ji and (2k i − 1)(2k j − 1) • The same reasoning applies to the system inη, and has a unique real solution if ij¯ ji ≤ 0.
• For I ≥ 3 decision-makers and arbitrary k i ≥ 1, the system can be rewritten as a fixed-point equation which fulfils a contraction mapping condition if the norms of r and are sufficiently small. In this case, there is a unique solution.
In the next section, we investigate the bi-level case with multiple leaders and multiple followers.

Multiple Leaders and Multiple Followers
We consider the description in (1) in a bi-level hierarchical game with two and more leaders, that is, |I L | ≥ 2, and two and more followers, that is, |I F | ≥ 2. We restrict our attention to the admissible strategies, which are Lipschitz, in the state x. Given the strategies of the leaders (u i ) i∈I L ∈ ∏ i∈I L U i , a risk-neutral best-response strategy of follower j is a strategy that solves inf U j EL j . The set of risk-neutral best responses of j is denoted by rnBR j ((u i ) i∈I L , (u j ) j ∈I F \{j} ).
A mean-field-type risk-neutral Nash equilibrium among the followers given the first movers' . The followers solve the following Nash game given the strategy of the leaders (u i ) i∈I L , that is, Then, the leaders solve the following PIDE system: A minimizer of the integrand Hamiltonian H r i , denoted by provides a candidate Stackelberg strategy of the leader i. A mean-field-type risk-neutral Stackelberg solution between multiple leaders and multiple followers is a strategy ((u ss i ) i∈I L , (u ss j ) j∈I F ) of all decision-makers, such that and for every follower, j ∈ I F , u ss j ∈ rnBR j ((u ss i ) i∈I L ; (u ss j ) j ∈I F \{j} ). The next result presents the Stackelberg mean-field-type solution involving several leaders and followers in a semi-explicit manner.

Proposition 2.
The risk-neutral Stackelberg mean-field-type solution with multiple leaders and multiple followers is given in a semi-explicit way, as follows: j ∈ I F : whenever the above coefficient system admits a unique solution.
Proof. The proof is presented in Appendix A.

Remark 3.
Clearly, the mean-field-type Nash equilibrium in (4) differs from the Stackelberg solution in (7) when the ij are non-zero.

No Control-Coupling within Classes
It follows from (7) that, for jj = 0 =¯ jj for (j, j ) ∈ I 2 F , the term η j is explicitly, given by

No Leader and All Followers
In this case, there is no leader. All decision-makers are followers. This case is similar to the model proposed in the Nash game above. The solution is given by (4).

One Leader and Multiple Followers
There is a unique leader in I L , and the remaining decision-makers in I F are followers. I = I L ∪ I F . We assume that the leader (decision-maker 1 ∈ I L ) uses a state-and mean-field-type feedback strategy u 1 (t, x, m) and each of the followers (decision-maker j ∈ I F ) finds a state-and mean-field-type feedback strategy u j (t, x, m, u 1 ) given u 1 . The followers solve a Nash game given the strategy of the leader u 1 .

Multiple Leaders and One Follower
Since there is only one follower, the reaction set of the follower will be computed given the strategies of the leaders.

All Leaders and No Follower
In this case, there is no follower. All decision-makers are leaders. In terms of the information structure, this case is similar to the model proposed in the Nash game above. The solution is given by (4).

Fully Hierarchical Game
In the previous sections, we had only bi-level game problems. In this section, we make as many levels as the number of decision-makers. There are |I| hierarchical levels. At each layer i, decision-maker i chooses a control strategy u i knowing the control strategy of the preceding decision-makers, that is, {i − 1, . . . , 1}. This becomes a sequential decision-making problem. We use a backward induction method to solve the hierarchical game problem. This means that the decision-making problem at the last layer I, which is the reaction of decision-maker I, can be seen as a mean-field-type control problem. This is because at the i−th level, the strategies (u i ) i ∈{1,...,i−1} are already known by decision-maker i.
The Proposition 1 next presents the multi-level hierarchical-structure solution in the context of mean-field-type games in a semi-explicit manner. Proposition 1. The risk-neutral I−level hierarchical mean-field-type solution is given in a semi-explicit way, as follows: where the coefficient functions are, given by Level 1 : Level i : Level I : whenever these equations admit a solution.
Proof. This proof is presented in Appendix A.
From the analysis above, the following remarks are in order:

•
For ij = 0,¯ ij = 0, the order of the play matters because of the informational difference between the decision-makers at different levels of hierarchy in (8). One open question that we leave for future investigation is: How to determine the optimal ordering among all permutations of heterogenous decision-makers? • When all the ij and¯ ij are zero, the Nash equilibrium coincides with the bi-level solution, which coincides with any level of hierarchical solution. The order of the play and the informational difference do not generate an extra advantage for the first mover in this particular case. Consequently, the hierarchical leader design is only performed when the parameters ij = 0,¯ ij = 0.

Numerical Investigation
In this section, we perform some numerical examples in order to analyze two main scenarios. We study the effect of the number of leaders on the total cost for both homogeneous and heterogeneous scenarios, and we investigate the effect of the hierarchical structure considering a heterogeneous scenario.

Effect of the Number of Leaders on the Total Cost
We investigate the effect of the number of leaders on the total performance of the system. The total cost at the Stackelberg solution is For m 0 = δ x 0 , andk i =k ≥ 1, the total cost is .

Uniform Coupling and Homogeneous Players
When all other parameters are identical across the players except their role, S(I L , m 0 ) can be expressed as a function |I L |. It follows from (7) that The optimal number of leaders is, given by whereᾱ depends on χ as well. We observe that the latter function is not necessarily monotone in χ = |I L |. This means that increasing the number of leaders in the interaction does not necessarily improve the total performance of the system. We numerically investigate S(|I L |, δ x 0 ) as a function of χ = |I L | for |I| = 6. Let us consider a symmetric six-player game problem involving the parameters presented here: 2i =b 2 = 0.5, ∀i ∈ I,r i =r = 2, ∀i ∈ I, q i =q = 1, ∀i ∈ I,q iT =q T = 2, ∀i ∈ I, (10) Figure 5 and Table 1 also show that, under the considered parameters, the lowest total cost is obtained when |I L | = 2, corresponding to a cost S(|I L |, δ x 0 ) = 7.911. These results offer an insight into the game's structural design for the sake of either individual or total costs. We observe that having only one leader is suboptimal for the total cost. Having too many leaders (where the majority of the decision-makers are leaders) is not suboptimal for the total cost. In this setting, there is a tradeoff between leaders and followers, so that the system's cost gets balanced.   Figure 5. Evolution of the differential equationsα leader/follower , and the corresponding initial values for the different number of leaders in the homogeneous scenario.

Uniform Coupling and Heterogeneous Players
Now we investigate the two-layer case with uniform coupling, that is,¯ ij = 0.1, for all combinations i, j ∈ I and for the heterogeneous case with |I| = 3. We consider the following parameters: Figure 6 shows the evolution ofᾱ 1 ,ᾱ 2 , andᾱ 3 for the different topologies presented in Table 2. It can be seen in Figure 7 that all the structures return a close value for the total cost. However, Table 2 shows that the best topology is the last one, where the third player acts as the unique leader assuming an initial condition, such that (10) holds.  Figure 6. Evolution of the differential equationsα leader/follower , and the corresponding initial values for the different number of leaders in the heterogeneous scenario.  Figure 7. Evolution of the sum of differential equations and the corresponding total cost for the heterogeneous scenario.

Impact of the Hierarchical Structures
Here, we analyze the impact on the order of the strategic selection, that is, the hierarchical order on the heterogeneous case with |I| = 3. We consider the following heterogeneous parameters: Table 3 shows the summary of the total costs for the six different possible hierarchical orders assuming an initial condition, such that (10) holds. It can be seen that the third configuration is the best to minimize the total cost. Moreover, Figure 8 presents the evolution of the equations ∑ j∈Iαj (t) for all the possible structures.

Conclusions
In this paper, we have examined multi-layer hierarchical mean-field-type games with non-quadratic polynomial costs. We derived hierarchical mean-field-type solutions in linear stateand mean-field feedback form by using a partial integro-differential system, and also established the relationship between the Nash and the hierarchical solutions. Furthermore, we studied the impact of the number of leaders on a bi-level Stackelberg problem for both symmetric and non-symmetric scenarios. In addition, we have shown that the number of layers, permutations of the decision-makers per layer, and their identity significantly affect the total cost of the system. We have also numerically shown that the ordering among all permutations of heterogenous decision-makers may reduce the cost by a significant proportion, depending on the horizon. One open question that we leave for future investigation is to find, theoretically, the optimal ordering among all permutations of heterogenous decision-makers, and to examine the benefits/costs of structure design and leadership.

Conflicts of Interest:
There is no conflict of interest.

Appendix A
Proof of Proposition 1. Under the assumption of perfect state observation and perfect knowledge of the model, a sufficiency condition for equilibrium is, given by the PIDE system (3). We aim to solve (3). To do so, we start with the following guess functional of decision-maker i aŝ where the coefficient functions α i andᾱ i need to be determined. Notice that, for k i = 1, the functionalV i (t, m) becomes a mean-variance-dependent functional, and for an arbitrary parameter k i , the functional may support higher order moments. We compute the key termsV i,m (t, m),V i,xm (t, m), with ˜ m(dy) = 0. The Integrand Hamiltonian is strictly convex in (u i −ū i ,ū i ). The optimal control strategy is the unique minimizer of By strictly convexity and by orthogonality between (u i −ū i ) andū i the following condition system holds: By solving the previously mentioned conditions, one obtains the optimal control input in a closed-loop form. The linear state-and mean-field-type feedback strategy u i = −η i (x − ym(dy)) − η i ym(dy), i ∈ I solves the system if the coefficients satisfy i ∈ I, The integrand Hamiltonian of i becomes By identification the coefficients α i solve the following ordinary differential equation: The aggregate mean-field term ym(t, dy) can be derived in a semi-explicit way by taking the expected value of the state dynamics. It follows that This completes the proof.
Proof of Proposition 2. For the data in (1), the integrand Hamiltonian H r j has a unique minimizer, denoted by which provides the reaction strategies of the follower decision-makers. Following (1) with leaders in I L and followers in I F , the first order optimality condition yields and which provides {η j ,η j } j∈I F as function of {η i ,η i } i∈I L and α,ᾱ. Following (1) with leaders in I L and followers in I F , the leaders' integrand Hamiltonian can be rewritten as follows In view of (A7), The optimal Stackelberg strategies of the leaders satisfy the following system: whose solution provides the coefficients (η ss i ,η ss i ) i∈L .

Appendix A.1. I-th Hierarchical Level
Proof of Proposition 3. We use a backward induction procedure to prove the statement. When decision-maker I optimizes the preceding decision-makers have already chosen their strategy and that is known by I. Hence, integrand Hamiltonian of I is It follows from strictly convex optimization above that the best response strategy can be expressed as: I,i (x −x) 2(k I −1) (u i −ū i ) + b 2,IVI,xm + c I (x −x) 2k I −1 , I,i x 2(k I −1)ū i +b 2,IVI,xm +c Ix 2k I −1 .
In particular, i ≤I − 1 : At the hierarchical level I − 1, the preceding levels are {1, 2, . . . , I − 2} and the succeeding level is I. Having the expression of the optimal control strategies of the last layer I we can move to the preceding layer, that is, I − 1. Decision-maker I − 1 has u 1 , . . . , u I−2 and the reaction u * I of decision-maker I. Therefore, the integrand Hamiltonian of I − 1 is, given by By identification from the first-order optimality condition the coefficient functions η i ,η i satisfy the following equations