Spatially Distributed Differential Game Theoretic Model of Fisheries

We consider a differential game of fisheries in a fan-like control structure of the type “supervisor—several agents”. The dynamics of the controlled system is described by a non-linear differential equation model which is identified on the Azov Sea data. An averaging by two spatial coordinates is conducted. Different information structures of the game are generated by the control methods of compulsion (supervisor restricts the feasible strategies of agents) and impulsion (she exerts an impact to their payoff functionals). Both Stackerlberg and inverse Stackelberg games are considered. For the numerical investigation we use a discretization of the initial model and the method of qualitatively representative scenarios in simulation modeling.


Introduction and Related Work
The models of optimal exploitation of water biological resources have been investigated since the middle of the last century [1,2], particularly in the frame of sustainability science [3,4] and viability theory [5].
A comprehensive review of the game-theoretic applications to the fisheries is presented in [6]. From the societal point of view, overfishing is dangerous for biological and economical reasons, but it still holds [1]. Game theoretic models can help in resolving these issues [7]. The necessity of coordination was noticed in the first seminal work [8]. In their two-player analysis, Levhari and Mirman [9] noticed two essential moments in the game theoretic models in fisheries, namely: both players have influence on the stock dynamics, and the players are interconnected. The term 'dynamic externality' is used for characterization of these features [9] and provides the important role of game theory which studies strategic interaction in its application to fisheries [10].
An analysis of situations with three and more players requires consideration of coalitions that are formalized by cooperative games. The main issue here is an allocation of the payoff of the grand coalition among all players. As Yi [11] says, coalitions of players create favorable or unfavorable externalities for other players/coalitions in the game. The problem of group externalities in fisheries has been studied in [12,13].
Wirl [14] extends the dynamic tragedy of the commons for the case of uncertainty (an Ito process instead of a differential equation) and an arbitrary number of players. Using the complex approach, Wang and Ewald [15] receive an intriguing result: a competitive ecological system has better characteristics under cooperation, while a predatory ecosystem feels itself better under competition between fisheries.
In systems of imperfect information and uneven power, the principal-agent analysis is applied [16]. In game theoretic terms this is referred to as a Stackelberg game [17]. Some examples of applications of Extending this model, we construct a hierarchical differential game of two players with the leader (the supervisory body of the fishing service) and the follower (a fishing company). Payoff functionals for the players have the form: The supervisor: [H(P(x, y, z, t) − P * (x, y, z, t)) 2 +R(q(x, y, z, t), w(x, y, z, t))]dvdt → min . (1) Agents i = 1, 2, . . . , n: Here (x, y, z) ∈ G are spatial coordinates, 0 ≤ t ≤ T-time coordinate; P(x,y,z,t)-concentration of fish (red-finned mullet); w(x, y, z, t) = (w 1 (x, y, z, t), w 2 (x, y, z, t), . . . , w n (x, y, z, t)); w i (x, y, z, t) is the share of fish capture (i-th agent's control); q i (x, y, z, t) is the fishing quota (in shares, the leader's compulsion control); T-the length of the game; R(q, w) is a convex quota control cost function; s i (x, y, z, t) is a penalty coefficient for overfishing (the leader's impulsion control variable, in shares); a is a price per unit of fish biomass; b i is a fishing cost coefficient for the i-th agent; P * (x, y, z, t) is the optimal value of fish biomass from the point of view of sustainable development; H is the coefficient of penalty charged on the supervisor when the current biomass deviates from its ideal value Constraints on control have the following form: The supervisor chooses compulsion or impulsion respectively: Agents: The first term in Equation (1) reflects a penalty charged on the supervisor in the case of deviation of the current number of the fish population from its target value to both sides. That is why the respective indicator function is quadratic and equal to zero when its argument is equal to zero. The second term in Equation (1) describes the impulsion control cost (monitoring, and so on). In Equation (2) the first term means the agents' revenue from fishing, and the second term describes their costs. The input functions are borrowed from [28,29].
Thus, compulsion in this model assigns fishing quotas and incurs costs, while impulsion charges penalties for overfishing. A condition of sustainable development has the form ∀t (x, y, z) ∈ G P(x, y, z, t) = P * (x, y, z, t) or in a weaker form ∀t (x, y, z) ∈ G; P(x, y, z, t) − P * (x, y, z, t) ≤ δ; δ << 1. If the supervisor violates it then a penalty is imposed with coefficient H.

The Averaged Model and the QRS Method in Simulation Modeling
An attempt of the explicit solution of the Equations (1)-(4) leads to the complicated optimal control problems with phase constraints and supervisor's control variables as functional parameters. Thus, the model was investigated by computer simulation on the base of the QRS method [39].
The idea is that in the majority of applied dynamical models of the complex real-world social-economic systems (in differential games as well) it is possible to choose a very small number of scenarios that give a satisfactory picture of qualitatively different ways of the system dynamics. For example, in fish capture the shares of 10% and 20% are principally different; at the same time, there is no essential difference between 105 and 10.5% of the capture. We propose a mathematical formalization of this idea and give a numerical substantiation of it.
Let us describe the QRS method for the problem described by the Equations (1)- (4). Assume that w i (x, y, z, t), v i (x, y, z, t) are the controls of the i-th agent and the supervisor in relation to the i-th agent (i = 1, 2, .., n) and In the case of impulsion: In the case of compulsion: Then, is the set of outcomes; J i : Ω → R -a payoff functional of the player i = 0, 1, . . . , n.
In the QRS method we suppose that are sets of qualitatively representative strategies of the i-th agent and the supervisor in relation to the i-th agent. Then is the QRS set of the game. Denote v = (v 1 , v 2 , . . . , v n ); w = (w 1 , w 2 , . . . , w n ).
A set QRS = (v, w) (1) , (v, w) (2) , . . . , (v, w) (m) is called the QRS set of the game with precision ∆ if the two conditions are satisfied [39]: Thus, all the QRS lead to an essential difference in the supervisor's payoff, while the difference of payoffs for any other scenario and one of the QRS is not essential.
In the case of spatial heterogeneity of the shallow water body by three directions and the presence of three agents the QRS set contains more than 30,000 elements. Respectively, a direct enumeration of this set requires serious expenditures of the computer time. An average time of one numerical experiment on the computer with A10 microprocessor Intel Pentium G4620 with an operative memory 4 Gb is equal to approximately 24 h in this case.
Thus, an averaging of the system of six equations of the biological kinetics by two spatial coordinates was conducted that resulted in a spatially linear model of the Taganrog Gulf ecosystem.
For the description of dynamics of the fish population (red-finned mullet) an evolutionary equation was used [40].
is the value of fish biomass in the moment of time t in the point of space x; D is a parameter of mobility of the fish; w(t, x) is a share of the fish capture in the moment of time t in the point of space x; function P 0 (x) characterizes the value of fish biomass in the initial moment of time; L is the linear length of the water body.
It is supposed that As boundary conditions for Equation (5) we consider the conditions of filling of the medium which permits a free growth of the number of the population in the bound: After consideration of the averaging and simplifications the payoff functionals of the players take the form: For the supervisor: For the agents (i = 1, 2, . . . , n): The following restrictions are used: In the case of compulsion: In the case of impulsion: Here 0 ≤ t ≤ T; 0 ≤ x ≤ L; i = 1, 2, . . . , n. In the case of compulsion, the function R(q, w) is taken in the form while in the case of impulsion it is equal to zero. Thus, the problem described by the Equations (5)-(10) is solved. The equilibrium is built according to the chosen information structure.

A Discrete Version of the Model
In real world control systems (for example, in fisheries) the strategies of agents remain constant during some periods of time due to natural inertia of the control processes [37].
Suppose that the strategies of agents are constant on the equal time intervals in the same spatial segments, i.e., where b i,j are constant controls of the supervisor or agents when M is a number of the time intervals with constant controls, x j = j∆x, ∆x = L K , K-a number of the spatial segments with constant controls.
Then the control of player m becomes a grid function, i.e., w (m) ( . . , n. The payoff functionals of the players transform into payoff functions of many variables: For the supervisor: For the agents (m = 1, 2, . . . , n): The control restrictions are (i = 1, 2, . . . , M; j = 1, 2, . . . , K; m = 1, 2, . . . , n): For the supervisor: For the agents: Thus, the problem in Equations (5)-(10) is reduced to Equations (5), (6), and (11)- (14). If in the former case an optimal control problem is solved and the maximum is found on a class of functions then in the latter case of the optimization problem in Equations (5), (6), and (11)- (14) we seek a maximum of functions of many variables.

Computer Simulation on the Base of the QRS Method
An algorithm of building of the Stackelberg equilibrium for the game in Equations (5), (6), and (11)-(14) by the QRS method has the following form.

1.
The form and values of all input functions and parameters for the model in question are defined.

2.
One of potential QRS is chosen as a current supervisor's strategy.

3.
The set of Nash equilibria is built by the complete enumeration on the QRS set of all agents under the fixed supervisor's strategy. The problems in Equations (5) and (6) are solved numerically.

4.
The pair of strategies (a current supervisor's strategy and the set of best responses of the agents to it) is compared with the current best pair of strategies for the supervisor (from the point of view of Equation (11)). The current strategy becomes the optimal one if necessary.

5.
If the number of feasible scenarios for the supervisor are not exhausted, then her new strategy is chosen and the return to step 3 of the algorithm is implemented.
An algorithm of building of the inverse Stackelberg equilibrium for the game in Equations (5), (6), and (11)-(14) by the QRS method has the following form.

1.
All input functions and parameters for the model in question are defined.

2.
The strategies of punishment of the agents by the supervisor are introduced as where v = q under compulsion, and v = s under impulsion. Then the guaranteed payoff of the agent if he refused to cooperate with the supervisor is equal to , P(t, x)), , P(t, x)).

3.
The best supervisor's strategy (from the point of view of the payoff functional (11) Both under compulsion and impulsion for each scenario the Equations (5) and (6) are solved numerically on the base of implicit conservative difference schemes having the second order of approximation in relation to the step of spatial grid, and the first order of approximation in relation to the step on time [41]. Let us evaluate the computational complexity of the proposed algorithms. Assume that a QRS set of the hierarchical game in Equations (5), (6), and (11)- (14) in any moment of time in each of the considered spatial domains for each agent and for the supervisor in relation to each agent contains A elements. Then QRS i = A 2(M+K) ; i = 1, 2, . . . , n.
Thus, the computational complexity of the proposed algorithms of finding of the Stackelberg equilibrium is equal to O(A 2n(M+K) ) when A → ∞. Therefore, for small values of the input model parameters (A-a number of elements in the QRS set; n-number of agents; K-a number of the spatial segments with constant controls; M-a number of the time intervals with constant controls) numerical calculations are implementable and permit to construct the Stackelberg equilibrium.
A convergence of the proposed algorithms is treated as the realization of the conditions of internal and external stability of the equilibrium [39].
The following calculations were made for an initial distribution of the current fields in the Azov Sea under the northern wind [24][25][26][27].
Example. Suppose that M = K = 2; N = 1; P 00 = 0.  Table 1 presents some numerical results of the examination of the condition (a) for the built initial QRS set for the example input data in the case of compulsion.
Here J 0 min is the global minimum of Equation (11). The built initial QRS set contains 2401 strategies.
We checked the set for "redundancy" and possible contraction. A part of the results of the respective numerical calculations is presented in Table 1. The numerical experiments have shown that from the initial QRS set of 2401 elements we can go to the new set of 213 QRS and use it in the following calculations. As a result, the average time of one numerical experiment on the computer with A10 microprocessor Intel Pentium G4620 with an operative memory 4 Gb was equal to 1.5 h for compulsion and 2 h for impulsion scenarios. Table 2 presents some numerical results of the examination of the second condition from the definition of a QRS set under compulsion, namely the determination of the value of the precision ∆. The matrices mean two values in time and two values in space. Table 1. Examination of the condition (a) for the built initial QRS set for the example input data in the case of compulsion.  Table 2. Examination of the condition (b) for the example qualitatively representative scenarios (QRS) set under compulsion. Here we compared the strategies which do not belong to the QRS set with some "close" strategies from the QRS set. The numerical calculations (a part of their results presented in Table 2) have shown that the built QRS set satisfies the condition of external stability (b) from the QRS set definition when ∆ = 0.04. Similar calculations were made for the impulsion scenarios. In that case the initial QRS set was reduced to 215 elements, and ∆ = 0.078.
For the comparative evaluation of efficiency of the methods of hierarchical control we used a system compatibility index (SCI) which is determined for minimization problems by the formula The SCI describes a difference between the global minimum value of the supervisor's payoff functional and its equilibrium value. The value κ = 1 shows a complete system compatibility, or a possibility to provide it without a hierarchical control. The more a value of the index κ distinguishes from one, the less is the degree of system compatibility.
We conducted numerical experiments for the comparative evaluation of efficiency of different information structures and methods of hierarchical control for test input data sets. The subscripts com_1, com_2, imp_1, imp_2 denote the respective combinations of impulsion (imp), compulsion (com), Stackelberg games-subscript 1, inverse Stackelberg games-subscript 2.
We conducted multiple numerical calculations with the built QRS set of the game. Tables 3-5 contain their results which permit to fulfil a comparative analysis of the different methods of hierarchical control and different information structures of the game from the point of view of the system compatibility index. The dependencies of the SCI on the price of fish biomass unit, capture cost, and penalty coefficient respectively are presented in Tables 3-5. The following conclusions follow from the investigation: when the input model parameters vary in a wide range the system compatibility index does not vary essentially. Therefore, a hierarchical control is required to ensure the sustainable management; an increase of the price of unit of fish biomass results in the increase of the agent's payoff. The supervisor's payoff does not change. The SCI is close to one. The system becomes more compatible; -an increase of the capture cost results in the decrease of the agent's payoff. The supervisor's payoff does not decrease. As in the previous case, the system becomes more compatible; -an increase of the penalty coefficient leads to the decrease of the supervisor's payoff. The system under compulsion becomes less compatible, and the role of supervisor in the system increases; -the best information structure for the supervisor is an inverse Stackelberg game under compulsion; -in contrast, the best information structure for the agents is a Stackelberg game under impulsion; -for most of the considered input data the SCI is closer to one in Stackelberg games under impulsion.

Conclusions and Future Work
The paper contains the following new approaches. Unlike the previous works [28,29], we consider here a fan-like control system (a differential game of a supervisor and several agents) and closed-loop strategies of the players. In the case of several agents, the set of their best responses to the supervisor's strategy is the set of Nash equilibria in the normal-form game of the agents. Besides, we added a spatial coordinate. The authors approach to the modeling of fisheries is based on the concept of sustainable management. In this frame, the hierarchical control mechanisms (methods of compulsion and impulsion) are formalized as solutions of the Stackelberg games with phase constraints which reflect the requirements to the state of a controlled dynamical system (the conditions of sustainable development). The numerical investigation of the model uses two authors' ideas. First, we suppose that even in the continuous processes the values of control variables are changed only in some discrete moments of time that allows for a discretization of the initial continuous model [37,38]. Second, in most real applications it is possible to choose a very small number of representative scenarios (values of control variables and model parameters) that gives a qualitatively good description of the model trajectories of the controlled dynamical system including the optimal one [39]. At last, for a comparative evaluation of the efficiency of the methods of hierarchical control we use an index of the system compatibility [36] which is a ratio of the value of supervisor's payoff functional in a Nash equilibrium to the value of its global maximum (minimum).
The method of QRS in the considered problem has proven its efficiency in the case of one spatial coordinate. Unfortunately, an attempt to consider a spatial heterogeneity by two or three directions leads to the essential expenditures of computer time. This is the main shortage of the method because the one-dimensional model of a sea has a conditional character. To avoid this shortage, in future research it is recommended to use a harder selection of the strategies for the inclusion to the initial QRS set.
Certainly, the problem under consideration is very complicated that incurs many limitations in the practical application of the developed model. In fact, in its present form the model just demonstrates the possibilities of a game theoretic analysis of the control mechanisms in fisheries. The model seems to be the most realistic for regions where a single regulatory body is in charge and where there are no transboundary disputes on fishing rights. The model is also limited by its application of a single type of penalty scheme and does not in detail investigate different forms of compulsory regulation, which is planned to be considered in future research.