1. Introduction
A widespread approach to the analysis of game-theoretic models (particularly, differential games) consists in the comparison of results obtained in the cases of selfish players’behavior (differential games in normal form), their hierarchical organization (Stackelberg games), and cooperation (a differential game is reduced to the optimal control problem). For example, a successful illustration of this approach is presented by Zhang et al. [
1]. A variant of the approach is described in Ougolnitsky [
2]. Basing on Basar and Zhu [
3], Cairns and Martinet [
4], and some other papers, we proposed in [
2] a system of individual and collective indices of the comparative (relative) efficiency for quantitative evaluation of the different ways of organization of economic agents.
A convenient model for the comparative analysis of the different ways of organization of economic agents is Cournot oligopoly (see Maskin and Tirole [
5], Geras’kin [
6,
7], and Algazin and Algazina [
8,
9]). For example, Xiao et al. [
10,
11] studied Cournot duopoly with bounded rationality and investigated the equilibria. Raoufinia et al. [
12] analyzed open-loop and closed-loop solutions in a Cournot duopoly game with advertising. Al-Khedhairi [
13] considered non-trivial Cournot duopoly based on fractal differential equations. Julien [
14] investigated Cournot oligopoly with several Stackelberg leaders and followers. A comparison of Cournot and Stackelberg equilibria performed by Zouhar and Zouharova [
15].
Together with a standard setup of the oligopoly model that describes a competition of several firms ina market of homogeneous goods, it is interesting to consider the so-called green effect. Usually, the green effect is concerned with supply chains (see Azevedo et al. [
16], Fahimnia et al. [
17], and Sharma and Jain [
18]). It is assumed in this case that the participants of a supply chain invest inthe environmental protection in the processes of production and transportation. The incurred costs are compensated by the willingness of environmentally minded consumers to pay more for products with a green label.
However, the studies of Cournot oligopoly do not include a systematic comparative analysis of the relative profit from the point of view of the whole society and different firms. Besides, the environmental externalities of economic activity and possible environmental protection efforts are not considered.
Thus, the main idea of this paper consists in a comparison of the different ways of organization of economic agents, such as competition, cooperation, and hierarchical control, for differential Cournot oligopoly with pollution dynamics and differential Cournot oligopoly with the green effect. The specific aim consists in building preference systems based on individual and collective indices of relative efficiency.
The contribution of this paper and its novelty areas follows:
- -
A differential game-theoretic model of Cournot oligopoly with consideration of pollution for the general case and the case of symmetrical agents is built and investigated analytically and numerically.
- -
A comparative analysis of selfish agents’ behavior (a differential game in normal form), their hierarchical organization (differential Stackelberg games), and cooperation (optimal control problem) using individual and collective indices of relative efficiency is conducted.
- -
We performed the same analysis for the models with the green effect when players chose both output volumes and environmental protection efforts.
- -
We constructed systems of collective and individual preferences.
In
Section 2, we characterize the materials and methods of the investigation. In
Section 3, we build and investigate a differential game model of Cournot oligopoly for the selfish behavior of players, their hierarchical organization, and cooperation. The case of symmetrical players and the general case are considered. We applied the Pontryagin maximum principle for analytical investigation and simulation modeling for numerical calculations. We used a system of individual and collective indices of relative efficiency for the quantitative comparison of the obtained results. In
Section 4, similar work is performed for the modelswith the green effect when players invest in environmental protection.
Section 5 concludes the paper.
2. Materials and Methods
The main analytical instrument of the investigation is the well-known Pontryagin maximum principle [
19]. For numerical calculations, we used an original method of qualitatively representative scenarios in simulation modeling [
20]. The idea consists in choosing a relatively small number of control scenarios that providea sufficiently precise description of the dynamics of the controlled system. For the substantiation of sufficiency, two conditions are used, namely the conditions of internal and external stability. Suppose an initial set of scenarios is chosen. It is internally stable if for any two scenarios from this set, the respective payoffs of the players differ essentially. It is externally stable if for any feasible scenario that does not belong to this set, we can find a scenario from this set such that the respective payoffs of the players are close. The value of precision of such approximations is chosen empirically and should not exceed 10% from the typical values of payoffs.
For quantitative comparative evaluation of the different ways of organization of economic agents (information structures of the respective game-theoretic models) from the point of view of both the whole society and separate agents, we introduced a system of relative efficiency indices [
2], namely:
- -
collective indices of relative efficiency;
The values determine the payoffs of the i-th player in the Stackelberg and inverse Stackelberg games, respectively, when the -th player is the leader. Any player can become the leader; in our examples, it is the first player.
- -
individual indices of relative efficiency.
The payoffs are supposed to be non-negative.
We proposed a mathematical model that is a dynamic version of Cournot oligopoly with consideration of environmental pollution. The main approach to its identification is an expert estimate using available real data. There were five main parameters in the model: (1) the concentration of pollutants in the environment in the initial moment of time, (2) a coefficient of the pollutants’ emission during production, (3) a coefficient of decay of the pollutants in the environment, (4) a maximal output for any firm, and (5) a cost coefficient for each firm. For the numerical identification of their values, we used the following reasonings. As a pollutant, we can consider carbon monoxide (CO). It is toxic, and its admissible concentration in production premises is 20 mg/m3 during a working day or 50 mg/m3 during an hour or 100 mg/m3 during 30 min. Based on this, the concentration of pollutants in the environment in the initial moment of time varied from 1 to 50 mg/m3. The coefficient of the pollutants’ emission during production depends on the production volume. For example, a coke chemical plant emits annually about 2000 tons of carbon monoxide. Based on this, the value of pollutants’ emission varied from 0.1 to 30 tons per year; the maximal output for any firm varied from 5 to 70,000 tons per year, and the cost coefficient for each firm varied from 1 to 50. The decay of many pollutants is slow; for example, carbon monoxide decays only in the presence of a catalyst. Thus, we varied the coefficient of decay of the pollutants from 0.1 to 30 kg per year. In addition, discounting was considering in the model, and a discount factor was taken to be equal to 0.004 that corresponds to moderate inflation. The modeling was conducted at an interval of 1200 days.
3. Differential Game Model of Cournot Oligopoly with Consideration of Pollution
Let us consider a dynamic version of Cournot oligopoly with consideration of environmental pollution and the linear equation of dynamics:
Here, is a set of firms (agents, players) competing in the manner of Cournot oligopoly in a market of homogeneous goods; is the -th player’s profit in time ; ; is the output volume of the -th firm (its strategy); the expression in parentheses in Formula (1) determines the price for the produced good, depending on the demand that is conversely proportional to the total output volume; are dimensioned coefficients that provide the fitness of dimension (for simplicity, they are assumed to be equal to 1); is the volume of pollutants in the environment (a state variable); is the coefficient of emission in the production of the -th firm; is the demand parameter; is the cost coefficient of the i-th firm; is the coefficient of pollution decay; is the discount factor; and is the length of the game. The agents’ interaction is described by their strategies and the final value of the state variable in the moment of time .
In the case of symmetrical agent
the model in Formulas (1)–(3) takes the form
We supposed that all players use open-loop piecewise continuous strategies. The agents may be selfish, so we have a differential game in normal form with Nash equilibrium as its solution. In addition, the agents may cooperate, so the game is reduced to the optimal control problem. Finally, a hierarchical organization is possible that is formalized by differential Stackelberg and inverse Stackelberg games [
21,
22].
Let us first consider a selfish behavior of the agents and investigate the symmetrical model in Formulas (4)–(6) using the Pontryagin maximum principle [
19]. The Hamilton function for each player has the form
where
is a conjugate variable. We obtain
and
Here,
. Therefore, the found value
maximizes the Hamilton function if
, or the value belongs to the domain of feasible strategies, Formula (5). Otherwise, the point of maximum coincides with the lower bound of the set of feasible strategies of the agent. A conjugate variable is determined from the boundary value problem
Then,
and Formula (7) is a maximizer of the Hamilton function if
.
These calculations show that Nash equilibrium exists and is unique.
Using Formulas (8) and (9), we conducted numerical calculations for different input data sets in the case of symmetrical agents. We realized about 100 numerical calculations. We varied the following parameters:
n from 2 to 40,
from 5 to 70
from 1 to 50,
from 0.1 to 30,
from 0.1 to 30, and
from 1 to 50. The input data are presented in
Table 1, and the agents’ payoffs for the input data from
Table 1 for
= 1200 and
r = 0.001 are presented in
Table 2. Here and elsewhere, NE stands for Nash equilibrium,
for cooperative solution, and
and
for Stackelberg and inverse Stackelberg games, respectively.
In the case of arbitrary agents, the Hamilton function for the
-th player is
Then,
and
. Therefore, the solutions of Formula (10) maximize the Hamilton function if they belong to the sets of feasible strategies. For conjugate variables, Formula (8) is applied.
Solving the system of equations in Formula (10), we obtain
Let us consider the case .
Substitute Formula (11) in the equation of dynamics and solve it by the method of variation of parameters:
The input data are presented in
Table 3, and the results for three arbitrary agents at
for the input data from the
Table 3 are presented in
Table 4.
When all agents cooperate we obtain an optimal control problem:
In the symmetrical case
the problem takes the form
Similarto the case considered earlier, we obtain
:
If
, then
In
Table 2, in the third and fourth columns, the results of calculations in the case of cooperation for the input data from
Table 1 are presented.
When arbitrary agents cooperated, the solution was found numerically [
23,
24] using the method of qualitatively representative scenarios in simulation modeling [
20]. The initial sets of qualitatively representative scenarios were taken as sets that consisted of three elements: 0, a big number (10,000 as a specific example), and their average value. All elements of the initial set were checked for completeness and redundancy [
20], and it was reduced or extended with new elements by necessity. The calculation results are presented in
Table 4.
Now, let us consider the case of hierarchical relations between agents in two versions of the information structure. In a Stackelberg game, one of the agents (e.g., the first one) becomes the leader (she). She chooses and reports to the other agents (followers) her open-loop strategy .
The followers play a differential game in normal form. The best response of the followers to the leader’s strategy is defined as Nash equilibrium in this game. We solved
optimal control problems (1)–(3) for
. A solution of each problem was found using the Pontryagin maximum principle, similar to Formulas (10) and (11), and had the form
where
;
Substitute Formula (13) into Formulas (1) and (3) and solve the problem in Formulas (1) and (3) using the Pontryagin maximum principle for
. An optimal strategy of the first player has the form
where
Thus, in Stackelberg equilibrium, the first player (leader) chooses her strategy, Formula (14). Given the leader’s strategy, other players choose their strategies according to Formula (13). Given all players’ strategies, the state variable is determined using the solution of Formula (3) and the payoffs are determined using Formula (1).
In an inverse Stackelberg game [
21,
22] based on the model in Formulas (1)–(3), the leader reports to each follower her strategy, with feedback on their control
If a follower refuses to cooperate with the leader, then she punishes the follower using the punishment strategy
, which according to Formula (13) has the form
Then, a guaranteed result of the
-th follower is equal to
If the followers cooperate with the leader, then she chooses a reward strategy
The reward strategies
are found as solutions of the optimal control problem
A solution of the problem in Formulas (15) and (16) was found numerically with computer simulation. The condition in Formula (16) provides that a reward is always more profitable for the followers than punishment. The payoffs of all players in the Stackelberg and inverse Stackelberg games are presented in
Table 4.
The values of individual and collective indices of relative efficiency for different information structures are given in
Table 5. In the last row of
Table 5, the average values of the collective and individual efficiency indices on the set of simulation experiments are presented. Thus, we obtained the following preference systems:
society: ;
agent-leader:;
agent-follower:.
As expected, cooperation is always preferable for the whole society and for followers. However, for the leader, the information structure of the inverse Stackelberg game is the most profitable as a rule. That is why the struggle for leadership arises often.
4. A Model with Consideration of the Green Effect
Now, the model takes the form
where
characterizes green efforts of the
-th firm,
is the coefficient of demand increasing due to the green effect,
is the green effort coefficient, and
is the coefficient of additional decreasing of the pollution due to green efforts.
In the case of symmetrical agents, the model takes the form
The agents’ strategies contain two control actions (functions
and
). The Hamilton function for each player in the symmetrical model in Formulas (20)–(22) has the form
So, if
, then
If and , then the maximum is attained on the bound of the set of feasible controls (at least one of the optimal controls is not internal).
If
and
, then
is an arbitrary function that belongs to the set of feasible strategies, for example,
. Denote
Given
if
, then
Notice that .
Therefore, the found value is a maximizer if a sufficient condition
is true and the value belongs to the set of feasible strategies, i.e., for
;
. If at least one of the inequalities
is false, then in dependence on the input model parameters, the maximum is obtained in one of the boundary points
The state variable was calculated using the method of parameter variation. Given the inequalities in Formula (18), it is explained by the formula
when
The payoffs are equal to
where
In the case of cooperation, the model takes the form
The maximum is obtained by the values
Nash equilibrium was calculated numerically using computer simulation [
22,
23]. The input data are given in
Table 6, and the results for symmetrical agents with consideration of the green effect are presented in
Table 7 for the input data from
Table 6.
In the case of arbitrary agents (model in Formulas (17)–(19)), the Nash equilibria for selfish behavior and cooperative solutions were calculated numerically using computer simulation. In
Table 8 the input data are given, and in
Table 9 the results for three arbitrary agents at
for the input data from
Table 8 are presented.
Now consider a hierarchical setup with consideration of the green effect. Let a specific agent (principal) maximize the functional
by controls
.
The other agents’ payoff functionals retain the form Formula (17), but now, the maximization is conducted only by the controls . The equation of dynamics has the form Formula (19). Control constraints are of the form Formula (18) again.
In the Stackelberg game, similarto the preceding case, we obtained the solution in the game of agents in the form of Formula (25).
In
Table 10, the values of the indices of collective and individual relative efficiency with consideration of the green effect are presented. The last row of
Table 10 contains the average values of the respective indices. The indices of individual efficiency were calculated only for the case of cooperation.
In this case, we obtained the same preference system for the whole society and the followers:
This preference system remains the same as the system without consideration of the green effect. However, the consideration of the green effect makes the agents’ interests more diverse, and the whole economic system becomes less (for some input data sets, essentially less) compatible. In this case, for the whole society, cooperation is much better than other ways of organization.