The System Dynamics ( SD ) Analysis of the Government and Power Producers ’ Evolutionary Game Strategies Based on Carbon Trading ( CT ) Mechanism : A Case of China

Climate warming caused by carbon emissions is one of the most serious problems faced by human beings, and the carbon trading (CT) mechanism is an effective way to promote carbon emission reduction and achieve green and low-carbon development. Scholars have mainly studied the impact of CT on the energy economy system, and few scholars studied the game process and behavior strategies of government and power producers in the implementation of a CT mechanism. This paper will fill this gap. This paper firstly constructs the evolutionary game model of government and power producers based on CT, and then simulates the evolutionary process of game behavior strategies by establishing a system dynamics (SD) model, and finally studies the influence of government controllable key factors on system stability. The combination of evolutionary game and SD in our study not only clearly reveals the complex and dynamic evolution process of game models under bounded rationality, but also provides a qualitative and quantitative simulation platform for analyzing the dynamic game process between government and power producers. The results show that: (1) There is no evolutionarily stable strategy (ESS) in the game system between government and power producers under CT, and the system evolution is characterized by periodicity; (2) When the government implements dynamic subsidies or punitive measures, the mixed strategy of the game system has ESS; (3) Reducing the unit subsidy and raising the unit fine can both promote the participation of power producers in CT, but the former increases the probability of government supervision; thus, it is best to increase the fines when the government makes strategic adjustments, followed by reducing subsidies.


Introduction
Greenhouse gas emissions are one of the major factors causing global warming, and the low-carbon economy has become a global hot spot.The 21st session of the United Nations Conference on Climate Change (Global Climate Change Conference) was held in Paris from 30 November to 1 December 2015, and countries have reached a consensus on energy saving and emission reduction.China signed the Paris Agreement on 22 April 2016.As the world's second largest economy, China actively participates in global climate governance to achieve long-term goals for mitigating global climate change [1].The Chinese government issued the National Carbon Emissions Trading Market Construction Program (Power Generation Industry) on 19 December 2017, which marked the completion of the overall design of China's carbon trading (CT) system, and the national CT system was officially launched [2].CT is a market-based mechanism developed by the government to promote global greenhouse gas emission reductions.It aims to control emissions by establishing legal greenhouse gas emission permits and total control targets, and allowing such permits to be purchased and sold like commodities [3].It can use the market mechanism to optimize the allocation of environmental capacity and resources, mobilize the enthusiasm of enterprises to control emissions, flexibly adjust the balance between economic development and environmental protection, and minimize the overall costs of social governance [4].
Many scholars have conducted numerous studies on CT, and the current research mainly focuses on two aspects.One is the effectiveness of the CT market and its impact on the energy and economic system.Zhao et al. [3] study the efficiency of CT market in China on the basis of the effective market theory and fair game model.Song et al. [5] evaluate the effect of current CT policy and analyze whether the related policies can improve the operation efficiency of the current carbon market.Fang et al. [6] explore the optimization scheme of CT in China based on a novel energy-saving and emission-reduction system with carbon price constraints.Hu et al. [7] use a panel of 25 major developing countries during the years 1996-2012 to explore the role of renewable energy consumption and commercial services trade in generating carbon emissions.Lin and Jia [8] establish six countermeasure scenarios with different carbon right allocation decline schemes to explore the impact of these schemes on energy, economy, and the environment.Li and Jia [9] use a dynamic, recursive computable general equilibrium (CGE) model to simulate the CT market, to explore the relationship between free quota ratio and CT price, and the impact of the CT scheme on China's economy and environment.Yang et al. [10] use the difference-in-differences model to study various policies respectively, including economy, energy, climate and allowance to analyze the determinants of the carbon prices.Jiang et al. [11] focus on the initial allocation of carbon emission permits among the provinces of China from the perspective of fairness, and constructs a model of the initial inter-provincial allocation of carbon emission permits.The other is the influence of CT policy on the enterprises' market decisions.Chen et al. [12] conduct a questionnaire survey of 570 companies in 29 regions nationwide to identify the influencing factors of carbon emission reduction by establishing regression models.Song et al. [13] present an optimization model to quantitatively assess carbon reduction strategies at enterprise level in the building sector under CT.Zhu et al. [14] study the power generation enterprises' optimization decision support approach for the risk of CT in electric power systems.Yang et al. [15] identify the factors affecting companies' awareness and perceptions of the CT system by conducting a national survey based on an online questionnaire from May to November 2015 in seven CT pilots.Wang et al. [16] study the manufacturing/remanufacturing decisions for a capital-constrained manufacturer considering carbon emission cap and trade.Wang et al. [17] use game models to study the supply chain enterprise operations and government carbon tax decisions considering CT.Qin et al. [18] propose a multi-criteria decision analysis model to examine the quota allocation pertaining to China's east coastal areas based on the principles of efficiency and equity.
The existing literature has beneficially explored the effects of the implementation and development of a CT mechanism.They not only studied the impact of CT policies on energy economy development and the electricity market, but also revealed the key factors and the transaction process between the main bodies of the CT market.However, there are some deficiencies in the current studies.The government is the promulgator of the policy, and the purpose of the government's enactment of the policy is to guide policy implementers to develop in a specific direction, so as to achieve the effects of the policy implementation.In this process, there is a game between the policy issuer and the implementer.Most of the existing literature studies the impact of CT on the market and power generation companies from a macro perspective, and few studies examine the game process and behavioral strategies of the government and power producers in the process of CT implementation from a micro perspective.This paper will fill this gap.This study first analyzes the behavioral strategies of government and power producers in the evolutionary game process by constructing an evolutionary game model.Secondly, taking China's CT market as an example, the system dynamics (SD) model is established to simulate the evolution of the game's behavioral strategies.Finally, we study the impact of key factors of CT implementation on the stability of the system.The combination of evolutionary game and SD in our study not only clearly reveals the complex and dynamic evolution process of game models under bounded rationality, but also provides a qualitative and quantitative simulation platform for analyzing the dynamic game process between government and power producers, thereby providing effective theoretical support for policy makers.

Methodology
In traditional game theory, it is often assumed that the participants are completely rational and under complete information conditions.However, the participants' complete rationality and complete information conditions are difficult to achieve in real economic life.The evolutionary game believes that humans usually achieve a game equilibrium through trial and error, that is, bounded rationality, rather than turning human models into a super-reasonable game player.It has commonalities with the principle of biological evolution, and emphasizes dynamic equilibrium rather than static equilibrium.The significance of evolutionary game analysis under bounded rationality is not to predict one-off game outcomes or short-term game equilibrium, but to analyze and compare the long-term stability trend of certain game relationships under a stable environment, which is consistent with the simulation characteristics of SD [19].SD is a systems modeling and dynamic simulation methodology for the analysis of dynamic complexities in socio-economic and biophysical systems with long-term, cyclical, and low-precision requirements [20,21].The research method based on the combination of evolutionary game and SD has been applied to the field of economics and management.For example, Kim [22] and Sice et al. [23] use SD to simulate the dynamic and complex evolutionary process of the hybrid strategy game model and duopoly game model.Liu et al. [24] simulate an evolutionary game model of two enterprise populations' dynamics and stability in the decision-making behavior process.It can be seen that SD provides an effective aid for studying the complex dynamic process of evolutionary games under incomplete information conditions.

Theoretical Framing Analysis
It is necessary to explain the CT process and the relationship between the government and power producers under CT before establishing the evolutionary game model, as shown in Figure 1, in which the solid line indicates the behaviors of the government and power producers, and the dotted line indicates the CT amounts.In the CT market, the government acts by setting caps, legislating, motivating participation, introducing trading, and standardizing and supervising the market [4].The government takes carbon emissions as the total amounts of transactions in the CT market, and accordingly sets a certain percentage of the free quotas, which is the maximum carbon emissions permitted by the power producers.The portion beyond the free quotas requires the power producers to trade at a certain CT price in the market, thereby completing the government's mandatory carbon emission target for power producers.During the transaction process, the government will give certain incentives to the power producers involved in CT, that is, subsidies [20].Similarly, the government will impose certain penalties on power producers that do not participate in CT, that is, fines [20].We regard all the power producers in the market as one game subject in our study, and a dynamic game is formed between the government's incentive/punishment measures and the decision of the power producers to participate in CT or not.

Game Payment Function
(1) Participants in the game: The two participants in the evolution game of CT are the government and producers, and both of them have bounded rationality.
(2) Participants' behavior strategies: Power producers have two strategies, which are to participate in CT or not participate in CT. "The strategy of participation in CT (PCT)" includes power producers reducing energy consumption through energy-efficient retrofitting of equipment, thereby reducing carbon emissions."The strategy of no participation in CT (NPCT)" means that power producers do not take any measures to reduce carbon emissions.At the same time, the government has the duty to supervise the power producers.There are two strategies for the government, that is, supervise (S) and do not supervise (NS) whether the power producers participate in CT [25].It can be seen as a result of the game between the government and power producers whether a power producer participates in CT or not.
(3) Probabilities of behavioral strategy: In the initial stage of the game between the government and power producers, we suppose that the probability of the government choosing "S" is x(0 ≤ x ≤ 1), and then the probability of choosing "NS" is 1 − x.The probability of power producers choosing "PCT" is y(0 ≤ y ≤ 1), and then the probability of choosing "NPCT" is 1 − y.The game strategy combination is shown in Table 1.
Table 1.The game strategy combination of the government and power producers.PCT, participation in carbon trading; NPCT, no participation in carbon trading; S, supervise; NS, do not supervise.

Game Players
Power Producer

PCT (y)
NPCT (1 − y) (4) Game payment functions: The revenue of the game matrix between government and power producers is shown in Table 2.The payment function of each strategy on game players are as follows.
All the variables and their economic meanings are shown in Appendix A. Formula (1) is the government's revenues when the government supervises and power producers participate in CT.It is the negative value of expenditure costs, including the costs paid by the government during supervision, and subsidies for power producers, which are the product of the unit subsidy and the carbon emissions involved in the CT market.Formula (2) is the government's revenues when the government supervises and power producers do not participate in CT.The incomes are penalties, which are the product of the unit fine and the carbon emissions should be involved in the CT market, and the expenditures are supervision costs and treatment costs of environmental pollution.Formula (3) which equals 0 is the government's revenues when the government does not supervise and power producers participate in CT.Formula (4) is the government's revenues when the government does not supervise and power producers do not participate in CT.It is treatment costs of environmental pollution.Formula (5) is the power producers' revenues when power producers participate in CT and the government supervises.The incomes include the comprehensive benefits, the subsidies, and the market trading revenues when power producers participate in CT.The expenditures are the costs of energy-saving equipment, where the comprehensive benefits are the reduction of energy consumption costs, which is reflected in the reduction of the long-term marginal cost of carbon emissions.Formula (6) is the power producers' revenues when power producers do not participate in CT and the government supervises.It is the penalties.Formula (7) is the power producers' revenues when power producers participate in CT and the government does not supervise.It equals Formula (5) except for subsidies.Formula (8) which equals 0 is the power producers' revenues when power producers do not participate in CT and the government does not supervise.
Table 2.The revenue of the game matrix between government and power producers.

Evolutionarily Stable Strategy (ESS)
According to Tables 1 and 2, the expected return of "S" (π x ), the expected return of "NS" (π 1−x ), and average expected return (π) are as follows [26][27][28]: Thus, the replicated dynamic equation of the government's evolutionary strategy is: Similarly, the expected return of "CT" (u y ), the expected return of "NCT" (u 1−y ), and average expected return (u) are as follows: Thus, the replicated dynamic equation of power producers' evolutionary strategy is: From Formulas ( 12) and ( 16), this can be expressed as a two-dimensional dynamic autonomy system: According to the two-dimensional differential theorem, if and only if , the equilibrium points of the system above are (0, 0), (0, 1), (1, 0), (1, 1), and (x * , y * ), where According to the method proposed by Friedman [26][27][28], a system of differential equations describing population dynamics is known, and the stability of the local equilibrium point can be obtained by the analysis of the system's Jacobian.Thus, the stability of five equilibrium points described above can be analysis according to the Jacobian of Formula (17), which is: The local stability of linear differential equations is determined by both determinant (det(J)) and trace (tr(J)).When the equilibrium point satisfies det(J) = ∂F(x,y) ∂x × ∂G(x,y) ∂y − ∂F(x,y) ∂y × ∂G(x,y) ∂x > 0 and tr(J) = ∂F(x,y) ∂x + ∂G(x,y) ∂y < 0, this equilibrium point is the ESS.If det(J) < 0, then this equilibrium point is a saddle point.The stability analysis is performed according to the analysis of Jacobian, and the results are shown in Table 3.
It can be seen that the game model has a central point as (x * , y * ) = and four saddle points as (0, 0), (0, 1), (1, 0), and (1, 1).Therefore, there is no ESS in the evolution of the system, and any slight change may have a huge impact on the system's behavior [19].

SD Simulation Model
Based on the analysis of the theoretical framing and evolutionary game model of the relationship between the government and power producers in the CT market, we believe that there are complex nonlinear relationships between the two game players.In order to clearly represent the dynamic evolution of the behavioral strategies of the game players, we establish a stock and flow diagram (SFD) of dynamic game between the government and power producers using Vensim based on the evolutionary game model above, as shown in Figure 2. The SFD is a good tool for modeling the cause and effect relationships between various components of the SD model [29].There are about 30 functions used to represent the relationship between various factors in the SFD.As most of the functional relationships have been explained in the evolutionary game model, some of other main formulas and important function relationships in the flow chart are listed as follows.
Formula ( 18) is the probability of the government choosing "S", where, x 0 is the initial probability of government supervision.Similarly, Formula ( 19) is the probability of power producers choosing "PCT", where, y 0 is the initial probability of power producers' participation of CT.

Data
We put the data of China's CT market into the SD model for simulation.The data are collated according to [20,[30][31][32][33][34], and the main sources are related literature, the Chinese Statistical Yearbook, and survey data of China Electricity Council (CEC).We assume that TCE is one unit, and the data of other variables are shown in Table 4.

Results Analysis
According to the data, we obtain that and f − UCC×TCE TCE−FQ /( f + p sub ) = 0.23, which both meet the prerequisites of evolutionary equilibrium.Since the data in our study is the unit value based on the actual data, the simulation time in the SD model refers to the general time unit rather than a specific setting (such as year and month).In order to study whether the game players will be actively stable to a saddle point, we analyze three scenarios.By comparing the results of our study with that of the related literature [19,25,35,36], we find that the trend of game players' probabilities and the evolution of the mixed strategy in our study are substantially the same as those in the related literature; this proves that our simulation results are consistent with those of other scholars.
Scenario 1: It is assumed that the initial probability of the government supervision is the equilibrium value of the mixed strategy, and that of power producers participating in CT is random. Let − p CT p sub + f = 0.72, and the initial probability of power producers' participation in CT is y 0 = 0.2 and y 0 = 0.8, respectively.The simulation results are shown in Figure 3.We can see that when the initial value of government supervision is set as the equilibrium value of mixed strategy, and the initial value of y is given, the probability of power producers participating in CT fluctuates, and the system will not stabilize to the center point (x * , y * ).In addition, as time and number of games increase, the amplitude of fluctuation of y gradually increases, and it is difficult for the game process to reach a steady state.Scenario 2: It is assumed that the initial probability of power producers participating in CT is the equilibrium value of the mixed strategy, and that of the government supervision is random.
Let y 0 = f − UCC×TCE TCE−FQ /( f + p sub ) = 0.23, the initial probability of the government supervision is x 0 = 0.2 and x 0 = 0.8, respectively.The simulation results are shown in Figure 4. We can see that when the initial value of power producers participating in CT is set as the equilibrium value of mixed strategy, and the initial value of x is given, the probability of government supervision fluctuates, and the system will not stabilize to the center point (x * , y * ).In addition, as time and number of games increase, the amplitude of fluctuation of x gradually increases, and it is difficult for the game process to reach a steady state.Scenario 3: It is assumed that both of the game players use the same value between 0 and 1 as the initial probability.
Let x 0 = 0.5 and y 0 = 0.5, the simulation results are shown in Figure 5.We can see that the evolution process of the system is a closed-loop line with periodic motion around the starting point, which indicates that the two game players of the government and power producers show a periodic behavior pattern.

Discussion
According to the simulation results, we can see that the game system cannot achieve equilibrium under the premise of given variables.The unit subsidy and fine are unchanged in the simulation, that is, the government subsidizes or punishes power producers at the same amount regardless of the probability of power producers participating in CT.The unit subsidy and unit fine are two important variables controlled by the government, which can be changed dynamically according to the degree of actual participation of the power producers in CT.In this section, we will discuss the system's evolutionary process in the case of dynamic subsidies and dynamic penalties.

Dynamic Subsidies
The government adopts a subsidy incentive mechanism to promote the implementation of CT policy by power producers to fulfill emission reduction requirements.In the initial stage of subsidy implementation, when the proportion of power producers participating in CT is low, the government has a strong incentive to motivate power producers participated in CT; thus, the unit subsidy is relatively high.On the contrary, the higher the proportion of power producers participated in CT, the better the implementation of CT policy.Power producers can complete or even exceed the emission reduction targets.At this time, the government's incentives for encouraging power producers to participate in CT will also reduce, thereby decreasing the unit subsidy.Thus, we assume that the government's subsidies are inversely proportional to the probability that power producers participate in CT, that is, when power producers participate in CT and the government supervises, the unit subsidy changes from constant p sub to (1 − y) × p sub .First, we analyze the effect of dynamic subsidies on system's stability according to replicated dynamic equations and Jacobian stability analysis.(1 − y) × p sub is substituted into Formula ( 17), and we obtain that: We obtain five equilibrium points as (0, 0), (0, 1), (1, 0), (1, 1), and The Jacobian of the system is: The stability analysis is performed according to the analysis of Jacobian, and the results are shown in Table 5.It can be seen that the game system with dynamic subsidies has an ESS, which is Table 5.The stability analysis of equilibrium points under dynamic subsidies.
Local Equilibrium Point det(J ) tr(J ) Secondly, we use SD simulation to verify the above conclusion.The simulation results under dynamic subsidies are shown in Figure 6a-c.We can see that, when the subsidy is static and p sub = 100, x and y oscillate up and down as the time and number of games increase, and the amplitude of fluctuation increases, and it is difficult for the process to reach a steady state.However, when the subsidy is dynamic and p sub = 100, x and y converge gradually and eventually tend to be stable, and the ESS is (0.8, 0.26).Through SD simulation, we verify that the system has stability under dynamic subsidies.
In addition, we study the changes in probability by changing the value of p sub .If the unit subsidy is reduced, namely, when the subsidy is dynamic and p sub = 60, x and y gradually become stable and improve, that is, reducing unit subsidy can promote the participation of power producers in CT, and at the same time, it can increase the probability of government supervision.

Dynamic Penalties
Similar to the subsidy mechanism, the government adopts a punitive measure mechanism to promote the implementation of CT policy by power producers from the opposite direction to fulfill emission reduction requirements.In the initial stage of punishment, when the proportion of power producers participating in CT is low, the government has a strong incentive to punish power producers who do not participate in CT; thus, the unit fine is relatively high.On the contrary, the higher the proportion of power producers participated in CT, the better the implementation of CT policy.Power producers can complete or even exceed the emission reduction targets.At this time, the government's incentives for punishing power producers will also reduce, thereby decreasing the unit fine.Thus, we assume that the government's penalties are inversely proportional to the probability that power producers participate in CT, that is, when power producers do not participate in CT and the government supervises, the unit fine changes from constant f to (1 − y) × f .First, we analyze the effect of dynamic penalties on system's stability according to replicated dynamic equations and Jacobian stability analysis.(1 − y) × f is substituted into Formula ( 17), and we obtain that: We obtain five equilibrium points as (0, 0), ( ), and The Jacobian of the system is: The stability analysis is performed according to the analysis of the Jacobian, and the results are shown in Table 6.It can be seen that the game system with dynamic penalties has an ESS, which is Secondly, we use SD simulation to verify the above conclusion.The simulation results under dynamic penalties are shown in Figure 7a-c.We can see that, when the fine is static and f = 150, x and y oscillate up and down as the time and number of games increase, and the amplitude of fluctuation increases, and it is difficult for the process to reach a steady state.However, when the fine is dynamic and f = 150, x and y converge gradually and eventually tend to be stable, and the ESS is (0.79, 0.15).Through SD simulation, we verify that the system has stability under dynamic penalties.
In addition, we study the changes in probability by changing the value of f .If the unit fine is increased, namely, when the fine is dynamic and f = 180, x and y gradually become stable, and x decreases and y increases; that is, raising unit fine can not only promote the participation of power producers in CT, but also reduce the probability of government supervision.

Conclusions
Climate warming caused by carbon emissions has become a concern for human beings, and the CT mechanism is one of the effective means to promote carbon emission reduction and achieve green and low-carbon development.It is of great significance to study the evolutionary game process of government and power producers behavioral strategies when implementing CT policy.This paper first constructs the evolutionary game model of the game players, and then uses the SD model to simulate the evolution process of the behavioral strategies.Finally, the stability of the system under the government's dynamic strategies is discussed.This study provides theoretical guidance and countermeasures for the realization of effective implementation of CT policy and carbon reduction.According to the results of the model, the conclusions of this paper are as follows.
(1) The evolutionary game model and SD simulation model constructed in this paper can clearly and effectively demonstrate the evolutionary game process of government and power producers' behavioral strategies under CT, which provide references for scholars to study related issues.(2) There is a central point and four saddle points in the game system between government and power producers under CT, and there is no ESS.The evolution process of the system is a closed-loop line with periodic motion, which indicates that the two game players of the government and power producers show a periodic behavior pattern.(3) The trajectories of game players are spiraling inward and tending to stabilize focus when the government implements dynamic subsidies or penalties.This shows that the probability of government supervision and power producers participating in CT gradually converge with the increase of time, and ultimately stabilizes at the ESS in the mixed strategy, so that the game system can reach equilibrium.(4) Reducing the unit subsidy and raising the unit fine can both promote the participation of power producers in CT, but the former increases the probability of government supervision.The purpose of CT implementation is to achieve the optimal allocation of resources through market-oriented means, and the government is only responsible for the basic supervision [37,38].Thus, it is best to increase the fines when the government makes strategic adjustments, followed by reducing subsidies.
There are still some improvements that could be made in future studies.We believe that the government's implementation of dynamic subsidies or penalties is conducive to the stability of the system, but in reality, it is difficult for the government to adjust the size of subsidies or fines at any time.Thus, scholars could make an in-depth study of the specific implementation process and mechanisms of dynamic subsidies and penalties in future studies.

Figure 1 .
Figure 1.The carbon trading (CT) process and the relationship between the government and power producers under CT.

Figure 2 .
Figure 2. The stock and flow diagram (SFD) of dynamic game between the government and power producers.

Figure 3 .
Figure 3.The evolution of the probability of power producers participating in CT under different initial values.

Figure 4 .
Figure 4.The evolution of the probability of government supervision under different initial values.

Figure 5 .
Figure 5.The game evolution process of the mixed strategy.

Figure 6 .
Figure 6.The system's game process under dynamic subsidies.

Figure 7 .
Figure 7.The system's game process under dynamic penalties.

Table 3 .
The stability analysis of equilibrium points.

Table 4 .
Data of simulation.

Table 6 .
The stability analysis of equilibrium points under dynamic penalties.