Research on Evolutionary Game Strategy Selection and Simulation Research of Carbon Emission Reduction of Government and Enterprises under the “Dual Carbon” Goal

: As one of the effective market instruments in carbon emission reduction policy, carbon trading is capable of promoting the smooth implementation of the “dual carbon” goal. Based on the path evolutionary game method of information economics, this paper constructs a dynamic game model of the evolution and development of government and enterprise carbon emission reduction. It also analyzes the evolution and development law of government and enterprise carbon emission reduction. We used the carbon market trading data of Guangdong Province to simulate the evolutionary game path of government and enterprise carbon emission reduction under the “double carbon” target and then selected strategies. Results show that (1) Scientiﬁc adjustment of carbon quota can effectively shorten the realization time of carbon emission reduction probability of high-pollution enterprises, obtain additional surplus carbon quota, and win extra carbon emission reduction income; (2) Increasing ﬁnancial subsidies can improve the probability of carbon emission reduction of high-pollution enterprises but cannot prevent the periodic change in carbon emission reduction probability, which in turn helps prolong the “window period” of government regulation on carbon emission reduction; (3) Increasing carbon emission penalties will help high-pollution enterprises actively reduce emissions and improve the motivation of government supervision; (4) The government can introduce a dynamic reward and punishment mechanism. If the government properly chooses the reward and punishment strategy, it may not necessarily pay additional subsidies, so that the government and enterprises can cooperate in tacit agreement to achieve the goal of carbon emission reduction; (5) If the price of carbon emission permits is adjusted, high-pollution enterprises will actively reduce carbon emissions and gain greater beneﬁts no matter what regulatory measures the government takes. Results of this study have profound signiﬁcance for carbon emission reduction strategies and government regulation of high-pollution enterprises and will help China achieve its “dual carbon” development goal.


Introduction
With the continuous development of the science and technology industry, global carbon dioxide emissions are annually increasing, leading to a series of frequent natural disasters in recent years. Additionally, countries are paying increasing attention to carbon dioxide emissions. The International Energy Agency (IEA) recently released the report, "Global Energy Review: Carbon Dioxide Emissions in 2021". IEA pointed out that in 2021, carbon dioxide emissions from the global energy sector will reach 36.3 billion tons, a record high year-on-year increase of 6%. The 2021 BP World Energy Statistical Yearbook data

Literature Review
Academia has done a lot of research work on government policy and enterprise emission reduction.
Carbon emission reduction of high-pollution enterprises is not only an important prerequisite for the realization of the "double carbon" goal, but also an important basis [5,6]. It is an important measure for the government, enterprises and the public to participate in environmental co-governance to realize economic and social green transformation [7][8][9]. The government should strengthen environmental regulation [10][11][12] and use government subsidy policies with caution [13]. The government should also make use of the coordination of different policies to force enterprises to reduce emissions and achieve sustainable social development in the game between government and enterprise [14].
The government's policy design and choice play an important role in guiding and standardizing enterprise behavior. The government can set scientific and reasonable incentive and punishment policies to promote enterprises to achieve optimal allocation of emission reduction resources [15,16]. The government can not only strengthen the supervision of highpollution enterprises, but also improve energy utilization efficiency to reduce carbon emissions, otherwise there will be loopholes in carbon emission reduction [17,18]. The combination of rewards and punishments of the central and local governments can guide the development of local low-carbon economy [19,20]. The use of carbon trading policies can improve the economic performance of high-pollution enterprises [21][22][23], stimulate innovation [24,25] and improve the carbon trading market [26,27]. Game theory is an effective analytical tool to study the conflict and cooperation between governments and enterprises in carbon emission reduction [28,29]. The game theory method has become a common method for the government to regulate the carbon emission reduction of enterprises [30]. It is a common method for repeated game between government regulatory departments and enterprises in infinite stages [31]. It will have an impact on the outcome of the game, including government green technology subsidy [32], carbon trading mechanism [33], enterprise emission reduction cost [34], consumer satisfaction [35] and government carbon tax [36].
Although many studies have recognized that evolutionary game is an important tool to analyze the relationship between government and enterprises, the intensity of government policy changes with the emission reduction behavior of enterprises in the actual process of emission reduction. It also changes with the change of carbon quota, including the emission reduction investment of enterprises and risk preference. Existing studies focus on enterprises in a general sense, while high-pollution enterprises are big carbon emitters and are the key to the realization of the "double carbon" goal. Therefore, based on existing research results, this paper establishes an evolutionary game model between government policies considering carbon emission constraints and emission reduction behavior of high-pollution enterprises, we focus on analyzing the influence of government and enterprise evolutionary game strategy selection, such as scientific adjustment of carbon emission quota, carbon trading price adjustment and government dynamic reward and punishment changes. which will provide feasible suggestions for realizing China's "double carbon" goal.

Policy Background
On 22 September 2020, at the 75th United Nations General Assembly, President Xi Jinping solemnly announced to the world that China's "carbon dioxide emissions" will strive to peak by 2030 and strive to achieve carbon neutrality by 2060 (Hereinafter referred to as "Double Carbon"). In March of the following year, "Double Carbon" was incorporated into the overall layout of ecological civilization construction, becoming an important part of accelerating green and low-carbon development during the "14th Five-Year Plan" period. "Double carbon" is an inherent requirement to promote the high-quality development of China's economy. It is also an inevitable choice for the development of economic society and energy technology to enter a new stage. Carbon emission trading ("carbon trading" for short) has become an important way to achieve the "double carbon" goal [37,38]. Figure 1 shows the growth of China's carbon dioxide emissions from 1980 to 2020. ory method has become a common method for the government to regulate the ca emission reduction of enterprises [30]. It is a common method for repeated game bet government regulatory departments and enterprises in infinite stages [31]. It will ha impact on the outcome of the game, including government green technology subsidy carbon trading mechanism [33], enterprise emission reduction cost [34], consumer faction [35] and government carbon tax [36].
Although many studies have recognized that evolutionary game is an importan to analyze the relationship between government and enterprises, the intensity of go ment policy changes with the emission reduction behavior of enterprises in the a process of emission reduction. It also changes with the change of carbon quota, inclu the emission reduction investment of enterprises and risk preference. Existing studi cus on enterprises in a general sense, while high-pollution enterprises are big carbon ters and are the key to the realization of the "double carbon" goal. Therefore, base existing research results, this paper establishes an evolutionary game model between ernment policies considering carbon emission constraints and emission reduction be ior of high-pollution enterprises, we focus on analyzing the influence of governmen enterprise evolutionary game strategy selection, such as scientific adjustment of ca emission quota, carbon trading price adjustment and government dynamic reward punishment changes. which will provide feasible suggestions for realizing China's " ble carbon" goal.

Policy Background
On 22 September 2020, at the 75th United Nations General Assembly, Preside Jinping solemnly announced to the world that China's "carbon dioxide emissions" strive to peak by 2030 and strive to achieve carbon neutrality by 2060 (Hereinafter ref to as "Double Carbon"). In March of the following year, "Double Carbon" was inc rated into the overall layout of ecological civilization construction, becoming an impo part of accelerating green and low-carbon development during the "14th Five-Year P period. "Double carbon" is an inherent requirement to promote the high-quality dev ment of China's economy. It is also an inevitable choice for the development of econ society and energy technology to enter a new stage. Carbon emission trading ("ca trading" for short) has become an important way to achieve the "double carbon" [37,38]. Figure 1 shows the growth of China's carbon dioxide emissions from 1980 to  China's carbon emission has gone through three stages of "low-speed growth-highspeed growth-high-quality development" (Figure 1). The first stage is the low-speed growth stage (1980−1999 9.71% in 1999, with an average annual growth of 3.15%. The second stage is the high-speed growth stage (2000−2012). Carbon dioxide emissions increased from 20.17% in 2000 to 6.30% in 2012, with an average annual growth rate of 10.65%. The third stage is the highquality development stage (2013−2020). Carbon dioxide emissions increased from 2.90% in 2013 to a decrease of 1.77% in 2020, with an average annual growth rate of 2.11%, which is in a relatively reasonable range.

Parametric Assumptions
This paper constructs a government-enterprise carbon emission reduction model and explains the game parameters between government and enterprise. This provides the basis for subsequent model solution and analysis (see Table 1). Table 1. Parameter symbols and the meaning of the parameters.

Parameter
Meaning of the Parameters Comprehensive benefits of the enterprise when enterprises take carbon emission reduction Comprehensive benefits of the enterprise when enterprises take No emission reduction E g1 E g2 Government benefits when enterprises reduce emission Government benefits when enterprises do not reduce emission C 1 Cost of enterprise emission reduction C 2 Cost of government regulation S Government emission reduction subsidies F Fines imposed by the government R 1 Constraints of the public will on the government R 2 Constraints of public will on enterprises R 3 Constraints of the will of the enterprises on the government ϕ Allocation coefficient of carbon trading volume P Carbon trading price T Carbon allowances allocated for free (T > T 1 , Carbon emissions when enterprises reduce emissions Carbon emissions when enterprises do not reduce emissions xP 1 Revenue from the sale of carbon allowances (1 − x)P 2 Purchase cost of carbon allowances

Research Hypotheses
(1) The government is the participant of carbon reduction 1, and the high-pollution enterprises are the participant of carbon reduction 2. Assuming that both parties are bounded rational participants, both parties will choose corresponding strategies on the bases of the behavior of the other party based on the limited information.
(2) If high-pollution companies adopt carbon reduction strategies, they will need to upgrade equipment and adopt digital technologies. Assuming that enterprises receive green benefits µ 1 E. However, digital technology development requires an increase in its internal costs, assuming that the increase in overhead is C 1 . In this situation, if the highpollution enterprises adopt the strategy of No emission reduction, they will gain µ 2 E. In the long run, the benefits of emission reduction are far more than the benefits of No carbon emission reduction, that is µ 1 E > µ 2 E.
(3) The strategies chosen by the government for high-pollution enterprises include subsidies and penalties. When high-pollution enterprises adopt active emission reduction strategies, the government conducts an active supervision strategy and rewards highpollution enterprises as S. Conversely, the government imposes administrative penalties on the negative carbon emission reduction behavior of high-pollution enterprises, and the government charges a fine as F. When the government adopts active supervision, it is bound to pay human capital and material capital to supervise high-pollution enterprises, and it is also necessary to introduce incentives and punishments to encourage high-pollution (4) Under the constraint of public will, the behavior strategies of both government and enterprise will change. If high-pollution enterprises do not reduce emissions and the government does not regulate, the credibility of high-pollution enterprises and the government will be affected. So, the government's loss is R 1 . If high-pollution enterprises do not reduce emissions, the public will condemn the behavior of high-pollution enterprises by public opinion, then the damage of high-pollution enterprises due to public opinion supervision is R 2 . If high-pollution enterprises actively reduce emissions, but the government does not supervise them, high-pollution enterprises can appeal to the government through industry associations and other channels. The government needs to compensate the high-pollution enterprises for the lack of supervision, and the compensation amount is the loss of the government, denoted as R 3 .
(5) To achieve the carbon emission standards of high-pollution enterprises, this paper introduces carbon emission trading (hereinafter referred to as "carbon trading") to analyze the impact of the game results between government and enterprise. Let the carbon quota allocated for free be T and the carbon trading price be P, let P 1 = P(T − T 1 ). When the carbon emission of the high-pollution enterprise is T 1 , which is lower than the carbon quota allocated for free, the high-pollution enterprise can sell the excess carbon allowances to make a profit. At this time, P 1 can be regarded as the profit of the high-pollution enterprise. Similarly, let P 2 = P (T 2 − T), which means the cost that high-pollution enterprises need to pay when they take No emission reduction measures. The proportion of high-pollution enterprise groups adopting carbon emission reduction strategies is x; then, xP 1 can be regarded as the income from the sale of carbon allowances by high-pollution enterprises; (1 − x) P 2 can be regarded as the cost of purchasing carbon allowances for high-pollution enterprises. From the government's point of view, ϕ is the allocation coefficient of carbon trading volume; if the government takes a positive attitude and supervises high-pollution enterprises, it can obtain profits through carbon trading, such as transaction fees, denoted as ϕP 1 and ϕP 2 .
(6) Based on the assumptions of the model, under the constraints of public will, the government and high-pollution enterprises will not take an absolutely negative or positive attitude.

Construction and Analysis of Evolutionary Model
Based on the above assumptions, four evolutionary game strategies are formed between the government and high-pollution enterprises, including (active supervision, emission reduction), (active supervision, No emission reduction), (non-supervision, emission reduction) and (non-supervision, No emission reduction). Moreover, the payment matrix for the choice of evolutionary game strategies is set up for carbon emission reduction between government and enterprises as shown in Table 2.

Construction and Analysis of the Model
If the probability of the government's active supervision is x, the probability of nonsupervision is (1 − x).The probability of carbon emission reduction of high-pollution enterprises is y, and the probability of No emission reduction of high-pollution enterprises is (1 − y), where x ∈ (0, 1), y ∈ (0, 1), and both x and y vary with time t. The replication dynamic equation is used to describe the rate of change of the frequency of the game subject choosing a specific strategy, and we can use the dynamic replication equation to analyze the game process of the strategy choice between the government and high-pollution enterprises [39,40]: Specifically, dw(t)/dt represents the growth rate of the group proportion of game subjects choosing a certain strategy A at time t. w(t) represents the proportion of game subjects choosing strategy A, in which A is the government or high-pollution enterprises. E t (A) is the expected return of the selection strategy A, and E is the average return of the two strategies.

Construction of the Model
If the government's expected return for "active supervision" is E 11 , the expected return for "non-supervision" is E 12 , and the average government returns are E 1 , the following equations are The replication dynamic equation of the proportion of active supervision by government departments is: [41] Both sides of the equation guide the x: . If the high-pollution enterprise chooses to actively reduce carbon emissions, its expected benefit is E 21 , and if the high-pollution enterprise does not reduce emissions, its expected benefit is E 22 , and the average benefit of the high-pollution enterprise is E 2 ,. Then, we can deduce the expected benefits of carbon emission reduction and "No emission reduction" of high-pollution enterprises as shown in Equations (7) and (8), and the average benefit is shown in Equation (9).
The replication dynamic equation of carbon emission reduction ratio selected by the group of high-pollution enterprises is [42]: Taking the derivative of both sides of this equation with respect to y, Sustainability 2022, 14, 12647 We use the Jacobian matrix (Jacobian) to judge the stability of the evolution gamesystem of government and enterprise carbon emission reduction [43]. If the determinant value in the matrix is greater than zero and the trace value is less than zero, the system is stabler. This evolutionary game-system comprises five local equilibrium points, namely, A (0, 0), B (0, 1), C (1, 0), D (1, 1), E (x 0 , y 0 ). According to Equations (5) and (10), the Jacobian matrix of the system can be obtained:

Analysis of the Model
The determinant path of the evolution of government and enterprise carbon emission reduction game-system is calculated according to Formula (12) as shown in Table 3. Table 3. Determinants and traces of the game equilibrium point (Jacobian) matrix of carbon emission reduction evolution between government and enterprise.

Local Equilibrium Point
TrJ Symbol DetJ Symbol Stability In Table 3, . Y represents the evolutionary stability strategy, N represents the instability state, and ND indicates the absence of evolutionary stability strategy.
According to above descriptions, Figure 2 presents the replication dynamic relationship between the government and high-pollution enterprises in one coordinate.
, the government and high-pollution enterprises choose non-supervision and No emission reduction strategies as shown in Figure 2a; , the game between the two sides converges to (0, 1), and the government and the high-pollution enterprises will choose non-supervision and carbon emission reduction strategies as shown in Figure 2b; , the game between the two sides converges to the equilibrium point (1, 0), and the government and the high-pollution enterprises will choose the active supervision and No emission reduction strategies as shown in Figure 2c; , the equilibrium point is (1, 1). The government and high-pollution enterprises will choose the strategy of active supervision and emission reduction as shown in Figure 2d; (5) Both government and enterprise will also have opportunistic tendencies, which may choose negative behavior. The system has two evolutionary stability points: (0,0) and (1,1) as shown in Figure 2e; (6) In Figure 2f, the government and high-pollution enterprises will choose strategies according to each other's behavior, and no evolutionary stability strategy exists.
(3) If − − + < −( + + + ) , the game between the two sides converges to the equilibrium point (1, 0), and the government and the high-pollution enterprises will choose the active supervision and No emission reduction strategies as shown in Figure 2 (c); , the equilibrium point is (1, 1). The government and high-pollution enterprises will choose the strategy of active supervision and emission reduction as shown in Figure 2 (d); (5) Both government and enterprise will also have opportunistic tendencies, which may choose negative behavior. The system has two evolutionary stability points: (0,0) and (1,1) as shown in Figure 2 (e); (6) In Figure 2 (f), the government and high-pollution enterprises will choose strategies according to each other's behavior, and no evolutionary stability strategy exists.

Simulation Analysis
Using the MATLAB platform, this paper selects Guangdong carbon market transaction data for simulation analysis. The data of carbon quota (T), emission reduction cost of high-pollution enterprises (C 1 ), government emission reduction cost (C 2 ), carbon trading price (P) and government emission reduction subsidies come from Guangdong Carbon Emissions Trading Center and the rest of the data are obtained through the parameter setting experience in References [44][45][46][47] and China Carbon Emissions trading network. From 2017 to 2021, the average carbon price in Guangdong Province will fluctuate around 25 yuan, and the average value of corporate carbon allowances (T) is around 460 million tons. Therefore, this paper sets the carbon trading price at 25 yuan, and the initial value of carbon allowances (T) is set as 46 × 10 7 tons. The cumulative trading volume of carbon emission allowances is about 2 × 10 7 tons, so the initial values of (T − T 1 ) and (T 2 − T) are set to 2 × 10 7 tons. At this time, both P 1 and P 2 are 50 × 10 7 yuan, and the trading amount is allocated. Coefficient (ϕ) is set to 0.1. According to the research ideas of Zhong et al. [44], C 1 is set to 60 yuan per ton, and C 2 is set to 50 yuan per ton. Zhe et al. pointed out [45] that high-pollution enterprises taking emission reduction measures are about 20% more profitable than those not taking emission reduction measures. Drawing on this result, this paper sets the comprehensive benefits of the enterprise (µ 1 E) as 100 yuan per ton, and the emission reduction income (µ 2 E) of high-pollution enterprises is set at 80 yuan per ton. According to the research ideas of Wu et al. [46], this paper sets the public will of the governments constraint loss (R 1 ) as about 10% of the non-abatement revenue of high-pollution enterprises (µ 2 E), that is 8 yuan per ton. Referring to the research results of Ojansivu et al. [47], the loss (R 2 ) caused by public willingness to restrict high-pollution enterprises is about 20% of the non-abatement revenue (µ 2 E) of high-pollution enterprises, and it is set at 16 yuan per ton. Data from the National Bureau of Statistics shows that the output value of high-pollution enterprises in the country's heavy industry accounts for about 68% of the total industrial output value in recent years, so R 3 is about 68 yuan per ton.
The benefits of E g1 and E g2 are affected by whether enterprises reduce emissions or not, so except E g1 and E g2 , based on the above, the initial values of other parameters are set as follows (see Table 4). Table 4. Initial parameter assignments. In Table 4, all parameters except P, T, T 1 , T 2 , P 1 , P 2 are multiplied by 10 7 tons, and except for ϕ, the dimensions of the parameters are unified to 1 × 10 7 yuan. In this situation, these initial values may reflect the initial operation of the Guangdong carbon trading market to a certain extent.
If the carbon trading price is P = 25, the free-allocated carbon quota T is 46 (×10 7 tons), if high-pollution enterprises actively reduce carbon emission, T 1 is 44 (×10 7 tons), and if high-pollution enterprises do not reduce emission, T 2 is 48 (×10 7 tons). The probability of high-pollution enterprises adopting carbon emission reduction is x; then, xP 1 is the scale that carbon quota may be sold, the probability of high-pollution enterprises not making carbon emission reduction is (1 − x); then, (1 − x) P 2 is the carbon quota that high-pollution enterprises may buy, and the carbon quota trading scale is xP 1 + (1 − x)P 2 . According to the above-mentioned analysis, the maximum purchase income of carbon quota is 50 × 10 7 yuan, and the maximum purchase cost is 50 × 10 7 yuan. The situation presented here meets the following: At this time, it is the early stage of carbon emission, and the scale of carbon trading is small. Therefore, high-pollution enterprises are reluctant to reduce emissions and obtain limited profits. In this case, the government does not actively regulate. At this time, the evolution path of carbon emission reduction of high-pollution enterprises is shown in Figure 3.
The horizontal axis represents time (t), and the vertical axis y represents the probability of carbon emission reduction in high-pollution enterprises over time. This paper divides the initial intention of government and enterprise into low, middle, and high levels, namely, (0.2, 0.5, 0.8). As can be seen from Figure 3, when the government's willingness to regulate is low (x = 0.2), the probability of carbon emission reduction is the lowest. When t = 2, the carbon emission reduction probability reaches 1. When the probability of active government regulation x is 0.5, the carbon emission reduction probability of high-pollution enterprises is 1 (t = 1.5). However, the probability of active government regulation x = 0.8, the probability of carbon emission reduction of high-pollution enterprises quickly reaches 1 (t = 0.7).
In this paper, the dynamic evolution scenario is set, and the simulation is carried out from three aspects: scientific adjustment of carbon quota, dynamic reward and punishment mechanism and adjustment of carbon emission right price.
At this time, it is the early stage of carbon emission, and the scale of carbon trading is small. Therefore, high-pollution enterprises are reluctant to reduce emissions and obtain limited profits. In this case, the government does not actively regulate. At this time, the evolution path of carbon emission reduction of high-pollution enterprises is shown in Figure 3. The horizontal axis represents time (t), and the vertical axis y represents the probability of carbon emission reduction in high-pollution enterprises over time. This paper divides the initial intention of government and enterprise into low, middle, and high levels, namely, (0.2, 0.5, 0.8). As can be seen from Figure 3, when the government's willingness to regulate is low (x = 0.2), the probability of carbon emission reduction is the lowest. When t = 2, the carbon emission reduction probability reaches 1. When the probability of active government regulation x is 0.5, the carbon emission reduction probability of highpollution enterprises is 1 (t = 1.5). However, the probability of active government regulation x = 0.8, the probability of carbon emission reduction of high-pollution enterprises quickly reaches 1 (t = 0.7).
In this paper, the dynamic evolution scenario is set, and the simulation is carried out from three aspects: scientific adjustment of carbon quota, dynamic reward and punishment mechanism and adjustment of carbon emission right price.

Dynamic Evolution Scenario 1, Scientific Adjustment of Carbon Quotas
Increasing the initial free carbon quota will allow high-pollution enterprises to obtain more revenue and pay less cost. Then, high-pollution enterprises are willing to actively reduce carbon emissions. Assuming that the carbon quota T is increased from 46 to 47, Figure 4 forms the evolutionary game state of government and enterprise (supervision, emission reduction).

Dynamic Evolution Scenario 1, Scientific Adjustment of Carbon Quotas
Increasing the initial free carbon quota will allow high-pollution enterprises to obtain more revenue and pay less cost. Then, high-pollution enterprises are willing to actively reduce carbon emissions. Assuming that the carbon quota T is increased from 46 to 47, Figure 4 forms the evolutionary game state of government and enterprise (supervision, emission reduction).  Figure 4 depicts the dynamic evolution path of high-pollution enterprises in the situation (4). Compared with Figure 3, when The probability of active government regulation x is 0.2, the time that the probability of emission reduction for high-pollution enterprises take to reach 1 has been shortened. High pollution enterprises speed up carbon emission reduction, get more surplus carbon quotas, and obtain more carbon emission reduction income. In this situation, the government adopts active supervision to encourage high-pollution enterprises to increase carbon emission reduction, so that a (supervi-  Figure 4 depicts the dynamic evolution path of high-pollution enterprises in the situation (4). Compared with Figure 3, when The probability of active government regulation x is 0.2, the time that the probability of emission reduction for high-pollution enterprises take to reach 1 has been shortened. High pollution enterprises speed up carbon emission reduction, get more surplus carbon quotas, and obtain more carbon emission reduction income.

OR PEER REVIEW 11 of 18
In this situation, the government adopts active supervision to encourage high-pollution enterprises to increase carbon emission reduction, so that a (supervision, carbon emission reduction) strategy is formed between the government and high-pollution enterprises.
Supposed that P = 30, T = 46, T 1 = 44, T 2 = 47, P 1 = 60, and P 2 = 30. The cases are considered as follows: In line with the scenario (4), by increasing the scale of carbon trading, carbon prices will also increase (See Figure 5). emission reduction, get more surplus carbon quotas, and obtain more carbon emission reduction income. In this situation, the government adopts active supervision to encourage high-pollution enterprises to increase carbon emission reduction, so that a (supervision, carbon emission reduction) strategy is formed between the government and highpollution enterprises.
Supposed that P = 30, T = 46, = 44, = 47, = 60, and = 30. The cases are considered as follows: In line with the scenario (4), by increasing the scale of carbon trading, carbon prices will also increase (See Figure 5).    Figure 5 depicts the path of the evolution game between government and highpollution enterprise in the situation (4). The red broken line represents the dynamic evolution path when the initial state of the government is 0.2. The behavior probability between government and high-pollution enterprises quickly converged to 1. However, when the government's initial willingness for active supervision was very low, the probability of emission reduction of high-pollution enterprises begin to decline briefly and then quickly rise to 1. The reason may be that high-pollution enterprises always exploit government regulatory loopholes, adopt No emission reduction strategy, and obtain excess profits at the cost of environmental damage. However, when the government changes from non-supervision to active supervision, high-pollution enterprises change their business models and actively conduct carbon emission reduction. In this case, a (supervision, carbon emission reduction) strategy has been formed between the government and high-pollution enterprises.

Dynamic Evolution Scenario 2, Financial Subsidies and Government Punishments
The government cannot fully grasp all the strategic selection trends of high-pollution enterprises and can only guide enterprises to adopt active strategies for carbon emission reduction by providing financial subsidies or penalties for high-pollution enterprises according to the information available to them. In this part, the expected profits of enterprise with x = 0.2 (y = 0.2, 0.5, 0.8). Other parameters are adjusted according to the evolution scenario (See Figures 6-10).
The government cannot fully grasp all the strategic selection trends of high-pollution enterprises and can only guide enterprises to adopt active strategies for carbon emission reduction by providing financial subsidies or penalties for high-pollution enterprises according to the information available to them. In this part, the expected profits of enterprise with x = 0.2 (y = 0.2, 0.5, 0.8). Other parameters are adjusted according to the evolution scenario (See Figures 6-10).  Figure 6 depicts the dynamic evolution path between government and high-pollution enterprises as the previous research hypothesis (5). Figure 6 shows that when the government subsidy(S) is low, the regulation is not strict, and the probability of carbon emission reduction is also low. When t = 0.6, the probability of emission reduction tends to 0 ( → 0). When t = 0.6, the emission reduction probability of high-pollution enterprises firstly tends to 0 (y→0). With subsidy incentives, the probability of carbon emission reduction has increased. When t = 1.2, the emission reduction probability of high-pollution enterprises reaches the first peak (y = 0.2); When t = 1.4, the emission reduction probability of high-pollution enterprises reaches the lowest point. However, with the increase in subsidies, the probability of carbon emission reduction rebounded rapidly. When t = 2.3, the probability of corporate carbon emission reduction reaches the first peak (y = 0.4). When t = 1.6, the emission reduction probability of high emission reduction enterprises tends to 0 for the first time (y→0), and then with the increase of the subsidy level, when t = 2.7, the carbon emission reduction probability reaches the peak for the first time (y = 0.4). When the government subsidy increases to 40, we find that the time when the emission reduction probability of three different carbon emission reduction types tends to 0 (y→0) is shortened, and the period of reaching the peak value is shortened, and the peak value has different degrees. Improvement (see Figure 9b). It can be seen that increasing government subsidies can increase the probability of carbon emission reduction of high-pollution  Figure 7a depicts the path of the dynamic evolution of government and enterprise as the previous research hypothesis (5). When the government punishment (F) is low, the carbon emission reduction probability of different types of enterprises (y = 0.2, 0.5, 0.8) first increases and then rapidly tends to 0, showing a certain periodicity. Then, the zero cycle is shortened compared with fiscal subsidies, and the maximum peak of carbon emission reduction converges. Figure 7b depicts the dynamic evolution path of the government-enterprise evolution game in the situation (3). When F increases from 30 to 60, the carbon emission reduction probability of different types of high-pollution enterprises quickly focuses on the lowest point (y→0). It shows that when the punishment measures were promoted, many high-pollution enterprises were forced to shut down because they could not accept high fines.  Figure 7a depicts the path of the dynamic evolution of government and enterprise as the previous research hypothesis (5). When the government punishment (F) is low, the carbon emission reduction probability of different types of enterprises (y = 0.2, 0.5, 0.8) first increases and then rapidly tends to 0, showing a certain periodicity. Then, the zero cycle is shortened compared with fiscal subsidies, and the maximum peak of carbon emission reduction converges. Figure 7b depicts the dynamic evolution path of the government-enterprise evolution game in the situation (3). When F increases from 30 to 60, the carbon emission reduction probability of different types of high-pollution enterprises quickly focuses on the lowest point (y→0). It shows that when the punishment measures were promoted, many high-pollution enterprises were forced to shut down because they could not accept high fines.  Figure 8a depicts the game path of government and enterprise evolution in the situation (6). When the financial subsidy is low (S = 25), the probability of the government active supervision fluctuates in most cases and fluctuates between 0 and 1, showing a certain periodicity. Moreover, the greater the probability that the government will implement regulation, the more likely the government will obtain the maximum benefits. Figure 8 (b) depicts the dynamic evolution path of the government and enterprise evolution game in the situation (3). When the financial subsidy (S) is increased to 40, the probability of government active supervision quickly tends to 1. It shows that increasing fiscal subsidies (S) will help to extend the "window period" for the government to actively supervise carbon emission reduction. The government can achieve the goal of carbon reduction of high-pollution enterprises by improving S, and financial subsidies have become an effective tool to encourage carbon reduction of high-pollution enterprises.
(a) (b)  Figure 9 depicts the situation (3) in the evolution path of government and enterprise. As found in Figure 9a, when the government punishment is low (F = 30), the probability of the government's active supervision first rises and then falls. When t = 0.5, the government's active supervision probability tends to 0. When the carbon emission reduction probability of high-pollution enterprises is 0.8, the government takes the initiative to supervise. When t = 0.4, the government's active supervision probability tends to 1. When the punishment intensity is increased to 60 (Figure 9b), no matter what carbon emission reduction probability the enterprise chooses, the probability of the government's active supervision increases rapidly. When t = 0.05, the carbon emission reduction probability quickly converges to 1 and does not change thereafter. Apparently, with the introduction of a dynamic punishment mechanism, no matter what kind of high-pollution enterprises will quickly take carbon emission reduction actions, the government will not lack the motivation of active supervision.    Figure 9 depicts the situation (3) in the evolution path of government and enterprise. As found in Figure 9a, when the government punishment is low (F = 30), the probability of the government's active supervision first rises and then falls. When t = 0.5, the government's active supervision probability tends to 0. When the carbon emission reduction probability of high-pollution enterprises is 0.8, the government takes the initiative to supervise. When t = 0.4, the government's active supervision probability tends to 1. When the punishment intensity is increased to 60 (Figure 9b), no matter what carbon emission reduction probability the enterprise chooses, the probability of the government's active supervision increases rapidly. When t = 0.05, the carbon emission reduction probability quickly converges to 1 and does not change thereafter. Apparently, with the introduction of a dynamic punishment mechanism, no matter what kind of high-pollution enterprises will quickly take carbon emission reduction actions, the government will not lack the motivation of active supervision.   Figure 6 depicts the dynamic evolution path between government and high-pollution enterprises as the previous research hypothesis (5). Figure 6 shows that when the government subsidy(S) is low, the regulation is not strict, and the probability of carbon emission reduction is also low. When t = 0.6, the probability of emission reduction tends to 0 ( y → 0 ). When t = 0.6, the emission reduction probability of high-pollution enterprises firstly tends to 0 (y→0). With subsidy incentives, the probability of carbon emission reduction has increased. When t = 1.2, the emission reduction probability of high-pollution enterprises reaches the first peak (y = 0.2); When t = 1.4, the emission reduction probability of highpollution enterprises reaches the lowest point. However, with the increase in subsidies, the probability of carbon emission reduction rebounded rapidly. When t = 2.3, the probability of corporate carbon emission reduction reaches the first peak (y = 0.4). When t = 1.6, the emission reduction probability of high emission reduction enterprises tends to 0 for the first time (y→0), and then with the increase of the subsidy level, when t = 2.7, the carbon emission reduction probability reaches the peak for the first time (y = 0.4). When the government subsidy increases to 40, we find that the time when the emission reduction probability of three different carbon emission reduction types tends to 0 (y→0) is shortened, and the period of reaching the peak value is shortened, and the peak value has different degrees. Improvement (see Figure 9b). It can be seen that increasing government subsidies can increase the probability of carbon emission reduction of high-pollution enterprises, but it cannot eliminate the cyclical changes in the probability of carbon emission reduction. Figure 7a depicts the path of the dynamic evolution of government and enterprise as the previous research hypothesis (5). When the government punishment (F) is low, the carbon emission reduction probability of different types of enterprises (y = 0.2, 0.5, 0.8) first increases and then rapidly tends to 0, showing a certain periodicity. Then, the zero cycle is shortened compared with fiscal subsidies, and the maximum peak of carbon emission reduction converges. Figure 7b depicts the dynamic evolution path of the governmententerprise evolution game in the situation (3). When F increases from 30 to 60, the carbon emission reduction probability of different types of high-pollution enterprises quickly focuses on the lowest point (y→0). It shows that when the punishment measures were promoted, many high-pollution enterprises were forced to shut down because they could not accept high fines. Figure 8a depicts the game path of government and enterprise evolution in the situation (6). When the financial subsidy is low (S = 25), the probability of the government active supervision fluctuates in most cases and fluctuates between 0 and 1, showing a certain periodicity. Moreover, the greater the probability that the government will implement regulation, the more likely the government will obtain the maximum benefits. Figure 8b depicts the dynamic evolution path of the government and enterprise evolution game in the situation (3). When the financial subsidy (S) is increased to 40, the probability of government active supervision quickly tends to 1. It shows that increasing fiscal subsidies (S) will help to extend the "window period" for the government to actively supervise carbon emission reduction. The government can achieve the goal of carbon reduction of high-pollution enterprises by improving S, and financial subsidies have become an effective tool to encourage carbon reduction of high-pollution enterprises. Figure 9 depicts the situation (3) in the evolution path of government and enterprise. As found in Figure 9a, when the government punishment is low (F = 30), the probability of the government's active supervision first rises and then falls. When t = 0.5, the government's active supervision probability tends to 0. When the carbon emission reduction probability of high-pollution enterprises is 0.8, the government takes the initiative to supervise. When t = 0.4, the government's active supervision probability tends to 1. When the punishment intensity is increased to 60 (Figure 9b), no matter what carbon emission reduction probability the enterprise chooses, the probability of the government's active supervision increases rapidly. When t = 0.05, the carbon emission reduction probability quickly converges to 1 and does not change thereafter. Apparently, with the introduction of a dynamic punishment mechanism, no matter what kind of high-pollution enterprises will quickly take carbon emission reduction actions, the government will not lack the motivation of active supervision. Figure 10 describes the dynamic path of the government-enterprise evolution game in the situation (5). We found that no matter what form of carbon emission reduction high-pollution enterprises take, when dynamic subsidies and penalties are introduced, the probability of carbon emission reduction will eventually converge to 1. The difference is that when the carbon emission reduction probability of high-pollution enterprises is very high (y = 0.8), the dynamic evolution of government and enterprises first decreases and then increases, and finally converges to 1. When the probability of carbon emission reduction is lower, the emission reduction rate of high-pollution enterprises tends to be 1 faster. Apparently, with the introduction of the dynamic reward and punishment mechanism, the government does not necessarily need to pay additional subsidies. As long as the reward and punishment strategies are properly selected, the tacit cooperation rate between the government and the enterprises increases, the high-pollution enterprises do not lack carbon emission reduction power, and the government will also achieve the expected regulatory results.

Dynamic Evolution Scenario 3, Adjusting the Price of Carbon Emission Rights
In order to stimulate high-pollution enterprises to adopt active carbon emission reduction strategies and avoid the negative effects caused by government non-supervision, the carbon emission price P is adjusted. Raising carbon prices will not only increase the maximum cost of buying carbon quotas (xP 1 ) but also increase the cost of buying carbon quotas (1 − x) P 2 . Furthermore, it not only can encourage high-pollution enterprises to implement carbon emission reduction, but also improve the government to gain benefits in carbon emission reduction, strengthen the government's active supervision, and also realize the dynamic evolution of carbon emission reduction between government and enterprises. For this reason, assuming that the carbon price P increases from 25 to 60, P 1 = P 2 = 120 with other conditions remaining (See Figure 11). carbon emission reduction power, and the government will also achieve the expected regulatory results.

Dynamic Evolution Scenario 3, Adjusting the Price of Carbon Emission Rights
In order to stimulate high-pollution enterprises to adopt active carbon emission reduction strategies and avoid the negative effects caused by government non-supervision, the carbon emission price P is adjusted. Raising carbon prices will not only increase the maximum cost of buying carbon quotas (xP1) but also increase the cost of buying carbon quotas (1 − x) P2. Furthermore, it not only can encourage high-pollution enterprises to implement carbon emission reduction, but also improve the government to gain benefits in carbon emission reduction, strengthen the government's active supervision, and also realize the dynamic evolution of carbon emission reduction between government and enterprises. For this reason, assuming that the carbon price P increases from 25 to 60, P1 = P2 = 120 with other conditions remaining (See Figure 11). When the initial state of the government is very low (x = 0.1), high-pollution enterprises actively reduce carbon emissions and obtain greater benefits (see Figure 11). From the perspective of maximizing benefits, both the government and enterprises will adopt active carbon emission reduction strategies, so as to obtain greater benefits from each, and actively promote the realization of the "dual carbon" goal.

Conclusions
This paper analyzes dynamically the possible decisions made by both government and enterprises in the context of carbon trading through an evolutionary game model. It Figure 11.
Dynamic evolution path of government-enterprise game (adjusting carbon emission price).
When the initial state of the government is very low (x = 0.1), high-pollution enterprises actively reduce carbon emissions and obtain greater benefits (see Figure 11). From the perspective of maximizing benefits, both the government and enterprises will adopt active carbon emission reduction strategies, so as to obtain greater benefits from each, and actively promote the realization of the "dual carbon" goal.

Conclusions
This paper analyzes dynamically the possible decisions made by both government and enterprises in the context of carbon trading through an evolutionary game model. It conducts a simulation analysis with the help of carbon trading market data in Guangdong Province and devotes itself to determining the optimal solution for carbon emission reduction to reduce carbon emissions of high-pollution enterprises. The following conclusions are drawn via simulation.
First, in the high-carbon market, scientific carbon quota allocation shortens the realization time of enterprises' emission reduction probability, as discussed by [19,42]. This study finds that the scientific adjustment of carbon quotas shortens the time t to achieve the probability of carbon emission reduction for high-pollution enterprises by 0.7 when the probability of government supervision is low. It also improves the degree of carbon quotas so that high-pollution enterprises obtain additional surplus carbon quotas and win further carbon emission reduction income.
Second, the carbon emission reduction decision of high-carbon enterprises is more significantly affected by the adjustment of government subsidy intensity and violation penalty ratio [48]. While this paper shows that increasing financial subsidies can improve the probability of the carbon emission reduction of high polluters, it cannot prevent the cyclical changes in the probability of carbon emission reduction of high polluters. When financial subsidies are low, most of the time, the government takes the initiative to supervise the cyclical changes in the probability without a steady state. When financial subsidies are increased, the "window period" for government regulation of carbon emission reduction is extended.
Third, when government penalties are low, the carbon reduction probability of high polluters varies cyclically and cannot reach equilibrium. If the penalty is increased, the carbon emission reduction probability of high polluters rapidly tends to zero. When the dynamic penalty mechanism is introduced, high-pollution enterprises will quickly take carbon emission reduction actions, and thus the government does not necessarily need to pay additional subsidies. When incentives and penalties are imposed by local government, no matter how enterprises take carbon emission reduction measures, this may directly affect the outcome of the game, which has been discovered by [32]. As long as the reward and punishment strategy is properly chosen, the government and high-pollution enterprises will enhance the tacit cooperation rate, the government, and enterprises to reach the win-win goal. That result is also observed in this paper.
Fourth, adjusting the price of carbon emission rights, high-pollution enterprises will actively reduce carbon emission and gain other benefits from it, regardless of the regulatory measures taken by the government. From the perspective of profit maximization, both governments and enterprises will adopt aggressive carbon emission reduction strategies. However, a prior study indicates that different types of enterprises are affected by the carbon emission rights trading price to different degrees, and the emission reduction effect of enterprises is significantly different [35].
This research still has the following limitations.
(1) As one of the important subjects in the evolutionary game strategy selection of government and enterprise carbon emission reduction under the "dual carbon" goal, quantifying the constraints of public will on the government and enterprises is difficult. (2) Financial subsidies and penalties are considered in policy choices. With the continuous emergence of low-carbon and environmental protection industries, more policy factors need to be considered in future research to obtain richer results. (3) Given that it is difficult to quantify the constraints of public will on government, and given the limited space of the article, complex sensitivity analysis will be the focus of future research.
In the future, we should (1) further refine the high-pollution industries, conduct specific research on industries such as coal, electricity, and oil, and further disaggregate the causes of high emissions. Finally, a number of valuable suggestions were formed by industry. (2) Following the existing research, the simulation is carried out in a smaller network, whereas the actual high-pollution enterprise network structure is more complex. Therefore, we will collect other data to fully characterize the network topology between government and enterprises and expand the research space from a spatial perspective.

Conflicts of Interest:
The authors declare no conflict of interest.