Behavioral Game Theory Model in Pollution Control with Additional Supervision

: The work is devoted to the study of the impact of external control on the strategies of pollutant discharge enterprises and government regulators in the ﬁeld of environmental protection. The authors construct a model of the relationship between these entities. It is an evolutionary game in which the players are entities that generate pollutants and the government departments that implement pollution supervision. The choice of strategies of both of these entities and the evolutionary stability of the system controlled by different regulatory efforts, i.e., a third party, are analyzed. The authors then verify the evolutionary paths and evolutionary results of the model under different conditions using simulation analysis based on this model. The conducted research shows that the weak power of third-party supervision is not enough to promote the evolution of the behavioral decisions of the government and enterprises. An appropriate increase in the power of third-party supervision will change the choice of the government and enterprises strategies in the short term; however, due to the mutual inﬂuence of the strategies between both sides of the game, in this situation, the evolutionary system does not pursue a stable state. The strong power of third-party supervision will push enterprises to choose a pollution control strategy, change the intensity of government supervision, and replace government supervision to a certain extent. It is an interesting example of modeling the relationship of this system on the basis of evolutionary game theory. The ﬁndings can be regarded as a theoretical reference for environmental pollution control of enterprises.


Introduction
In the speech at the Eighth National Conference on Eco-Environmental Protection held in 2018, General Secretary Xi Jinping stressed that the construction of ecological civilization is the fundamental plan for the sustainable development of the Chinese nation, that the ecological environment is a major social issue related to people's livelihood [1], and that it is in the decisive stage of building a well-off society in an all-round way. Fighting the battle of pollution prevention and control is a key link. Since reform and opening up, the economy of China has achieved a "miracle of growth" [2]. As the backbone of promoting social progress and economic development, enterprises have played an indispensable role. However, due to their own attributes, some enterprises produce certain wastes in the production process [3]. If they are discharged directly without treatment, environmental pollution occurs [4]. Although the state has promulgated relevant policies and regulations and emission control measures to regulate the emission behavior of enterprises, it has failed to achieve the desired results.
Scholars have made several contributions to the effective treatment of environmental pollution caused by the pollutants of enterprises. Taschini [5] attempted to developing valid price models for emission permits. Petrosyan et al. [6] proposed that a cooperative dynamic game method could be used to study practical problems, such as the regulation of pollutant emissions. According to the background of emission trading market. Zhang et al. [3] constructed a Stackelberg game model between government and enterprises to explore the game mechanism of enterprise input of pollution control and dynamic consistency of emissions trading policy. Falbo et al. [7] optimized a market model of the emission price mechanism to address diverse questions in the setting of risk-averse market players. Hampf et al. [8] used nonparametric methods to analyze the economic effects of the implementation EPA's new regulations for carbon dioxide on existing US coal-fired power plants with the help of a sample; the research results showed that new regulations would not lead to a change in profits, since the increase in the intensity standard of carbon dioxide would reduce carbon dioxide emissions, which also revealed the tradeoff between environmental income and profit income of the US government.
The government is not only the maker of environmental regulations, but also the regulator of system implementation. Pollutant discharge enterprises are producers of pollutants, their production behaviors are constrained by environmental policies, and their environmental behaviors are regulated by the government. The process of enterprise pollution control is essentially a game between government regulators and pollutant discharge enterprises. By studying the behavior interaction between them in the process of pollution control, many problems can be found in the process of environmental policy implementation. The authors of [2] found that, if enterprises lack viability, the high pollution control cost might lead to their inability to obtain normal profits; even if the government has issued strict policies and regulations, the profit return of the enterprises will become a soft constraint on the policy implementation. The authors of [9] showed that government officials colluded with pollutant discharge enterprises in the pursuit of economic benefits. Furthermore, the authors of [10] found that economic benefits were not the only reason for the inaction of government regulators; immaterial benefits such as political achievements also affected the interaction mechanism between the government and enterprises. Thus, the main reasons for the difficulty of environmental pollution control can be summarized into two points: first, because pollution control requires cost investment, enterprises choose to discharge pollutants directly to the environment in pursuit of high profits; second, local governments may relax or even not regulate pollutant emissions because of their own interests, failing to give full play to their regulatory responsibilities. When the government regulatory mechanism fails, an external force is needed to restrict their behaviors.
With the deepening of the research on the interaction between government and enterprise environmental strategies, the authors of [11] drew the following conclusion: as an independent regulatory subject other than the government and enterprises, a third party could get rid of the interest temptation and external punishment constraints, as well as play a regulatory role more fairly. With the gradual enhancement of public awareness of rights, more and more incidents of environmental pollution rights protection were reported [12], and the regulatory role of the rapidly developing media and the internet is becoming increasingly prominent in environmental issues [13], which shows that third-party supervision is playing a more and more important role in environmental pollution regulation.
Existing research [14][15][16][17] provides a theoretical basis for third-party supervision of environmental pollution, but there are also some limitations, which is mainly reflected in the following two aspects: in most game analyses, third-party supervision is set as the game subject, whereas, in fact, it does not have a direct game relationship with the government or enterprises, but acts as an external factor affecting the strategic choice of government and enterprises; thus, the conclusions fail in explanation; furthermore, the benefits of the game subject in the existing models often only focus on the economic benefits, ignoring the nonmaterial return of the government such as reputation income from active supervision and the public blame due to dereliction of duty, as well as the nonmaterial losses of enterprises such as the negative comments from consumers caused by environmental pollution.
Evolutionary game theory combines the analysis of game theory with the analysis of dynamic evolutionary process; it no longer models the game players as completely rational game subjects, but takes the bounded rationality of the game players as the analysis framework. This theory holds that the members of the game group constantly adjust and modify their strategies through learning, imitation, and trial and error, finally reaching a certain equilibrium state. At this time, the game players tend toward a stable strategy, which is called the evolutionary stable strategy (ESS) [18]. The whole evolutionary process is consistent with the actual economic life situation; hence, it is widely used in all fields of social studies [19][20][21][22][23][24][25]. Through in-depth analysis of the environmental pollution issue, it is easy to find that government regulators and pollutant discharge enterprises also follow a similar law: enterprises constantly adjust their environmental behaviors according to the regulation policies of the government, while government regulators constantly update and improve environmental regulation policies according to the environmental behaviors of enterprises. The strategy selection of the two follows the basic characteristics of evolutionary game theory. Therefore, from the perspective of evolutionary game theory, it is very consistent with the actual situation to study the behavior law of the government and enterprises in the pollution control issue.
In view of the literature review and comprehensive analysis, this paper proposes an evolutionary game model of government and enterprises subject to the participation of third-party supervision, and makes the following contributions:

1.
Our research is set in the government supervision of pollutant discharge behaviors of enterprises, introducing third-party supervision, and embeds it into the game process between the government and enterprises in the form of a variable rather than a game subject.

2.
Our evolutionary game model complements and perfects the parameters. By fully considering the economic benefits, the reputation benefits of the government and enterprises are also reflected in the actual benefits of both sides.

3.
According to the different initial states of the government-enterprise evolutionary system, we simulate and verify the evolutionary paths and evolutionary results of government-enterprise strategies under different levels of third-party supervision with the help of simulation analysis.
The paper is organized as follows: Section 2 introduces the related work of environmental pollution control of enterprises; Section 3 introduces our model and related preparations; Section 4 analyzes the evolutionary stability of government-enterprise strategies subject to the influence of third-party supervision; Section 5 conducts a simulation analysis; Section 6 summarizes our findings and suggestions, as well as future work.

Related Work
The research on environmental pollution control of enterprises mainly focused on formulating and revising environmental protection policies and regulations, pollutant discharge supervision, and third-party pollution control.
Environmental protection policies and regulations are important rules to restrict behaviors of enterprises and an important basis for the government regulators to exercise their regulatory responsibilities. From the perspective of the impact of government policies on enterprise behavior, some scholars found that reasonable and effective environmental protection policies [26][27][28] can guide enterprises to control pollution actively and reduce environmental pollution; however, in the long-term communication process, enterprises update their strategies to deal with environmental policies [29].
In December 2014, the general office of the State Council issued the opinions on promoting the third-party governance of environmental pollution to promote the establishment of the third-party governance mechanism and improve the third-party governance market. However, this model encountered many problems in the process of practice, such as the failure to establish cooperative relationships between pollutant discharge enterprises and third-party governance enterprises due to unclear responsibility boundaries, with the high cost of technological innovation causing the third-party pollution governance enterprises to bear high market risks. In order to seek scientific decisions to break through the dilemma, the authors of [30] constructed an interest game model between the entrusting party and the entrusted party from the perspective of the third-party pollution control mode, to analyze the main influencing factors of the provision of third-party governance services; they provided policy suggestions for improving the third-party pollution control mechanism. The authors of [31] analyzed the stochastic differential cooperative game model between pollutants discharge enterprises and third-party governance enterprises, and they found that government regulators could guide third-party governance enterprises to actively enter the pollution control market and carry out technology research through balanced regulation and incentive measures.
Luo et al. [32] studied the third-party governance of environmental pollution from the principal-agent perspective. Du et al. [33] explored the rules of communication and the evolutionary law of system formation between the government and third-party pollution governance enterprises.
Pollution discharge is an external behavior of enterprises, which requires government supervision and guidance. Although local governments are more capable of controlling environmental pollution than the central government, local governments alone cannot solve the problem of environmental pollution [34]. Pollutant discharge enterprises and government regulators are the two basic related subjects in the problem of enterprise pollution control. Many scholars have carried out a series of studies on the mutual driving relationship between the two. Fairchild [35] used game theory to explore the strategic interaction between enterprises and government regulators, and found that government regulatory behavior and environmental awareness education can lead to an increase in green production behavior. Using a noncooperative game model, Tsebelis [36] found that increasing fines of enterprise violations would reduce the frequency of law enforcement; The author of [37] used the tool of evolutionary game analysis to demonstrate the important role of government regulation in corporate environmental governance, and the research showed that the government's moderate punishment and maintaining a certain probability of government regulation would promote enterprises to form a stable environmental protection strategy. Therefore, the active supervision strategy of government regulators is in favor of the environmental quality.
On the issue of enterprise pollution control, the relationship between government regulators and pollutant discharge enterprises is not only a simple relationship between supervising and being supervised; they exhibit both cooperation in economic benefits and conflict in environmental benefits. Therefore, they may have collusive behavior for their own interests [38] and choose a noncooperative strategy because of external constraints. At the same time, subject to the influence of information asymmetry and complex game environment, in the process of communication, both sides will keep learning and adjusting their strategies in order to bring optimal profits. At present, scholars have demonstrated the importance of third-party supervision in the process of enterprise pollution control [14,39,40], and they found that the third party can avoid the weak supervision of the government due to the economic cooperation relationship with enterprises, which encourage enterprises to choose pollution control behaviors. Figure 1 intuitively shows the interactive relationships among the three. Many scholars have incorporated third-party supervision into the research of environmental pollution control. The authors of [15] proposed a central-local-public threeparty noncooperative game model and found that third-party supervision can completely replace the administrative regulation role of the central government. According to the three-party evolutionary game model incorporating the government, business, and the overall interests of society, some others explored the role of government regulation and policy strategies in environmental pollution control [16]. The author of [17] introduced a dynamic penalty subsidy mechanism based on this game model, to analyze the impact of government dynamic regulation and public participation on pollution governance mode. The authors of [15] structured a game theory model of environmental regulation involving four interest subjects as the central government, local governments, enterprises, and the third party, and they discussed the influence of third-party supervision on local governments and enterprise behavior choices.

Model Descriptions
Hypothesis 1: The game model has two players: the pollutant discharge enterprises and the government regulators, and both are bounded rationality. The pollutant discharge enterprises have two strategic choices: the pollutant is discharged after complete treatment and the pollutant is discharged directly after partial treatment or even without treatment, which is abbreviated as (pollution control, no pollution control). We assume the probability of the enterprises choosing the pollution control strategy is x ; hence, the probability of no pollution control is x 1 -. The government regulators can choose a strict supervision strategy, loose supervision, or even no supervision strategy, which is recorded as (supervision, no supervision). We assume that the probability of choosing the supervision strategy is y ; hence, the probability of no supervision is y 1 -, which satisfies x y 0 , 1 £ £ . Hypothesis 2: The enterprises can obtain economic benefits R if they invest in production costs C 1 to carry out production. If the pollutants produced in the production process are not treated and discharged directly, they will pollute the environment. At this time, the enterprises have no cost input; however, if they are found by the government regulators, they will be punished by P , such as fines or suspension of business for rectification. If enterprises pay attention to environmental protection, they will adopt a pollution control strategy, which will result in pollution control costs C 2 , such as manpower and technical costs. Many scholars have incorporated third-party supervision into the research of environmental pollution control. The authors of [15] proposed a central-local-public three-party noncooperative game model and found that third-party supervision can completely replace the administrative regulation role of the central government. According to the three-party evolutionary game model incorporating the government, business, and the overall interests of society, some others explored the role of government regulation and policy strategies in environmental pollution control [16]. The author of [17] introduced a dynamic penalty subsidy mechanism based on this game model, to analyze the impact of government dynamic regulation and public participation on pollution governance mode. The authors of [15] structured a game theory model of environmental regulation involving four interest subjects as the central government, local governments, enterprises, and the third party, and they discussed the influence of third-party supervision on local governments and enterprise behavior choices.

Model Descriptions
Hypothesis 1: The game model has two players: the pollutant discharge enterprises and the government regulators, and both are bounded rationality. The pollutant discharge enterprises have two strategic choices: the pollutant is discharged after complete treatment and the pollutant is discharged directly after partial treatment or even without treatment, which is abbreviated as (pollution control, no pollution control). We assume the probability of the enterprises choosing the pollution control strategy is x; hence, the probability of no pollution control is 1 − x. The government regulators can choose a strict supervision strategy, loose supervision, or even no supervision strategy, which is recorded as (supervision, no supervision). We assume that the probability of choosing the supervision strategy is y; hence, the probability of no supervision is 1 − y, which satisfies 0 ≤ x, y ≤ 1.
Hypothesis 2: The enterprises can obtain economic benefits R if they invest in production costs C 1 to carry out production. If the pollutants produced in the production process are not treated and discharged directly, they will pollute the environment. At this time, the enterprises have no cost input; however, if they are found by the government regulators, they will be punished by P, such as fines or suspension of business for rectification. If enterprises pay attention to environmental protection, they will adopt a pollution control strategy, which will result in pollution control costs C 2 , such as manpower and technical costs.
Hypothesis 3: Government regulators need to invest in human, financial, and other regulatory costs C 3 to supervise pollutant discharge enterprises, but they will be trusted and praised by higher authorities and the public because of their active supervision and gain reputation return H. In particular, when the enterprises actively control pollution and the government chooses no supervision strategy, the government will gain reputation return H without any cost.
Hypothesis 4: When the enterprises directly discharge pollutants and the government chooses no supervision strategy, the pollutant discharge behaviors of enterprises may be discovered by third-party supervision. Suppose that the probability of the third party supervising the pollutant discharge behaviors of enterprises is µ (0 ≤ µ ≤ 1), the size of µ expresses the power of third-party supervision, and a larger value of µ represents a stronger power of third-party supervision. Once the pollutant discharge behavior of enterprises is disclosed by the third party, they will bear a series of losses caused by the damage of social image, which can be expressed as W 1 , and the government will also be criticized by the public and punished by their superiors for their laziness, which can be expressed as W 2 . At this time, the benefits of enterprises and the government are a function of µ. When µ = 0, third-party supervision did not play a role, and the government did not implement its supervision responsibilities; thus, the pollution discharge behavior of the enterprises will not be discovered, with the return of the enterprise and the government being R − C 1 and H. When µ = 1, the return of the enterprises is R − C 1 − W 1 , and the return of the government is −W 2 .
Hypothesis 5: Third-party supervision is restricted by various conditions whereby they cannot directly supervise the pollutant discharge enterprises. When the enterprises have pollutant discharge behavior, if the government implements supervision, it will discover the pollutant discharge behavior before the third party.
Hypothesis 6: According to the supervision strategy of the government, the main factors affecting the strategy choice of the enterprises are the cost C 2 of pollution control and the punishment P from the government, and the size relationship between the two will directly affect the enterprise decision making. When P < C 2 , the government punishment mechanism fails, and the enterprises will choose no pollution control strategy; at this time, the role of third-party supervision is obvious. Therefore, the subsequent analysis is based on the condition of P > C 2 .
On the basis of the above assumptions, the game payment matrix of the pollutant discharge enterprises and the government regulators subject to third-party supervision can be expressed as shown in Table 1. Table 1. Game payment matrix of government-enterprises based on third-party supervision.

Supervision
No supervision

Pollutant Discharge Enterprises
Pollution control According to the game model between the government and enterprises, the expected return of the enterprises choosing a pollution control strategy and not choosing a pollution control strategy can be expressed as U a1 and U a2 , and the average expected return can be expressed as U a .
Similarly, the expected return of the government choosing a supervision strategy and not choosing a supervision strategy can be expressed as U b1 and U b2 , and the average expected return is U b .
From the Malthusian dynamic equation, we can get that the growth rate of the number of pollutant discharge enterprise strategies is U a1 − U a , where t is time; thus, the replicated dynamic equation of the enterprises is F(x, y) = dx dt = x(U a1 − U a ) [41], which can be obtained after sorting as follows: In the same way, the replication dynamic equation of the government regulators is obtained as follows:

Solution of System Equilibrium Point and Jacobian Matrix
The dynamic evolutionary system of government-enterprise strategies can be constructed immediately by combining Equations (7) and (8). Let F(x, y) = 0 and G(x, y) = 0; then, the five local equilibrium points of the system can be obtained, which are E 1 (0, 0), According to its definition, the Jacobian matrix J can be obtained by taking the firstorder partial derivative of x and y in Equations (7) and (8) and arranging them in a certain way.
According to the nature of evolutionary game theory, with the help of the determinant Det(J) and the trace Tr(J) of the Jacobian matrix, the local stability of each equilibrium point can be analyzed. When Det(J) > 0 and Tr(J) < 0 corresponding to an equilibrium point, the equilibrium point is the evolutionary stable strategy of the system. Let a 11 = ∂F(x, y)/∂x, a 12 = ∂F(x, y)/∂y, a 21 = ∂G(x, y)/∂x,and a 22 = ∂G(x, y)/∂y;then, Substituting the five equilibrium points into Jacobian matrix J, the corresponding expressions of Det(J) and Tr(J) can be obtained as shown in Table 2. Table 2. Expressions of Jacobian determinant and trace of each equilibrium point.

Det(J)
Equilibrium Points Tr(J) According to the results in Table 2, Tr(J) = 0 corresponds to the equilibrium point E 5 (x E , y e ); thus, this point is certainly not the evolutionary stable strategy of the system.

Solution of the Value Ranges of the Third-Party Supervision
In order to explore the influence of third-party supervision µ on the strategy choice of enterprises, we take the first-order partial derivative of µ in the expression x e , and get ∂x e /∂µ > 0, which shows that the probability x e of pollutant discharge enterprises choosing a pollution control strategy increases with the increase in µ; that is, increasing the power of third-party supervision can promote enterprises to choose a pollution control strategy. The joint supervision of the government and the third-party will encourage enterprises to choose pollution control behavior, to achieve the purpose of improving the environment. Similarly, the result of the first-order partial derivative of µ in the expression y e is ∂y e /∂µ < 0,which shows that the probability y e of the government regulators choosing the supervision strategy decreases with the increase in µ; that is, a stronger power of thirdparty supervision results in a greater probability of the government choosing no supervision strategy, indicating that third-party supervision is an alternative to government supervision. Therefore, it is necessary to analyze the impact of different values of µ on the evolutionary results for government-enterprise strategies.
According to the expressions of x e and y e , the value ranges of µ are analyzed as below.

1.
Let x e = 0; thus, we can get µ = Let y e = 0; thus, we can get µ = C 2 W 1 , and the size of y e is also related to the size of ; if y e > 1, then

Formatting of Mathematical Components
The stability of the equilibrium points E i (i = 1, 2, 3, 4) is analyzed using the judgment method of local stability of the evolutionary system and Jacobian matrix J, according to the different value ranges of the third-party supervision µ. The specific results are shown in Table 3. Table 3. System stability analysis of each equilibrium point with different values of µ.

Proposition
No. , C 3 H+W 2 )), the dynamic evolutionary system is stable at the equilibrium point E 1 (0, 0) (no pollution control, no supervision).
Proof of Proposition 1. According to the stability analysis results of each equilibrium point in Proposition 1 of Table 3, the evolutionary stable strategy of pollutant discharge enterprises and government regulators is (no pollution control, no supervision). On the basis of Proposition 1, with two conclusions can be drawn. Firstly, the strategy selection of the pollutant discharge enterprises changes with the change in government strategy. When the government regulators choose the supervision strategy, and the benefit of pollutant discharge enterprises choosing a pollution control strategy is greater than that of choosing no pollution control strategy, then bounded rational enterprises will choose pollution control strategy; when government regulators choose no supervision strategy, and the benefit of pollutant discharge enterprises choosing no pollution control strategy is greater than that of choosing a pollution control strategy, then the bounded rational enterprises will choose no pollution control strategy. In this situation, the pollution control behavior and the lack of pollution control behavior of enterprises coexist. Secondly, no matter whether pollutant discharge enterprises choose the pollution control strategy or not, when the benefit of the government choosing no supervision strategy is always greater than that of choosing a supervision strategy, then the government with limited rationality will choose no supervision strategy. After a long-term evolutionary game, the system tends to the steady state that pollutant discharge enterprises choose no pollution control strategy and the government regulators choose no supervision strategy.
The above analysis procedure of the evolutionary stable strategy validates Proposition 1. At this time, the critical value of x e is 0, and 0 < y e < 1; hence, the point E 5 can be regarded as located on the y-axis. In Proposition 1, the benefits of no supervision strategy of government regulators are always greater than those of a supervision strategy. After long-term evolution, the government regulators finally tend to choose no supervision strategy. Pollutant discharge enterprises have no stable strategy choices, but constantly adjust and revise their own strategies along with the government strategies, finally stabilizing to no pollution control strategy. Figure 2 shows the dynamic evolutionary trend of government-enterprise strategies in this case. strategy. After a long-term evolutionary game, the system tends to the steady state tha pollutant discharge enterprises choose no pollution control strategy and the governmen regulators choose no supervision strategy. □ The above analysis procedure of the evolutionary stable strategy validates Proposi tion 1. At this time, the critical value of e x is 0, and e y 0 1 < < ; hence, the point E 5 can be regarded as located on the y-axis. In Proposition 1, the benefits of no supervision strat egy of government regulators are always greater than those of a supervision strategy. Af ter long-term evolution, the government regulators finally tend to choose no supervision strategy. Pollutant discharge enterprises have no stable strategy choices, but constantly adjust and revise their own strategies along with the government strategies, finally stabi lizing to no pollution control strategy. Figure 2 shows the dynamic evolutionary trend o government-enterprise strategies in this case.   Table 3, there is no evolutionary stable strategy for the dynamic evolutionary government-enterprise system. From the value of μ in Proposition 2, we can get R C C R C μW with the strategy selection of the two actors in Proposition 1, the behavior choice of the pollutant discharge enterprises has not changed; however, when the enterprises choose no pollution control strategy, the government will change its choice to the supervision strategy. At this time, the supervision and no supervision behaviors of the governmen coexist, which shows that, under the influence of third-party supervision, governmen decision making will change with the change in the strategy of the pollutant discharge

Proposition 2.
When µ ∈ ( C 3 H+W 2 , C 2 W 1 ), there is no evolutionary steady state for the dynamic evolutionary system.

Proof of Proposition 2.
According to the stability analysis results of each equilibrium point in Proposition 2 of Table 3, there is no evolutionary stable strategy for the dynamic evolutionary government-enterprise system. From the value of µ in Proposition 2, we can 2 , and 0 < x e , y e < 1. Compared with the strategy selection of the two actors in Proposition 1, the behavior choice of the pollutant discharge enterprises has not changed; however, when the enterprises choose no pollution control strategy, the government will change its choice to the supervision strategy. At this time, the supervision and no supervision behaviors of the government coexist, which shows that, under the influence of third-party supervision, government decision making will change with the change in the strategy of the pollutant discharge enterprises, evolving from a stable strategy selection state to an unstable state. The evolutionary path of the system exhibits dynamic change with certain uncertainty. Figure 3 describes the evolutionary process of the government-enterprise strategies in this case.  Table 3, the evolutionary stable strategy of pollutant discharge enterprises and government regulators is (pollution control, no supervision). According to the value of μ , we can get R C C R C μW (1 ) -< --Compared with Proposition 1, the behavior choice of the government has not changed i.e., the no supervision strategy of the government is still stable; however, when the gov ernment chooses the no supervision strategy, the enterprises will change to the pollution control strategy, which shows that the pollutant discharge enterprises will tend to choose the pollution control strategy under the influence of third-party supervision. □ In Proposition 3, the benefits of the no supervision strategy of government regulator are always greater than those of the supervision strategy, and the benefits of the pollution control strategy of pollutant discharge enterprises are always higher than those of the no pollution control strategy. After long-term evolution, government regulators finally tend to choose the no supervision strategy, while enterprises are stable in the pollution contro strategy. According to the value of μW P 1 -in e x , the value of μ in this proposition can be divided into two cases. Firstly, when μW P 1 0 -< , we can ge

Corollary 2.
Although an appropriate increase in the power of third-party supervision changes the choice of the government and enterprise strategies in the short term, due to the mutual influence of the strategies on both sides of the game, the evolutionary system does not pursue a stable state in this situation.
, the dynamic evolutionary system is stable at the equilibrium point E 3 (1, 0) (pollution control, no supervision).

Proof of Proposition 3.
According to the stability analysis results of each equilibrium point in Proposition 3 of Table 3, the evolutionary stable strategy of pollutant discharge enterprises and government regulators is (pollution control, no supervision). According to the value of µ, we can get Compared with Proposition 1, the behavior choice of the government has not changed, i.e., the no supervision strategy of the government is still stable; however, when the government chooses the no supervision strategy, the enterprises will change to the pollution control strategy, which shows that the pollutant discharge enterprises will tend to choose the pollution control strategy under the influence of third-party supervision.
In Proposition 3, the benefits of the no supervision strategy of government regulators are always greater than those of the supervision strategy, and the benefits of the pollution control strategy of pollutant discharge enterprises are always higher than those of the no pollution control strategy. After long-term evolution, government regulators finally tend to choose the no supervision strategy, while enterprises are stable in the pollution control strategy. According to the value of µW 1 − P in x e , the value of µ in this proposition can be divided into two cases. Firstly, when µW 1 − P < 0, we can get µ ∈ ( C 2 W 1 , min( C 3 H+W 2 , P W 1 )); at this time, the critical values of x e and y e are both 0, and point E 5 can be regarded as coinciding with point E 3 (0, 0). The dynamic evolutionary phase diagram is shown in Figure 4a. Secondly, when µW 1 − P > 0, we can get µ ∈ ( P W 1 , C 3 H+W 2 ); at this time, the critical value of x e is 0, the critical value of y e is 1, and point E 5 can be regarded as coinciding with the point E 3 (0, 1). The dynamic evolutionary phase diagram is shown in Figure 4b.  The increase is higher than that in Figure 3. ( Table 3, the evolutionary stable strategy of the pollutant discharge enterprises and government regulators is (pollution control, no supervision). According to the value of μ , we can get R C C R C μW (1 ) -> --. At this time, the evolutionary stable strategy of the system is the same as that in Proposition 3; however, when the pollutant discharge enterprises choose no control pollution strategy, the government will change its choice to the supervision strategy, which shows that the current power of third-party supervision can keep enterprises in a stable pollution control strategy, as well as cause a change in the intensity of government supervision. When the government does not care about the loss of reputation, they will still choose the no supervision strategy because of the strong supervision efficiency of third-party supervision. □ In Proposition 4, the benefits of the pollution control strategy of pollutant discharge enterprises are always higher than those of the no pollution control strategy. After longterm evolution, enterprises finally tend to choose the pollution control strategy. The government regulators have no stable strategy choice, but constantly adjust and modify their own strategies along with enterprises strategies, finally stabilizing to no supervision strategy. According to the value of μW P 1 -in e x , the value of μ in this proposition can be divided into two cases. Firstly, when μW P 1 0 -< , we can get

Corollary 3.
When the power of third-party supervision increases to an extent, it can promote enterprises to tend to choose the pollution control strategy, but it is not enough to promote the government to evolve into the supervision strategy.

Proof of Proposition 4.
According to the stability analysis results of each equilibrium point in Proposition 4 of Table 3, the evolutionary stable strategy of the pollutant discharge enterprises and government regulators is (pollution control, no supervision). According to the value of µ, we can get At this time, the evolutionary stable strategy of the system is the same as that in Proposition 3; however, when the pollutant discharge enterprises choose no control pollution strategy, the government will change its choice to the supervision strategy, which shows that the current power of third-party supervision can keep enterprises in a stable pollution control strategy, as well as cause a change in the intensity of government supervision. When the government does not care about the loss of reputation, they will still choose the no supervision strategy because of the strong supervision efficiency of third-party supervision.
In Proposition 4, the benefits of the pollution control strategy of pollutant discharge enterprises are always higher than those of the no pollution control strategy. After long-term evolution, enterprises finally tend to choose the pollution control strategy. The government regulators have no stable strategy choice, but constantly adjust and modify their own strategies along with enterprises strategies, finally stabilizing to no supervision strategy. According to the value of µW 1 − P in x e , the value of µ in this proposition can be divided into two cases. Firstly, when µW 1 − P < 0, we can get µ ∈ (max( C 2 W 1 , C 3 H+W 2 ), P W 1 )); at this time, 0 < x e < 1, the critical value of y e is 0, and point E 5 can be regarded as located on the x-axis. The dynamic evolutionary phase diagram of government-enterprise strategies is shown in Figure 5a. Secondly, when µW 1 − P > 0, we can get µ ∈ (max( C 3 H+W 2 , P W 1 ), 1); in this situation, 0 < x e < 1, the critical value of y e is 1, and point E 5 can be regarded as located on the straight line y = 1. The dynamic evolutionary phase diagram is shown in Figure 5b. (max( , ),1) Î + .

Corollary 4. The strong power of third-party supervision coordinating with the government su pervision pushes enterprise stability in the pollution control strategy, changes the intensity of gov ernment supervision, and replaces the government supervision to a certain extent in this situation
By summarizing the values of μ and evolutionary results in the above four propo sitions, three cases can be obtained, as shown in Table 4.

Simulation Analysis
With the help of MATLAB software, simulation experiments were carried out on the conclusions of the above theories, intuitively showing the evolutionary trend of strategies on both sides of the game under different conditions.
In order to meet the conditions of Case 1 in Table 4, we set W1 = 6, W2 = 4, C2 = 2, C3 = 4, H = 3, and P = 5; the two groups of initial values of x and y at the beginning of the game were (0.3, 0.7) and (0.8, 0.2). After calculation, μ ∈ (0,1/3) could be obtained; then, we selected μ = 0.2 from the value range for simulation experiment, and the simulation results are shown in Figure 6. The simulation results show that, when the power of third-party supervision is weak, no matter the initial state of the game between the enterprises and the government, it will eventually evolve into the enterprises choosing no pollution con trol strategy and the government regulators choosing no supervision strategy.

Corollary 4.
The strong power of third-party supervision coordinating with the government supervision pushes enterprise stability in the pollution control strategy, changes the intensity of government supervision, and replaces the government supervision to a certain extent in this situation.
By summarizing the values of µ and evolutionary results in the above four propositions, three cases can be obtained, as shown in Table 4. Table 4. Evolutionary stable strategies in three cases.

Case No.
Value Ranges of µ ESS

Simulation Analysis
With the help of MATLAB software, simulation experiments were carried out on the conclusions of the above theories, intuitively showing the evolutionary trend of strategies on both sides of the game under different conditions.
In order to meet the conditions of Case 1 in Table 4, we set W 1 = 6, W 2 = 4, C 2 = 2, C 3 = 4, H = 3, and P = 5; the two groups of initial values of x and y at the beginning of the game were (0.3, 0.7) and (0.8, 0.2). After calculation, µ ∈ (0,1/3) could be obtained; then, we selected µ = 0.2 from the value range for simulation experiment, and the simulation results are shown in Figure 6. The simulation results show that, when the power of third-party supervision is weak, no matter the initial state of the game between the enterprises and the government, it will eventually evolve into the enterprises choosing no pollution control strategy and the government regulators choosing no supervision strategy.
In order to meet the conditions of Case 2 in Table 4, we set W 1 = 6, W 2 = 5, C 2 = 3, C 3 = 2, H = 2, and P = 5; the two groups of initial values of x and y at the beginning of the game were (0.1, 0.9) and (0.7, 0.2). After calculation, µ ∈ (2/7,1/2) could be obtained, and we selected µ = 0.4 from the value range for simulation experiment. The simulation results are shown in Figure 7. The simulation results show that, no matter the initial state of the game between the pollutant discharge enterprises and government regulators, an appropriate increase in third-party supervision can change the behavior choices of enterprises and the government in the short term, but it cannot make the enterprises and the government tend to a certain stable point. The strategy selection of both sides are correlated in the evolutionary process, the whole system is in a cyclical state of shock, and there is no evolutionary stable strategy. In order to meet the conditions of Case 2 in Table 4, we set W1 = 6, W2 = 5 2, H = 2, and P = 5; the two groups of initial values of x and y at the beginning were (0.1, 0.9) and (0.7, 0.2). After calculation, μ ∈ (2/7,1/2) could be obtain selected μ = 0.4 from the value range for simulation experiment. The simula are shown in Figure 7. The simulation results show that, no matter the initia game between the pollutant discharge enterprises and government regulator priate increase in third-party supervision can change the behavior choices of and the government in the short term, but it cannot make the enterprises and ment tend to a certain stable point. The strategy selection of both sides are c the evolutionary process, the whole system is in a cyclical state of shock, and evolutionary stable strategy.   In order to meet the conditions of Case 2 in Table 4, we set W1 = 6, W2 = 5 2, H = 2, and P = 5; the two groups of initial values of x and y at the beginning were (0.1, 0.9) and (0.7, 0.2). After calculation, μ ∈ (2/7,1/2) could be obtain selected μ = 0.4 from the value range for simulation experiment. The simula are shown in Figure 7. The simulation results show that, no matter the initia game between the pollutant discharge enterprises and government regulator priate increase in third-party supervision can change the behavior choices of and the government in the short term, but it cannot make the enterprises and ment tend to a certain stable point. The strategy selection of both sides are c the evolutionary process, the whole system is in a cyclical state of shock, and evolutionary stable strategy. In order to meet the conditions of Case 3 in Table 4, we set W1 = 6, W2 = 4 5, H = 3, and P = 5; the two groups of initial values of x and y at the beginning were (0.2, 0.8) and (0.7, 0.3). After calculation, μ ∈ (1/2,5/7) or μ ∈ (5/7,1) c tained, and we selected μ = 0.6 and μ = 0.8 from the two value ranges for c simulation experiments. The simulation results are shown in Figure 8. The sim sults show that, no matter the initial state of the game between enterprises a ment regulators, affected by the strong power of third-party supervision, evolve into the pollution control strategy and the government evolves into th vision strategy. By comparing Figure 8a,b, it can also be found that, und Simulation results of government-enterprise strategies after increasing the power of third-party supervision. In this case, the corresponding value range of µ is ( C 3 H+W 2 , C 2 W 1 ).
In order to meet the conditions of Case 3 in Table 4, we set W 1 = 6, W 2 = 4, C 2 = 3, C 3 = 5, H = 3, and P = 5; the two groups of initial values of x and y at the beginning of the game were (0.2, 0.8) and (0.7, 0.3). After calculation, µ ∈ (1/2,5/7) or µ ∈ (5/7,1) could be obtained, and we selected µ = 0.6 and µ = 0.8 from the two value ranges for comparative simulation experiments. The simulation results are shown in Figure 8. The simulation results show that, no matter the initial state of the game between enterprises and government regulators, affected by the strong power of third-party supervision, enterprises evolve into the pollution control strategy and the government evolves into the no supervision strategy. By comparing Figure 8a,b, it can also be found that, under the same institutional system, increasing the power of third-party supervision can accelerate the evolutionary process of the enterprises choosing the pollution control strategy, but it has little impact on the evolutionary strategy path of the government.
institutional system, increasing the power of third-party supervision can accelera evolutionary process of the enterprises choosing the pollution control strategy, but little impact on the evolutionary strategy path of the government. (max( , ),1) + .

Research Findings
From the perspective of evolutionary game theory, this paper constructed a model between pollutant discharge enterprises and government regulators subject influence of third-party supervision, analyzed the impact of different degrees of party supervision on the evolutionary stability of government-enterprise strategie simulated the theoretical analysis with MATLAB. The results show that the strateg lection of the pollutant discharge enterprises influences that of the government. More third-party supervision will change the strategy choice of enterprises, as well as ha impact on the supervision behavior of the government. To a certain extent, it will also a substitute role for government supervision. The findings are summarized below.
The first conclusion was obtained from Corollary 1 and the simulation results i ure 6. When the power of third-party supervision is weak, the probability of pol discharge enterprises and the government bearing losses due to third-party superv is very small; thus, it is impossible to change the strategy selection of the enterprise the government.  Figure 7. The corresponding value range of µ is ( C 2 W 1 , C 3 H+W 2 ). (b) Results of the strong power of third-party supervision. The corresponding value range of µ is (max( C 2 W 1 , C 3 H+W 2 ), 1).

Research Findings
From the perspective of evolutionary game theory, this paper constructed a game model between pollutant discharge enterprises and government regulators subject to the influence of third-party supervision, analyzed the impact of different degrees of third-party supervision on the evolutionary stability of government-enterprise strategies, and simulated the theoretical analysis with MATLAB. The results show that the strategy selection of the pollutant discharge enterprises influences that of the government. Moreover, thirdparty supervision will change the strategy choice of enterprises, as well as have an impact on the supervision behavior of the government. To a certain extent, it will also play a substitute role for government supervision. The findings are summarized below.
The first conclusion was obtained from Corollary 1 and the simulation results in Figure 6. When the power of third-party supervision is weak, the probability of pollutant discharge enterprises and the government bearing losses due to third-party supervision is very small; thus, it is impossible to change the strategy selection of the enterprises and the government.
The second conclusion was obtained from Corollary 2 and the simulation results in Figure 7. Appropriately increasing the power of third-party supervision will affect the behavior choice of pollutant discharge enterprises or the government in a short period of time. If the enterprises have a lower cost of pollution control, higher penalties from the government for no control pollution strategy, or a greater loss after the discovery of no control pollution behavior by the third-party, third-party supervision will promote the enterprises to evolve into a pollution control strategy; otherwise, third-party supervision will not play a role. If the supervision cost of the government is small, the government attaches great importance to reputation benefits, or the loss after the discovery of no supervision behavior by the third-party is large, the government will tend to choose a supervision strategy under the influence of third-party supervision; otherwise, third-party supervision is ineffective for government regulators. In this situation, the strategy selection of the pollutant discharge enterprises follows that of government regulators; thus, there is no evolutionary stability strategy between the government and enterprises under the third-party supervision.
The third conclusion was obtained from Corollaries 3 and 4 and the simulation results in Figure 8. The strong power of third-party supervision will promote the pollutant discharge enterprises to evolve into a pollution control strategy, even if the government chooses no supervision strategy, and the third party can also replace the government in playing a regulatory role. A stronger power of third-party supervision accelerates the evolutions of enterprises into a pollution control strategy.

Suggestions
On the basis of the above conclusions, some suggestions are put forward, in the hope of pushing pollutant discharge enterprises to choose environment-friendly strategies, making up for the shortcomings of government supervision caused by information asymmetry, replacing the government's regulatory responsibilities when it does not act or acts slowly, and eventually forming an environmental protection mechanism of social co-governance.
To improve the situation shown in the first conclusion, government regulators need to increase the punishment of pollutant discharge behaviors and actively perform their regulatory responsibilities. Meanwhile, superior governments can set up a reward and punishment system for local government regulators to urge them to choose a supervision strategy.
To improve the situation shown in the second conclusion, the government can subsidize enterprises by issuing pollution control awards or tax relief, so as to reduce the operating costs of enterprises and guide them to consciously choose a pollution control strategy; superior governments should provide effective support and help, and improve the cooperation efficiency between relevant departments, so as to reduce the supervision costs of government regulators, improve regulatory benefits, and change the periodic oscillation state caused by the linked choice of government-enterprises strategies.
To improve the situation shown in the third conclusion, the construction of environmental protection awareness should be strengthened, and the public, the media, and other third parties should be guided to participate in the supervision of pollutant emissions.

Future Work
This paper discussed the strategy selection of local government regulators and pollutant discharge enterprises under the influence of third-party supervision. Due to the self-attributes of the game subjects, the nonmaterial benefits of game players involve reputation gains or losses and do not take environmental benefits or economic benefits at the national level into account; therefore, the model design has certain limitations. This model is only a prototype; in the future, we can consider introducing superior governments to expand the game problem to a multiagent noncooperative game problem, by introducing the environmental performance evaluation indicators of the superior government to the local government, and by perfecting the parameter settings to make the research results more convincing.