Research on Air Pollution Control in China: From the Perspective of Quadrilateral Evolutionary Games

: By constructing a quadrilateral evolutionary game model involving the central government, local governments, polluting enterprises, and the public, this paper attempts to comprehensively analyze the development and implementation of China’s air pollution control policies. Through the quadrilateral evolutionary game model, this paper systematically studies the evolutionary stable strategies of the four parties involved and obtains 27 equilibrium points, strategy sets, and their corresponding policy performance with the help of the four-dimensional dynamic system. The research results show that there are ﬁve equilibrium points that represent the least ideal scenarios, 14 equilibrium points that represent the less than ideal scenarios, four equilibrium points that represent the ideal scenarios, three equilibrium points that represent the more than ideal scenarios, and one equilibrium point that represents the most ideal scenarios. By analyzing the eight equilibrium points that represent the ideal, more than ideal and most ideal scenarios, especially the four stable points, this paper has obtained the conditions as well as policy implications of the four stable points in China’s air pollution control campaign. corresponding policy performance with the help of the four-dimensional dynamic system. The research results show that the least ideal scenario is that the enterprises choose to violate the regulations on emissions regardless of the monitoring, regulations or whistleblowing activities by the governments and the public, which includes ﬁve equilibrium points; the less-than-ideal scenario is that the enterprises choose to violate the regulations on emissions or partially comply with the


Introduction
Severe air pollution will not only lead to a high incidence of diseases and low level of social welfare but also impose immeasurable negative impacts on sustainable development in the long run [1][2][3]. As a large developing country that is at a critical stage of economic transformation, China has realized the significance of air pollution problems. The top leadership has clearly stated that we must "speed up structural reform on ecological civilization and build a beautiful China" [4].
In order to fight against severe air pollution, the Chinese government has issued a large number of air pollution control policies, such as the "Air Pollution Prevention and Control Action Plan" implemented in 2013 [5]; the "Temporary Provisions on the Management of Pollutant Discharge Permits" issued in December 2016 that has accelerated the implementation of a permit system for pollutant emission control [6]; the revised "Air Pollution Prevention and Control Law" effective 26 October 2018 that requires clean production inspection in key industries including the steel, cement, and chemical industries, the adoption of advanced technologies, processes, and equipment, as well as clean production technology transformation for key areas and weak links in energy conservation and emission reduction campaigns [7]; and the "Key Points on 2019 Nation-wide Air Pollution Prevention and Control" published in March 2019 that requires various local governments to make efforts on air pollution prevention and control and continuously improve air quality [8]. Assessing the results of Therefore, throughout the development and implementation of existing air pollution control policies, there are four participants: the central government, the local government, polluting enterprises, and the public (whistleblowers). During the overall process of policy development and implementation, the relationship between these four parties is quite complicated.
1. There is to information asymmetry with the central government [14][15][16], competition between local governments for official promotion and in the area of environmental regulation, and local governments not only take orders from the central government but also have countermeasures and non-cooperation relationships with the central government [17][18][19]. 2. From the perspective of polluting enterprises, which are one of the main sources of air pollution in China [20,21] and the main source of pollution explored in this paper, because a large number of polluting enterprises are the major tax-payers that the local government relies heavily on, and some are even state-owned enterprises directly under the State-Owned Assets Supervision and Administration Commission (SASAC), the local governments have a certain collusion relationship with the polluting enterprises [22][23][24]. Therefore, throughout the development and implementation of existing air pollution control policies, there are four participants: the central government, the local government, polluting enterprises, and the public (whistleblowers). During the overall process of policy development and implementation, the relationship between these four parties is quite complicated.

1.
There is to information asymmetry with the central government [14][15][16], competition between local governments for official promotion and in the area of environmental regulation, and local governments not only take orders from the central government but also have countermeasures and non-cooperation relationships with the central government [17][18][19].

2.
From the perspective of polluting enterprises, which are one of the main sources of air pollution in China [20,21] and the main source of pollution explored in this paper, because a large number of polluting enterprises are the major tax-payers that the local government relies heavily on, and some are even state-owned enterprises directly under the State-Owned Assets Supervision and Administration Commission (SASAC), the local governments have a certain collusion relationship with the polluting enterprises [22][23][24].

1.
Most studies focused on dual party games and did not consider the participation and influence of the public [34][35][36]. Although this simplifies theoretical derivation, it cannot reflect the actual situation in China's air pollution control.

2.
Although many studies have further expanded dual-party games to tri-party games covering the central government, local governments, and polluting enterprises, these studies still failed to fully reflect the complex process of air pollution control in China [37]. In our tri-party game study in 2019, although we innovatively included the public into the pollution control game, regretfully, we did not include polluting enterprises as a game participant in the research framework [38].
In view of this, based on above studies, this paper has made further innovations by constructing a Quadrilateral Evolutionary Game Model covering the central government, local governments, polluting enterprises, and the public in order to comprehensively analyze the development and implementation process of China's air pollution control policies. By adopting the Quadrilateral Evolutionary Game Model, this paper has systematically studied the evolutionary stable strategy of the four parties involved, and proposed solutions to air pollution control in China, making theoretical and practical contributions to the construction of air pollution control systems in developing countries.
The structure of this paper is as follows: Section 2 constructs the Quadrilateral Evolutionary Game Model, Section 3 obtains the evolutionary stable strategies and stability conditions of various parties, Section 4 reveals the condition and process of formation of evolutionary stable strategies Sustainability 2020, 12, 1756 4 of 23 by constructing and solving the four-dimensional dynamic system of dynamic game evolution, and Section 5 concludes the paper and provided relevant policy recommendations based on the results of game analysis.

The Parties in the Game and Their Strategy Choices
The central government determines the performance of local governments in environmental supervision by checking whether the local government's supervision report is consistent with on-site inspection results and determines whether the enterprises comply with the regulations on emissions by monitoring various emission indicators of the polluting enterprises [39]. Given the strategies of the local government and polluting enterprises, the central government can choose to monitor (let the probability be x) or not to monitor (let the probability be (1 − x)) the air pollution control work in different places. Let the long-term social welfare brought by the long-term monitoring of the central government on environmental protection be S 1 , the monitoring cost be C 1 , and the cost of not monitoring be 0; let the reputation loss due to lack of monitoring by the central government be L 1 .

The Local Government
When the central government requires local governments to supervise whether enterprises comply with regulations on emissions, the local governments may implement regulations for the improvement of local environment. However, the local government may also choose not to regulate due to the high regulatory cost or concerns that strict regulations might result in lower fiscal revenues [40,41]. Therefore, the local governments' behavior strategies include regulating (let the probability be y) and not regulating (let the probability be (1 − y)). When the local governments choose to regulate, this will pressure the enterprises to comply with regulations on emissions, so that improve the long-term reputation and political achievement of the local government (let it be S 2 ), but this will incur regulatory cost at the same time (let it be C 2 ). When the local government chooses not to regulate, public health will be adversely impacted, and the local government will face reputation loss and other losses due to population migration and labor force decrease in the long term (let it be L 2 ). If the fact that the local government chooses not to regulate takes place during the central government's environmental inspection, the local government will be subject to political penalties (let it be P 2 ).

Enterprises
When the local government requires enterprises to strictly follow the stated emission allowance, the enterprises may, out of a sense of responsibility to the environment or fear of supervision by the central and local governments, choose to comply with the regulations (let the probability be z); the enterprises may also choose not to comply due to technology investment cost and potential negative impacts on their operation income (let the probability be (1 − z)) [42,43]. If the illegal polluting activity of enterprises is discovered by the central or local governments, the enterprises will be subject to economic penalties (let it be P 3 ) collected by the central government [44] and suffer from a reputation loss (let it be L 3 ).

The Public
When the public interest is violated due to the excessive emissions of polluting enterprises within the jurisdiction of the local government, the public could not blow the whistle, i.e., tolerate the enterprises' non-compliant emissions and hope the local government would enforce the regulations. In this case, the public will suffer a health loss of L 4 . The public could also choose to blow the whistle in order to protect their legitimate rights and interests. If the non-compliant emissions or illegal Sustainability 2020, 12, 1756 5 of 23 polluting activities of the enterprise are confirmed, the whistleblower will receive a reward of B 4 [45]. Therefore, the public's behavior strategies include blowing the whistle (let the probability be θ) and not blowing the whistle (let the probability be (1 − θ)). As for the local government, regarding the excessive emissions of polluting enterprises within its jurisdiction, it may choose to regulate for the benefit of public interests; it may also choose not to regulate due to concerns that the local economy might suffer from strict environmental regulations [46,47]. When the public chooses to blow the whistle and the local government chooses to regulate, the whistleblower will receive an extra compensation from the polluting enterprise for negative externalities (let it be R 4 ). However, when the public choose to blow the whistle, they face a cost of C 4 .

The Game Tree and Parameters
Based on the above-mentioned game participants and strategy choices, this paper has obtained the Quadrilateral Game Tree related to air pollution whistleblowing and air pollution control and supervision, as shown in Figure 2. polluting activities of the enterprise are confirmed, the whistleblower will receive a reward of [48]. Therefore, the public's behavior strategies include blowing the whistle (let the probability be ) and not blowing the whistle (let the probability be (1− )). As for the local government, regarding the excessive emissions of polluting enterprises within its jurisdiction, it may choose to regulate for the benefit of public interests; it may also choose not to regulate due to concerns that the local economy might suffer from strict environmental regulations [49,50]. When the public chooses to blow the whistle and the local government chooses to regulate, the whistleblower will receive an extra compensation from the polluting enterprise for negative externalities (let it be ). However, when the public choose to blow the whistle, they face a cost of .

The Game Tree and Parameters
Based on the above-mentioned game participants and Further, Table 1 has listed the parameter description, definition, and value range of different game participants in Part 2.1, in which the parameters , , , are dimensionless ones, while the other parameters , , … , , , etc. are economic variables of the same order of magnitude. This paper has not set specific units for them, which will not affect model calculations and results analysis. Further, Table 1 has listed the parameter description, definition, and value range of different game participants in Section 2.1, in which the parameters x, y, z, θ are dimensionless ones, while the other parameters S 1 , C 1 , . . . , P 3 , L 4 , etc. are economic variables of the same order of magnitude. This paper has not set specific units for them, which will not affect model calculations and results analysis.
The long-term social welfare due to air quality improvement when the central government monitors, the local government regulates polluting activities, and enterprises comply with the regulations The monitoring cost of the central government The reputation loss if the central government chooses not to monitor The long-term reputation gain and political achievement of the local government if it chooses to regulate polluting activities and encourage emission reduction The cost of the local government if it chooses to regulate polluting activities and the economic loss brought by strict regulation The reputation loss if the local government chooses not to regulate The punishment on the local government if the local government chooses not to regulate and enterprises' polluting activity is caught by the central government The technology investment cost required by enterprises to comply with the regulations and related impacts on their main operation income The reputation loss of the enterprise if it chooses to violate the regulations The penalty on enterprises if their illegal polluting activity is caught by the local government or central government, which belongs to the central government The cost of the public to blow the whistle The reward granted to the whistleblower by the central government The compensation to the whistleblower from polluting enterprises if the local government chooses to regulate The adverse health impact on the public if the local government chooses not to regulate L 4 > 0 Based on the parameters introduced in Section 2.2, the payoff matrix of the Quadrilateral Evolutionary Game for air pollution control is shown in Table 2. Table 2. The payoff matrix of the quadrilateral evolutionary game for air pollution control.
The Central Government (a) The Local Government (b) The elements of the Payoff Matrix are shown in Equation (1).
In the above payoff matrix, the four parties will continuously adjust their strategies in order to maximize their expected return. According to the Evolutionary Game Theory, when the return of a certain strategy is higher than the average return of the game system, this strategy will gradually evolve and develop in the system [48][49][50], i.e., the proportion of individuals adopting such strategy will grow at a rate greater than zero. This process is called the replicator dynamics equation, which is a dynamic differential equation of the frequency with which a particular strategy is adopted in a system [51][52][53].
Based on the different strategies of the four parties and their corresponding payoff, this paper has established the replicator dynamics equation of each party as follows:

The Replicator Dynamics Equation of the Central Government
The expected return of the central government a when it chooses to monitor can be expressed as The expected return of the central government a when it chooses not to monitor can be expressed as Let the probability of the central government a choosing to monitor and not to monitor be x and (1 − x) respectively, then the expected return of the central government can be expressed as The growth rate of the monitoring strategy by the central government dx dt is positively correlated to the payoff of this strategy and difference in payoff with other strategies. Therefore, the replicator dynamics equation of the central government can be calculated as follows:

The Replicator Dynamics Equation of the Local Government
The expected return of the local government b when it chooses to regulate emissions can be expressed as The expected return of the local government b when it chooses not to regulate emissions can be expressed as Let the probability of the local government b choosing to regulate and not to regulate emissions be y and (1 − y) respectively, then the expected return of the local government can be expressed as The growth rate of the regulating strategy by the local government dy dt is positively correlated to the payoff of this strategy and difference in payoff with other strategies. Therefore, the replicator dynamics equation of the local government can be calculated as follows: Sustainability 2020, 12, 1756 9 of 23

The Replicator Dynamics Equation of Enterprises
The expected return of the enterprise c when it chooses to comply with regulations on emissions can be expressed as The expected return of the enterprise c when it chooses to violate the regulations on emissions can be expressed as Let the probability of the enterprise c choosing to comply with and violate the regulations on emissions be z and (1 − z) respectively, then the expected return of the enterprise can be expressed as: The growth rate of the compliance strategy by the enterprise dz dt is positively correlated to the payoff of this strategy and difference in payoff with other strategies. Therefore, the replicator dynamics equation of the enterprise can be calculated as follows:

The Replicator Dynamics Equation of The Public
The expected return of the public d when they choose to blow the whistle can be expressed as The expected return of the public d when they choose not to blow the whistle can be expressed as Let the probability of the public d choosing to blow the whistle and not blow the whistle be θ and (1 − θ) respectively, then the expected return of the public can be expressed as The growth rate of the whistleblowing strategy by the public dθ dt is positively correlated to the payoff of this strategy and difference in payoff with other strategies. Therefore, the replicator dynamics equation of the public can be calculated as follows: Sustainability 2020, 12, 1756 10 of 23

Results
Based on the game model constructed in Section 2, this paper will discuss the stable strategies and stability conditions from the perspective of all parties.

The Dynamic Trend and Evolutionary Stable Points of the Central Government
It can be seen from Equation (6) that the main factors that determine the central government's tendency to choose the monitoring strategy include the following: 1.
The probability of the other parties' strategy decisions, such as the probability of the local government choosing to regulate y, the probability of the enterprise choosing to comply with regulations on emissions z, and the probability of the public choosing to blow the whistle θ; 2.
The costs and benefits of the central government's strategies, including the monitoring cost C 1 , the long-term social welfare due to long-term monitoring S 1 , the reputation loss due to lack of monitoring L 1 , the economic or political penalties on local governments P 2 , the penalties on non-compliant enterprises P 3 , and the reward to whistleblowers B 4 .
According to Equation (6), let A(y, z, θ) = 0, and when any of the three conditions listed in Equation (19) is met: It can be obtained that F(x) ≡ 0, which means that when any of the probabilities listed above meets the specified conditions, the central government will choose not to monitor, and the game system will be in a stable state, as shown in Figure 3a, below.

The Dynamic Trend and Evolutionary Stable Points of the Local Government
It can be seen from Equation (10) that the main factors that determine the local government's tendency to choose the regulating strategy include: 1. The probability of the other parties' strategy decisions, such as the probability of the central government choosing to monitor , the probability of the enterprise choosing to comply with regulations on emissions , and the probability of the public choosing to blow the whistle ; 2. The costs and benefits of the local government's strategies, including the regulatory cost of the local government , the reputation and political achievement due to long-term regulatory efforts , the reputation loss due to insufficient regulatory efforts , and the economic or political penalties on local governments . According to Equation (10), let ( , , ) = 0, and when any of the three conditions listed in Equation (21) below is met: It can be obtained that ( ) ≡ 0, which means that when any of the probabilities listed above meets the specified conditions, the local government will choose not to regulate, and the game system will be in a stable state, as shown in Figure 4 (a).
In the case of ≠ 0, let ( ) = 0, two stable points of can be obtained: 0 and 1. It can be inferred from Equation (10) that In the case of A(y, z, θ) 0, let F(x) = 0, two stable points of x can be obtained: 0 and 1. It can be inferred from Equation (6) that In Equation (20), if A(y, z, θ) < 0, i.e., y > y 1 , z > z 1 , θ > θ 1 , then In this case, x = 0 is the evolutionary stable point, which represents the only global evolutionary stable strategy, that is, the central government will tend to choose the stable strategy of not monitoring, as shown in Figure 3b. This means that when the probability of the local government choosing to regulate is higher than the critical value y 1 , when the probability of the enterprise choosing to comply with regulations on emissions is higher than the critical value z 1 , or when the probability of the public choosing to blow the whistle is higher than the critical value θ 1 , the enterprises will have a higher Sustainability 2020, 12, 1756 11 of 23 probability of compliant emissions, the air pollution will be effectively controlled, and the central government will reduce its monitoring efforts. In this case, the optimal strategy choice of the central government is "not monitoring".
Conversely, if A(y, z, θ) > 0, i.e., y < y 1 , z < z 1 , θ < θ 1 , then dF(x) dx |x − 0 < 0 and dF(x) dx |x − 0 > 0. In this case, x = 1 is the evolutionary stable point, which represents the only global evolutionary stable strategy-that is, the central government will tend to choose the stable strategy of monitoring, as shown in Figure 3c. This means that when the probability of the local government choosing to regulate is lower than the critical value y 1 , when the probability of the enterprise choosing to comply with regulations on emissions is lower than the critical value z 1 , or when the probability of the public choosing to blow the whistle is lower than the critical value θ 1 , the local government will spend less efforts on regulatory activities, the public will have less incentive to blow the whistle, resulting in a lower probability of compliant emissions by polluting enterprises, the air pollution will not be effectively controlled. In this case, the optimal strategy choice of the central government is "monitoring".

The Dynamic Trend and Evolutionary Stable Points of the Local Government
It can be seen from Equation (10) that the main factors that determine the local government's tendency to choose the regulating strategy include: 1.
The probability of the other parties' strategy decisions, such as the probability of the central government choosing to monitor x, the probability of the enterprise choosing to comply with regulations on emissions z, and the probability of the public choosing to blow the whistle θ; 2.
The costs and benefits of the local government's strategies, including the regulatory cost of the local government C 2 , the reputation and political achievement due to long-term regulatory efforts S 2 , the reputation loss due to insufficient regulatory efforts L 2 , and the economic or political penalties on local governments P 2 .
According to Equation (10), let B(x, z, θ) = 0, and when any of the three conditions listed in Equation (21) below is met: It can be obtained that F(y) ≡ 0, which means that when any of the probabilities listed above meets the specified conditions, the local government will choose not to regulate, and the game system will be in a stable state, as shown in Figure 4a. compliant emissions, the public will have a stronger tendency to blow the whistle, the air pollution will not be effectively controlled. In this case, the optimal strategy choice of the local government is "regulating emissions." Conversely, if ( , , ) > 0, i.e. > , > , < , then ( ) | = 0 < 0 and ( ) | = 1 > 0. In this case, = 0 is the evolutionary stable point, which represents the only global evolutionary stable strategy, that is, the local government will tend to choose the stable strategy of not regulating emissions, as shown in Figure 4 (c). This means that when the probability of the central government choosing to monitor is higher than the critical value , when the probability of the enterprise choosing to comply with regulations on emissions is higher than the critical value , or when the probability of the public choosing to blow the whistle is lower than the critical value , the air pollution will be effectively controlled. In this case, from the cost perspective, the optimal strategy choice of the local government is "not regulating emissions." Figure 4. The evolutionary phase diagram of the local government's strategy choices: (a) 0 < = < 1,0 < = < 1,0 < = < 1; (b) 0 < < < 1,0 < < < 1,0 < < < 1; (c) 0 < < < 1,0 < < < 1,0 < < < 1.

The Dynamic Trend and Evolutionary Stable Points of the Enterprises
It can be seen from Equation (14) that the main factors that determine the enterprises' tendency to choose the compliance strategy include the following: 1. The probability of the other parties' strategy decisions, such as the probability of the central government choosing to monitor , the probability of the local government choosing to regulate , and the probability of the public choosing to blow the whistle ; 2. The costs and benefits of the enterprises' strategies, including the cost of complying with regulations on emissions , penalty cost due to non-compliant emissions , and compensation to whistleblowers by polluting enterprises for negative externalities caused . According to Equation (14), let ( , , ) = 0, and when any of the three conditions listed in Equation (23) below is met: In the case of B 0, let F(y) = 0, two stable points of y can be obtained: 0 and 1. It can be inferred from Equation (10) that In Equation (22), if B(x, z, θ) < 0, i.e., x < x 2 , z< z 2 , θ >θ 2 , then dF(y) dy y = 0 > 0 and dF(y) dy y = 1 < 0. In this case, y = 1 is the evolutionary stable point, which represents the only global evolutionary stable strategy, that is, the local government will tend to choose the stable strategy of regulating emissions, as shown in Figure 4b. This means that, when the probability of the central government choosing to monitor is lower than the critical value x 2 , when the probability of the enterprise choosing to comply with regulations on emissions is lower than the critical value z 2 , or when the probability of the public choosing to blow the whistle is higher than the critical value θ 2 , with less monitoring of the central government, the enterprises will have a lower probability of compliant emissions, the public will have a stronger tendency to blow the whistle, the air pollution will not be effectively controlled. In this case, the optimal strategy choice of the local government is "regulating emissions".
Conversely, if B(x, z, θ) > 0, i.e., x > x 2 , z > z 2 , θ < θ 2 , then dF(y) dy y = 0 < 0 and dF(y) dy y = 1 > 0. In this case, y = 0 is the evolutionary stable point, which represents the only global evolutionary stable strategy, that is, the local government will tend to choose the stable strategy of not regulating emissions, as shown in Figure 4c. This means that when the probability of the central government choosing to monitor is higher than the critical value x 2 , when the probability of the enterprise choosing to comply with regulations on emissions is higher than the critical value z 2 , or when the probability of the public choosing to blow the whistle is lower than the critical value θ 2 , the air pollution will be effectively controlled. In this case, from the cost perspective, the optimal strategy choice of the local government is "not regulating emissions".

The Dynamic Trend and Evolutionary Stable Points of the Enterprises
It can be seen from Equation (14) that the main factors that determine the enterprises' tendency to choose the compliance strategy include the following:

1.
The probability of the other parties' strategy decisions, such as the probability of the central government choosing to monitor x, the probability of the local government choosing to regulate y, and the probability of the public choosing to blow the whistle θ; 2.
The costs and benefits of the enterprises' strategies, including the cost of complying with regulations on emissions C 3 , penalty cost due to non-compliant emissions P 3 , and compensation to whistleblowers by polluting enterprises for negative externalities caused R 4 .
According to Equation (14), let C(x, y, θ) = 0, and when any of the three conditions listed in Equation (23) below is met: It can be obtained that F(z) ≡ 0, which means that when any of the probabilities listed above meets the specified conditions, the enterprises will choose to violate the regulations on emissions, and the game system will be in a stable state, as shown in Figure 5a. stable strategy, that is, the enterprises will tend to choose the stable strategy of not complying with regulations on emissions, as shown in Figure 5(c). This means that when the probability of the central government choosing to monitor is lower than the critical value , when the probability of the local government choosing to regulate is lower than the critical value , or when the probability of the public choosing to blow the whistle is lower than the critical value , the supervision and regulation on air pollution is loosening, with less monitoring by the central government, less regulations on pollution emissions issued by the local government, and less initiative of the public to blow the whistle. In this case, from the cost perspective, the optimal strategy choice of the enterprises is "violating the regulations on emissions." Figure 5. The evolutionary phase diagram of the enterprises' strategy choices: (a) 0 < < < 1, 0 < = < 1, 0 < = < 1; (b) 0 < < < 1,0 < < < 1,0 < < < 1; (c) 0 < < < 1, 0 < < < 1, 0 < < < 1.

The Dynamic Trend and Evolutionary Stable Points of the Public
It can be seen from Equation (18) that the main factors that determine the public's tendency to choose the whistleblowing strategy include: 1. The probability of the other parties' strategy decisions, such as the probability of the central government choosing to monitor , the probability of the local government choosing to regulate , and the probability of the enterprises choosing to comply with regulations on emissions ; Figure 5. The evolutionary phase diagram of the enterprises' strategy choices: (a) 0 < In the case of C(x, y, θ) 0, let F(z) = 0, two stable points of z can be obtained: 0 and 1. It can be inferred from Equation (14) that dz |z − 0 > 0. In this case, z = 1 is the evolutionary stable point, which represents the only global evolutionary stable strategy, that is, the enterprises will tend to choose the stable strategy of complying with regulations on emissions, as shown in Figure 5b. This means that when the probability of the central government choosing to monitor is higher than the critical value x 3 , when the probability of the local government choosing to regulate is higher than the critical value y 3 , or when the probability of the public choosing to blow the whistle is higher than the critical value θ 3 , the supervision and regulation on air pollution is tightening, with closer monitoring by the central government, stricter regulations on pollution emissions issued by the local government, and stronger willingness of the public to blow the whistle. In this case, in order to avoid penalties charged by the central government and compensation paid to the whistleblowers, the enterprises will choose the optimal strategy of "complying with regulations on emissions".
Conversely, if C(x, y, θ) < 0, i.e., x < x 3 , y < y 3 , θ < θ 3 , then dF(z) dz |z − 0 < 0 and dF(z) dz |z − 1 > 0. In this case, z = 0 is the evolutionary stable point, which represents the only global evolutionary stable strategy, that is, the enterprises will tend to choose the stable strategy of not complying with regulations on emissions, as shown in Figure 5c. This means that when the probability of the central government choosing to monitor is lower than the critical value x 3 , when the probability of the local government choosing to regulate is lower than the critical value y 3 , or when the probability of the public choosing to blow the whistle is lower than the critical value θ 3 , the supervision and regulation on air pollution is loosening, with less monitoring by the central government, less regulations on pollution emissions issued by the local government, and less initiative of the public to blow the whistle. In this case, from the cost perspective, the optimal strategy choice of the enterprises is "violating the regulations on emissions".

The Dynamic Trend and Evolutionary Stable Points of the Public
It can be seen from Equation (18) that the main factors that determine the public's tendency to choose the whistleblowing strategy include:

1.
The probability of the other parties' strategy decisions, such as the probability of the central government choosing to monitor x, the probability of the local government choosing to regulate y, and the probability of the enterprises choosing to comply with regulations on emissions z; 2.
The costs and benefits of the public's strategies, including the reward from the central government B 4 , the cost of whistleblowing C 4 , and the compensation from polluting enterprises for negative externalities caused R 4 .
According to Equation (18), let D(x, y, z) = 0, and when any of the three conditions listed in Equation (25) below is met: It can be obtained that F(θ) ≡ 0, which means that when any of the probabilities listed above meets the specified conditions, the public will choose not to blow the whistle, and the game system will be in a stable state, as shown in Figure 6a below. Sustainability 2020, 12, x FOR PEER REVIEW 15 of 26 choose to comply with the regulations on pollution emissions, so air pollution will be effectively controlled and the public's health will be protected. In this case, from a cost perspective, the optimal strategy choice of the public is "not blowing the whistle." Figure 6. The evolutionary phase diagram of the public's strategy choices: (a) 0 < = < 1,0 < = < 1,0 < = < 1 ; (b) 0 < < < 1,0 < < < 1, 0 < < < 1 ; (c) 0 < < < 1,0 < < < 1,0 < < < 1.

The Four-Dimensional Dynamic System and Its Equilibrium Points
In order to reveal the condition and process of the formation of above evolutionary stable strategies, this section will expand the analysis by constructing and solving a four-dimensional dynamic system of dynamic game evolution. According to Friedman, the stability of the equilibrium point of a group dynamic system represented by a differential equation can be determined by the stability analysis of the Jacobian matrix [57]. Therefore, this paper has adopted the Jacobian matrix stability analysis method to study the stability of the equilibrium points in the evolutionary game. A four-dimensional dynamic system is obtained based on the replicator dynamics equations of the four parties, as shown in Equation (27), which is the combination of Equation (6), (10), (14), and (18).
This paper solves this four-dimensional dynamic system made up of the game strategies of the central government, the local government, enterprises, and the public. When ( ) = 0, ( ) = 0, ( ) = 0, ( ) = 0, this paper has obtained multiple feasible solutions: 1. There are 16 equilibrium points for four-party pure strategy solutions, which are: In the case of D(x, y, z) 0, let F(θ) = 0, two stable points of θ can be obtained: 0 and 1. It can be inferred from Equation (18) that Because (−x − y + xy)R 4 < 0, if D(x, y, z) > 0, i.e., x > x 4 , y > y 4 , z < z 4 , then dF(θ) dθ |θ − 1 < 0 and dF(θ) dθ |θ − 0 > 0. In this case, θ = 1 is the evolutionary stable point, which represents the only global evolutionary stable strategy, that is, the public will tend to choose the stable strategy of blowing the whistle, as shown in Figure 6b. This means that when the probability of the central government choosing to monitor is higher than the critical value x 4 , when the probability of the local government choosing to regulate is higher than the critical value y 4 , or when the probability of the enterprises choosing to comply with regulations on emissions is lower than the critical value z 4 , most enterprises would choose to violate the regulations on pollution emissions, so air pollution will not be effectively controlled. In this case, with stronger monitoring by the central government and growing regulatory efforts by the local government, from the perspective of health and financial compensation, the optimal strategy choice of the public is "blowing the whistle".
Conversely, if D(x, y, z) < 0, i.e., x < x 4 , y< y 4 , z >z 4 , then dF(θ) dθ |θ − 1 > 0 and dF(θ) dθ |θ − 0 < 0. In this case, θ = 0 is the evolutionary stable point, which represents the only global evolutionary stable strategy, that is, the public will tend to choose the stable strategy of not blowing the whistle, as shown in Figure 6c. This means that when the probability of the central government choosing to monitor is lower than the critical value x 4 , when the probability of the local government choosing to regulate is lower than the critical value y 4 , or when the probability of the enterprises choosing to comply with regulations on emissions is higher than the critical value z 4 , most enterprises would choose to comply with the regulations on pollution emissions, so air pollution will be effectively controlled and the public's health will be protected. In this case, from a cost perspective, the optimal strategy choice of the public is "not blowing the whistle".

The Four-Dimensional Dynamic System and Its Equilibrium Points
In order to reveal the condition and process of the formation of above evolutionary stable strategies, this section will expand the analysis by constructing and solving a four-dimensional dynamic system of dynamic game evolution. According to Friedman, the stability of the equilibrium point of a group dynamic system represented by a differential equation can be determined by the stability analysis of the Jacobian matrix [54]. Therefore, this paper has adopted the Jacobian matrix stability analysis method to study the stability of the equilibrium points in the evolutionary game. A four-dimensional dynamic system is obtained based on the replicator dynamics equations of the four parties, as shown in Equation (27), which is the combination of Equation (6), (10), (14), and (18).
This paper solves this four-dimensional dynamic system made up of the game strategies of the central government, the local government, enterprises, and the public. When F(x) = 0, F(y) = 0, F(z) = 0, F(θ) = 0, this paper has obtained multiple feasible solutions: 1.
There are 16 equilibrium points for four-party pure strategy solutions, which are: Equation (28) indicates the four-party pure strategy solutions, which mean that the probability of the strategy selection of quadrilateral game participants is a certain value of 0 or 1. According to Equation (28), it can be seen that the probability of the quadrilateral game participants is all 0 or 1, and there are 16 strategy sets.

2.
There are at least eight equilibrium points for dual-party pure strategy solutions, which are Equation (29) indicates the dual-party pure strategy solutions, which mean that only two parties in the quadrilateral game participants have a strategy selection probability of 0 or 1, and the remaining two parties have a policy selection probability of uncertain values. According to Equation (29), there are at least eight strategy sets.

3.
There are at least two equilibrium points for single-party pure strategy solutions, which are Equation (30) indicates the single-party pure strategy solutions, which mean that only three parties in the quadrilateral game participants have a strategy selection probability of 0 or 1. According to Equation (30), there are at least two strategy sets.

4.
There may be a mixed strategy solution E 26 (x * , y * , z * ), whose existence requires the following conditions to be satisfied: A(y, z, θ) = 0 B(x, z, θ) = 0 and x * , y * , z * , θ * (0, 1) C(x, y, θ) = 0 D(x, y, z) = 0 (31) To summarize, when solving Equation (27), we can get a large number of feasible solutions. However, many solutions have only mathematical meaning rather than practical significance. Therefore, this paper choses feasible solutions that can represent real situations and can be represented by mathematical expressions, which include 16 feasible solutions for four-party pure strategy solutions (E 0 to E 15 ), eight feasible solutions for dual-party pure strategy (E 16 to E 23 ) and two feasible for single-party pure strategy (E 24 and E 25 ).

The Stability of the Four-Dimensional Dynamic System
In a multiple-party evolutionary game, the necessary and sufficient condition for an evolutionary stable equilibrium E is that E represents a strict Nash equilibrium [55]. If the evolutionary stable equilibrium E is asymptotically stable, then E must satisfy a strict Nash equilibrium, and the strict Nash equilibrium must be a pure strategy equilibrium [56]. According to Lyapunov's stability theory [57,58], the eigenvalues of the Jacobian matrix can determine the asymptotic stability of the equilibrium points of the system, that is, the necessary and sufficient condition for an equilibrium point in a replicator dynamics system to represent an evolutionary stable strategy is that all the eigenvalues of its Jacobian matrix are negative real numbers [59]. The Jacobian matrix of the four-dimensional dynamic system is shown in Equation (32): When the equilibrium point is E 0 (x = 0, y = 0, z = 0, θ = 0), the Jacobian matrix is written as Equation (33).
Similarly, the Jacobian matrix and its eigenvalues can be obtained for the 27 equilibrium points of the system, as shown in Table 3. According to Lyapunov's stability conditions, when all the eigenvalues of the Jacobian matrix λ is negative (λ < 0), that is, when all the eigenvalues of the Jacobian matrix are negative real numbers, the corresponding equilibrium point is a stable point; when all the eigenvalues of the Jacobian matrix are positive real numbers, the corresponding equilibrium point is an unstable point [60,61]; and when the eigenvalues of the Jacobian matrix contain both negative and positive real numbers, the corresponding equilibrium point is a saddle point [62,63]. Table 3. The stability of equilibrium points in the quadrilateral evolutionary game.

Equilibrium Point
Eigenvalues Asymptotic Stability Condition In Table 3, it is difficult to determine the evolutionary stability of the quadrilateral evolutionary game system based on the available information. Considering that this paper mainly focuses on the compliant emission behavior of the enterprises under the supervision and regulation of governments and monitoring of the public, this paper has neglected the equilibrium points in which the enterprises violate or partially comply with the regulations on emissions and only keep the eight equilibrium points that represent the most ideal, more-than-ideal, and ideal scenarios. Then, this paper studies stability conditions of these 8 equilibrium points. Taking E 2 as an example, the asymptotic stability condition for E 2 is: C 1 > L 1 + S 1 and C 2 > L 2 + S 2 , C 3 < L 3 , C 4 >0. According to the parameter settings in Section 2.2 based on real world situations, S 1 max{C 1 , L 1 , P 2 , P 3 , B 4 } > 0. Therefore, the condition of (C 1 > L 1 + S 1 ) cannot be satisfied, and thus the equilibrium point E 2 is not an asymptotically stable point. Similarly, the condition for E 5 to be asymptotically stable (−B 4 − C 1 + L 1 + S 1 < 0) also cannot be satisfied. After analyzing the rest 6 equilibrium points using the same method, this paper has obtained 4 possible asymptotically stable points: E 9 (x = 1, y = 0, z = 1, θ = 0), E 11 (x = 1, y = 1, z = 1, θ = 0), E 13 (x = 1, y = 0, z = 1, θ = 1), and E 15 (x = 1, y = 1, z = 1, θ = 1). However, from the perspective of the local government and the public, they must require positive payoff. Therefore, only E 15 (x = 1, y = 1, z = 1, θ = 1) has the best chance of meeting the asymptotic stability condition: It can be inferred from the stability conditions of these four equilibrium points that the value of the environmental political achievement (S 2 ) is critical to the local government. If the value of the environmental political achievement is lower than the environmental regulatory cost, the stable equilibrium point of the system will move towards E 9 and E 13 , which will increase the pressure on the central government to enhance monitoring. The public's battle against pollution by enterprises puts restrictions on the enterprises' environmental behaviors and plays the role of third-party supervision. The reputation loss of enterprises if the whistle is blown (P 3 ) and the design of mechanisms of compensation for negative externalities (R 4 ) and reward (B 4 ) are particularly critical. If the whistleblowing cost is too high, the stable equilibrium point of the system will move towards E 11 , greatly reducing the incentive of the public to play an active role in whistleblowing.
Therefore, in China's air pollution control campaign, in order to achieve the most ideal evolutionary stable strategy in which the central government monitors, the local governments regulate, and the enterprises follow the regulations, the key points include emphasizing the environmental political achievement of the local governments, the environmental reputation of enterprises, and the whistleblowing incentive mechanism innovation. The environmental management of China should utilize the third-party supervisory role of the public apart from administrative intervention and market mechanisms.

The Ideality of the Solutions for the Four-Dimensional Dynamic System
This paper has classified the outcomes of 16 equilibrium points for four-party pure strategy solutions, eight equilibrium points for dual-party pure strategy solutions, two equilibrium points for single-party pure strategy solutions, and the mixed strategy solution into five categories, which in the order from the best to the worst scenarios are Most Ideal, More than Ideal, Ideal, Less than Ideal, and Least Ideal (see Table 4).
Among the outcomes, the least ideal scenario is that the enterprises choose to violate the regulations on emissions regardless of the monitoring, regulations or whistleblowing activities by the governments and the public, which is represented by five equilibrium points. The less-than-ideal scenario is that the enterprises choose to violate the regulations on emissions or partially comply with the regulations with the monitoring, regulations or whistleblowing activities by the governments and the public, which includes 14 equilibrium points. The ideal scenario is that the enterprises choose to comply with the regulations on emissions with the monitoring, regulations and whistleblowing activities by the governments and the public, which is represented by four equilibrium points. The more-than-ideal scenario is that the enterprises choose to comply with the regulations on emissions with either the monitoring, regulations or whistleblowing activities by the governments and the public, which covers three equilibrium points. The most ideal scenario is that the enterprises choose to comply with the regulations on emissions without the monitoring, regulations or whistleblowing activities by the governments and the public, which is represented by one equilibrium point.

Conclusions
This paper has innovatively constructed a quadrilateral evolutionary game model involving the central government, local governments, polluting enterprises, and the public in order to comprehensively analyze the development and implementation of China's air pollution control policies. By using the quadrilateral evolutionary game model, this paper has systematically studied the evolutionary stable strategies of the four parties involved and obtains 27 equilibrium points, strategy sets, and their corresponding policy performance with the help of the four-dimensional dynamic system. The research results show that the least ideal scenario is that the enterprises choose to violate the regulations on emissions regardless of the monitoring, regulations or whistleblowing activities by the governments and the public, which includes five equilibrium points; the less-than-ideal scenario is that the enterprises choose to violate the regulations on emissions or partially comply with the regulations with the monitoring, regulations or whistleblowing activities by the governments and the public, which includes 14 equilibrium points; the ideal scenario is that the enterprises choose to comply with the regulations on emissions with the monitoring, regulations and whistleblowing activities by the governments and the public, which is represented by four equilibrium points; the more-than-ideal scenario is that the enterprises choose to comply with the regulations on emissions with either the monitoring, regulations or whistleblowing activities by the governments and the public, which covers three equilibrium points; and the most ideal scenario is that the enterprises choose to comply with the regulations on emissions without the monitoring, regulations or whistleblowing activities by the governments and the public, which is represented by one equilibrium point. By analyzing the eight equilibrium points that represent the ideal, more-than-ideal, and most ideal scenarios, especially the four asymptotically stable points among them, this paper has obtained the conditions for these four stable points as well as related policy implications.
In order to achieve the ideal or most ideal outcome of air pollution control policies when there are multiple parties involved, on the one hand, costs need to be reduced, including the monitoring cost, the regulatory cost, the compliance cost, and the whistleblowing cost; on the other hand, the supervisory responsibility of the central government on air pollution control should be shared with the local governments and the public, which requires further enhancement in the understanding and motivation of the local governments on environmental regulation, further reduction in the regulatory cost of local governments and the compliance cost of enterprises, and further encouraging the public to actively report air pollution incidents and sources. For the enhancement of local governments' understanding and motivation of environmental regulation, performance evaluation (to enhance understanding and motivation) and air pollution special funds (to reduce regulatory and compliance costs) could be used as well as setting up performance evaluation indicators in addition to economic indicators such as GDP growth. In order to encourage the public to actively report air pollution incidents and participate in the battle against air pollution, first of all, it is necessary to strengthen the public's awareness of environmental protection. Apart from that, a better reward and compensation system for whistleblowing activities should be designed, including honors and cash rewards. Finally, better whistleblowing channels should be provided to the public, such as developing the smartphone mobile application and WeChat Applet for the "12369 Environmental Protection Whistleblowing Inter-Connected Management Platform".
The research in this paper is mainly a theoretical analysis of air pollution control and the quadrilateral regulatory system. Based on this research, we will evaluate the process of air pollution control in the future, which may help improve air pollution governance in China.

Conflicts of Interest:
The authors declare no conflict of interest.