Research on Evolutionary Game Analysis of Spatial Cooperation for Social Governance of Basin Water Pollution

: Given that the two institutional arrangements of government regulation and market allocation cannot effectively solve the conﬂict between individual and collective interests in the process of water pollution control, this work presents a useful attempt on the third institutional arrangement of environmental governance—social governance—to overcome the dilemma. Based on common pool resource theory and multi-person prisoner game analysis framework, it incorporates environmental damage function, spatial network structure, and strategy update based on a learning mechanism into the analysis framework. In addition, it constructs a set of spatial cooperative evolution game models of basin water pollution social governance, so as to test the guarantee effect of the spontaneous collective action conditions of basin polluters on the long-term survival of the new system. This work adopts the Monte Carlo numerical simulation method to conduct the simulation experiment research. The experimental results show it is possible to successfully form collective actions entirely dependent on emitters, which yet requires a large initial scale of cooperation, that is, a majority of the emitter group autonomously abides by credible commitments. In this process, transparent full information and active organizational mobilization have a positive effect on the collective action development. The organic combination can better guide emitters to abide by credible commitments to achieve the optimal collective interests. The study results can provide a theoretical and practical reference for the social governance mechanism at a large-scale basin. on the collective action evolution experiments under a different social governance mechanism, initial cooperation scale.


Introduction
Large pollution discharge from rapid economic development poses a serious threat to basin water environment, and severely threatens water security and sustainability [1,2]. With gradual emersion of China basin water environment pollution, various provinces and cities have made attempts toward its governance, but not enough to alleviate the severity [2][3][4]. China is in urgent need of exploring new considerations and modes in basin governance, to achieve improvements in the basin water environment and sustainable water use.
Currently, China usually adopts two governance modes of government dominance and market allocation to control basin water pollution [5]. In line with these two modes, many studies take the local government of the river basin as the main body of governance responsibility, and explore the corresponding feasible schemes through the allocation of governance responsibility and the trading of emission rights [4][5][6][7][8][9][10][11][12][13][14][15]. However, neither of these two models seems to have achieved the expected results in the actual governance process [5,16]. One important reason for this is that they neglect the individual rationality or interest demands of large-scale polluters behind the government. The individual rationality cannot be simply replaced by collective rationality, especially considering the inconsistency between individual and collective interests caused by pollution externality. These polluters are also victims and thereby could benefit from environment governance [17,18]. Therefore, whether from egoism or altruism, approaches toward urging these polluters to overcome the conflict between individual rationality and collective rationality and break through the "prisoner's dilemma" caused by pollution externality is the key to solve the water pollution problem [18].
Cooperative game is considered as an appropriate method to solve such problems [19][20][21][22]. However, the traditional cooperative game is usually suitable for a game between several agents, such as [20,23,24]. Game models that can accommodate multiple players (populations) have been developed, and the public goods game model is one of the representatives, which can realize limited repeated, multi-stage and population social dilemma games [25]. It is a standardized model that extends the two-person prisoner game to the population interaction, and succinctly describes the multi-person prisoner dilemma, that is, whether the individuals in the population are willing to bear the cooperative costs at the same time to realize the collective interests [26][27][28][29][30]. At present, the public goods game mainly focuses on the individual behavior mechanism [31], the game with spatial structure [32,33], the strategy update mechanism [34,35], and the strengthening mechanism for cooperation promotion [36][37][38][39][40][41][42]. Different from the general public goods game model, the benefits of water pollution treatment usually change with the change of treatment input. In other words, the benefits of environmental governance will increase with the increase of cooperation scale, or vice versa. Therefore, we develop a cooperative game model of a population composed of homogeneous individuals in a certain spatial structure by linking the water environment dynamics with the collective behavior. The model is used to explore the following three questions: (1) Could the population from cooperation spontaneously? (2) Would the collective cooperation be affected by the strategy distribution in the population? (3) Could the cooperation be improved by intervention from individual conflicts and government?
This study's contribution can be summarized as: (1) Enrichment of theoretic research on basin water pollution control under a social governance system; (2) construction of spatial cooperation evolutionary game model of basin water pollution social governance system, and provision of mathematical basis and analysis tool for new system's effectiveness; (3) unlike common public goods game, the variability of environmental benefits has been considered; (4) expounded the spontaneous collective action conditions after new system supply, providing theoretical practice and policy reference for basin water pollution governance system.

Environment Benefits from Pollution Governance
We assume that there is a basin system that can improve the water environment by reducing emissions, and the emission reduction is acceptable to the polluters. This system consists of the basin water environment, polluters, and the government. According to pollution regulation compliance, polluters are divided into cooperators and traitors. Cooperator emission is set as e c , and its emission volume is generally accepted basin society. Correspondingly, traitor, out of self-interest, maintains original emission volume e d (e d > e c ). Their joint emission constitutes the total basin water pollution.
Presume N represents the number of all agents, and f c is the cooperation percentage, and then the total pollution emission E can be obtained by Formula (1).
Generally, the environmental losses caused by different pollution discharge loads may be different. Consequently, we assume that the water environment risk outbreak probability is highly relevant with total pollution emission, and represented by P(E). D max is used to reflect the loss from potential greatest danger. Then, the potential environmental loss caused by different pollution discharge loads could be represented by Equation (2).
Obviously, total pollution increases with the number of traitors. All polluters need to shoulder the loss when total emission amount goes beyond water environment capacity. On the contrary, cooperators undertake emission reduction cost, while related benefits from water environment risk reduction is shared by all members. This also denotes that the possibility of pollution hazard outbreak is determined by the cooperation scale in the basin. Consequently, the cooperation scale in the basin is used in this study to reflect the outbreak risk of basin water pollution hazard, i.e., higher cooperation scale denotes less outbreak risk of water pollution hazard and better performance of the new governance system.
When all individuals betray the pollution regulation, total emission amount is E d , and basin emitters shall share the largest water environment loss D max . If some individuals are willing to cooperate, then the total emission amount is E, and joint water environment loss shared by all basin emitters is reduced to D(D ≤ D max ). The difference between loss D and the largest water environment loss D max is the environment control benefit R shared by all members, namely: Individual benefit obtained from environment governance could be represented as Equation (4).
The Gompertz function is used in this study to simplify the above process, which displays the relationship between the change of environmental governance benefit and cooperation scale in S-shaped curve (Equation (5)). r = re −τe −σ fc (5) where r is the maximum environment benefit coefficient. τ represents the displacement on x-axis, which determines the inflection point of the curve. σ represents the growth rate of function, which reflects the marginal environmental benefit. To ensure the comparability and generality of experimental simulation results, the study draws on the simulation parameter setting standard of traditional public goods game research. A study by Szolnoki et al. [43] shows that when r is set to a fixed value, and r ≤ 3.74, the cooperative strategy cannot maintain and survive in the population. Only when r > 3.74, will there be individuals willing to participate and maintain cooperation. However, when r > 5.49, all individuals participate in collective cooperation autonomously, with no more cooperative traitors. A study by Perc et al. [44] shows that grid interaction greatly lowers cooperator survival threshold on the grid (r = 3.74) than that in a well-mixed population, based on which population cooperation evolution is explored under severe (i.e., r < 3.74) or relaxed (i.e., r > 3.74) conditions.. Therefore, we limit environmental governance benefit r to the interval [0, 6], which also means r = 6. We set τ = 150 and σ = 10 in this work. Consequently, polluters face the dilemma of pursuing personal interests and protecting collective interests. Individuals have to comply with pollution regulations and bear the cooperation cost (i.e., opportunity cost of pollution reduction) if they want to protect collective interests. However, if there are enough cooperators in the basin, individuals might also benefit from their cooperation strategy due to increased collective benefits. In order to continuously optimize their own return in the dilemma, individuals will compare their own return with the return of others, so as to update their own emission strategy according to the behavior of others with higher income.
In the new governance system of social governance, in order to actively solve the collective problems, some polluters can also spontaneously punish the betrayers, and realize the increase in individual return by strengthening the formation of collective action. On the basis of voluntary social governance by the polluters, the government could implement punishment mechanisms to change the return of the polluters, so as to strengthen the scale of cooperation and achieve proper treatment of water pollution.
The conceptual model of the new governance system of basin water pollution control under social governance is presented in Figure 1. The basic emission decision function, social learning mechanism, the individual strengthened cooperation strategy, and government intervention are determined based on the conceptual model. might also benefit from their cooperation strategy due to increased collective benefits. In order to continuously optimize their own return in the dilemma, individuals will compare their own return with the return of others, so as to update their own emission strategy according to the behavior of others with higher income.
In the new governance system of social governance, in order to actively solve the collective problems, some polluters can also spontaneously punish the betrayers, and realize the increase in individual return by strengthening the formation of collective action. On the basis of voluntary social governance by the polluters, the government could implement punishment mechanisms to change the return of the polluters, so as to strengthen the scale of cooperation and achieve proper treatment of water pollution.
The conceptual model of the new governance system of basin water pollution control under social governance is presented in Figure 1. The basic emission decision function, social learning mechanism, the individual strengthened cooperation strategy, and government intervention are determined based on the conceptual model.

Decision Function for Pollutant Emission Behavior
Individual emission activity will affect the local area environment, thus forming feedback into the total revenue return of each emitter in the group. Spatial network structure setting can reflect the influence of local areas, and it is represented by a square grid ( Figure  2) in this work. Assuming that in a square grid, there are = × vertices, and each vertex denotes a group of emitter representatives and each edge stands for the direct interaction between representatives. In such a structure, each representative generally has four neighbors which can interact directly, itself and four neighbors forming a group G, and then a member quantity is five. Except for the group G, emitters are also on the groups constructed by neighbors as the center, so there are total of five Gs. This study also presumes that the acceptable emission amount can be realized by efficiency improvement, without reducing expected output. Then, in a group G, total returns of emitters' cooperative strategy and betrayal strategy are and , respectively, which can be shown as the below function:

Decision Function for Pollutant Emission Behavior
Individual emission activity will affect the local area environment, thus forming feedback into the total revenue return of each emitter in the group. Spatial network structure setting can reflect the influence of local areas, and it is represented by a square grid ( Figure 2) in this work. Assuming that in a square grid, there are N = L × L vertices, and each vertex denotes a group of emitter representatives and each edge stands for the direct interaction between representatives. In such a structure, each representative generally has four neighbors which can interact directly, itself and four neighbors forming a group G, and then a member quantity g is five. Except for the group G, emitters are also on the groups constructed by neighbors as the center, so there are total of five Gs. This study also presumes that the acceptable emission amount can be realized by efficiency improvement, without reducing expected output. Then, in a group G, total returns of emitters' cooperative strategy and betrayal strategy are π G C and π G D , respectively, which can be shown as the below function: where n c is existing cooperator number in group G, and c g is the cooperation cost. If the individual chooses cooperation strategy, cooperation agents in the group add one more member; if non-cooperation strategy is chosen, no change to cooperation number in the group. From the above two functions, the emission activity of an individual group affects total benefits return of the individual. Non-cooperation strategy can make the individual run out of cooperation cost, but also reduce environmental returns. Cooperation strategy has improved the total benefits return of other emitters in the group, but it is not the best for itself, especially when other individuals in the group are uncooperative. In public goods game, the cost of cooperation is often set as c g = 1.
where is existing cooperator number in group G, and is the cooperation cost. If the individual chooses cooperation strategy, cooperation agents in the group add one more member; if non-cooperation strategy is chosen, no change to cooperation number in the group. From the above two functions, the emission activity of an individual group affects total benefits return of the individual. Non-cooperation strategy can make the individual run out of cooperation cost, but also reduce environmental returns. Cooperation strategy has improved the total benefits return of other emitters in the group, but it is not the best for itself, especially when other individuals in the group are uncooperative. In public goods game, the cost of cooperation is often set as = 1.

Social Learning Strategy Update Based on Self-Organization
With changes in global and local strategies, individuals will also adjust their own actions to protect their own interests. The logit rule based on the Fermi function is a shortsight strategy response rule with the best performance in evolutionary game, and is especially suitable to stimulate human behavior [44]. Therefore, this work adopts the Fermi function to reflect social study and strategy update mechanism of individuals. Due to the existence of square grid spatial structure, emitters have an impact on the total return of their four neighbors, whose actions, on the other hand, also affect the total return of the emitter, which means the individual needs to participate in group game interaction centered on itself and different neighbors. At this time, the group number G is 5. Suppose an agent with strategy has an overall return benefit , which should be the sum of its game payoffs of each group: Arbitrarily select an individual x, its current strategy is , and its return benefit is . Randomly select an individual y, one of x's four neighbors, and its strategy is . is different from , and its return benefit is . Then, individual x imitates the strategy of individual y, and its strategy update possibility is: where = 0.5. When > , individual x is easier to imitate . However, this does not mean individual x can receive a higher return benefit. Such a decision-making error can be attributed to incomplete information and external influences that affect adversary evaluations [44,45].

Strengthened Cooperation Strategy in the New Governance System
History has repeatedly proved that public goods, whether environmental protection or social benefit system, are easily utilized by those individuals who seek personal gains at the expense of others [46], and enticing benefits of free-riding often leads to collective

Social Learning Strategy Update Based on Self-Organization
With changes in global and local strategies, individuals will also adjust their own actions to protect their own interests. The logit rule based on the Fermi function is a short-sight strategy response rule with the best performance in evolutionary game, and is especially suitable to stimulate human behavior [44]. Therefore, this work adopts the Fermi function to reflect social study and strategy update mechanism of individuals. Due to the existence of square grid spatial structure, emitters have an impact on the total return of their four neighbors, whose actions, on the other hand, also affect the total return of the emitter, which means the individual needs to participate in group game interaction centered on itself and different neighbors. At this time, the group number G is 5. Suppose an agent with strategy s x has an overall return benefit Π s x , which should be the sum of its game payoffs of each group: Arbitrarily select an individual x, its current strategy is s x , and its return benefit is Π s x . Randomly select an individual y, one of x's four neighbors, and its strategy is s y . s x is different from s y , and its return benefit is Π s y . Then, individual x imitates the strategy of individual y, and its strategy update possibility is: where K = 0.5. When Π s y > Π s x , individual x is easier to imitate s y . However, this does not mean individual x can receive a higher return benefit. Such a decision-making error can be attributed to incomplete information and external influences that affect adversary evaluations [44,45].

Strengthened Cooperation Strategy in the New Governance System
History has repeatedly proved that public goods, whether environmental protection or social benefit system, are easily utilized by those individuals who seek personal gains at the expense of others [46], and enticing benefits of free-riding often leads to collective cooperation breakdown [47]. Therefore, a method to effectively set cooperation system guarantee is a key point to scholars. This work adopts an implementer perspective of strengthened cooperation by Perc [44], and makes an analysis from two aspects of individual conflict and individual punishment from individual strengthened cooperation mechanism, and government supervision punishment mechanism. These strengthened cooperation mechanisms are added to the "binary strategy" game as strategy form, forming the game model of the corresponding mechanism.

Cooperation Game under Individual Strengthened Cooperation Mechanism
Individual conflicts often occur in a population. In the public goods game, individual conflict is represented by individual punishment, which is a strengthened strategy imple-Water 2022, 14, 2564 6 of 24 mented spontaneously by individuals to punish traitors through corresponding conflict and confrontation cost, and to reduce the benefit of the free-riding behavior, thus encouraging emitters to cooperate. The above constitutes the individual strengthened cooperation mechanism in the new governance system of basin water pollution control.
The individual punishment strategy is to introduce a third competitive strategy on the basis of Equations (5) and (6), namely individual punishment strategy (peer punishment, abbreviated as Pep). The individual who adopts the individual punishment strategy has the same environmental benefits and cooperation cost as the individual who adopts cooperation strategy, which means the individual punisher is also a punitive collaborator. However, as a punisher, one also needs to shoulder punishment or conflict cost c Pep . At this time, for the individual of punishment strategy, they will not just enjoy free-ride behavior benefits, but will also take punishment from individual punishers, namely fine f Pep . At the same time, let c Pep = α f Pep and α > 0. When α < 1, this means the individual punisher has to pay more to have a certain impact on the free-rider. When α > 1, this means the individual punisher can pay a smaller cost to have a certain impact on the free-rider. Based on the above, the formula is as: where n c , n Pep , and n d are agent number in G group adopting the cooperation strategy, individual punishment strategy, and resistance strategy, respectively. n d g−1 and n Pep g−1 represent proportion of traitors and individual punishers in the group. It is not difficult to find that when neighbors of individual punishers are all traitors, it bears the largest punishment cost. In addition, when neighbors of the traitor are individual punishers, it also receives the greatest punishment force.

Collective Cooperation Game under Government Intervention
In the process of pollution control, governments often take the responsibility of supervision and sanction. They can function as intervention on emitter behaviors by leadership or resources, so as to promote collective action. In public goods game, pool punishment is an important strategy to reflect government intervention. Sigmund [48] gave a clear explanation for this, and denoted it as synonymous with institutionalized punishment. In institutional punishment, the punisher needs to pay administrative cost for government intervention, regardless of its necessity and efficiency.
In addition to cooperation cost, individuals who adopt pool punishment also bear a pool penalty cost c gp . For traitors, they need to accept administrative intervention and associated fines f gp due to the existence of pool punishment. In addition, the benefits return functions of three different competitive strategies are as follows: π G D = r n c + n gp g − f gp n gp (15)  where n c and n gp are the number of individuals in group G who adopt general collaborators and who adopt pool penalty strategy, respectively. S n gp is a state function as follows: The state function indicates when there are individuals in the group adopting pool punishment strategy, traitor collective cooperation behavior will be under administrative punishment. Otherwise, traitors still get to enjoy the short-term benefit of free-riding behavior.

Experiment Design and Evolutionary Process
In evolutionary graph theory, the evolutionary equilibrium is no longer represented by the evolutionary stable strategy, but is replaced by the evolutionary steady state (evolutionary dynamic equilibrium). Evolutionary stability also determines the outcome of evolution. This study believes the evolutionary steady state can be divided into two cases. First, a competitive strategy dominates the entire population, which means other strategies cannot compete with this strategy, presenting the evolution of a clear growth trend of one party and decrease of other strategies. Second, as pointed out by Perc et al. [44], the average density of each specific strategy distributed in the population determines the steady state of the evolutionary game, but after sufficient relaxation time, the number proportions formed by different competitive strategies tend to become stable and no longer change with time.
The first case can also be understood as a special form of the second case.
Combined with the above conceptual model, the steady state of cooperation scale f stable c is used to represent the outbreak risk of water environment hazards and the performance of the new governance system. In the evolutionary steady state, a larger cooperation scale means lower basin water environmental risk and better performance of the new governance system.
At the beginning of the evolution simulation, this study assigned the initial emission strategies of all individuals in the basin. Given that the initial strategy distribution state f initial c would greatly affect the chance of successful policy evolution, this study considers two extremes of policy distribution states [44], namely random strategy distribution and organized strategy distribution, which are also referred to as random state and organized state, respectively.
The random state is uniformly distributed on the vertices of the grid network structure according to the proportion of the corresponding strategy, that is, the individual randomly selects a strategy in the strategy set with equal probability in the game. It is often assumed that random states give different strategies the same chance of evolutionary success. The organized state is to be divided into several subsystems in the whole spatial structure, and individuals in each subsystem adopt the same competitive strategy. In the emergent phenomena of human society, whether a movement or an initiative, they all start from local, namely beginning as an organized initial state of like-minded individuals [44]. Then, it is necessary to form a stable strategy cluster first, and then compete with other strategies, so as to ensure different strategies bear the same chance of success in evolution.
This study argues both initial strategy distribution states may appear at the beginning of a new system establishment. At the beginning, representatives of the emitter group can freely decide whether to abide by the credible commitment willingness through the collective choice platform, so as to be both compliant individuals (cooperators) and noncompliant individuals (betrayers). This scenario is very similar to live voting, reflected by a random state. Of course, government can organize and mobilize the emitters in the local area, first to ensure local emitters abide by the credible commitment, and then expand to the entire basin. This situation can also be represented by an organized state.
Three evolution simulation experiments were conducted to examine the effect of social governance in basin water pollution control and explore the conditions for autonomous collective action formation: (1) Experiment 1: spontaneous cooperation evolution simulation, (2) experiment 2: cooperation evolution simulation under individual strengthened mechanism, and (3) experiment 3: cooperation evolution simulation under government punishment mechanism. Each experiment executes random-state and organized-state simulation, which formulates six scenarios (Table 1). In scenarios 1-2 of experiment 1, simulations were conducted under the conditions of 0-100% initial cooperation scale. These simulations were used to explore whether autonomous collective actions could formulate spontaneously in the basin and what conditions are needed for autonomous collective actions formulation without any intervention. The judging criteria for strengthened cooperation mechanism are whether the mechanism can formulate a steady state of cooperation scale. Therefore, based on the results of experiment 1, experiment 2 and experiment 3, only executed simulations with 50% initial cooperation scale (details in Section 3.2).
The strategy implementation under strengthened cooperation mechanism requires corresponding cost and setting of reward and punishment values, namely, the implementation conditions for strengthened cooperation strategy. In these numerical settings, this study does not adopt the usual practice of cooperative game research in statistical physics, which considers the values of cost and reward and punishment as continuous, and looks for the evolution results of collective cooperation under different values, such as Szolnoki and Perc [30], Helbing [46] et al., Szolnoki [36] et al., and Perc and Szolnoki [37]. This method is indeed more rigorous, and small differences in numerical settings determine the difference in evolution results. However, in practical applications, this work considers the difficulty for the strategy implementer to control the strategy implementation cost, especially the small cost difference, and it is difficult for the strategy bearer to experience the small changes in the rewards and punishments.
Based on practical considerations, this work mainly considers implementation cost, and takes cooperation cost as the comparison standard. Taking enhanced cooperation strategy agents as cooperators requires cooperation cost, and implementation cost of the enhanced cooperation strategy as well. To this end, in cost settings, additional payments of 0.1, 0.5, 1.0 and 2.0 are necessities on a cooperative governance cost basis. The difference in cost size is regarded as the difficulty behind the enhanced cooperation mechanism implementation. The higher the cost, the more difficult it is to apply and implement the enhanced cooperation mechanism, because the implementer needs to bear a high price.
For the numerical setting of punishments, only the influence of punishments on policy recipients needs consideration. Assume that in a uniformly mixed population where each individual plays a game with other individuals through random matching, when the punishment intensity is 1, traitors' short-term benefits are no longer guaranteed, which can effectively induce them to join collective cooperation. Considering the cluster effect of structured populations, this work takes the value of 1 as the median, and sets its values as 0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8 and 2.0, a total of 11 cases investigated.
In short, costs, rewards, and punishments determine the conditions for strengthening cooperation mechanism. Usually, a smaller cost can make it easier to implement the strategy of strengthening cooperation, while a larger reward and punishment can better guarantee the strategy implementation effect. The simulation experiments and their parameters are presented in Table 1.
This work adopts the Monte Carlo simulation method to simulate the evolution process. It randomly selects an individual from the game participants, and makes this individual decide whether to update the strategy through learning. The selected individual, regardless of the update strategy or the maintenance strategy, is regarded as a Monte Carlo step, which represents a basic evolution time unit. In each Monte Carlo step, only one individual can implement the strategy update, and the neighbors of this individual do not update the strategy along with it, which means other individual strategies in this step remain unchanged. Usually, a complete Monte Carlo simulation process needs to complete at least L 2 steps, ensuring each individual gets a chance to strategy update. In the simulation process, sufficient relaxation time should be ensured to make the quantity ratio from different competitive strategies stable or form a relatively clear evolution trend (for example, the cooperative strategy in constant increase and other strategies in constant decrease), thus ensuring a complete evolution process.
For pollution control activities at the basin scale, even if representatives are appointed to replace large-scale control entities (pollutants), the number of representatives is still very large and should be in the tens of thousands. For this reason, the game experiment can only be completed by reducing game agent number under other experimental requirements. This work sets L = 50, and there is a total of 2500 individuals. Let all competitive strategies be distributed on the grid vertex, representing the choice of individual initial strategy at this point. Each strategy performed by an actor after social learning is represented by a step size. The game evolutionary steady state can only be obtained by ensuring all action agents have undergone sufficient strategy updates. Therefore, to ensure each individual can achieve at least one strategy update and that the number of individuals adopting different competitive strategies tends to be stable, this work makes each Monte Carlo step at least 40,000 steps long. In subsequent sections, except for the evolution strategy distribution state diagram, the numerical results in each diagram are represented by the mean results, whose calculation is based on the arithmetic mean of 500 independent experiment results. The purpose is to overcome the error caused by randomness in the Monte Carlo method, so as to ensure the accuracy and validity of the experimental results. Figure 3 shows the spontaneous cooperation evolution process (without strengthened cooperation mechanism) in different organizational states, including random and organized states (Scenario 1 and Scenario 2 in Table 1). In addition, it can be found that under the variable environmental governance benefits, both strategy distribution states can realize cooperation development under a certain initial partner scale. The organized state can enable collective cooperation to develop at about 70% of the initial collaborator size, while the stochastic state requires a higher initial collaborator size. Under the same initial conditions (e.g., at 85% of the initial collaborator size), the organized state also enables earlier and more stable collective full cooperation, while the evolution trend curve of random state shows a trend of "twists and turns". under the variable environmental governance benefits, both strategy distribution states can realize cooperation development under a certain initial partner scale. The organized state can enable collective cooperation to develop at about 70% of the initial collaborator size, while the stochastic state requires a higher initial collaborator size. Under the same initial conditions (e.g., at 85% of the initial collaborator size), the organized state also enables earlier and more stable collective full cooperation, while the evolution trend curve of random state shows a trend of "twists and turns". To describe the critical point where the two strategies distribution states trigger the development of collective cooperation with more clarity, this work shows cooperative evolution results distribution diagram based on different initial cooperation scales through box plots (Figure 4). The x-axis in Figure 4 represents the initial collaborator scale, and the y-axis represents the evolved collaborator proportion (same type of graph, and same as below).

Spontaneous Cooperation Evolution Simulation
In a random state, it is possible to evolve a high level of collective cooperation when the initial cooperation scale reaches 70%, but the possibility is very small, while the initial environmental governance benefit at this time is = 5.23. However, with an increase in the initial collaborator scale, the possibility also gradually increases, and collective cooperation development can be stably achieved when it increases to 76%. The initial partner scale of 76% also means that the environmental governance benefit r needs to reach 5.57, and the random state can only guarantee the stable occurrence of spontaneous collective cooperation under such benefit incentives. This result is obviously more stringent than the > 5.49 condition in the public goods game.
When the cooperator scale ranges from 70% to 75%, the cooperation evolution in a random state is uncertain. It would evolve polarized results, which may achieve a high To describe the critical point where the two strategies distribution states trigger the development of collective cooperation with more clarity, this work shows cooperative evolution results distribution diagram based on different initial cooperation scales through box plots (Figure 4). The x-axis in Figure 4 represents the initial collaborator scale, and the y-axis represents the evolved collaborator proportion (same type of graph, and same as below).
Water 2022, 14, x FOR PEER REVIEW level of collective cooperation or may collapse. This uncertainty stems from the state which cannot guarantee every strategy an equal chance of survival, especiall three or more competing strategies [44]. In other words, the strategy state may le high density of defection strategies in local areas, and the cooperators in these ar betray the collective cooperation due to learning errors caused by the local enviro Once traitors increase to a certain number, an irreversible formation is formed, ine followed by collapse of collective cooperation. In contrast, in an area dominated b erative strategies of initial evolution, the consolidation and development of collec operation is a high-probability event. When the initial cooperation scale reaches 7 above, it means that the population has sufficient slack to resist the increase of b strategies, which also gives traitors sufficient time to learn and participate in co cooperation.
The organized state can effectively avoid the instability of the evolution result by the random initial state in the small system scale [44]. According to Figure 3, tively avoids the uncertainty of the evolutionary outcome. The organized state stil achieve the development of collective cooperation under the condition of 65% ini laborator scale, but once the initial collaborator scale increases by another 1%, t nears closer. That is, when the environmental governance benefit ≥ 4.89, the c tive strategy can overcome the betrayal strategy, and collective cooperation can p spontaneously. Correspondingly, the random state requires more incentives for e mental governance benefits ( ≥ 5.57) to ensure the success of collective coopera To compare and analyze the evolution process of the random initial state an nized state with more clarity, this study takes 70% and 80% of the cooperator scal initial conditions, respectively, to show the individual strategies and their incom bution in the process of spontaneous cooperative evolution. As shown in Figur Figure 6, the strategy distribution and total profit distribution of the two states at ginning of the game evolution, during the process (T are 5000, 10,000, 15,000 and step nodes, respectively) and at the end of the game evolution were intercepted map. Among them, in the strategy evolution distribution diagram, dark blue and In a random state, it is possible to evolve a high level of collective cooperation when the initial cooperation scale reaches 70%, but the possibility is very small, while the initial environmental governance benefit at this time is r = 5.23. However, with an increase in the initial collaborator scale, the possibility also gradually increases, and collective cooperation development can be stably achieved when it increases to 76%. The initial partner scale of 76% also means that the environmental governance benefit r needs to reach 5.57, and the random state can only guarantee the stable occurrence of spontaneous collective cooperation under such benefit incentives. This result is obviously more stringent than the r > 5.49 condition in the public goods game.
When the cooperator scale ranges from 70% to 75%, the cooperation evolution in a random state is uncertain. It would evolve polarized results, which may achieve a high level of collective cooperation or may collapse. This uncertainty stems from the random state which cannot guarantee every strategy an equal chance of survival, especially under three or more competing strategies [44]. In other words, the strategy state may lead to a high density of defection strategies in local areas, and the cooperators in these areas will betray the collective cooperation due to learning errors caused by the local environment. Once traitors increase to a certain number, an irreversible formation is formed, inevitably followed by collapse of collective cooperation. In contrast, in an area dominated by cooperative strategies of initial evolution, the consolidation and development of collective cooperation is a high-probability event. When the initial cooperation scale reaches 76% and above, it means that the population has sufficient slack to resist the increase of betrayal strategies, which also gives traitors sufficient time to learn and participate in collective cooperation.
The organized state can effectively avoid the instability of the evolution result caused by the random initial state in the small system scale [44]. According to Figure 3, it effectively avoids the uncertainty of the evolutionary outcome. The organized state still cannot achieve the development of collective cooperation under the condition of 65% initial collaborator scale, but once the initial collaborator scale increases by another 1%, this goal nears closer. That is, when the environmental governance benefit r ≥ 4.89, the cooperative strategy can overcome the betrayal strategy, and collective cooperation can proceed spontaneously. Correspondingly, the random state requires more incentives for environmental governance benefits (r ≥ 5.57) to ensure the success of collective cooperation.
To compare and analyze the evolution process of the random initial state and organized state with more clarity, this study takes 70% and 80% of the cooperator scale as the initial conditions, respectively, to show the individual strategies and their income distribution in the process of spontaneous cooperative evolution. As shown in Figures 5 and 6, the strategy distribution and total profit distribution of the two states at the beginning of the game evolution, during the process (T are 5000, 10,000, 15,000 and 20,000 step nodes, respectively) and at the end of the game evolution were intercepted. Status map. Among them, in the strategy evolution distribution diagram, dark blue and yellow represent betrayal strategy and cooperation strategy, respectively. Figure 5 uses a cooperator scale of 70% as the initial condition for evolution. It can be seen from A1 that cooperative strategy is difficult to resist betrayal strategy in a random state, while cooperative strategy in the organized state gradually occupies the living space of the original betrayal strategy, which makes the collective cooperation grow and develop.
It can be seen that the cooperative strategy in the random state is not dominant initial period, but non-cooperators are increased and several clusters are formed population. This process reflects the effect of betrayal strategy clusters, which also s in Figure 3 where the trend curve of collaborator scale in a random state does not ri falls in the initial stage. The formation of this cluster actually transforms cooperative egy and betray strategy into several independent subsystems composed of a single egy.

Figure 5. Strategies Distribution and Returns Change Process with 70% Cooperative Scale
Initial Condition (A1 and A2, respectively, represent the distribution of individual strategi their total returns in a random state, while B1 and B2, respectively, represent the distribut individual strategies and their total returns in an organized state; C denotes cooperation st while D denotes betrayal strategy; the same below). Notably, in the later stages of evolution, the collaborator number continues in ing, but at a very slow rate. The slowness of this process may be due to the fact th traitor cannot effectively grasp all the information, resulting in erroneous learning n sary to maintain the betrayal strategy. As seen from A2 in Figure 5, if traitors can joi the collective cooperation, their total income will still increase, thus methods to learning mistakes represent the key to accelerating the development of collective co ation. For the organized state, the evolution process is still the same as that of the   Figure 6 takes the cooperator scale of 80% as the initial condition for evolution. At an initial collaborator size of 80%, collective cooperation in random state is also developed. It can be seen that the cooperative strategy in the random state is not dominant in the initial period, but non-cooperators are increased and several clusters are formed in the population. This process reflects the effect of betrayal strategy clusters, which also shown in Figure 3 where the trend curve of collaborator scale in a random state does not rise but falls in the initial stage. The formation of this cluster actually transforms cooperative strategy and betray strategy into several independent subsystems composed of a single strategy.
Notably, in the later stages of evolution, the collaborator number continues increasing, but at a very slow rate. The slowness of this process may be due to the fact that the traitor cannot effectively grasp all the information, resulting in erroneous learning necessary to maintain the betrayal strategy. As seen from A2 in Figure 5, if traitors can join into the collective cooperation, their total income will still increase, thus methods to avoid learning mistakes represent the key to accelerating the development of collective cooperation. For the organized state, the evolution process is still the same as that of the organized state in Figure 5 but the increase in initial collaborator size can only accelerate the promotion of collective cooperation to full cooperation.

Cooperative Evolution Simulation under Strengthened Cooperation Mechanism
It is necessary to establish a benchmark for the initial cooperation scale, so as to judge and choose under which implementation conditions (i.e., different combinations of costs, rewards, and punishments), the strengthening cooperation strategy can be effectively maintained, or promote collective cooperation development. In the general form of public goods game research, an r value around 2.0 is usually regarded as a low-return payoff condition [45], and is widely used to test the effectiveness of the strengthened cooperation strategy. Since r here is determined by the cooperation scale, this work sets the initial cooperation scale as 50%, and its environmental governance benefit r = 2.18. As known from the above section, 50% of the initial conditions cannot evolve the desired result, and collective cooperation will be quickly disintegrated by free-rider behavior. If the strengthened cooperation mechanism is effective, it can at least ensure that collective cooperation does not collapse quickly, and even promote cooperation development. Otherwise, vice versa. In short, this initial condition can provide a good basis for comparison and reference, and can identify effective implementation conditions for strengthening the cooperation strategy.

Cooperative Evolution Simulation under Individual Strengthened Mechanism
This study firstly examines the conditions for implementing individual conflict strategies. Assuming a random state and an organized state with 50% cooperators, the cooperative evolution results based on different combinations of individual punishment costs and fines are shown in Figure 7a,b, respectively. It can be found from Figure 7a that irrespective of how the costs and fines of individual punishment are combined, the individual conflict mechanism in the random initial state cannot promote collective cooperation. At this point, it can be said that the individual conflict mechanism is completely ineffective. However, Figure 7b provides another result. With the increase in fines, the cooperator proportion after its evolution also gradually increases, so the relationship between fines and cooperator porportion is in positive correlation. When f Pep ≥ 1.0, the collaborator scale is basically maintained. Regarding individual punishment strategy cost, it does not have a significant effect on cooperative evolution. On the whole, the individual conflict mechanism is only maintained under the scale of 50% initial cooperators in the organized state. random state can promote the collective cooperation development more quickly. Their growth trend also shows another different feature. In the random state, the smaller the initial scale, the greater the change in the initial growth rate, while the growth rate in the organized state is not related to the initial cooperation scale. All trend lines seem to promote the cooperation development at a certain growth rate. Some scholars have pointed out that costly individual punishment is difficult to maintain in a well-mixed population [49,50]. It can also be found in the experiments of this study that even in a population with a spatial structure, whether it is a random or organized initial state, if there is no sufficient reward incentive, individual punishment is still difficult to maintain. Moreover, even the individual conflict mechanism cannot change the disintegration of collective cooperation. To this end, it is necessary to further clarify the initial collaborator size to effectively activate individual conflict and play their role. Figure 8 shows different initial collaborator cooperation evolution processes under the individual conflict mechanism, and the punisher only needs to pay a small penalty cost when c Pep = 0.1 and f Pep = 1.0, thus reflecting the activation conditions of individual punishment strategies. From the comparison of the two strategies' distribution states, activation conditions of the individual conflict mechanism under the organized initial state are better than the random initial state. This is because the organized state can realize cooperation strategy growth under the scale of about 55% initial partners, while the stochastic state requires a bigger initial collaborator size (around 65%). From the evolutionary trend, the cooperative strategies in both states show a growing trend. The random state forms a rapid growth trend at first and then gradually slows, while the organized state shows a continuous and stable growth, and no signs of slowing in the later evolution stages. Therefore, in the middle and early stages, the individual conflict mechanism in a random state can promote the collective cooperation development more quickly. Their growth trend also shows another different feature. In the random state, the smaller the initial scale, the greater the change in the initial growth rate, while the growth rate in the organized state is not related to the initial cooperation scale. All trend lines seem to promote the cooperation development at a certain growth rate. Figure 9 shows more precisely the evolution of cooperation for different initial partner sizes. It can be seen that there is still instability in the random state, but after the individual conflict mechanism is activated, it can achieve high-level cooperation more quickly. The organized state can more easily activate the individual conflict mechanism to achieve the development of collective cooperation, thus avoiding the uncertainty caused by evolutionary instability, though the evolutionary process is longer. Compared to the "binary strategy", the individual conflict mechanism, whether in a random or organized state, provides more relaxed conditions for collective cooperation, and can realize collective cooperation development with a smaller initial cooperation scale.   Figure 9 shows more precisely the evolution of cooperation for different initial partner sizes. It can be seen that there is still instability in the random state, but after the individual conflict mechanism is activated, it can achieve high-level cooperation more quickly. The organized state can more easily activate the individual conflict mechanism to achieve the development of collective cooperation, thus avoiding the uncertainty caused by evolutionary instability, though the evolutionary process is longer. Compared to the "binary strategy", the individual conflict mechanism, whether in a random or organized state, provides more relaxed conditions for collective cooperation, and can realize collective cooperation development with a smaller initial cooperation scale.   Figure 9 shows more precisely the evolution of cooperation for different initial partner sizes. It can be seen that there is still instability in the random state, but after the individual conflict mechanism is activated, it can achieve high-level cooperation more quickly. The organized state can more easily activate the individual conflict mechanism to achieve the development of collective cooperation, thus avoiding the uncertainty caused by evolutionary instability, though the evolutionary process is longer. Compared to the "binary strategy", the individual conflict mechanism, whether in a random or organized state, provides more relaxed conditions for collective cooperation, and can realize collective cooperation development with a smaller initial cooperation scale. To explain the evolution process of the strategy distribution in two different states more clearly, this study adopts the initial cooperator scale of 70% in the random state (35% cooperative strategy and 35% individual punishment strategy). The benefit of environmental governance is r = 5.23, and the cost and the fine of individual punishment strategy are the same as above. Such a numerical setting is to highlight the influence of the individual conflict mechanism on the cooperation evolution under ideal conditions. Then, under the initial condition of 70% cooperators, the individual punishment strategy distribution and the benefits change process are shown in Figure 10. In conclusion, whether in a random or organized state, the individual conflict sy can play a role in promoting the collective cooperation development, and requires initial collaborator size in comparison with "binary strategy" simulation results. Th dividual conflict mechanism has played an active role in safeguarding the new sys and a methodology that ensures the long-term maintenance of the individual punishm strategy is an extremely critical factor.

Conditions for Government's Punishment Strategy Implementation
Recent studies have proved that government regulation and punishment play a otal role in the cooperative governance of water pollution in basins [55]. Figure 10 sh the average result of the evolutionary cooperator scale based on different cost and p ishment combinations under the initial cooperation scale of 50%, to reflect governm punishment performance under different implementation conditions.
As can be seen from Figure 11, the government supervision and punishment me nism in a random state can only ensure its activation when the punishment intensi greater than a certain threshold. At = 0.1, when the punishment intensity is its evolved cooperator scale reaches 78.5%; when is greater than 1.2, the scale rea over 95%. Punishment costs can hinder and limit the government's punishment me nism. For example, when the punishment intensity is 1.6, and the punishment is 0.1 and 0.5, respectively, the mean size of the cooperators after evolution can r 95.67% and 91.14%, respectively. Once the cost exceeds 0.5, the mean value will be gre reduced. In addition, the punishment intensity is the decisive factor. When The result also confirms the conclusion that in population cooperative evolution based on spatial structure, individual punishers and cooperators will spontaneously differentiate into homogeneous tight clusters, and then independently compete with the collective cooperative traitors. This strategy, which gains space from the betrayal strategy more effectively, can lay the foundation for the demise of other cooperative strategies [44].
In the random state, all three strategies appear to spontaneously differentiate into tight clusters. In the early stage, since individual punishers are scattered in the spatial population, traitors choose cooperative strategy under the shock of individual punishment. However, when most space is occupied by cooperative strategies, individuals in local areas will choose the betrayal strategy, and gain higher returns from betrayal, thus leading to loss of interest in partners. The large cost gives birth to the emergence of second-order free riders, making it difficult for individual punishment to form an effective resultant force, thus affecting sanction effectiveness against traitors [38]. Therefore, in the random state, the individual conflict mechanism cannot guarantee the long and stable survival of the individual punishment strategy in the population. According to Lucifer's positive side effect, some scholars have proposed a punishment strategy by stimulating individuals for their longer existence in the population, which can greatly accelerate the diffusion of individual punishment strategies in the population. This ensures that the individual conflict system can play its role for a long period of time [51].
The differentiation process at the initial stage is avoided in the organized state, because in collective cooperation participation, individual total return of tight clusters from individual punishers and cooperators is much higher than that of betrayal, and individual punishment cost is relatively low. The organized state also produced the positive side effects of Lucifer, accelerated the rate of traitors joining the collective cooperation, and effec-tively prevented the individuals who had joined the collective cooperation from betrayal again. Due to the existence of the spatial structure, the punisher can resist the infringement of second-order free riders by forming clusters, and can directly compete with traitors, thus maintaining the common interests of individuals and groups [36,46,[52][53][54].
In conclusion, whether in a random or organized state, the individual conflict system can play a role in promoting the collective cooperation development, and requires less initial collaborator size in comparison with "binary strategy" simulation results. The individual conflict mechanism has played an active role in safeguarding the new system, and a methodology that ensures the long-term maintenance of the individual punishment strategy is an extremely critical factor.

Cooperative Evolution Stimulation under Government Punishment Mechanism Conditions for Government's Punishment Strategy Implementation
Recent studies have proved that government regulation and punishment play a pivotal role in the cooperative governance of water pollution in basins [55]. Figure 10 shows the average result of the evolutionary cooperator scale based on different cost and punishment combinations under the initial cooperation scale of 50%, to reflect government punishment performance under different implementation conditions.
As can be seen from Figure 11, the government supervision and punishment mechanism in a random state can only ensure its activation when the punishment intensity is greater than a certain threshold. At c gp = 0.1, when the punishment intensity f gp is 1.2, its evolved cooperator scale reaches 78.5%; when f gp is greater than 1.2, the scale reaches over 95%. Punishment costs can hinder and limit the government's punishment mechanism. For example, when the punishment intensity f gp is 1.6, and the punishment cost c gp is 0.1 and 0.5, respectively, the mean size of the cooperators after evolution can reach 95.67% and 91.14%, respectively. Once the cost exceeds 0.5, the mean value will be greatly reduced. In addition, the punishment intensity is the decisive factor. When the punishment intensity f gp increases to 1.8 and above, the cost constraint effect is greatly weakened. Even in the case of c gp = 2, the average size of the evolved collaborators still reaches 90.44%. In short, the smaller the cost, the greater the punishment, and the more likely the government punishment mechanism will lead to a high collective cooperation level.
Water 2022, 14, x FOR PEER REVIEW 18 of 25 punishment intensity increases to 1.8 and above, the cost constraint effect is greatly weakened. Even in the case of = 2, the average size of the evolved collaborators still reaches 90.44%. In short, the smaller the cost, the greater the punishment, and the more likely the government punishment mechanism will lead to a high collective cooperation level.  Figures 12 and 13 can answer these questions. From Figure 12, under the random state of 47% initial cooperation scale, namely the environmental governance benefit = 1.53, the partner scale curve shows an upward trend, indicating that the government punishment mechanism has the potential to promote collective cooperation development under such initial conditions. Meanwhile, it requires 49% initial cooperation scale under the organized state, namely the environmental governance benefit = 1.96, where the collaborator scale curve shows a weak upward trend. Com-

Activation Conditions of Government Punishment Strategy
When c gp = 0.1 and f gp = 1.4, the government punishment system can promote collective cooperation development under the 50% initial cooperation scale in two different states. However, what is the critical point to promote collective cooperation development, and how stable is the evolution? Figures 12 and 13 can answer these questions. From Figure 12, under the random state of 47% initial cooperation scale, namely the environmental gover-nance benefit r = 1.53, the partner scale curve shows an upward trend, indicating that the government punishment mechanism has the potential to promote collective cooperation development under such initial conditions. Meanwhile, it requires 49% initial cooperation scale under the organized state, namely the environmental governance benefit r = 1.96, where the collaborator scale curve shows a weak upward trend. Combined with Figure 13, the random state can stably evolve a high level of collective cooperation under the initial cooperation scale of 49%, while the organized state requires an additional 1% of the initial cooperation scale. Overall, compared with the organized state, the government punishment mechanism of the random state can be activated with a smaller initial cooperator scale, and can also form a high level of collective cooperation earlier on. stability is still not as good as the random state. To sum up, the government punishment mechanism can play a more positive role in the random state.  To describe the evolution process of the three competitive strategies in the government punishment mechanism and the total return change of individuals adopting different strategies with further clarity, this study sets 54% cooperator scale as the initial evolution condition where the betrayal strategy, cooperation strategy and government punish- stability is still not as good as the random state. To sum up, the government punishment mechanism can play a more positive role in the random state.  To describe the evolution process of the three competitive strategies in the government punishment mechanism and the total return change of individuals adopting different strategies with further clarity, this study sets 54% cooperator scale as the initial evolution condition where the betrayal strategy, cooperation strategy and government punish- In terms of results stability, the government punishment mechanism in the random initial state is also more robust. This is diametrically opposite to the performance of the individual conflict mechanism, self-organized adaptive punishment mechanism, and self-organized adaptive reward mechanism. As shown in Figure 13, the random state government punishment mechanism is likely to promote collective cooperation developent under the initial cooperator scale of 46%, and the possibility of a higher level of collective cooperation is greatly increased when the initial cooperator scale reaches 47% and 48%. The government punishment mechanism of the organized state is not only weaker than the random state in the effect of promoting collective cooperation development, but also inferior to the random state in the stability of its evolution result. The organized state requires at least 50% initial cooperator scale to ensure all evolution results are not lower than the initial cooperator scale, and all results are distributed around its mean. Although the organized state increases with the size of initial collaborators, the stability of its evolutionary results also increases. However, under the same conditions, its evolution stability is still not as good as the random state. To sum up, the government punishment mechanism can play a more positive role in the random state.
To describe the evolution process of the three competitive strategies in the government punishment mechanism and the total return change of individuals adopting different strategies with further clarity, this study sets 54% cooperator scale as the initial evolution condition where the betrayal strategy, cooperation strategy and government punishment strategy account for 46%, 27%, and 27%, respectively. Take c gp = 0.1 and f gp = 1.4 as an example shown in Figure 14, in the initial stage of random state, the government's punishment strategy is widely distributed in the space, and there is sufficient contact with the betrayer, thus the betrayer is punished accordingly. From the distribution of total benefits, it can be seen that the total benefits of traitors surrounded by the government punishment strategy are the smallest. In this case, triators are forced to join the collective cooperation. As the traitor number decreases, costly government punishment strategies are gradually replaced by cooperative strategies, and the roles of individuals adopting these strategies also change from supervisors to second-order free riders. In the later stage of the evolutionary game, it can be clearly seen that the government punishment strategy has basically disappeared, the second-order free riders have an absolute advantage, and a high level of collective cooperation has been formed. However, due to the large number of second-order free riders, the government punishment strategy cannot effectively punish the traitors, forming a state similar to the later stage of evolution in the random state of the "binary strategy", where evolutionary progress becomes very slow. Therefore, methods to reinvigorate the government punishment strategy among traitors are key to solving the problem.
In the initial stage of the organized state, it can be clearly seen that the individual who adopts the cooperative strategy has the largest total benefits, the individual who adopts the government punishment strategy has the second highest return, and the traitor has the smallest total return. This shows the costly punishment strategy is not competitive with the cooperative strategy within the cooperative collective, but the positive role of the government punishment strategy also cannot be denied, since it ensures the cooperative collective will not be threatened by the betrayal strategy, avoiding the emergence of traitors. Although the government punishment strategy is gradually replaced by the cooperative strategy within the cooperative collective, different from the random state, it has not disappeared, but is always at the forefront of the confrontation with the betrayal strategy. Strikingly from Figure 14, the government punishment strategy can develop more quickly to the space occupied by the betrayal strategy.
has basically disappeared, the second-order free riders have an absolute advantage, a high level of collective cooperation has been formed. However, due to the large num of second-order free riders, the government punishment strategy cannot effectively ish the traitors, forming a state similar to the later stage of evolution in the random of the "binary strategy", where evolutionary progress becomes very slow. There methods to reinvigorate the government punishment strategy among traitors are ke solving the problem. In the initial stage of the organized state, it can be clearly seen that the indivi who adopts the cooperative strategy has the largest total benefits, the individual adopts the government punishment strategy has the second highest return, and the tr has the smallest total return. This shows the costly punishment strategy is not compet with the cooperative strategy within the cooperative collective, but the positive role o government punishment strategy also cannot be denied, since it ensures the cooper collective will not be threatened by the betrayal strategy, avoiding the emergence of tors. Although the government punishment strategy is gradually replaced by the coo ative strategy within the cooperative collective, different from the random state, it ha disappeared, but is always at the forefront of the confrontation with the betrayal stra Strikingly from Figure 14, the government punishment strategy can develop more qu to the space occupied by the betrayal strategy.

Discussions
In order to examine the effect of a social governance system in promoting collective action on water pollution control, this study developed a spatial cooperative evolutionary game model, and conducted evolutionary simulation experiments under spontaneous conditions, an individual strengthened mechanism and a government punishment mechanism. The cooperation scale is used to reflect the collective action formation and the effect of basin water pollution social governance. In addition, the process of cooperation evolution essentially reflects the process whereby new institutional rules are gradually accepted and followed by the public.
Simulation experiments show that it is very difficult to rely on emitters to spontaneously abide by institutional rules to form collective actions, even though pollution poses a serious problem to basin sustainable development and public production life. Meanwhile, whether in the form of organized or random mobilization, social governance needs the support of the vast majority of emitter groups, otherwise it will deviate from the collective due to traitor unwillingness to bear the governance cost, thus leading to the disintegration of collective action. This also means that to achieve voluntary social governance, both emitters and external government need to actively mobilize the vast majority of emitters, and ensure they comply with the social governance norms, so as to ensure collective action. This obviously places extremely stringent requirements on the implementation of the social governance system. Even when the above requirements can be met, the collective choice platform needs to ensure the openness and timely delivery of information, which allows basin emitters to be aware of the environmental improvements from their governance efforts and the greater results on themselves and others. Otherwise, the emitter will rely on imperfect local information and deviate from collective cooperation. Moreover, in reality, individual behavior decisions need timely information feedback, otherwise they cannot fully understand the impact of their behavior on the environment and others. Thus, they may be more determined to maximize their own interests and avoid governance investment, thereby betraying collective cooperation. These two factors will not only delay the formation of the entire social norm, but may even lead to the disintegration of collective action of spontaneous autonomous organizations, namely, failure of the social governance system.
Compared to random states, organized states can stably and rapidly achieve collective cooperation with a smaller initial cooperator scale. Therefore, organized mobilization can better lead emitters to voluntarily comply with pollution regulation. With the continuous reduction of pollution, the emission group will also receive higher water environment governance benefits, which could motivate more emitters to join the collective action and thus form a positive feedback loop. This also allows for easier acceptance by polluters of the new system rules and the transformation into a new social norm and consensus, that is seldom violated by people.
According to the simulation results, it can be determined that whether basin emitters spontaneously abide by the credible commitments or not depends to a large extent on the benefit-cost ratio of environmental governance. Only when this ratio is large enough, can the collective cooperation develop steadily, and the spontaneous condition can be weakened, namely, to reduce the initial cooperator size. Therefore, fully displaying the comprehensive environmental governance benefits and effectively saving governance input costs provide the basis for realization of spontaneous self-organization governance. In terms of the comprehensive benefits of environmental governance, this study believes effective expansion of the water environment management paradigm is key to highlighting comprehensive environmental governance benefits, not limited to basic economic interests, but also including a wider range of human health, social welfare, environmental and ecological interests. It is not only limited to the current explicit interests, but also the evaluation of long-term invisible interests.
There are various types of governance inputs. Project governance is only one aspect, the transformation of production and lifestyle is also an example of governance investment, i.e., reducing use of non-essential chemical reagents, choosing green transportation for travel, selecting more scientific fertilization methods in agricultural production, and avoiding direct emissions to the environment in industrial production processes, etc. These changes can effectively reduce the emission of pollution loads, and save governance input cost, making it easier for individuals to accept and implement. Only after a clarificaion on comprehensive environmental governance benefits and governance investment methods, government, non-governmental environmental protection organizations, scientific groups, social organizations, and various other institutions can make the public fully aware of environmental governance influence and advantages through environmental legislation, media publicity, and popular science education, as well as scientific and acceptable governance input methods, thus effectively promoting individuals to consciously recognize and abide by the corresponding social norms. Once a sufficient number of action agents join the social norms and social governance system, and threshold conditions for collective action formation are broken through, it will enter a positive feedback loop. This will promote more participation in collective action, thus ultimately achieving social governance.
There is also a very important conclusion regarding the collective action of spontanous self-organization. Once complete collective cooperation is formed, there will be no action agent out of collective cooperation. This conclusion indicates two points: First, under the analytical framework and premise provided in this study, social governance of water pollution in the basin is achievable, because whether spontaneously by the public or under certain supervision and sanctions, action agents can achieve a mutually beneficial situation between individual and collective interests through social governance. Second, this also means the formation of social norms will have a strong binding force on individuals, making it difficult for individuals to act for short-term self-interest rather than to comply with social norms or break away from collective action.

Conclusions
In order to break through the social dilemma of conflict between individual and collective interests in basin water pollution control, this study developed a group of spatial cooperative evolutionary game models for river basin water pollution social governance. The model constructed in this study can systematically show the dynamic relationship between human pollution discharge activities and the water environment, reflect the decision-making impact of global, regional and self-behavior changes on the action agents, and provide a mathematical basis and analytical tools for solving new institutional supply problems. Based on the model, this study conducted collective action evolution simulation experiments under a different social governance mechanism, initial cooperation state and scale.
The results show that the social governance system may form autonomous collective action, but requires a large initial scale of cooperation. This means that without any intervention, the social governance system needs a very harsh social environment (i.e., most individuals can independently comply with the emission regulations) to truly fulfill its role. The initial emission strategy distribution state is a significant factor affecting collective cooperation. If an organized mobilization mechanism can be formed at the initial stage of basin water pollution, it will be easier for polluters to form collective cooperation and achieve effective social governance of basin water pollution.
Changing the cost-benefit ratio of environment governance of polluters is also a critical way to promote collective action and achieve effective social governance of basin water pollution. Furthermore, intervention measures such as supervision and punishment from individuals and governments play an important role. In addition, when the collective cooperation reaches a certain scale, it will no longer face the risk of disintegration, which means that the collective action formed by social governance may have a greater ability to resist water environment risks. This is of great significance for water environment governance under various external challenges including climate change, population growth and social development. Future research can further explore the resilience of a social governance system to provide theoretical support for sustainable basin development.
Research on the social governance system of basin water pollution under a governmentled governance system represents a complex and huge systematic work. This study provides only a preliminary discussion on this issue, and there are still many problems requiring more in-depth and detailed research: (1) The study fails to incorporate key elements, such as basin upstream and downstream structures and water resource changes caused by climate change, into this study. An exploration on these will ensure the long-term governance effectiveness of social governance. (2) The study uses the S-curve function to describe the interactive relationship between the social system and water resources system, but the relationship itself can be diverse, other linear expressions or a discrete relationship. How these differences will affect cooperation evolution requires further in-depth research and evidence. In future research, the theory and methodology should be improved, and these important influencing factors and actual situation should be incorporated, so as to enhance the scientific rigor and comprehensiveness.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.