Evaluation of the Quality Supervision System for Construction Projects in China Considering the Quality Behavior Risk Transmission

: In construction projects, improper quality behavior of a participant results in quality behavior risks, which can transmit to the downstream participants and may cause detrimental e ﬀ ects on the quality of the entity ﬁnally constructed. Controlling the transmission of quality behavior risks is the key to e ﬀ ectively supervising and ensuring the quality of construction projects. In this study, the e ﬀ ectiveness of the quality supervision system of construction projects in China was investigated by considering the transmission of quality behavior risks. A multi-player evolutionary game model consisting of the players of quality supervision of a government department, upstream participant (UP), and downstream participant (DP) was generated. By using the system dynamics theory, the game model was simulated to determine the stability of the evolutionary system and to evaluate the e ﬀ ectiveness of China’s current quality supervision system under di ﬀ erent scenarios. The results showed that there is no evolutionary stable strategy (ESS) in the evolutionary system of the current quality supervision system in China and there are ﬂuctuations in the evolution process. It revealed that high risk exists in the current quality supervision system in China. To resolve the problem of the low e ﬃ ciency of the current Chinese supervision system, a dynamic penalty and incentive method is developed, which has been proven to be able to e ﬀ ectively control the quality behavior risks in construction projects and hence ensuring the quality of the entity ﬁnally constructed.

identifying investors' opportunistic behavior during performance evaluations [9]. Liu analyzed the game relationship between the regulator and the coal-mine participants to explore the effectiveness of coal-mine regulation in China [14]. Fan discussed the optimal supervision strategy of the government to supervise the low-carbon subsidy of enterprises and showed that the overall optimal supervision probability was 0.24 [15]. Guo studied the interactions among the project owner, supervising engineer, and contractor during the construction quality supervision by using the evolutionary game theory [16]. Concerning the government supervision, some scholars focus on stressing the regulatory measures of the government, such as penalty and reward. Generally, punishment is recognized as an effective way to implement social norms [17,18]. However, with the absence of an incentive mechanism [19], it is difficult to reach a cooperation due to the conflict of interests in the quality supervision system. As a result, the illegal and unqualified quality behaviors of the participants cannot be controlled. Thus, a collection of penalty and reward has been put forward to curb the illegal quality behavior. Different combinations of penalty and reward mechanisms have been successfully applied to many areas, such as coal-mine safety regulation [14,20] and railway transportation safety regulation [10].
In light of the above reviews, this paper attempts to study three aspects: (i) previous studies have focused mainly on the different interests and demands among some specific participants, e.g., the employer, supervision unit, and construction unit, while ignoring the dynamic transmission between the upstream and downstream participants. Thus, an upstream participant (UP) and a downstream participant (DP) were abstracted from the participants to investigate the transmission of the quality behavior risks; (ii) considering the inconsistency of interests and the bounded rationality between the participants and the government, an evolutionary game model was established to analyze the dynamic evolution process of the selection of quality behaviors among the government, UP, and DP. Then the game was simulated by using the system dynamics to analyze the stability of the evolutionary system and the effectiveness on controlling the transmission of quality behavior risks under different scenarios; (iii) a dynamic penalty and incentive (DPI) mechanism was developed to restrain the fluctuation of the evolutionary process in the game, and the stability and effectiveness of the DPI mechanism were verified. This research provides theoretical support to the government for controlling the quality behavior risks, implementing effective supervision, and innovating the supervision mode in construction projects.

Game Design and Assumption
Based on the work sequence of participation in construction projects, the UP and DP are abstracted from the participants to study the transmission of the quality behavior risks, illustrated in Figure 1. Each subject in Figure 1 is a bounded rational economic person, characterized by obtaining maximum profit. The evolutionary game theory has the greatest advantage for analyzing the bounded rationality of players and the dynamic process of games. identifying investors' opportunistic behavior during performance evaluations [9]. Liu analyzed the game relationship between the regulator and the coal-mine participants to explore the effectiveness of coal-mine regulation in China [14]. Fan discussed the optimal supervision strategy of the government to supervise the low-carbon subsidy of enterprises and showed that the overall optimal supervision probability was 0.24 [15]. Guo studied the interactions among the project owner, supervising engineer, and contractor during the construction quality supervision by using the evolutionary game theory [16]. Concerning the government supervision, some scholars focus on stressing the regulatory measures of the government, such as penalty and reward. Generally, punishment is recognized as an effective way to implement social norms [17,18]. However, with the absence of an incentive mechanism [19], it is difficult to reach a cooperation due to the conflict of interests in the quality supervision system. As a result, the illegal and unqualified quality behaviors of the participants cannot be controlled. Thus, a collection of penalty and reward has been put forward to curb the illegal quality behavior. Different combinations of penalty and reward mechanisms have been successfully applied to many areas, such as coal-mine safety regulation [14,20] and railway transportation safety regulation [10].
In light of the above reviews, this paper attempts to study three aspects: (i) previous studies have focused mainly on the different interests and demands among some specific participants, e.g., the employer, supervision unit, and construction unit, while ignoring the dynamic transmission between the upstream and downstream participants. Thus, an upstream participant (UP) and a downstream participant (DP) were abstracted from the participants to investigate the transmission of the quality behavior risks; (ii) considering the inconsistency of interests and the bounded rationality between the participants and the government, an evolutionary game model was established to analyze the dynamic evolution process of the selection of quality behaviors among the government, UP, and DP. Then the game was simulated by using the system dynamics to analyze the stability of the evolutionary system and the effectiveness on controlling the transmission of quality behavior risks under different scenarios; (iii) a dynamic penalty and incentive (DPI) mechanism was developed to restrain the fluctuation of the evolutionary process in the game, and the stability and effectiveness of the DPI mechanism were verified. This research provides theoretical support to the government for controlling the quality behavior risks, implementing effective supervision, and innovating the supervision mode in construction projects.

Game Design and Assumption
Based on the work sequence of participation in construction projects, the UP and DP are abstracted from the participants to study the transmission of the quality behavior risks, illustrated in Figure 1. Each subject in Figure 1 is a bounded rational economic person, characterized by obtaining maximum profit. The evolutionary game theory has the greatest advantage for analyzing the bounded rationality of players and the dynamic process of games. Assume that the participants i (i = 1, 2), i = 1 represents UP and i = 2 represents DP, choosing y i (0 ≤ y i ≤ 1, i = 1, 2) as their strategy in the game process, where y i represents the probability of following the regulations. y i = 1 represents the participants following the regulations, and adopting the legal and required quality behavior, and y i = 0 represents the participants adopting illegal and unqualified quality behavior. During the construction, R i (R i > 0, i = 1, 2) represents the quality cost when the participants adopt the legal and required quality behavior, namely, the effort on the construction. If the participants choose the illegal and unqualified quality behavior, the extra profits T i (T i > 0, i = 1, 2) can be gained, but at the same time, the quality behavior risks resulting from the illegal and unqualified quality behavior will be transmitted to the DPs. The DPs can claim the damages from the quality behavior risks of UPs, and S(S > 0) is the compensation to the DP.
Assume that the government chooses x (0 ≤ x ≤ 1) as its strategy in the game process, where x represents the probability of supervising. x = 1 means that the government invests more labor, money, or material resources, such as purchasing the testing service from the third-party, unannounced inspection, strengthening the training and incentive for the supervisors of government, and so on. x = 0 represents the government that makes the rules of supervision, and the authority of supervision is delegated to the participants. The government needs to pay a cost C p (C p > 0) when adopting the positive supervision, and the cost of the negative supervision is so low that it can be ignored in this paper. Assume that the inspection ability of the government is strong enough, once the government adopts positive supervision, the quality behavior of the participants, including the illegal and unqualified behavior and the legal and required behavior, will be discovered. The government imposes fines F i (F i > 0, i = 1, 2) on the participants when they adopt the illegal and unqualified quality behavior, e.g., downgrading credit rating. The participants are also rewarded E i (E i > 0, i = 1, 2) by the government when they adopt the legal and required quality behavior, e.g., upgrading credit rating.
The participants need to bear the cost of quality behavior risks when adopting the strategy of illegal and unqualified quality behavior. The cost of quality behavior risks refers to the cost of repairing the quality problem and quality defect, which result from quality behavior risks, and the loss incurred from quality accidents [21,22]. If a quality accident occurs, the government will bear joint liability in China. In general, the cost of quality behavior risks borne by both the government and the responsible participant is linearly correlated. Assume that C i (C i > 0, i = 1, 2) represents the cost of quality behavior risks borne by the participants, αC i is the cost of quality behavior risks borne by the government, where α(α ≥ 0) is the transmission coefficient of quality behavior risks. Furthermore, assessing the influence of quality behavior risks commonly depends on the risk effect of the behavior itself, and the reaction time and reaction strength to find and address the problem. This can be reflected in the cost of quality behavior risks, which is the discount coefficient of the cost of quality behavior risks. When the government adopts the strategy of positive supervision, the cost of quality behavior risks borne by the government is αβC i , where β (0 < β ≤ 1) is the discount coefficient of the cost of quality behavior risks.
The variables of the multi-player game are shown in Table 1. Based on the above assumptions and analysis, the payoff matrix of the multi-player game can be obtained in Table 2. Table 1. Meanings of the variables in the multi-player game.

Variables
Meanings Notes x Ratio of positive supervision 0 ≤ x ≤ 1 y i Ratio of adopting the legal and required quality behavior 0 ≤ y i ≤ 1, i = 1, 2 C p Cost of positive supervision C p > 0 R i Quality cost of adopting the legal and required quality behavior Extra profit of adopting the illegal and unqualified quality behavior Cost of quality behavior risks C i > 0, i = 1, 2 α Transmission coefficient of cost of quality behavior risks α ≥ 0 s Compensation s > 0 β Discount coefficient of cost of quality behavior risks

Strategy
Adopting the legal and required quality behavior (y 2 ) Adopting the illegal and unqualified quality behavior Adopting the legal and required quality behavior (y 2 ) Adopting the illegal and unqualified quality behavior (1 − y 2 )

UP
Adopting the legal and required quality behavior (y 1 ) Adopting the illegal and unqualified quality behavior

Game Solution
According to the evolutionary game theory, replication dynamics is used to represent the learning and evolution mechanism of individuals in the process of the transmission of quality behavior risks. Thus, the expected benefits of the government from positive supervision (U x ) and negative supervision (U 1−x ) can be obtained as follows, respectively.
The average expected benefit of the government is shown by Equation (3) below: Similarly, the expressions of the expected benefits for the UP are shown by Equations (4) and (5).
The expressions of the expected benefits for the DP are shown by Equations (6) and (7).
According to the replication dynamics, the change rate of x is as follows: Similarly, the change rates of y 1 and y 2 can be obtained as follows, respectively.
Equations (8)-(10) reflect the speed of the strategy adjustment of the government, UP, and DP. When the equations are equal to zero, it means that the strategy will not be changed and the game system reaches a relatively stable equilibrium state. Many researchers have followed the method proposed by Friedman to analyze the stability of the equilibrium points by calculating the determinant and trace of the Jacobian matrix of the game [23]. Since this general method requires specific analysis of different parameters in different cases, calculations are tedious and complicated to the multi-player game. Therefore, a computer simulation method is used in this paper to analyze the stability of the complex dynamic multi-player game.

The Stability Analysis of the Multi-Player Game System
In the multi-player game, each player learns the behavior of other players by comparing and observing the benefits with others and then adjusts its strategy selection. This process is the feedback behavior in the game. System dynamics (SD) simulation is an effective method to study the feedback behaviors by analyzing the causal relationship and interaction among variables in complex systems [14,16,24]. Using the SD simulation to study the transmission of the quality behavior risks in the construction process can not only show the dynamic process of quality behavior risks in the construction process, but also intuitively illustrate the effectiveness of the strategic selection of the participants on the quality of entity finally constructed in construction projects. Therefore, this section will build an SD model to analyze the stability of the equilibrium points in the multi-player game, which can provide an effective experimental reference for controlling the quality behavior risks and optimizing supervision strategies.

Building the SD Model
The multi-player evolutionary game SD model is built according to the above game assumptions and analysis by using Vensim PLE 7.2. The SD model of the multi-player game of the transmission of quality behavior risks is shown in Figure 2, including three state variables: the ratio of positive supervision, the ratio of the UP adopting the legal and required quality behavior, and the ratio of the DP adopting the legal and required quality behavior; three rate variables: the change rate of choosing positive supervision, the change rate of the UP adopting the legal and required quality behavior, and the change rate of the DP adopting the legal and required quality behavior; nine auxiliary variables; and fourteen external variables. The auxiliary variables are U x ,  Table 1.  The values of the external variables influence the auxiliary variables, and the values of the auxiliary variables influence the rate variables. Then, the state variables will change, which in turn affects the auxiliary variables. Thus, there is a feedback system. Finally, the system will be analyzed using numerical simulation.

Numerical Simulation
From the above analysis, we obtained some external variables that are supposed to affect the simulation result. To make the results reasonable, an orthogonal experiment was designed to perform the analysis for various values of parameters to reflect more possible situations [25]. Taking three selected variables (T 1 , F 1 , and E 1 ) as factors, we made two levels of them for the experiment and then simulated the four experimental plans from the orthogonal array. The factor-level table and orthogonal array are shown in Tables 3 and 4, respectively.  To analyze the stability of the game, the values of the first experiment were taken as an example. The model setting of the Vensim PLE 7.2 is Initial Time = 0, Final Time = 100, Time Step = 0.03125, Units for Time = Month, and Integration Type: Euler. Through interviews with related people, such as the government supervisor and the project manager, after pretreatment, we determined the initial values of the other external variables, which are shown in Table 5.

Stability Analysis
In the asymmetric game, the evolutionary stability strategy must be pure strategy [26]. Since the multi-player game in this paper is an asymmetric game, we analyzed the stability of the pure strategies: (1, 0, 1), and Z 8 (1, 1, 1), which are from G(x, y 1 , y 2 ) = 0, H(x, y 1 , y 2 ) = 0, and I(x, y 1 , y 2 ) = 0 in Equations (10)- (12). The simulation results of Z 1 ∼ Z 8 show that the multi-player game reaches a relative equilibrium state and no player has any change in their strategy. Taking one of the initial pure strategy equilibrium points, Z 8 = (1, 1, 1), as an example, the evolution process under the initial strategy Z 8 is shown in Figure 3. Curve 1 represents the ratio of positive supervision, namely, x, curve 2 represents the ratio of the UP adopting the legal and required quality behavior, namely, y 1 , and curve 3 represents the ratio of the DP adopting the legal and required quality behavior, namely, y 2 .
Cost of positive supervision 6 1 Quality cost of the UP adopting the legal and required quality behavior 15 2 Quality cost of the DP adopting the legal and required quality behavior 13 Cost of quality behavior risks for the DP 3 Transmission coefficient of cost of quality behavior risks 0.5 Compensation 1 Discount coefficient of cost of quality behavior risks 0.6 The simulation results of 8 show that none of the three curves fluctuate. This phenomenon can be explained as follows. During the process of supervision, if the three players choose the above strategies at the beginning, none of them will change. However, this state only represents the decision process at this point. This balance may be broken when a few individuals in a group are disturbed by external factors and slight changes in other players' strategies. To verify the stability of 8 , assume In practice, the value of x is close to 1, the more resources that the government invests in supervision, and the higher the supervision cost. The values of y 1 and y 2 are close to 1, the upstream and DPs are more willing to adopt the legal and required quality behavior. From the perspective of cost-benefit of the quality supervision in construction projects, the lower the supervision cost is and the higher the legal and required quality behavior ratio are, the higher the quality supervision efficiency.
The simulation results of Z 8 show that none of the three curves fluctuate. This phenomenon can be explained as follows. During the process of supervision, if the three players choose the above strategies at the beginning, none of them will change. However, this state only represents the decision process at this point. This balance may be broken when a few individuals in a group are disturbed by external factors and slight changes in other players' strategies. To verify the stability of Z 8 , assume that the initial strategy x is slightly mutated from 1 to 0.99; the evolution process under Z 8 = (0.99,1,1) is shown in Figure 4. that the initial strategy is slightly mutated from 1 to 0.99; the evolution process under 8 ′ = (0.99,1,1) is shown in Figure 4. As shown in Figure 4, the equilibrium state of 8 is unsteady. Specifically, the quality behavior of the UP and the DP has not been affected by changing the ratio of positive supervision. The equilibrium state is evolved from 8 to 6 after changing . This phenomenon can be explained as follows. Although only a few individuals in the group of the government mutate their initial strategies, this mutation can get better benefits, namely, they decrease their supervision cost and the participants follow the legal and required quality behavior. This further leads to the fact that the other individuals in the group of the government constantly adjust their strategies for better benefits. Once a slight change happens in an initial strategy, the equilibrium state of the game will be broken. This conclusion is also applied to other pure strategies. There is no evolutionary stable strategy in the multi-player game.
Furthermore, more general initial strategies should be simulated close to the actual condition. Since the government pays more attention to the social and ecological impacts of a project and its reputation loss in addition to the economic benefits, it will tend to be more perceptive than the participants for the consequences of quality behavior risks in construction projects. As a result, we input 9 (0.6,0.5,0.5) in the model. The result of 9 cannot reach an ESS, which is shown in Figure 5. Curves 1 and 2 fluctuate sharply. By observing the fluctuation laws of curves 1 and 2, the amplitude of the fluctuation becomes bigger and bigger. The bigger the amplitude is, the higher the quality behavior risk for the construction projects is. There is an alternating push-pull relationship between the government ratio and the UP ratio. This phenomenon can be explained as follows. When the ratio of the UP is low, the possibility of the quality behavior risks aroused by the UP inspected by the government is high, and the government will increase their supervision ratio to force the participant to adopt the legal and required quality behavior. When the ratio of the UP increases to a certain extent, the government will relax their supervision to save money, which in turn makes the UPs decrease their investment in construction to get more profits. This is the main reason for the high occurrence probability of quality accidents in construction projects in China. This process forms the fluctuations between the government and the UP. It shows that the current supervision of the government can have an impact on supervising the quality behavior of the UP. However, this  As shown in Figure 4, the equilibrium state of Z 8 is unsteady. Specifically, the quality behavior of the UP and the DP has not been affected by changing the ratio of positive supervision. The equilibrium state is evolved from Z 8 to Z 6 after changing x. This phenomenon can be explained as follows. Although only a few individuals in the group of the government mutate their initial strategies, this mutation can get better benefits, namely, they decrease their supervision cost and the participants follow the legal and required quality behavior. This further leads to the fact that the other individuals in the group of the government constantly adjust their strategies for better benefits. Once a slight change happens in an initial strategy, the equilibrium state of the game will be broken. This conclusion is also applied to other pure strategies. There is no evolutionary stable strategy in the multi-player game.
Furthermore, more general initial strategies should be simulated close to the actual condition. Since the government pays more attention to the social and ecological impacts of a project and its reputation loss in addition to the economic benefits, it will tend to be more perceptive than the participants for the consequences of quality behavior risks in construction projects. As a result, we input Z 9 (0.6, 0.5, 0.5) in the model. The result of Z 9 cannot reach an ESS, which is shown in Figure 5. Curves 1 and 2 fluctuate sharply. By observing the fluctuation laws of curves 1 and 2, the amplitude of the fluctuation becomes bigger and bigger. The bigger the amplitude is, the higher the quality behavior risk for the construction projects is. There is an alternating push-pull relationship between the government ratio and the UP ratio. This phenomenon can be explained as follows. When the ratio of the UP is low, the possibility of the quality behavior risks aroused by the UP inspected by the government is high, and the government will increase their supervision ratio to force the participant to adopt the legal and required quality behavior. When the ratio of the UP increases to a certain extent, the government will relax their supervision to save money, which in turn makes the UPs decrease their investment in construction to get more profits. This is the main reason for the high occurrence probability of quality accidents in construction projects in China. This process forms the fluctuations between the government and the UP. It shows that the current supervision of the government can have an impact on supervising the quality behavior of the UP. However, this approach does not prevent or eliminate the quality behavior risks in construction projects. When curve 2 is closed or equal to 0, the risks can be transmitted, evolved, and finally a quality accident may happen if the participant cannot eliminate them. The shortage of human resources and limited funding in the government supervision department make it more difficult for the government to deal with the actual problems. may happen if the participant cannot eliminate them. The shortage of human resources and limited funding in the government supervision department make it more difficult for the government to deal with the actual problems.
Another phenomenon can be seen in Figure 5, where the DP seems immune to the interaction of the other players and adopts the legal and required quality behavior almost all the time. It can be explained as follows. The interplay between the UP and the government produces a deterrent effect on the DP. By comparing the price and profit of adopting the illegal and unqualified quality behavior of the UP, the strategy of the DP will be adjusted and the legal and required quality behavior will be chosen.

Analysis of Different Rewards and Penalties
It is generally acknowledged by scholars that the government increasing the number of penalties or rewards is an effective measure to control illegal behavior during construction [9, 10,20]. In this section, the penalty and the reward will be changed respectively to investigate the influence of the changes on the multi-player game. When changing the number of penalties or rewards, other parameters remain fixed.

Scenario 1. Changing the penalties of the participants
We changed the penalties ( 1 , 2 ) from (9,7) to (7,5) or (11,9). For an initial strategy of Another phenomenon can be seen in Figure 5, where the DP seems immune to the interaction of the other players and adopts the legal and required quality behavior almost all the time. It can be explained as follows. The interplay between the UP and the government produces a deterrent effect on the DP. By comparing the price and profit of adopting the illegal and unqualified quality behavior of the UP, the strategy of the DP will be adjusted and the legal and required quality behavior will be chosen.

Analysis of Different Rewards and Penalties
It is generally acknowledged by scholars that the government increasing the number of penalties or rewards is an effective measure to control illegal behavior during construction [9, 10,20]. In this section, the penalty and the reward will be changed respectively to investigate the influence of the changes on the multi-player game. When changing the number of penalties or rewards, other parameters remain fixed.

Scenario 1. Changing the Penalties of the Participants
We changed the penalties (F 1 , F 2 ) from (9,7) to (7,5) or (11,9). For an initial strategy of Z 9 (0.6, 0.5, 0.5), the evolution process is shown in Figure 6. Figure 6a,b show the evolution process when the penalties (F 1 , F 2 ) are (7,5) and (11,9), respectively. Comparing Figure 5 with Figure 6a, after the government decreased the penalties of the UP and the DP to (7,5), the UP adjusted its strategy rapidly to choose the illegal and unqualified quality behavior. Simultaneously, the DP adjusted its strategy from the stable legal behavior to fluctuation. The ratios of the behaviors show that the occurrence probability of a quality accident is higher. Therefore, decreasing the penalty proves that the quality behavior risks are more serious.
Comparing Figure 5 with Figure 6b, after the government increased the penalties of the UP and the DP to (11,9), the multi-player game model gradually reached a pure strategy at (0,1,1). This point is a perfect state for quality supervision in construction projects. In current China, the government is devoted to exploring a mode of quality supervision that can reduce regulation and maximize participants' behavior. However, in the last section, we proved that the pure strategy is unstable.  Comparing Figure 5 with Figure 6a, after the government decreased the penalties of the UP and the DP to (7,5), the UP adjusted its strategy rapidly to choose the illegal and unqualified quality behavior. Simultaneously, the DP adjusted its strategy from the stable legal behavior to fluctuation. The ratios of the behaviors show that the occurrence probability of a quality accident is higher. Therefore, decreasing the penalty proves that the quality behavior risks are more serious.
Comparing Figure 5 with Figure 6b, after the government increased the penalties of the UP and the DP to (11,9), the multi-player game model gradually reached a pure strategy at (0,1,1). This point is a perfect state for quality supervision in construction projects. In current China, the government is devoted to exploring a mode of quality supervision that can reduce regulation and maximize participants' behavior. However, in the Section 4, we proved that the pure strategy is unstable.

Scenario 2. Changing the Rewards of the Participants
We changed the rewards (E 1 , E 2 ) from (2,2) to (1,1) or (3,3). For an initial strategy of Z 9 (0.6, 0.5, 0.5), the evolution process is shown in Figure 7. Figure 7a,b show the evolution process when the rewards (E 1 , E 2 ) are (1,1) and (3,3), respectively. We changed the rewards ( 1 , 2 ) from (2,2) to (1,1) or (3,3). For an initial strategy of 9 (0.6,0.5,0.5), the evolution process is shown in Figure 7. Figure 7a,b show the evolution process when the rewards ( 1 , 2 ) are (1,1) and (3,3), respectively. Comparing Figure 5 with Figure 7a, after the government decreased the rewards of the UP and the DP to (1,1), the fluctuation frequency of the ratio of the UP increases. The quality behavior of the UP became more unpredictable. It is detrimental for the government to supervise. Comparing Figure  5 with Figure 7b, after the government increased the rewards of the UP and the DP to (3,3), the amplitude of the fluctuation of the ratio of UP gradually increases, which can raise the uncertainty of the game system. The results show that changing the rewards of the participants cannot reduce the fluctuations and an ESS in the system is never reached.
In conclusion, there is no ESS in the current multi-player game system from the set of values of the orthogonal experiment. Changing the penalties and the rewards of the participants respectively under the strategy of 9 shows that the fluctuation cannot be dampened. The simulation results of  Comparing Figure 5 with Figure 7a, after the government decreased the rewards of the UP and the DP to (1,1), the fluctuation frequency of the ratio of the UP increases. The quality behavior of the UP became more unpredictable. It is detrimental for the government to supervise. Comparing Figure 5 with Figure 7b, after the government increased the rewards of the UP and the DP to (3,3), the amplitude of the fluctuation of the ratio of UP gradually increases, which can raise the uncertainty of the game system. The results show that changing the rewards of the participants cannot reduce the fluctuations and an ESS in the system is never reached.
In conclusion, there is no ESS in the current multi-player game system from the set of values of the orthogonal experiment. Changing the penalties and the rewards of the participants respectively under the strategy of Z 9 shows that the fluctuation cannot be dampened. The simulation results of the model (Figures 5, 6b and 7) prove that there is an alternating push-pull relationship between the government and the UP. The government urges the UP to improve the legal and required quality behavior. With the relaxed supervision, the ratio of the legal and required quality behavior decreases, which in turn prompts the government to strengthen its supervision. However, the simulation results show that changing the penalty or the reward cannot retrain the fluctuation, which means the supervision of the government is not efficient to the UP. Based on the deterrent effect, controlling the quality behavior of the UP is more important than the DP. Moreover, the ratio of the legal and required quality behavior of the UP is too low, which may increase the probability of quality problems, quality defects, and even quality accidents.
In the other experiment, for an initial strategy of Z 9 (0.6, 0.5, 0.5), the simulations of the other three combinations of factors present similar results with the above experiment, which are shown in Figure 8a-c. Although the levels of extra profit, penalty, and reward of the UP have been changed from low to high, the game system is still unstable. It can be deduced from this phenomenon that it makes no sense to mechanically adjust the levels of the variables to reduce the quality behavior risks in the current quality supervision system. Thus, it is urgent for the government to design an incentive mechanism for the participants to control the risks in construction projects, since the DP can also become the UP for other participants. The following study will continue to use the first set of values to keep consistency.  (Figures 5, 6b and 7) prove that there is an alternating push-pull relationship between the government and the UP. The government urges the UP to improve the legal and required quality behavior. With the relaxed supervision, the ratio of the legal and required quality behavior decreases, which in turn prompts the government to strengthen its supervision. However, the simulation results show that changing the penalty or the reward cannot retrain the fluctuation, which means the supervision of the government is not efficient to the UP. Based on the deterrent effect, controlling the quality behavior of the UP is more important than the DP. Moreover, the ratio of the legal and required quality behavior of the UP is too low, which may increase the probability of quality problems, quality defects, and even quality accidents.
In the other experiment, for an initial strategy of 9 (0.6,0.5,0.5), the simulations of the other three combinations of factors present similar results with the above experiment, which are shown in Figure 8a-c. Although the levels of extra profit, penalty, and reward of the UP have been changed from low to high, the game system is still unstable. It can be deduced from this phenomenon that it makes no sense to mechanically adjust the levels of the variables to reduce the quality behavior risks in the current quality supervision system. Thus, it is urgent for the government to design an incentive mechanism for the participants to control the risks in construction projects, since the DP can also become the UP for other participants. The following study will continue to use the first set of values to keep consistency.

Analysis of the Dynamic Penalty and Incentive (DPI) Mechanism
Many research trials have proven that relating the penalty and reward to the actions of the players can effectively control the fluctuations in the evolutionary game [10,27,28]. This article attempts to introduce a dynamic penalty and incentive mechanism, including a dynamic penalty and dynamic reward. The dynamic penalty correlates with the ratio of the illegal and unqualified quality behavior, the extra profits, and the ratio of positive supervision. The dynamic reward correlates with the ratio of the legal and required quality behavior, the quality cost, and the ratio of positive supervision. We use ′ , ′ ( = 1,2) to represent the penalty and the reward in the system, respectively. The equations are as follows:  (14) where is the penalty coefficient, and is the reward coefficient.

Analysis of the Dynamic Penalty and Incentive (DPI) Mechanism
Many research trials have proven that relating the penalty and reward to the actions of the players can effectively control the fluctuations in the evolutionary game [10,27,28]. This article attempts to introduce a dynamic penalty and incentive mechanism, including a dynamic penalty and dynamic reward. The dynamic penalty correlates with the ratio of the illegal and unqualified quality behavior, the extra profits, and the ratio of positive supervision. The dynamic reward correlates with the ratio of the legal and required quality behavior, the quality cost, and the ratio of positive supervision. We use F i , E i (i = 1, 2) to represent the penalty and the reward in the system, respectively. The equations are as follows: where δ is the penalty coefficient, and ε is the reward coefficient. 2). Thus, the SD model of the multi-player game under the DPI mechanism is shown in Figure 9.  Based on the initial values in Table 5, assuming that the values of δ and ε are set to 1 [27], for an initial strategy of Z 9 (0.6, 0.5, 0.5), the evolution process is shown in Figure 10. Based on the initial values in Table 5, assuming that the values of and are set to 1 [27], for an initial strategy of 9 (0.6,0.5,0.5), the evolution process is shown in Figure 10. As shown in Figure 10, curves 1, 2, and 3 reach a steady state rapidly. The equilibrium strategy of the multi-player game system is * (0,1,1). To analyze the stability of * , 10 (0.1,0.1,0.1) and 11 (0.9,0.9,0.9), which are shown in Figure 11a,b, respectively, are considered as the initial strategies. As shown in Figure 10, curves 1, 2, and 3 reach a steady state rapidly. The equilibrium strategy of the multi-player game system is Z * (0,1,1). To analyze the stability of Z * , Z 10 (0.1, 0.1, 0.1) and Z 11 (0.9, 0.9, 0.9), which are shown in Figure 11a,b, respectively, are considered as the initial strategies. Based on the initial values in Table 5, assuming that the values of and are set to 1 [27], for an initial strategy of 9 (0.6,0.5,0.5), the evolution process is shown in Figure 10. As shown in Figure 10, curves 1, 2, and 3 reach a steady state rapidly. The equilibrium strategy of the multi-player game system is * (0,1,1). To analyze the stability of * , 10 (0.1,0.1,0.1) and 11 (0.9,0.9,0.9), which are shown in Figure 11a,b, respectively, are considered as the initial strategies. The results show that all characteristic values are less than zero, so the * is the ESS, which is consistent with the computer simulation results. As we can see in Figure 11, no matter the initial strategy under a high ratio or a low ratio, the system keeps stable and converges to Z * (0,1,1). The results show that this stable state is unaffected by an initial strategy. The DPI mechanism can effectively dampen the fluctuation of the evolutionary process. Compared with the current incentive mechanism, the designing incentive mechanism can further improve the ratio of the legal and required quality behavior of the participants and reduce the ratio of positive supervision. This does not mean that the government is not important under the DPI mechanism. The government still needs to do the top-level design, such as making the incentive mechanism. Furthermore, the government is trying to innovate a market-oriented quality supervision mode that can achieve a lower investment and a higher ratio of the legal quality behavior in construction projects. The DPI mechanism meets this requirement perfectly.
The stability of Z * was analyzed by computer simulation. In this part, we used the characteristic values of the Jacobian matrix to verify the simulation result. It turns out that the characteristic value is less than zero in the Jacobian matrix, which indicates that the equilibrium solution is the ESS. We first put the DPI mechanism into Equations (8)- (10). Then the Jacobian matrix of the multi-player game can be obtained below: ∂G(x,y 1 ,y 2 ) ∂y 1 ∂G(x,y 1 ,y 2 ) ∂y 2 ∂H(x,y 1 ,y 2 ) ∂x ∂H(x,y 1 ,y 2 ) ∂y 1 ∂H(x,y 1 ,y 2 ) ∂y 2 ∂I(x,y 1 ,y 2 ) ∂x ∂I(x,y 1 ,y 2 ) ∂y 1 ∂I(x,y 1 ,y 2 ) ∂y 2 Its characteristic values are as follows: ρ 1 = ∂G(x,y 1 ,y 2 ) ∂x = (2x − 1) 28 195 x + 13 + 28 195 x(x − 1), According to the stability theory proposed by Lyapunov, for the strategy of Z * (0,1,1), because x 0 in the DPI mechanism, set x → 0 , ρ 1 = lim x→0 (2x − 1) 28 195 x + 13 + 28 195 x(x − 1) = −13 < 0, The results show that all characteristic values are less than zero, so the Z * is the ESS, which is consistent with the computer simulation results.
The above results demonstrate the effectiveness and stability of the DPI mechanism in the multi-player game. The government can reduce its direct supervision cost by making the game rule in the quality supervision system. As mentioned earlier, this fits in the reform of the quality supervision of China, that the supervisory authority is decentralizing to the market and the supervision relationship is shifting from microcosmic to macroscopic. As the number of construction projects continues to expand, it would be impossible to rely on the government with limited funding and a shortage of supervisors to supervise the participants. As a result, the government only inspects a small part. Therefore, the ratio of positive supervision is not high in reality. The simulation results of the model fit this situation. Under the fixed incentive mechanism, there is a risk of fluctuations in the multi-player game, and the designing DPI mechanism can restrain such fluctuations effectively. Furthermore, the simulation results of the DPI mechanism show its advantage to stabilize and improve the ratio of the legal and required quality behavior of the participants. It can optimize the quality supervision system of construction projects in China.

Conclusions
Improper quality behavior results in quality behavior risks of the participants in construction projects. The quality behavior risks of a participant can be transmitted to the downstream participant if the participant fails to eliminate the risks. The government, with a shortage of supervisors and limited funding, cannot supervise the participants efficiently. To address the transmission of quality behavior risks in construction projects, this paper abstracts the participants as UP and DP. The evolutionary multi-player game in China's quality supervision system, including the government, UP, and DP, was investigated in this research. Then, the multi-player game was simulated by the SD model to intuitively analyze the stability of the game. The influence of changing the penalties and rewards of the participants on the evolution process was explored. A DPI mechanism was developed to deal with the fluctuations in the game. The conclusions are made as follows: (1) In the current incentive mechanism, the evolutionary process of the multi-player game system fluctuates and it cannot reach an ESS, which means high risk exists in the current quality supervision system of construction projects in China. This is the main reason for the high occurrence probability of quality accidents in construction projects in China. A reasonable penalty and reward mechanism has a significant impact on the strategy selections of the participants. The penalty is low, resulting in all participants adopting the illegal and unqualified quality behavior. High penalty finally converges to a pure strategy, which is unstable. Excessively high reward and low reward can worsen the amplitude and frequency of the fluctuations of the multi-player game, which can raise the uncertainty of the game system and is detrimental for the government to supervise. (2) The simulation results show that there is an alternating push-pull relationship between the government and the UP, the government urges the UP to improve the legal and required quality behavior. With the relaxed supervision, the ratio of the legal and required quality behavior decreases, which in turn prompts the government to strengthen its supervision. But the results show that the supervision efficiency is not good. With the deterrent effect produced by the interplay between the government and the UP, the DP keeps the legal and required quality behavior. Under the transmission of the quality behavior risks, setting high priority to UP is conducive to control the risks, namely, hinder the transmission. (3) The DPI mechanism can effectively restrain the fluctuation of the evolutionary process to control the quality behavior risks. The system rapidly reaches an ESS where the government has its lowest supervision cost and the participants have their highest quality responsibility to ensure quality. It can be deduced that the introduction of the DPI mechanism in the quality supervision of construction projects can help stabilize and improve the ratio of the legal and required quality behavior of the participants.
To implement the DPI mechanism and improve the efficiency of the quality supervision system, several policy recommendations are proposed below by the authors for the government.
(1) The DPI mechanism requires a wholesome and robust supervision system. The government should devote some time to enable the implementation of the DPI mechanism, such as, establishment of perfect laws and rules, scientific division of responsibilities, clear regulatory boundaries and standards. (2) The dynamic red lists and black lists of the participants can be adopted to stimulate the participants to choose legal and required quality behavior during the supervision. This requires the government to strengthen the information integration of violations and improve the update speed of the information. Dynamic and automatic detection techniques also need to be introduced, such as building information modeling.
There are some limitations to this article. This paper considers the effectiveness of the government supervision on controlling the quality behavior risks in construction projects. The supervision of the government is divided into positive supervision and negative supervision in this paper. The positive supervision contains several approaches, such as purchasing the testing service from a third-party, unannounced inspection, strengthening the training, and incentives for the supervisors of government. The supervision cost and the efficiency among these approaches should be different. However, this paper does not consider the difference. Therefore, the effectiveness of different approaches of positive supervision on controlling the quality behavior risks can be explored in future research. In addition, the simulation result of the DPI mechanism shows that the ratio of positive supervision is equal to zero. Though we explain the phenomenon that zero supervision means the government should do some top-level design during the supervision, the optimal intensity of the supervision is not considered. Thus, the optimal intensity of the supervision in the quality supervision system can be further investigated.