Does China’s Municipal Solid Waste Source Separation Program Work? Evidence from the Spatial-Two-Stage-Least Squares Models

: This paper evaluates the impact of the second municipal solid waste (MSW) source separation program on municipal solid waste generation (MSWG) in China. Without considering the spatial interactions between cities, the second MSW source separation program has a nonsigniﬁcant adverse e ﬀ ect on the per capita municipal solid waste generation (PMSWG). Relaxing the stable-unit-treatment-value assumption (SUTVA), which holds in most of the previous estimation literature on treatment e ﬀ ects, involving the spatial spillover e ﬀ ect among cities, as well as correcting the endogenous local policy has a signiﬁcantly negative but not robust impact on the PMSWG. The estimation results of the generalized nesting spatial regression models (GNS) imply that the spatial interaction characteristics among Chinese prefecture-level cities may, if neglected, lead to underestimation of the reduction e ﬀ ects of the second MSW source separation policy on the absolute amount of PMSWG. More importantly, our study indicates that although not all the spatial econometric models support the signiﬁcant reduction e ﬀ ect of source separation on the absolute amount of PMSWG, the source separation program signiﬁcantly reduces the relative amount of PMSWG, and this result is robust in all spatial models.


Introduction
The problem of municipal solid waste (MSW) caused by economic growth and urbanization is expected to be a major worldwide challenge in the future, especially for developing countries. From the perspective of economic theory, the following three types of tax/subsidy schemes can be used to address the problem of MSW: Purchase-relevant instruments, discard-relevant instruments, and jointly-relevant instruments [1], which are for pricing solid waste collection and disposal services at the different stages of municipal solid waste management (MSWM). The economic research has found multiple combinations of these policies that can encourage an efficient MSWM [2][3][4][5]. However, all of these policy designs are flawed to varying degrees [6,7]. Due to the feasibility of various tax and subsidy policies in practice, the municipal authorities of developing countries have limited policy tools and mainly rely on traditional Command and Control Policies. Those policies involve imposing restrictions on the recycling proportion of all packaging waste [2], using a certain percentage of recycled materials [8], and implementing MSW source separation programs.
MSW source separation is one of the major MSW reduction approaches adopted worldwide. It has been proved to be an effective method to achieve waste minimization and circular economy and some believe that source separation should be a key priority and that the program pilot in China should be extended to more cities [9].
Up to now, cities implementing source separation policies have formulated different separation standards according to their own characteristics. Most cities adopt the four-category approach, which divides the MSW into recyclable, food, hazardous, and other waste. However, the specific names of each city are slightly different. For example, food waste is named perishable waste in Hangzhou, as well as Shanghai calls food waste wet waste and other waste dry waste to make it easier for citizens to distinguish. In addition, the standard of Guangzhou even lists the categories of food waste in detail for the convenience of citizens to discard. When community residents discard waste, they need to put different waste into the corresponding color bin according to the regulations. Different garbage trucks would collect the waste according to the color of bins and transferred it to various waste facilities by local municipalities. Thus, it can increase the recycling rate of the formal collection system and also reduce the pressure of the landfill or incineration disposal facilities.
Diverting waste from landfills has many environmental benefits, such as the reduction in greenhouse gas emissions, the minimization of waste pretreatment operations, and energy recovery by composting [29]. Aphale et al. (2015) [30] verified that the recycling performances are positively related to the separation efficiencies. Thus, estimating the effectiveness of source separation is significant when considering the actual impact of the program. The government could benefit from it and improve the performance of MSWM. Since the implementation of the first pilot was 19 years ago, the public had little concern regarding environmental consciousness. Additionally, the municipal authorities did not provide enough support for it. Meanwhile, Han and Zhang (2017) [31] has proved the ineffectiveness of the pilot. The time interval between the third and the second pilot is relatively short, so the potential hysteresis of the second pilot may exit. Therefore, this paper mainly examined the causal effects of the second MSW source separation program on the PMSWG.

Description of the Methodology
In most empirical works, program evaluation with nonexperimental data needs to impose a strong assumption known as the stable-unit-treatment-value assumption, or SUTVA [32]. SUTVA requires that the treatment effects of units are independent of each other, that is, the spillover effects between the potential outcomes of units must be ruled out.
SUTVA might be plausible in many biomedical experiments and has been adopted by many previous program evaluation studies [33,34]. However, it is quite intuitive that SUTVA may not be true in many scenarios, such as in the research and development (R&D) activities of companies and epidemiology. It might be difficult to believe in the absence of "spillovers" among units [20]. Similarly, and a fortiori, the impact of implementing the second program is not only on the reduction of PMSWG but also, perhaps, on spillovers to the neighboring cities. So far, only a few papers have formally included the interaction between units in the literature on the estimation of program evaluation [20], let alone in the MSWM research.
Meanwhile, some believe that the SUTVA assumption is not testable in principle [34]. However, spatial factors actually play an important role in environmental and economic issues. Ignoring spatial factors may omit important variables, which can cause bias estimates for the model parameters [35]. According to the first law of geography [36], the sample data based on geographical location are often not independent of each other but have spatial dependence. Generally, the spatial dependence between adjacent units is stronger than that between distant units.
Emerging empirical literature has explored the spillover effects in the MSWM domain, such as the spillover effects of MSWM performance [37], MSW recycling outcomes [38], and PMWSG [39]. In terms of the spillover effect in PMSWG case, there may be several reasons why PMSWG in one city can be influenced by policies and practices adopted in neighboring cities.
Firstly, at the individual behavior level, the decrease in PMSWG represents the improvement of residents' pro-environmental behaviors and indicators of sustainability. The psychological theory argues that behavior consistency effect and social identity effect may foster a positive spillover effect of pro-environmental behaviors [40]. Agovino et al. (2016) [41] adopted the waste collection habits as a proxy of pro-environmental behaviors, they found that provinces with good pro-environmental habits may positively influence neighboring ones in Italy. Crocita et al. (2016) [42] employed the separation collection of MSW as representing a pro-environmental attitude and confirmed the presence of positive proximity effects among virtuous neighbors on Italian provinces. In other words, if spatial units are surrounded by units with good waste separation habits, they are more likely to adopt the same behavior since they feel part of a virtuous circle [37]. Thus, a decrease in PMSWG in one city may lead to a decrease in PMSWG in nearby cities due to similar social and cultural environments. The spillover effect decreases as the distance increases, which is embodied in the specification of the spatial weight matrix described below.
Secondly, considering the regional differences, neighboring cities with similar characteristics may be easier to exchange and emulate each other's experiences [38], and in this way, affect each other's MSWM policies and, in turn, the PMSWG. Furthermore, due to the similarity of economic development level and consumption culture in adjoining regions, the implementation of MSWM policies, such as source separation program in pilot cities, is more likely to affect the MSW separation awareness and classification knowledge of residents in adjacent cities. These positive proximity effects were also observed by the pioneering work in Oskamp et al. (1991) [43]. Thus, the PMSWG of a city may be influenced by neighboring cities.
Thirdly, MSW is a part of human life and will be generated along with human economic activities. Moreover, the change in economic mode will inevitably affect the production and types of MSW. Economic convergence clubs within China have been verified [44], which implies that there will be different economic agglomeration blocks, and the interaction between adjacent cities within the blocks will be greater than that between cities farther away. If the economic growth rate is relatively high in one city, the adjoining cities may imitate its economic development model and industrial structural arrangement, as by-products of economic activity, thus the types and production of MSW in neighboring cities will be affected by the economic development and the MSWM policies of this city.
Logically speaking, the PMSWG of each city in China will demonstrate spatially related characteristics. The choropleth maps of PMSWG in China in 2002, 2016, and the average of 2002 to 2016 are presented in Figure 1. The darker areas indicate a higher PMSWG. The PMSWG appears to be spatially dependent since the cities with higher PMSWG are clustered together. That is, the stylized fact that PMSWG exhibits homogeneity in adjacent cities is observed. If the PMSWG of each city was randomly distributed with regard to location, we would not see the clusters. A formal test of spatial correlation will be conducted below. habits as a proxy of pro-environmental behaviors, they found that provinces with good pro-environmental habits may positively influence neighboring ones in Italy. Crocita et al. (2016) [42] employed the separation collection of MSW as representing a pro-environmental attitude and confirmed the presence of positive proximity effects among virtuous neighbors on Italian provinces. In other words, if spatial units are surrounded by units with good waste separation habits, they are more likely to adopt the same behavior since they feel part of a virtuous circle [37]. Thus, a decrease in PMSWG in one city may lead to a decrease in PMSWG in nearby cities due to similar social and cultural environments. The spillover effect decreases as the distance increases, which is embodied in the specification of the spatial weight matrix described below. Secondly, considering the regional differences, neighboring cities with similar characteristics may be easier to exchange and emulate each other's experiences [38], and in this way, affect each other's MSWM policies and, in turn, the PMSWG. Furthermore, due to the similarity of economic development level and consumption culture in adjoining regions, the implementation of MSWM policies, such as source separation program in pilot cities, is more likely to affect the MSW separation awareness and classification knowledge of residents in adjacent cities. These positive proximity effects were also observed by the pioneering work in Oskamp et al. (1991) [43]. Thus, the PMSWG of a city may be influenced by neighboring cities.
Thirdly, MSW is a part of human life and will be generated along with human economic activities. Moreover, the change in economic mode will inevitably affect the production and types of MSW. Economic convergence clubs within China have been verified [44], which implies that there will be different economic agglomeration blocks, and the interaction between adjacent cities within the blocks will be greater than that between cities farther away. If the economic growth rate is relatively high in one city, the adjoining cities may imitate its economic development model and industrial structural arrangement, as by-products of economic activity, thus the types and production of MSW in neighboring cities will be affected by the economic development and the MSWM policies of this city.
Logically speaking, the PMSWG of each city in China will demonstrate spatially related characteristics. The choropleth maps of PMSWG in China in 2002, 2016, and the average of 2002 to 2016 are presented in Figure 1. The darker areas indicate a higher PMSWG. The PMSWG appears to be spatially dependent since the cities with higher PMSWG are clustered together. That is, the stylized fact that PMSWG exhibits homogeneity in adjacent cities is observed. If the PMSWG of each city was randomly distributed with regard to location, we would not see the clusters. A formal test of spatial correlation will be conducted below. In fact, in other fields, many scholars have also included the spatial interaction relationship in their research framework [45][46][47][48]. Therefore, if the pervasive spatial dependence is confirmed among  In fact, in other fields, many scholars have also included the spatial interaction relationship in their research framework [45][46][47][48]. Therefore, if the pervasive spatial dependence is confirmed among cities in China, the spatial spillover effects should be considered in the program evaluation of MSW studies.
In addition, identifying the causal effect of the second program, T i , on PMSWG faces seemingly insurmountable problems in nonexperimental data, where T i = 1 iff city i implemented the second program, otherwise T i = 0. Generally, in the program evaluation literature, T i = 1 is called the treated group and T i = 0 refers to the control group. However, the program is usually not randomly assigned, and there may be various significant unobserved factors, including the alternatives available to cities, their environmental preferences, waste management achievement, and the potentially idiosyncratic political benefits for local bureaucrats in an interjurisdictional competition context [49]. More generally, the estimation of the casual effect of T i may be biased and inconsistent when T i is not randomly assigned, and this essential problem is called selection bias.
To control for the possibility of endogenous program choices, the decision of whether to adopt the second program or not was modeled as a function of observable exogenous variables. Then, the predicted probabilities for the program implementation were used to substitute the possible endogenous variable, T i , in the PMSWG equation using two-stage least squares (2SLS). With data on ex-ante and ex-post program intervention, 2SLS allows a more flexible strategy to deal with unobserved heterogeneity and permits these characteristics to vary with the duration of the program.
Finally, we relaxed the SUTVA by including different spatial lag terms in 2SLS to control for the spatial interaction effects, and the treatment effect of the second program was estimated by setting different spatial weighting matrices. The data and detailed empirical strategy are introduced in the following section.

Data
The sample data include surveys both before and after the second source separation program intervention. Therefore, a panel data of 569 prefecture-level cities in mainland China between 2002 and 2016 are collected in this study. Using panel data could correct the bias caused by omitting variables and control for heteroskedasticity within different cities. We merged the panel data with the shapefile of prefecture-level cities [50], which has geographical information, to create spatial weighting matrices, and finally reduce our sample size to 288 prefecture-level cities.
All data are from Chinese official statistics, including the "China Statistical Yearbook", the "China Statistical Yearbook of Urban Construction" and City Statistical Yearbook of each city [51,52]. The indicator variable SEP2 = 1 if the city has implemented the second source separation program, and 0 otherwise. Let PROCAP = 1 if the city is the provincial capital city, and 0 otherwise. Some dummy variables of geographical location, EAST, MIDDLE, WEST, and NORTHEAST indicate that these cities are located in the east, middle, west, and northeast of China, respectively. As a summary, the descriptions of the variables are listed in Table 1.
The total and average MSWG in the treated cities are significantly larger than the average level of untreated cities. At the same time, the two groups have significant differences in terms of the GDP per capita, population density, and urbanization rate. The table illustrates that the data are unbalanced, and prefecture-level cities display considerable variation in MSWG, economic characteristics, and demographic variables, which may influence whether the cities implement the program or not, i.e., the program was not randomly assigned. This nontrivial difference may cause potential endogenous problems.

Specification with SUTVA
2SLS relaxes the exogeneity assumption of OLS and is robust to time-varying selection bias, which some identification strategies cannot control, such as the difference in differences estimation Sustainability 2020, 12, 1664 7 of 20 (DID). In the following section, we set the 2SLS econometric model with SUTVA, as in many previous program evaluation studies, and then relax the assumption by adding the spatial spillover effects in the model specification. The mean values of some characteristic variables between 2002 and 2016 in the treated and control cities are shown in Table 2.

A Discrete Choice Model
The first-stage regression is to construct a nonlinear binary response model to calculate the predicted probability of each city being selected into the second program, i.e., regressing the treatment dummy variable on the vector of exogenous covariates. A discrete choice model is used to depict the decision of whether each local government implements the second program or not. The most prominent types are Logit and Probit, both of which can be derived under the assumption of the utility-maximizing behavior of the local government. The discrete choice model can be derived by random utility theory [53]. Each local government is assumed to have a tradeoff between the potential benefits and costs of implementing the program. A latent variable U SEP2 i is defined as the net benefits for the city i from adopting the program, as follows: where U SEP2 i is the net utility of whether city i chooses to implement the second source separation program (SEP2) or not. The vector Z SEP2 i includes all the variables that may help determine the local government's choices, such as the number of vehicles and equipment for municipal environmental sanitation, the number of Harmless Treatment Plants, and other exogenous characteristic variables of city i, and γ is the corresponding vector of coefficients to be estimated. ε SEP2 i is the disturbance term. We cannot observe the net benefits U SEP2 i from implementing the program. Instead, we only know whether a city has implemented such a program, i.e., as follows: Therefore, city i chooses to implement the program with the following probability: If we assume that ε SEP2 i follows a standard logistic distribution, the familiar conditional Logit model arises.
We used this model to depict the choice of each city and to generate the predicted probabilityPR that each city will choose to adopt the program.PR is used as the instrument variable (IV) to substitute the possible endogenous variable, SEP2, in the second stage regression.

The 2SLS Model
Conventionally, we use the following equation to estimate the treatment effect of the SEP2 on PMSWG: where Y it denotes PMSWG, X it is the vector of covariates that affect PMSWG and θ is the vector of coefficients on X it . µ i is the city-level fixed effect, ϕ t represents the time effect, and ε it is the disturbance component. β is the program impact that we want to estimate. The predicted probabilityPR embodies only the exogenous variation in the treatment. Then, we usePR as the exogenous IV for treatment placement, creating the following second-stage regression: In this study, taking the logarithmic form of some major continuous regressors can reduce the data volatility and the potential disturbance of the heteroscedasticity.

Spatial Dependence Tests
In the first step, the spatial dependence of sample data should be tested in advance. If there is no spatial dependence, the traditional regression model is sufficient. The Moran's I index [54] and Moran scatter plot are the quantitative indexes that are usually used to measure the spatial dependence. Moran's I is defined as follows: where N is the number of observational spatial units, e is the residual vector of OLS, W is the spatial matrix, S is the normalized factor equal to the sum of all the elements in the weighting matrix. If we normalize the rows of the weighting matrix, Equation (7) reduces to the following: In our study, W(1) is the distance threshold spatial weighting matrix, based on the specification that if city i and j are farther apart than a distance threshold d, then w ij be set to 0, otherwise w ij = 1. To ensure that each city has at least one adjacent city, we set the distance threshold as the longest distance between city i and city j by using the method applied in spatial econometric practice [55]. Concretely, d equals to 2109 km, which is the maximum arc distance between the two cities in the sample data. In accordance with the queen contiguity rule to specify W(2), if two cities, i and j, have a common border or vertex, then w ij = 1, otherwise w ij = 0. The specific forms of the two spatial matrices are as follows:

The S2SLS Models
If Moran's I test and Moran scatter plot confirm the existence of spatial interaction, appropriate spatial econometric models should be set to control for spatial interference. We set the spatial correlation in the second stage of the above regression process for correct endogeneity as follows: W is the spatial weighting matrix, which characterizes the spatial relationships between the observable units and the elements w ij specify the potential spillover among interactive units i and j. WlnY it is the lag of the dependent variable, and ρW measures the spillover effects of nearby outcomes. Wx jt is the lag of the spatial correlated independent variable x jt , and δW measures the spillover of nearby covariates x jt to the outcome of unit i. σ is the spatial correlation parameter of the residuals, which account for the spatial interactions among unobservable neighbor factors. Equation (10) is to estimate the absolute reduction effect of SEP2. At the same time, we will further examine the relative reduction impact of SEP2, that is, the treatment effect of SEP2 on the growth rate of PMSWG: dlnY it is the increase of PMSWG and L.lnY it refers to the first lag of lnY it , i.e., lnY i(t−1) . Thus Equation (11) modifies the dependent variable for the growth rate of PMSWG in Equation (10).
The setting of Equation (10) is called the generalized nesting spatial model (GNS), which is the most general spatial econometric model. If δ = 0, σ = 0, the GNS will be degraded into a spatial lag model (SLM), also known as a spatial autoregressive model. If ρ = 0, δ = 0, the GNS will simply become a spatial error model (SEM), If σ = 0, the GNS reduces to a spatial Durbin model (SDM). Thus, SEM, SLM, and SDM are special forms of GNS. In this study, the GNS is used to estimate the spatial regression model, to take full account of the spatial interaction.
In the spatial econometric model, we control the spatial dependence by setting three different spatial weight matrices. The specification of the contiguity spatial matrix W(2) (CON matrix) continued to be defined by the rule of queen contiguity previously described. The elements of the inverse-distance spatial matrix W(3) (IDIS matrix) equal to the reciprocal distance between cities, assuming spillover effects are proportional to the inverse of the distance between cities. The inverse-distance contiguity spatial matrix W(4) (IDIS-CON matrix) is a weighting matrix that contains the inverse distance for adjacent cities and is 0 otherwise. Three weighting matrices settings are shown below.

The 2SLS Evaluation
The 2SLS is adopted to control the self-selection problem. A Logit model is applied in the first stage to depict the decision of whether each local government implements the second separation program or not at the city level, i.e., regressing the treatment policy dummy variable (SEP2=1) on the vector of city-level exogenous covariates. Thus, in the first stage, we first include as explanatory variables a set of city-level exogenous variables correlated with T i but not correlated with the error term into Equation (1). Since the actual implementation of the program was in 2014, the probability of implementing the second MSW source separation program in each city was first estimated by using observable exogenous variables that were not affected by the policy before the implementation of the policy (2002-2013) and then, replacing the endogenous policy variables to test the program impact of the second MSW source separation. The first stage analysis is to generate the predicted probabilitŷ PR that each city will choose to execute the program.PR is then used as the instrument variable (IV) to substitute the possible endogenous variable, SEP2, in the second stage regression. The reduction effects of the second source separation program on PMSWG can be estimated based on the correction of endogenous problems.
The regression results from the Logit model defined in Equation (4) are presented in Table 3. The fourth column in Table 3 displays the marginal effect of a change in the particular explanatory variable on the probability that a city adopts the second MSW source separation program. After controlling for other covariates, the probability of implementing the program is estimated to increase by approximately 1.45% for an additional percentage of the number of vehicles and equipment for Municipal Environmental Sanitation (lnVEH) and to increase by 0.03% for a 1% increase in the percentage of Harmless Treatment Capacity per year (lnHLC). The number of Harmless Treatment Plants/Grounds (HLS) is also significantly positively related to the probability of enforcing this program. Perhaps the cities that have better performance regarding the infrastructure are more willing to apply for this top-down pilot program to acquire the potentially idiosyncratic political benefits.
The analysis also implies that the provincial capital cities are 8.89% more likely to implement the second MSW source separation program. In general, provincial capitals have more sufficient financial funds and developed economies, so they are more likely to adopt the program in response to this top-down political propaganda. From the marginal effects of other variables in Table 3, we can also infer the influence of other variables on the probability of cities choosing to implement the second source separation program. The results in Table 3 are instructive and interesting in their own right, but the primary purpose of estimating these discrete models is to create an exogenous prediction to substitute for the policy dummy variable SEP2 in Equation (5).
In the second stage, the predicted probabilities of the second MSW source separation program (PR) from the Logit model are used as an exogeneity predictor in Equation (6). To better approximate the exact relationship between environmental degradation and economic growth, Sobhee and K.Sanjeev (2004) [56] suggest that higher-order terms of major economic explanatory variables should be added to the regression. Therefore, we include GDP per capita (PGDP) in a nonlinear fashion in Equation (6), using the logarithmic form of PGDP and both its square and cubic form in the regression. The panel data estimation results of Equation (6) are reported in Table 4 when the SUTVA holds. Table 4. Estimation results of the program's impact on PMSWG with SUTVA. As shown in Table 4, the estimated coefficients for lnPGDP, (lnPGDP) 2 , and (lnPGDP) 3 in all estimation approaches are significantly positive, negative, and positive, respectively, and the estimates are very close to each other. Therefore, it is appropriate to include higher-order terms of the economic variable in the regression analysis. The finding claims that the implementation of the second MSW source separation program is estimated to decrease PMSWG, except for the random effect model. However, neither estimate is significantly different from zero, which implies that after correcting the endogenous local policy program choice, the second MSW source separation program did not significantly reduce PMSWG. Thus, these estimation results cannot support the claims that China should vigorously promote the source separation program and extend it to more cities [9].

Individual-Fixed Effect Two-Way-Fixed Effect Random Effect
Some of the previous literature includes an impact evaluation of the demographic variables, population density, urban population, and urbanization rate on PMSWG [57][58][59]. We incorporated these covariates into the regression model and controlled the cross-product term of GDP per capita with the demographic variables and location dummy variables.
Since the location dummy variables do not change with time in the panel data, it is omitted in the individual fixed effect model and two-way fixed-effect model. At the same time, we also included the MSW pricing policy variables, i.e., the logarithmic form of the MSW disposal fee of each city (lnFEE), in the regression model. The estimation shows that the influences of the waste charging mechanism on the PMSWG seem to be negative in most approaches except for the random effect result, but the coefficients are insignificant in all models, suggesting the ineffective reduction effect of the waste charging policy on the PMSWG.
The null hypothesis that the constant terms are equal across the units of the individual fixed effect model is rejected. The statistic of the F test is 40.08, and the corresponding p-value is lower than 0.01 (with 287 • of freedom [d f ]), indicating that the pooled OLS would produce inconsistent estimates and the individual-specific heterogeneity should be controlled. In addition, the two-way fixed effect model adds the time effect to the individual fixed effect, and the time effects are jointly significant, at least at the 1% level (F = 9.94, d f = 14, P < 0.01), suggesting that they should be included in the regression. The time effect is also controlled in the random effect model, and the chi2 test for the time effect also indicates that the null hypothesis of the insignificant time effect should be rejected (chi2 = 238.65, d f = 14, P < 0.01). In the Hausman-test framework for fixed effect and random effect model selection, the null hypothesis that the random estimator is consistent is soundly rejected (chi2 = 51.06, d f = 27, P < 0.01). Therefore, the result of the random effect estimator is inconsistent, and the two-way fixed effect estimator is preferred.
In most previous works, it was generally assumed that there is no interaction among individuals, excluding spatial dependence, i.e., holding the SUTVA. However, if there is spatial interference among individuals, the estimation coefficient is biased. Therefore, the estimation shown in Table 4 that is obtained by ignoring spatial dependence is subject to bias and is not reliable. In the following part of this study, after confirming the existence of spatial dependence, spatial factors are controlled in the regression analysis to correct the estimation bias caused by ignoring spatial correlation.

Spatial Dependence Tests
We calculate the Moran's I index and fit the Moran scatter plot by Geoda. Figure 2 Figure 1 use the setting of the spatial matrix W(1), and the bottom three adopt W(2). The low-low (lower left) and the high-high quadrants of each plot imply positive spatial interactions, and the low-high (upper left) and the high-low (lower right) quadrants indicate negative spatial dependence. As illustrated in Figure 2, most of the 288 cities appear in the upper right and lower left quadrants, 999 random permutation simulations were carried out for each Moran's I index, and the pseudo p-values were all at least less than 0.01, which were significant at the level of 1%.
These results suggest that there is a nonnegligible positive global spatial dependence of PMSWG among cities. The appropriate spatial lag terms should be included to estimate the parameters of Equation (6). Otherwise, the regression results will be biased due to the omission of important variables.

The S2SLS Evaluation
In the last part of the empirical estimation process, we use Equation (10) to regress the PMSWG on the predicted probability of implementing the second MSW source separation program from the results of the Logit model in Table 3. Meanwhile, we also include three spatial spillover relationships in the following three ways: a.
include ρWY it to allow adjacent outcomes to affect outcomes; b.
include δWx jt to allow adjacent covariates to affect outcomes; c. include (I − σW) −1 ε it to allow adjacent unobservable errors to affect outcomes.
high-low (lower right) quadrants indicate negative spatial dependence. As illustrated in Figure 2, most of the 288 cities appear in the upper right and lower left quadrants, 999 random permutation simulations were carried out for each Moran's I index, and the pseudo p-values were all at least less than 0.01, which were significant at the level of 1%. These results suggest that there is a nonnegligible positive global spatial dependence of PMSWG among cities. The appropriate spatial lag terms should be included to estimate the parameters of Equation (6). Otherwise, the regression results will be biased due to the omission of important variables.  According to the above analysis, the two-way fixed effect model outperforms the other two models in Table 4. Therefore, we include the spatial weighting matrices, W(2), W(3) and W(4), in the two-way fixed effect model, since the spatial econometric model is considering the spatial dependence on the basis of the traditional model with SUTVA. Then, the quasi-maximum likelihood (QML) estimator in Lee and Yu (2010) [60] is implemented to fit the GNS model.
It can be seen from Table 5 that the estimated coefficients of the spatial lag term of the dependent variable, economic variable (lnPGDP) and the error term, that is, ρ, δ, and σ, respectively, are all significantly positive at least at the level of 1% under the setting of three spatial weighting matrices. The results indicate that the PMSWG, the economic variable, and the unobservable factors that influence the PMSWG are not independent, that is, the SUTVA is invalid. The increase of the GDP per capita and the PMSWG in city i will have positive spillover effects on the PMSWG in city j. Meanwhile, unobservable items that affect the PMSWG also have positive spillover effects between cities i and j. Thus, spatial effects should be considered in the regression model.
After controlling the spatial effects, the estimated coefficients of the nonlinear explanatory terms of the major economic explanatory variables, lnPGDP, (lnPGDP) 2 , and (lnPGDP) 3 , are all significant at the level of 1% under the setting of all the spatial weighting matrices, and their signs are positive, negative and positive, respectively, which are consistent with the estimated results of the conventional regression specification. However, after including spatial factors, (PR) becomes significant at the level of 10% in the setting of the IDIS matrix, i.e., the implementation of the second MSW source separation program does decrease the PMSWG. However, this finding is not supported by the specification of the CON matrix and IDIS-CON matrix, which implies that we cannot infer the reduction effect of the source separation program safely. Table 5. Estimation results of the program's impact on PMSWG by GNS models. At the same time, the estimated coefficient signs for demographic variables are also consistent with the previously estimated results without considering the spatial dependence relationship, indicating that the increase of the urban population will significantly reduce the PMSWG, which is consistent with the implication of the estimation results of the population density. This finding is similar to the observations that population agglomeration may increase the scale effect of MSWG [58], but it is contrary to the conclusion of Mazzanti and Zoboli (2008) [59]. Nonetheless, the increase of the urbanization rate does not necessarily represent the agglomeration of the population, but only the rise in the proportion of the urban population in a specific city administrative region. From this estimation, on the contrary, the increase of urbanization rate significantly increases the PMSWG, which confirms the research conclusion of Johnstone and Labonne (2004) [57] based on the utility maximization model.

PR
In the spatial econometric model, the cross-product terms are still included in the regression analysis, and the signs of its estimation coefficients are still consistent with the traditional estimation method obtained from Table 4. Additionally, the estimated coefficient of MSW charging variable (lnFEE) is still not significantly different from zero in each spatial matrix specification, indicating that the waste pricing mechanism does not have the effect of inhibiting the PMSWG, which is consistent with the previous study [17].
From the above analysis considering spatial correlation, it can be seen that the implementation of the second MSW source separation program in spatial models significantly decreases the quantity of PMSWG to a certain degree in one of the spatial models. Thus, the previous estimation without concerning the spatial spillover effects may be biased downward, that is, the traditional regression with SUTVA may underestimate the impact of the program, in contrast to the spatial estimation.
Although the sample data are not able to support a significant and robust negative effect of the second MSW source separation program on PMSWG in all the spatial models, this negative estimation bias gives us important implications for further analysis.
Since the selection bias problem exists in our sample data, there are significant differences in the level of economic development between the cities that choose to implement the second MSW source separation and other cities. In the previous literature, empirical studies have been conducted on various linear and nonlinear connections of PMSWG with economic growth [61][62][63][64][65][66]. In many developing countries, the PMSWG has not been decoupled from economic growth, and even in some developed countries, absolute decoupling has not yet been achieved. That is, from the time trend perspective, the PMSWG has and will continue to grow for a long time.
Therefore, it makes more sense to test the impact of the source separation program on the relative growth of PMSWG. Next, we still adopt the same spatial weighting matrices above to control the spatial interdependence, and we replace the dependent variable with the growth rate of the PMSWG but only consider the linear relationship between it and the proxy variable of economic growth (lnPGDP), i.e., high-order terms of economic variables are excluded from the regression model. In this way, the effect of the second MSW source separation policy on the growth rate of the PMSWG is shown in Table 6. Table 6. Estimation results of the program's impact on the growth rate of the PMSWG by GNS models. The estimation results show that the estimated coefficients of the spatially lagged dependent variable (ρ), the economic variable (δ), and the error term (σ) are all significant at least at the level of 5%. The significant estimation coefficients of the spatial lag terms in Table 6 suggest that the GNS is appropriate to control the spatial correlations, but the growth rate of the PMSWG has a negative spillover effect on spatially related cities, which is in contrast to the estimation in Table 5. In addition, compared with the estimation in Table 5, the spatial spillover effect of the explanatory variable GDP per capita becomes significantly negative, namely, the increase of the GDP per capita in city i will curb the growth rate of the PMSWG in the spatial neighboring cities. Thus, obtaining wealth may be one of the best ways we can find to improve our environment today, the model fitting results are consistent with the theoretical derivation conclusion of Boucekkine and El Ouardighi (2016) [67].
The results also indicate that the higher the probability of implementing the second MSW source separation (PR), the lower the growth rate of the PMSWG, and the estimates are very close to each other in the specification of all the three weighting matrices. By these estimates, when the change in the probability of the second MSW source separation is increased by 1%, the growth rate of the PMSWG will decrease by approximately 0.06% per person per year. However, the probability of implementing the second MSW source separation cannot be observed in reality and it is an instrumental variable for correcting endogenous dummy policy variable (SEP2=1, if treated, SEP2=0, otherwise) in the S2SLS regression context. We only observe whether the city has implemented the policy (SEP2=1) or not (SEP2=0), there is no middle ground in reality. In other words, compared with cities that did not implement the separation program, the enforcement of the second MSW source separation will result in a 5.79% decrease in the growth rate of the PMSWG, and it is a considerable value for the relative reduction effect of separation program. Thus, from the analysis, the second source separation program significantly reduced the growth rate of the PMSWG. Combined with the above analysis, the effect of the source separation program on the absolute amount of the PMSWG cannot be estimated steadily and consistently in all spatial models. However, it can still correct the negative bias due to ignoring the spatial dependences. Meanwhile, all three spatial models support that source separation policy significantly reducing the relative amount of PMSWG.
In the policy effect evaluation of source separation on the growth rate of PMSWG, the estimated coefficients of population density become no longer significant in all three models, and the estimated coefficients of the urbanization rate and urban population are still significantly positive and negative, respectively, at the level of 1%. The cross-product term of the GDP per capita and the urbanization rate, as well as the cross product term of the GDP per capita and urban population, are still significantly negative and positive at the level of at least 1% in the three spatial models, which indicates that the impact of urbanization rate and urban population on the growth rate of PMSWG depends on GDP per capita. In the case of the same rise of urbanization rate, the increase of GDP per capita will inhibit the growth rate of the PMSWG, while the identical rise in GDP per capita will lead to the increase of the growth rate of the PMSWG in the case of the same level of urban population increasing.
The reason may be the law of diminishing margins, which is the potential basis for the so-called Environmental Kuznets Curve [68], which refers to an inverted U-shaped relationship between environmental pollutants and economic growth indicators, following the pioneering work proposed by Kuznets (1955) [69]. Different cities may be at various stages of development in the Environmental Kuznets Curve [70]. In addition, with the same GDP per capita, the growth rate of the PMSWG in provincial cities is lower than that in nonprovincial cities. Furthermore, the effect of the MSW charging policy is still not significant in all models, which means the failure of the price policy of the MSWM in China.
Because the spatial econometric regression model explores the complex spatial dependency structure among spatial units, changing an explanatory variable or dependent variable of a particular spatial unit will affect the spatial unit itself, on the one hand, and all other spatial correlated units, on the other hand. This mutual spatial dependence will produce a feedback effect. Therefore, it will lead to a considerable bias when explaining the relationship between spatial units directly, according to the estimated parameters of the spatial regression model. Table 7 shows the average estimators of the direct effect, indirect effect, and total effect estimated by the Delta-Method under the specification of the IDIS matrix, and only the significant variables are listed.  The indirect effects of the predicted probability of the second MSW source separation program on the growth rate of the PMSWG are significantly negative under the three weighting matrices settings. This means that after correcting the endogenous local policy, the source separation policy has a negative spillover effect on the PMSWG. Take the contiguity spatial matrix as an example, when the probability of implementing the second MSW source separation program increases by 1%, the growth rate of the PMSWG in this city will decrease by 0.06%, and its spillover effect will lead to a decrease of 0.03% in the growth rate of the PMSWG in spatially related cities. Specifically, 31% of the total reduction effect is due to the spatial spillover effect.
In the effect evaluation of the source separation policy on the growth rate of the PMSWG, the estimated coefficients of the population density are no longer significant and are not reported in Table 7. Additionally, the direct effects and indirect effects of the GDP per capita, urban population and urbanization rate are still significant at the level of 1% in all the specifications of the spatial weighting matrices. Take the contiguity spatial weighting matrix, for example, among the total effects of one percent growth of GDP per capita, 55% of the negative effects are the further spillover effects on spatially dependent cities. Approximately 30% of the total effect of the increase of the urban population and urbanization rate are the negative and positive spillover effects, respectively.

Conclusions
Relaxing the SUTVA and correcting for endogenous local policy choices, we construct the GNS model to estimate the PMSWG reduction effect of China's second MSW source separation program. This study contributes to the empirical literature by evaluating the impacts of endogenous policies with consideration for spatial correlations that cannot be ignored.
The analysis indicates that comparing with the results estimated from the models with SUTVA, the spatial estimation increases the significance of the impact of the second MSW source separation program under the specification of the IDIS matrix. That is, the reduction effect of the program on the absolute amount of PMSWG was underestimated with SUTVA. Although not all the spatial models support this absolute amount of PMSWG reduction, the impact on the relative amount of PMSWG reduction is significant and stable in alternative specifications of spatial weighting matrices.
Given that the absolute MSWG growth in China currently remains high, it is difficult to achieve the absolute decoupling between MSWG and economic growth in the short-run. Seeking feasible policies to curb the growth of PMSWG has great practical significance for China's MSWM. According to the analysis of this study, the reduction effect of the source separation program on the growth rate of PMSWG is significant but still not satisfactory. Specifically, the estimates indicate a relative reduction rather than an absolute reduction effect of source separation program, and there is much room for policy improvement. In light of these findings, an urgent field for future research is to explore policy combinations to enhance the effectiveness of the source separation. It may be necessary to reinforce coordination in the policy implementation process and the whole chain of the MSWM. This study provides a quantitative factual basis for future policy revision and implementation.
With regard to the confirmation of nonnegligible positive spatial spillover effects between PMSWG at the city level in China, this study also encourages more researchers in developing counties to examine the existence of spatial dependence at different levels, such as the country level. If this interaction exists, the enhancement of the effectiveness of MSWM in a certain region not only can protect the environment of its own by reducing the PMSWG but also will have spillover effects on all other spatial-related regions. It also provides insights for further cooperation in MSWM among developing countries. However, the idiosyncratic characteristics of each county should be considered when transferring the conclusion of this paper.