3.3.4. Robustness Tests
Although the DID method provides an initial estimation of the policy effect in the baseline regression, its results may be influenced by factors such as sample selection bias [
81]. To verify the reliability of the research conclusions, it is necessary to employ other methods for robustness testing. Drawing on the research of Mao Qilin (2024) [
82] and Liu Naiquan (2017) [
83], this study uses propensity score matching (PSM) and the synthetic control method (SCM) to conduct counterfactual causal inference, addressing confounding factors in the causal effect.
For propensity score matching, the treatment group data consists of actual observed values of green energy efficiency indicators in pilot cities before and after policy implementation, which directly reflect real changes in green energy efficiency following the policy intervention in these cities. The control group data is constructed by calculating each city’s probability of being selected as a pilot city prior to policy implementation through the PSM method. This model uses a series of characteristic variables from the year preceding policy implementation as inputs. By matching non-pilot cities with similar propensity scores, we establish a control group that shares comparable pre-policy characteristics with the treatment group cities. The specific construction methodology is as follows:
First, this study uses
to represent the change in the degree of green transformation for the treatment group cities from year t to year t + s, and
to represent the change in green energy efficiency for the treatment group cities if they had not been selected as low-carbon city pilots during the same period. The average treatment effect (ATE) of being selected as a low-carbon city pilot on urban green energy efficiency can be expressed as:
Since the changes in green energy efficiency that cities selected as low-carbon city pilots would have experienced in the absence of the policy cannot be directly observed, this study constructs a counterfactual using cities at the prefecture level that had similar characteristics in the year before the policy implementation, but did not participate in the initiative. In this case, Equation (18) can be transformed into:
Specifically, construct a Probit model:
Among these, indicates whether city k was selected as a pilot in year t, taking the value 1 if selected and 0 otherwise. represents a series of characteristics of city k in year t − 1, including financial development level (Finance), urban investment level (Invest), population density (Pop), and economic structure (Level).
The propensity score matching (PSM) method effectively addresses sample selection bias and confounding factors in causal inference, as the construction of the control group ensures that the treatment and control groups share similar observable characteristics prior to policy implementation. Through this comparative approach, this study can more accurately assess the true policy effects, thereby validating the reliability and validity of the baseline regression results.
A single matching method may introduce matching bias, meaning systematic differences could still persist in the matched sample. Employing multiple matching methods helps identify and mitigate such bias while accommodating different data distributions and leveraging the strengths of various approaches. This enhances the robustness, precision, and credibility of the research findings. In the experiment, this study employs three propensity score matching methods: 1:1 nearest neighbor matching, radius matching, and kernel matching for baseline regression analysis [
84].
Figure 3 displays, in counterclockwise order, the kernel density plots of treatment and control groups before matching, followed by post-matching densities after nearest neighbor matching, radius matching, and kernel matching, respectively.
Table 7(1) presents baseline regression results using nearest neighbor matching. The coefficient of posttreat demonstrates statistically significant positive effects on urban green energy efficiency at the 1% significance level, indicating that the pilot policy increases urban green energy efficiency by 0.021 units on average. These results align with the baseline regression in
Table 6(5), confirming robustness and validating Hypothesis 1.
Table 7(2)–(3) report results from radius matching and kernel matching. After matching, the posttreat coefficients remain statistically significant at the 5% level, with estimated policy effects of 0.030 units. This marginally higher effect compared to
Table 6(5) baseline results suggests more pronounced net policy impacts after eliminating confounding factors through matching.
- 2.
Synthetic Control Method
Although the propensity score matching method can mitigate the issue of sample selection bias to some extent, it still conducts causal analysis under quasi-natural experimental conditions. In contrast, the synthetic control method can construct counterfactual outcomes for the treatment group in the absence of intervention by leveraging the pre-treatment characteristics of the control group [
85]. This paper employs the synthetic control method to assess the actual impact of the low-carbon city pilot policy on green energy efficiency by simulating a “virtual pilot city unaffected by the policy”. For the control group data, we construct a synthetic control group through a weighted combination of pre-intervention data from control cities, selecting eff, Finance, Invest, Pop, and Level as matching variables to ensure its pre-policy characteristics closely resemble those of the treatment group. The treatment group data are derived from actual observations.
The construction process of the synthesis control group is as follows:
For each policy simulation, the goal is to construct a synthetic control group whose characteristics before the policy implementation are as close as possible to those of the treatment group. The formula is as follows:
Among these,
represents the weight of the control group city j, and
denotes the characteristic value of control group city j at time t. The weights are determined by minimizing the characteristic differences between the treatment group and the synthetic control group before the policy implementation:
Among these, represents the characteristic value of the treatment group at time t, and pre-period refers to the time period before policy implementation,. This study extends the sample interval. Since 2010 was the implementation year of the first batch of low-carbon city pilot policies, the data for the explained variable and control variables from the five years before the policy implementation (2005, 2006, 2007, 2008, and 2009) are selected for constructing the synthetic control group.
This study examines the impact of the low-carbon city pilot policy on urban green energy efficiency. China established three batches of low-carbon pilot cities in 2010, 2012, and 2017, respectively. Correspondingly, this paper leverages the quasi-natural experiments formed by the implementation of these three batches of pilot policies, simulating three randomized experiments to explore the policy’s effect on urban green energy efficiency.
Figure 4,
Figure 5 and
Figure 6 present the implementation effects of the three batches of low-carbon city pilot policies.
Figure 4 illustrates the policy treatment effect for the 2010 cohort. Before 2010, the trends in green energy efficiency between the treatment group cities and the synthetic control group were relatively similar. However, after the implementation of the pilot policy in 2010, the green energy efficiency of the treatment group cities (solid line) became significantly higher than that of the synthetic control group (dashed line), indicating a positive policy effect.
Figure 5 displays the policy treatment effect for the 2012 cohort. Similarly to the 2010 results, the green energy efficiency trends of the treatment and synthetic control groups were closely aligned before the policy implementation in 2012. After the policy took effect, the pilot cities exhibited a significant improvement in green energy efficiency, with the gap gradually widening over time, confirming the effectiveness of the 2012 pilot policy.
Figure 6 presents the policy treatment effect for the 2017 cohort. Before the third batch of pilot policies was implemented in 2017, the treatment group’s green energy efficiency was slightly higher than that of the synthetic control group. This may be attributed to the fact that some cities had already adopted low-carbon city policies prior to 2017, leading to their higher baseline efficiency. Additionally, since 2012, China has placed strong emphasis on urban sustainable development, resulting in an upward trend in green energy efficiency for both the treatment and control groups. After the 2017 policy implementation, the treatment group’s green energy efficiency showed a marked increase compared to the synthetic control group, as evidenced by the steeper slope of the solid line relative to the dashed line in
Figure 6.
Through the synthetic control method, this study further validates that all three batches of low-carbon city pilot policies had a significant positive impact on urban green energy efficiency, with the effects strengthening over time. These findings corroborate the baseline regression results, demonstrating that the low-carbon city pilot policy plays an active role in promoting urban green energy transition and improving energy efficiency.
- 3.
Replacement of Core Indicators
In the baseline regression, the urban green energy efficiency indicator used in this study is calculated using the Super-SBM model. Column (4) of
Table 7 presents the results of replacing the Super-SBM model with the DEA-CCR model [
86], using the input and output indicators from
Table 1 to calculate urban green energy efficiency. Column (4) of
Table 7 shows that after replacing the explained variable indicator, the low-carbon city pilot policy still has a positive impact on urban green energy efficiency at the 1% significance level, with each unit increase in posttreat leading to an average increase of 0.032 units in eff. Although this value is larger than that in Column (5) of
Table 6, which may be due to differences in measurement methods, it does not affect the conclusions drawn from the baseline regression.
Column (5) of
Table 7 uses the Super-SBM model and the GML index to measure urban green total factor productivity, replacing the green energy efficiency indicator in the baseline regression [
87]. Column (5) of
Table 7 shows that after replacing the explained variable indicator again, the low-carbon city pilot policy has a positive impact on urban green energy efficiency at the 1% significance level, with each unit increase in posttreat leading to an average increase of 0.023 units in eff. This value shows no significant deviation from Column (5) of
Table 6, further verifying the robustness of the baseline regression results.
- 4.
Exclusion of Outliers
To eliminate the confounding effects of outliers on the causal relationship under study, this study applies a two-sided 1% winsorization and truncation to continuous variables and excludes extreme samples from the first year of the pandemic (2020) in the baseline regression.
Columns (1) to (3) of
Table 8 present the regression results after winsorization, truncation, and exclusion of the first year of the pandemic, respectively. According to the results in
Table 8, each unit increase in posttreat leads to average increases of 0.021, 0.021, and 0.022 units in eff, respectively. These values show no significant deviation from the baseline regression results in Column (5) of
Table 6, further validating the robustness of the baseline regression results.
- 5.
Exclusion of Policy Interference
The effects of the low-carbon city pilot policy may be confounded with those of other policies. If these interferences are not excluded, they may lead to incorrect estimates of the effects of the low-carbon pilot policy [
88]. Additionally, if other policies also significantly impact green energy efficiency and are not controlled for, the effects of the low-carbon pilot policy may be overestimated.
To enhance the internal validity of the research findings and ensure the reliability of the conclusions in specific contexts, this study refers to the research of Mao Qilin (2024), [
82] and selects two policies implemented within the same timeframe as the low-carbon city pilot policy that may affect urban green energy efficiency. These policies are included in Model (12) for baseline regression to verify the net effects of the policy, thereby providing more scientific and credible evidence for policymakers.
Specifically, this study selects the air quality control region (AQCR) pilot policy and the Broadband China pilot policy. These policies may be implemented simultaneously with the low-carbon city pilot policy, and could influence urban green energy efficiency. If not controlled for, their effects may be confounded with those of the low-carbon pilot policy, leading to biased conclusions [
89]:
① Air quality control region (AQCR) pilot policy: this policy aims to reduce air pollutant emissions and may overlap with the low-carbon pilot policy in terms of environmental protection and energy structure adjustment. Through interaction analysis, the independent effect of the low-carbon pilot policy on green energy efficiency can be isolated:
② Broadband China pilot policy: this policy aims to promote the application of informatization and intelligent technologies, which may affect green energy efficiency by improving energy management efficiency. Through interaction analysis, the independent effect of the low-carbon pilot policy beyond technological applications can be verified.
Column (4) of
Table 8 reports the baseline regression results after including the AQCR pilot policy. Here, the coefficient of posttreat is 0.023 and is significantly positive for eff at the 1% level, while the coefficient of AQCR is 0.003 and lacks strong significance, verifying the independent effect of the low-carbon city pilot policy on urban green energy efficiency. Column (5) of
Table 8 reports the baseline regression results after including the Broadband China pilot policy. Here, the coefficient of posttreat is 0.023 and is significantly positive for eff at the 1% level, while the coefficient of Broadband China is 0.005 and lacks strong significance, further verifying the independent effect of the low-carbon city pilot policy. Column (6) of
Table 8 reports the baseline regression results after including both policies in Model (12). Here, the coefficient of posttreat is 0.022 and is significantly positive for eff at the 1% level, while the coefficients of AQCR and Broadband China are 0.003 and 0.005, respectively, and lack strong significance, further confirming the independent effect of the low-carbon city pilot policy. By including these two policies in the baseline regression, the independent effect of the low-carbon pilot policy is isolated, avoiding confounding factors and verifying the robustness of the baseline regression conclusions.
- 6.
Endogeneity Test
Endogeneity refers to the correlation between explanatory variables and the error term, leading to biased and inconsistent estimates from ordinary least squares (OLS). In this study, the selection of low-carbon city pilot policies may not be random, but based on certain city characteristics, potentially causing endogeneity issues. Instrumental variables (IV) can address this by providing an exogenous variable correlated with the explanatory variable but uncorrelated with the error term, thereby yielding consistent estimates [
90]. Additionally, the time-series correlation of variables in the model and the lagged effects of policy impacts may also introduce endogeneity. A lagged model can mitigate this by incorporating lagged terms to capture the dynamic characteristics of time-series data.
- (1)
Two-stage least squares
To account for the endogeneity of explanatory variables, this study first replaces the explanatory variables with the interaction term between the low-carbon city pilot policy and urban carbon emissions, as well as the level of government intervention, employing the two-stage least squares (2SLS) method for baseline regression. Columns (1)–(4) of
Table 9 report the two-stage regression results for the two instrumental variables, both of which pass the weak instrument test.
Columns (1)–(2) of
Table 9 present the two-stage regression results for Instrumental Variable 1, the interaction between the low-carbon city pilot policy and urban carbon emission intensity. In column (1), the coefficient of posttreat is 0.136 and significant at the 1% level, indicating that, without considering the interaction term, the pilot policy itself has a positive correlation with the dependent variable. The level of urban carbon emissions is a key criterion for selecting pilot cities. In column (2), the coefficient of posttreat becomes 0.042 and remains significant at the 1% level, suggesting that the policy effect strengthens with higher carbon emission intensity. Cities with higher carbon emission intensity experience more pronounced improvements in urban green energy efficiency due to the pilot policy. Compared with the baseline regression results in
Table 6 (5), the findings from Instrumental Variable 1 remain consistent, confirming the robustness of the baseline results.
Columns (3)–(4) of
Table 9 report the two-stage regression results for Instrumental Variable 2, the degree of government intervention. In column (3), the coefficient of Cov is 0.774 and significant at the 1% level, indicating a positive correlation between government intervention and the dependent variable—cities with higher government intervention exhibit greater green energy efficiency. This may be due to additional support and resources provided during policy implementation. In column (4), the coefficient of posttreat is 0.407 and significant at the 1% level, further validating the significant positive impact of the low-carbon city pilot policy on urban green energy efficiency. Compared with the baseline regression results in
Table 6 (5), the coefficient is larger, demonstrating that the policy effect remains robust and significant under different model specifications.
- (2)
Lag model
Furthermore, considering the lagged effects of policy and the persistent time-series correlation of urban green energy efficiency, this study applies varying degrees of lags to the explanatory and dependent variables [
91]. Columns (5)–(9) of
Table 9 present the regression results of the lagged models, where the system GMM model passes both the AR test and Hansen test.
To compare with the baseline and benchmark regression results, this study constructs OLS and fixed-effects (FE) lagged models with one- and two-period lags for the explanatory variables. The regression results are shown in columns (5)–(8) of
Table 9. After a one-period lag of the low-carbon city pilot policy, the policy remains positively significant at the 1% level for urban green energy efficiency, supporting Hypothesis 1 and aligning with the conclusions from
Table 5 and
Table 6. After a two-period lag, the policy still shows a significant positive effect, confirming the robustness of the results in
Table 5 and
Table 6 and further validating Hypothesis 1.
For the dependent variable, to accurately capture dynamic effects and address endogeneity, this study employs the system GMM method with one–two period lags for the dependent variable, using the previously selected control variables as endogenous instruments with two–four lagged periods and Gov as an exogenous instrument. The regression results in column (9) of
Table 9 show that, under the SYS-GMM approach, the pilot policy remains positively significant at the 1% level for urban green energy efficiency. The results also pass the AR and Hansen tests, confirming the robustness of the findings in
Table 5 and
Table 6.