1. Introduction
Air pollution remains one of the most pressing environmental concerns for residents in China. Over the years, the Chinese government has demonstrated a strong commitment to improving air quality through numerous management measures [
1]. A recurring debate in environmental governance revolves around the merits of centralized versus decentralized approaches. Decentralized governance is often argued to enhance local government accountability and foster public participation [
2,
3,
4]. In contrast, centralized governance may improve oversight and regulation [
5], curb corruption [
6,
7], and more effectively address transboundary pollution [
8]. Policy instruments such as emission fees and environmental taxes have been adopted to strengthen supervision, with consensus that centralizing authority within higher-level environmental agencies can benefit environmental outcomes [
9].
In 2017, China introduced the Environmental Vertical Management Reform (EVMR)—a significant policy adjustment piloting in 117 cities across 12 provinces. The reform restructures governance by placing local environmental protection bureaus under direct provincial management rather than local governments, aiming to mitigate local protectionism and enhance enforcement independence. This institutional shift strengthens monitoring, supervision, and accountability, particularly for controlling major pollutants such as SO2, NO2, and CO.
China’s EVMR represents a significant experiment in re-calibrating central-local authority in environmental governance—a challenge not unique to China. The United States’ Clean Air Act establishes cooperative federalism, where federal standards coexist with state implementation, often yielding varied enforcement outcomes. The European Union employs a supranational approach through directives like the National Emission Ceilings Directive, requiring coordination among sovereign member states. India similarly grapples with strengthening pollution control boards amidst rapid industrialization. These international experiences underscore two points: centralizing oversight is a common response to localism’s pitfalls, and governance effectiveness hinges on institutional details and enforcement capacity. Thus, rigorous evaluation of China’s EVMR contributes not only to domestic policy understanding but also to comparative scholarship on environmental federalism.
A growing body of the literature has evaluated EVMR’s multifaceted impacts. Empirical studies confirm that the reform strengthens governance by curbing local protectionism and intensifying regulation, reshaping incentives for firms and officials [
10]. Evidence shows EVMR promotes corporate green innovation [
11] and ESG performance [
12], regional green transformation [
13], reduces industrial COD emissions [
14], and discourages land allocation to pollution-intensive industries [
15].
A critical line of inquiry has established a direct link between EVMR and air quality. Research indicates EVMR reduces corporate pollution through production adjustments and cleaner processes [
16]. County-level evidence shows average reductions in overall air pollution [
17], while prefecture-level analyses confirm decreases in firms’ SO
2 and soot emissions, attributed to enhanced enforcement [
18]. The principal mechanism identified is the reform’s role in mitigating local government interference by shifting administrative authority upward [
19], with city-level analyses further confirming its average effectiveness in reducing pollution concentrations [
20].
However, while these studies robustly confirm aggregate effectiveness, a significant limitation persists: existing research predominantly relies on single indicators—such as PM2.5 or AQI—or remotely sensed data. This approach obscures EVMR’s differential impact across the complex mixture of air pollutants, which originate from diverse sources and require distinct mitigation strategies. A critical analytical gap remains: comprehensive analysis decomposing the reform’s effects on all six major criteria pollutants—SO2, NO2, CO, O3, PM2.5, and PM10—within a unified framework, using high-frequency ground-level monitoring data, is notably absent.
To address this gap, our study leverages a high-frequency, ground-level dataset from China’s national air quality monitoring network, comprising 1710 stations. We applied spatial interpolation to address missing values, ensuring data completeness and representativeness, and constructed a balanced panel of city-level annual concentrations for all six major pollutants. This comprehensive data foundation allows systematic decomposition of EVMR’s effects within a unified framework. We assess heterogeneous impacts across pollutants, investigate long-term effect dynamics, regional heterogeneity, and underlying mechanisms. By doing so, our research provides granular, actionable understanding of how EVMR shapes air quality, offering critical evidence for designing targeted and sustainable environmental policies.
2. Data Source and Data Description
2.1. Air Quality Data
This study examines the effects of six major air pollutants—PM
2.5, PM
10, SO
2, NO
2, CO, and O
3—using the EVMR policy as a quasi-natural experiment. Station-level air quality data were obtained from the China National Environmental Monitoring Center (CNEMC). While the national monitoring network was established starting in 2013 [
21], a comprehensive, city-covering dataset became consistently available from 2015 onward. Our dataset spans from 1 January 2015 to 31 December 2023, with hourly resolution, and incorporates data from 1710 monitoring stations distributed across 288 cities. As illustrated in
Figure 1, which shows the geographic distribution of the stations, the coverage provides a representative spatial basis for analyzing regional air quality trends. The eight-year timeframe offers a two-year pre-policy baseline period and six post-implementation years, a structure well-suited for applying a difference-in-differences (DID) model to assess policy impacts across varied locations.
Following the methodology outlined in the technical report “Technical Regulation for Ambient Air Quality Assessment”, station-level air quality data were aggregated to the city level to evaluate annual air quality against national standards. The processing involved three main steps. First, daily average concentrations were calculated for each pollutant at each station. For PM
2.5, PM
10, SO
2, NO
2, and CO, this was based on the 24 h mean, while for O
3, it was defined as the daily maximum 8 h moving average. Second, city-level daily concentrations were derived by averaging data from all stations within a city. Third, annual city-level metrics were computed: the annual mean for PM
2.5, PM
10, SO
2, and NO
2; and the 95th percentile of daily values for CO and O
3. Approximately 1% to 5% of values in the raw data were missing across pollutants [
22]. To address this, a matrix completion method [
23] was employed for large-scale data imputation.
2.2. Independent and Control Variables
The core explanatory variable in this study is a dummy variable indicating the implementation of the Vertical Management Reform (EVMR). It takes the value of 1 for a city in the years following its adoption of the reform (the treatment group), and 0 otherwise (the control group). Accordingly, our sample comprises 117 cities in the treatment group and 171 cities in the control group.
Drawing on the previous literature, we include four city-level control variables. First, GDP per capita controls for the level of economic development. Second, the GDP growth rate accounts for the momentum of economic activity. Third, as the secondary industry is a primary source of air pollution, we include the share of secondary industry in GDP to capture the economic structure. Fourth, local fiscal expenditure is included, as it is a key determinant of environmental regulation capacity and public service provision. All control variable data were obtained from the CEIC database.
2.3. Data Processing and Transformation
The original variables exhibit substantial differences in scale (see
Table 1). To ensure the comparability of their regression coefficients, we transform all variables to a common scale. First, we apply min-max normalization to rescale all numerical variables to a positive range of 1 to 100. Subsequently, we apply a logarithmic transformation to these rescaled values.
2.4. Descriptive Statistics
Figure 2 illustrates the trends in major air pollutants for the treatment (pilot) and control (non-pilot) groups.
Figure 2a shows that in 2015, the average PM
2.5 concentration in the treatment group (59 μg/m
3) was approximately 12 μg/m
3 higher than in the control group. Following the implementation of EVMR in 2017, the treatment group’s PM
2.5 level dropped markedly to 42 μg/m
3, now measuring 2 μg/m
3 lower than the control group. This represents a substantial reduction of about 13 μg/m
3 for the treatment group from 2016 to 2017, contrasting with a marginal decrease of only 0.5 μg/m
3 in the control group.
A similar pattern is observed for PM
10 (
Figure 2b). In 2015, the treatment group’s concentration (100 μg/m
3) exceeded the control group’s by about 20 μg/m
3. By 2017, this relationship reversed, with the treatment group’s level becoming 5 μg/m
3 lower.
For SO
2 (
Figure 2c), concentrations were broadly comparable between the two groups across most years, with a notable exception in 2017 when the treatment group was approximately 5 μg/m
3 lower.
The concentrations of other pollutants (
Figure 2d–f) did not show significant differential changes between the treatment and control groups around the time of the reform.
3. Empirical Strategy and Identification
3.1. Empirical Framework and Baseline Model
The phased implementation of the Environmental Vertical Management Reform (EVMR) across Chinese cities from 2017 onward created a quasi-natural experiment with staggered treatment timing. This variation allows us to employ a difference-in-differences framework. Our empirical strategy proceeds in two stages to address key challenges in such a setting. First, we estimate a baseline average effect using a standard two-way fixed effects (TWFE) model. Subsequently, and as our primary specification, we implement the robust staggered event-study estimator [
24] to account for potential heterogeneous treatment effects and uncover the policy’s dynamic impacts.
The baseline TWFE model is specified as follows:
where the subscripts
,
,
denote city, year and pollutant type, respectively, with the pollutants being PM
2.5, PM
10, SO
2, NO
2, CO and O
3.
is a binary variable denoting whether city
is under the EVMR policy in year
. If city
implemented EVMR in year
then
for all years
, and
for all years
.
represents the vector of control variables, which includes GDP per capita, GDP growth rate, the share of secondary industry in GDP, and fiscal expenditure. The terms
and
capture city and year fixed effects, respectively.
To estimate dynamic treatment effects and account for heterogeneous impacts across cohorts, our primary specification is the event-study model proposed by Sun and Abraham (2021) [
24]. This robust estimator is specified as follows:
where
is the concentration of pollutant
in city
and year
.
are event-time dummies indicating whether city
is
year from its treatment year. The coefficients
capture the dynamic treatment effects.
is a vector of time-varying controls, identical to those included in Equation (1). City fixed effects
and year fixed effects
are included to control for unobserved time-invariant city characteristics and common temporal shocks, respectively. The period immediately before the reform (
) is omitted as the baseline. Standard errors are clustered at the city level.
3.2. Identification Assumption and Tests
3.2.1. Testing the Parallel Trends Assumption
The key identifying assumption underlying our staggered DID design is the parallel trends assumption. It requires that, in the absence of the EVMR, the evolution of air pollution levels in treatment and control cities would have followed parallel paths over time. This assumption allows us to attribute any divergent trends observed after the reform to the causal effect of the policy itself.
The plausibility of this assumption is supported by two contextual factors. First, during our sample period, all cities were subject to the same national macroeconomic conditions and overarching environmental policy directives. Second, as the descriptive trends in
Figure 2 suggest, although the levels of pollution differed between the two groups, their pre-treatment trends (prior to 2017) did not exhibit obvious divergence. This provides preliminary graphical evidence consistent with parallel trends.
We formally test the parallel trends assumption by examining the pre-treatment coefficients ( where ) from our primary event-study specification (Equation (2)). The absence of statistically significant pre-trends would support the validity of our research design.
In our primary analysis, we focus on the event window from
to
. This balanced window is dictated by data availability (the earliest treated cohort has only two pre-treatment years) and provides a clear view of the short-to-medium-term policy dynamics. The results of this formal test, including the estimated pre-treatment coefficients and their statistical significance, are presented in
Section 4.1.
3.2.2. Addressing Other Empirical Concerns
To bolster the causal interpretation of our findings and address potential confounding factors, we conduct a set of auxiliary robustness checks. The design and rationale for each check are outlined below, with full results detailed in
Section 4.3.
First, to rule out the possibility that our baseline results are driven by spurious correlation or unobserved time-invariant city characteristics, we perform a permutation-based placebo test. This test randomly reassigns the EVMR treatment status across cities while preserving the size of the actual treatment group, re-estimates the baseline model, and records the placebo treatment coefficient. By repeating this procedure 500 times, we generate an empirical distribution of the estimated effect under the null hypothesis of no true policy impact. Comparing our baseline estimate to this distribution allows us to assess the probability that the observed effect occurred by chance.
Second, the COVID-19 pandemic (beginning in 2020) constituted an unprecedented exogenous shock that drastically altered economic activity and air quality patterns, potentially confounding any policy assessment that includes the pandemic and post-pandemic periods. To ensure our estimated effects are not driven by or contingent upon this exceptional shock and its aftermath, we conduct a stringent robustness check by re-estimating our baseline model using only the pre-pandemic sample period (2015–2019). This test examines whether the core findings regarding the EVMR’s impact hold in a stable, pre-crisis environment.
Third, and most importantly, we address the issue of spatial interdependence in air pollution. Emissions in one city can directly affect air quality in neighboring areas through atmospheric transport, violating the assumption of independent observations in standard panel models and potentially biasing both coefficients and standard errors. To directly model this spatial dependence, we augment our baseline specification by incorporating a spatial lag of the dependent variable (SAR). This approach allows us to explicitly estimate the strength of spatial spillovers and, crucially, to re-evaluate the direct effect of the EVMR under a more appropriate spatial econometric framework.
4. Empirical Findings
4.1. Validity Test: Parallel Trends
A prerequisite for the validity of our difference-in-differences design is that the treatment and control cities would have followed parallel paths in air pollution in the absence of the EVMR. We formally test this parallel trend assumption using an event-study analysis. Following recent advances in the staggered DID literature, we estimate dynamic treatment effects for each pollutant, plotting the coefficients for years leading up to and following the reform implementation.
The results of this test are presented in
Figure 3. As shown, the estimated coefficients for all pre-treatment periods (event time
and
) are statistically indistinguishable from zero for every pollutant. The confidence intervals for these lead coefficients comfortably include zero, and the point estimates are small in magnitude. This pattern provides strong visual and statistical evidence that no systematic differential trends existed between the treatment and control groups prior to the policy intervention.
Therefore, the parallel trends assumption holds for our sample, lending credibility to our identification strategy. Having established the validity of the research design, we now proceed to examine the dynamic causal effects of the EVMR.
4.2. Main Results: Static and Dynamic Effects
Having established the validity of the parallel trend assumption in
Section 4.1, we now present the core empirical findings regarding the impact of the EVMR. This section is organized into two parts: we first report the static average treatment effects from the baseline difference-in-differences (DID) model, and then explore the dynamic evolution of policy effects using an event-study framework.
4.2.1. Baseline Difference-in-Differences Estimates
Table 2 presents the baseline difference-in-differences estimates of the EVMR’s impact on six major air pollutants.
Consistent with the parallel pre-trends established in
Section 4.1,
Table 2 shows a clear pattern of selective effectiveness. The EVMR led to statistically significant reductions in PM
2.5, PM
10, and SO
2 concentrations. In contrast, the coefficients for NO
2, CO, and O
3 are small and statistically indistinguishable from zero. This finding is noteworthy because it persists despite a general downward national trend in pollutants like NO
2 and CO since 2015 (
Figure 2). The stability of the treatment-control group divergence post-policy suggests that the observed nationwide decline in these pollutants cannot be attributed to the EVMR under our empirical design.
Beyond this selectivity, the effect size varies substantially among the pollutants that show significant responses. The impact on particulate matter is markedly stronger than that on SO2. As all variables are standardized, the coefficients are directly comparable. The estimated effects for PM2.5 and PM10 are very close in magnitude (−0.25 and −0.26, respectively). In practical terms, these estimates imply that the EVMR is associated with an average reduction of 8.4 μg/m3 (approximately 15.4%) for PM2.5 and 14.6 μg/m3 (approximately 15.5%) for PM10 in treatment cities. The reduction for SO2 is considerably more modest, at 2.2 μg/m3 (approximately 9.7%).
4.2.2. Dynamic Effects: Event-Study Analysis
The detailed event-study coefficients are reported in
Table 3.
For particulate matter (PM2.5 and PM10), a consistent and sustained reduction pattern is evident. The coefficients in the reform year () are significantly negative (−0.0032 for PM2.5; −0.0034 for PM10) and remain strong and statistically significant in the two subsequent years. This indicates that the EVMR achieved an immediate and persistent reduction in particulate pollution.
The effect on SO2 shows a different temporal profile. While significant in the reform year (−0.0026), it becomes statistically indistinguishable from zero thereafter, suggesting its impact was largely confined to the short run.
Notably, we observe a divergent pattern for other gaseous pollutants. Contrary to the overall goal of emission reduction, NO2 concentrations exhibited a significant increase in the first year post-reform. An even more pronounced and increasing trend is observed for CO, with coefficients turning significantly positive in and . This pattern is consistent with potential unintended consequences of the policy, such as pollution displacement or alterations in emission profiles. For O3, all post-reform coefficients are insignificant, indicating a null effect.
To quantify the overall short-to-medium-term impact, we calculate the average treatment effect on the treated (ATT) over the three-year post-reform window (
= 0 to 2). As shown in
Table 3, the aggregate ATT for PM
2.5 and PM
10 is strongly negative and significant, confirming a substantial reduction. The ATT for SO
2 is of similar magnitude but only marginally significant, reflecting its lack of persistence. In contrast, the ATT for CO is positive and significant, corroborating the dynamic increases. The ATT estimates for NO
2 and O
3 are statistically insignificant.
In summary, the dynamic analysis confirms the EVMR’s immediate and sustained effectiveness in reducing particulate matter, and a short-lived effect on SO2. However, it also uncovers critical trade-offs, evidenced by a significant post-reform increase in CO concentrations.
These findings regarding the selective, dynamic, and multi-pollutant trade-off effects of the EVMR form the core empirical contribution of this study. The following section subjects these baseline results to a series of robustness tests to ensure their credibility.
4.3. Robustness Test
4.3.1. Placebo Test: Falsification of Treatment Group
We conduct a permutation-based placebo test to assess whether our baseline results are driven by unobserved city characteristics or random chance. A placebo treatment group is constructed by randomly selecting 117 cities (matching the actual treatment group size) and artificially assigning them EVMR in 2017. We then estimate Equation (1) using this pseudo treatment indicator. This procedure is repeated 500 times, generating an empirical distribution of placebo coefficients under the null hypothesis of no true policy effect. The test focuses on PM2.5, PM10, and SO2—the pollutants with significant baseline effects.
Figure 4 presents the results. For each pollutant, the 500 placebo coefficients are centered around zero, with means statistically indistinguishable from zero (e.g., PM
2.5: mean = −0.0007, s.d. = 0.022). In stark contrast, our actual baseline estimates (e.g., −0.25 for PM
2.5) lie far in the tails of their respective placebo distributions, with permutation-based
p-values < 0.01. This pattern holds consistently for PM
10 and SO
2.
Thus, the placebo test confirms that the observed reductions are highly unlikely to occur by random chance or due to unobserved city heterogeneity, providing strong support for a causal interpretation of EVMR’s effects on these three pollutants.
4.3.2. Excluding the COVID-19 Pandemic Period
The COVID-19 pandemic constituted a major exogenous shock that temporarily reduced economic activity and air pollution, potentially confounding our policy assessment if its impact differed between treatment and control cities. To ensure our estimated effects are not driven by this exceptional period, we re-estimate our baseline model using only the pre-pandemic sample (2015–2019), which comprises 1420 city-year observations.
The results, presented in
Table 4, robustly support our main findings. The event-study coefficients for PM
2.5, PM
10, and SO
2 remain statistically significant and negative, with magnitudes closely mirroring those from the full-sample analysis in
Table 2. For instance, the immediate effect on PM
2.5 (
) is −0.24 in this restricted sample, compared to −0.25 in the baseline. The patterns for NO
2 and CO are also consistent. This close correspondence indicates that the pandemic’s impact was largely parallel across treatment and control groups, and does not undermine the causal interpretation of the EVMR’s effectiveness. The reform’s pollution-reducing effect is evident even in the absence of the pandemic shock.
4.3.3. Accounting for Spatial Autocorrelation
A potential threat to our identification is spatial dependence, as air pollutants diffuse across city boundaries. To ensure our estimates are robust to such spillovers, we augment our baseline DID model by incorporating a spatial lag of the dependent variable, yielding a Spatial Autoregressive (SAR) specification:
where
is a spatial weight matrix constructed using inverse distance between city centroids, and
is the spatial autoregressive coefficient capturing cross-city interdependence.
The estimation results of this spatial econometric model are presented in
Table 5. They lead to two key conclusions. First, the coefficient
is consistently positive and statistically significant across all pollutants, confirming the presence of positive spatial autocorrelation—a city’s air quality is significantly influenced by pollution in neighboring cities.
Second, and most critically, after directly controlling for this spatial dependence, the coefficient on our variable of interest,
for EVMR, remains negative, statistically significant, and quantitatively similar to our baseline estimates in
Table 2. This robustness strongly indicates that the pollution-reduction effect of the EVMR is a direct local causal impact of the reform, not a spurious finding driven by omitted spatial correlation.
5. Heterogeneous Effects: Exploring Variation in Policy Impact
5.1. Heterogeneity by Economic Development Level
The capacity for environmental governance is often closely tied to a region’s economic resources [
25]. We examine whether the effectiveness of the EVMR varied between more and less economically developed cities. Following conventional practice, we split our sample at the national median of GDP per capita (CNY 53,000) into “more developed” and “less developed” subgroups and re-estimate our model for each subgroup separately.
The results, presented in Columns (1)–(4) of
Table 6, reveal a striking pattern of selective heterogeneity. For particulate matter (PM
2.5 and PM
10), the reduction effects are statistically indistinguishable between the two subgroups, both in terms of magnitude and statistical significance. This suggests that the EVMR’s core mechanism—enhancing regulatory independence and enforcement—operated effectively irrespective of local fiscal capacity when it came to curbing the most salient pollutants.
In contrast, a clear divergence emerges for SO
2 (Columns (5)–(6) of
Table 6). The reform’s effect on SO
2 was significantly larger in less developed cities. We posit that this heterogeneity stems from differences in baseline regulatory stringency and abatement potential. SO
2 emissions in China are predominantly from the power and heavy industrial sectors, which historically faced relatively lax local oversight, especially in less developed regions where economic growth often took precedence over environmental compliance [
26]. The EVMR, by abruptly strengthening top-down oversight and reducing local protectionism, likely created a stronger regulatory shock in these previously under-regulated areas. In more developed cities, where environmental standards and enforcement may have already been closer to the new national norms due to greater prior scrutiny, the marginal impact of the vertical reform on SO
2 was consequently smaller. Thus, the larger observed effect in less developed cities is consistent with a “catch-up” effect in regulatory enforcement following the centralization of oversight.
5.2. Heterogeneity by Key Governance Areas
China’s air pollution policy has historically focused on key regions: the Beijing–Tianjin–Hebei region, the Yangtze River Delta, and the Fenwei Plain [
27]. These 83 cities received intensified scrutiny, resources, and political pressure even before the EVMR. We test whether the reform had a differential impact in these priority areas compared to other cities.
The subgroup analysis results are presented in
Table 7. They demonstrate that the EVMR’s impact was substantially amplified in key governance areas. The estimated reduction in PM
2.5, PM
10, and SO
2 is not only statistically significant but also larger in magnitude than in non-key cities.
This amplified effect is likely the product of a policy synergy. Key areas were already subject to a higher baseline level of monitoring, stricter targets, and greater central government support. The EVMR did not introduce a new policy in isolation but reinforced an existing high-stakes regulatory environment. By superimposing vertical management (which reduces local interference) onto an arena already under the national spotlight, the reform likely closed implementation gaps more effectively. This finding underscores that institutional reforms like the EVMR can have the greatest marginal effect where they complement and strengthen pre-existing, focused policy efforts.
6. Mechanism Analysis
While the EVMR aims to strengthen environmental governance, the precise causal pathway through which it improves air quality remains an empirical question. Drawing on theoretical insights from the literature [
28,
29,
30], we posit and empirically test a primary mechanism: that the reform reduces pollution by enhancing environmental enforcement intensity.
We formally test this mechanism using a causal mediation analysis framework (see
Figure 5). In this framework, we utilize the number of environmental penalty cases as a quantitative proxy for enforcement intensity, the mediating variable. The analysis empirically tests two sequential links: (1) whether the EVMR increased penalties, and (2) whether increased penalties, in turn, reduced pollution. The logic is captured by the following two-equation system:
where
is our mediator, the log-transformed number of environmental penalty cases, and
is pollutant concentration. The first equation tests link (1) by estimating
. The second equation tests link (2) by estimating
, after controlling for the direct effect of the reform (
). The product
captures the indirect (mediated) effect.
To test this two-step pathway, we estimate Equations (4) and (5). The results are presented in
Table 8 and corroborate the mechanism. First, Column (4) shows that the EVMR caused a significant increase in environmental penalties (
,
p < 0.01), confirming that the reform successfully enhanced enforcement intensity.
Second, Columns (5)–(7) show that increased penalties are associated with significantly lower pollution after controlling for the EVMR. Specifically, the coefficient on the mediator is negative and statistically significant for all pollutants. The indirect effect is therefore negative and significant, confirming mediation. Notably, the magnitude of the mediator’s coefficient is largest for SO2 (), suggesting that enforcement actions may have been particularly effective in curbing emissions from large, stationary sources like power plants, which are primary SO2 emitters.
Collectively, this evidence confirms that strengthened environmental enforcement is a key operational channel through which the EVMR achieved its pollution reduction goals. By insulating local agencies, the reform empowered them to impose stricter penalties on polluters, which in turn led to measurable reductions in emissions.
A remaining concern is potential reverse causality: high pollution could trigger stricter enforcement, biasing our estimates. To address this, we conduct a robustness check using a one-period lag of the penalty variable as the mediator. This specification mitigates reverse causality, as current pollution is unlikely to cause past enforcement actions.
The results, presented in
Table 9, robustly support our mechanism. The EVMR significantly increased the lagged penalty measure (link 1), and this lagged measure remains a strong negative predictor of current pollution after controlling for the reform (link 2). The estimated indirect effect remains statistically significant and quantitatively similar to our baseline. This consistency confirms that the enforcement channel is not an artifact of simultaneous causality.
7. Discussion
Our analysis reveals a distinct efficacy hierarchy in China’s Environmental Vertical Management Reform (EVMR). The reform achieved significant reductions in particulate matter (PM2.5, PM10) and sulfur dioxide (SO2), yet had negligible impact on nitrogen dioxide (NO2), carbon monoxide (CO), and ground-level ozone (O3). This pattern is not incidental but reflects an asymmetric regulatory prioritization deeply embedded in the nation’s pollution control agenda.
This asymmetry stems from the pronounced public and political salience of PM and SO
2, driven by their well-established severe health impacts [
31,
32] and direct perceptibility [
33,
34]. Consequently, these pollutants have been the explicit, quantified targets of flagship national campaigns such as the “Blue Sky Defense War.” The EVMR’s primary success lay in functioning as a powerful enforcement catalyst for these pre-existing, high-stakes mandates. By insulating local environmental bureaus from political interference, the reform enhanced their capacity to pursue objectives that were already clear, measurable, and politically non-negotiable. This demonstrates a key principle: institutional reforms yield the greatest marginal returns where they leverage and strengthen pre-defined policy priorities.
In contrast, the null findings for NO2, CO, and O3 underscore the inherent limits of an enforcement-centric reform. During our study period, these pollutants lacked the public salience and, crucially, the specific, binding reduction targets that defined the core policy agenda for PM and SO2. The EVMR is designed to improve the implementation of existing regulations, not to create new regulatory frontiers. In the absence of stringent standards, even a perfectly executed vertical management system has little authority to enforce.
Furthermore, the analysis reveals a more concerning, unintended consequence: a significant increase in CO concentrations in the post-reform years. This points to a pollution-shifting effect, where stringent controls on particulates and SO2 may have inadvertently altered industrial operations or energy consumption patterns, leading to increased emissions of other, less-regulated pollutants. It highlights a critical risk of partial regulatory strategies—solving one problem may exacerbate another.
Beyond policy and behavioral responses, the differential effectiveness is rooted in fundamental differences in the pollutants themselves. PM2.5, PM10, and SO2 are primary pollutants emitted directly from regulated point sources (e.g., power plants, industry), making their concentrations more directly responsive to enhanced oversight and penalties. In contrast, NO2 and especially O3 are secondary pollutants formed through complex, non-linear atmospheric reactions. Their levels are less tied to any single local source and more influenced by regional precursor emissions and meteorological conditions. This inherent chemical inertia makes them less amenable to conventional, localized command-and-control measures.
The observed lagged increase in CO concentrations in the second and third years post-reform warrants particular attention, as it reveals an unintended consequence of asymmetric regulatory prioritization. This phenomenon can be contextualized within the evolutionary trajectory of China’s air pollution governance, which has progressively shifted from addressing “sensory pollutants”—those with high public visibility—to tackling more chemically complex secondary pollutants. The initial phase of this transition prioritized the aggressive reduction in particulate matter and SO2 through mandatory desulfurization equipment installation and coal-fired boiler retrofits. While these interventions proved highly effective for their targeted pollutants, they may have inadvertently created conditions conducive to elevated CO emissions through multiple interconnected pathways.
Combustion engineering research traces this trade-off to interconnected mechanisms inherent in SO
2 control. Deep desulfurization typically requires adjusting oxygen-fuel ratios, a modification that inadvertently creates hypoxic zones within furnaces and directly elevates CO emissions [
35]. Compounding this combustion effect, the very materials employed to capture SO
2 can themselves become CO sources [
36]. Beyond these direct chemical and combustion pathways, the integration of downstream equipment such as scrubbers and reheaters introduces aerodynamic disturbances [
37] that propagate back to the furnace, destabilizing air-fuel ratios and further compromising combustion completeness. At a broader operational level, fuel switching to meet stringent SO
2 limits often creates mismatches between new fuel properties and existing burner design, a misalignment that inherently favors incomplete combustion [
38]. These interlinked factors—spanning combustion chemistry, sorbent behavior, system aerodynamics, and fuel compatibility—reveal a fundamental tension: achieving deep SO
2 reduction can inadvertently elevate CO through multiple pathways, turning a regulatory success for one pollutant into an unintended environmental challenge for another.
This fundamental tension carries crucial implications for environmental governance. The EVMR’s success in curbing PM and SO
2 confirms that enforcement-centric reforms yield greatest returns when aligned with clear priorities. However, the null effects on NO
2 and O
3, and more alarmingly, the significant increase in CO, sound a clear alarm: targeting “headline” pollutants while neglecting cross-pollutant interactions is inherently inadequate. As our engineering analysis reveals, the very measures that achieved deep SO
2 reduction—combustion adjustments, sorbent chemistry, system aerodynamics, and fuel switching—can inadvertently elevate CO through interconnected pathways. The imperative next step is to move beyond single-pollutant silos. The robust institutional backbone established by the EVMR must now be equipped with a holistic, multi-pollutant regulatory framework that addresses the full spectrum of air pollutants, accounts for their interactions and trade-offs, and leverages enhanced enforcement capacity to pursue truly comprehensive air quality improvements, in line with the Sustainable Development Goals [
39].
8. Conclusions
This study provides the first comprehensive evaluation of China’s Environmental Vertical Management Reform (EVMR) across all six major air pollutants within a unified staggered difference-in-differences framework. Our analysis reveals a nuanced and policy-relevant picture of the reform’s impact.
Our core finding is a pronounced hierarchy in efficacy: the EVMR induced significant and sustained reductions in the concentrations of particulate matter (PM2.5, PM10) and sulfur dioxide (SO2), pollutants that have long been the explicit targets of national campaigns. In stark contrast, the reform had no statistically discernible effect on nitrogen dioxide (NO2) or ground-level ozone (O3), and, more alarmingly, was associated with a significant lagged increase in carbon monoxide (CO). This disparity is not a failure of enforcement but reflects the reform’s design as an enforcement amplifier for pre-existing, high-priority regulatory targets.
The study makes three key contributions. First, it demonstrates that the success of a major governance restructuring is contingent upon the substantive policy agenda it serves; it strengthens the implementation of clear mandates but does not create them. Second, by identifying a potential pollution-shifting effect—the lagged increase in CO as an unintended consequence of deep SO2 reduction—it highlights the critical risk that targeted regulation of a subset of pollutants may inadvertently exacerbate others through interconnected engineering and operational pathways. Third, through mediation analysis, it corroborates that enhanced environmental enforcement is a primary channel through which the EVMR achieved its measured successes.
These findings carry direct policy implications. The EVMR has successfully built a more robust institutional backbone for environmental governance in China. To translate this institutional capital into comprehensive air quality improvements, policymakers must now equip it with an integrated, multi-pollutant regulatory strategy. Future policy should expand beyond the current focus on “headline” pollutants to establish binding targets and control measures for the full spectrum of air pollutants, explicitly accounting for the potential trade-offs between them. Only by addressing the complex interplay between pollutants can China’s air quality management fully meet its public health and sustainable development objectives.