The Spatial and Heterogeneity Impacts of Population Urbanization on Fine Particulate (PM2.5) in the Yangtze River Economic Belt, China

This paper addresses the effect of population urbanization on Fine Particulate (PM2.5) in the Yangtze River Economic Belt in China from 2006 to 2016 by employing PM2.5 remote sensing data and using the Stochastic Impacts by Regression on Population, Affluence and Technology (STIRPAT) model. The study contributes to the growing empirical literature by addressing heterogeneity, spillover, and dynamic effects in the dynamic spatial panel modeling process simultaneously. The empirical results show that population urbanization has a significant impact on PM2.5 with a positive spillover effect and a dynamic effect being detected and controlled. The heterogeneity effects of population urbanization on PM2.5 due to geographical positions show evidence of an obvious inverted U-shaped curve relationship in the upstream area and an increasing function curve in the midstream and downstream areas. The heterogeneity effects due to population urbanization levels show that an inverted N-shape curve relationship exists in low and medium urbanization level areas, while a U-shape curve relationship exists in high urbanization level areas. It is hoped that this study will inform the local governments about the heterogeneity of population urbanization and spillover effects of air pollution when addressing air pollution control.


Introduction
During the past four decades, urbanization in China has increased rapidly from 17.92% in 1978 to 58.52% in 2017. This rapid urbanization is being accompanied by the agglomeration of the urban population, the utilization of urban land, and severe industrial emissions leading to high ambient air pollution [1,2]. The standardized daily value of PM 2.5 in China is 75 µg/m 3 , meaning when the daily average concentration of PM2.5 higher than 75 µg/m 3 , the air quality reaches the level of pollution, which is three times that of the World Health Organization (WHO) standard (25 µg/m 3 ) [3]. According to the air quality monitoring results of 388 Chinese cities in 2016, only 84 cities reached the air quality standard, accounting for 24.9% of the cities tested. Of the tested cities, 254 cities, that is, approximately 75.1%, did not reach the standard.
The relationship between urbanization and air pollution has become a crucial issue for governments, residents, and academics [4]. Regarding this, many extant studies refer to the Environment Kuznets Curve (EKC) hypothesis analysis framework. In their classic work, Grossman and Krueger (1991) tested the EKC hypothesis and found that per capita income exhibits an inverted U-shaped curve relationship on waste emissions [5]. That is, the effect on the environment may worsen before it gets better as per capita income grows. detailed information, which help to stimulate a more accurate estimation, whereas national or provincial data may show bias [24]. Furthermore, previous literature includes mainly cross-sectional data, ignoring the characteristic of dynamic changes concerning the effect of urbanization on air pollution [25,26].
Fourth, many studies addressing PM 2.5 concerntration in China are based on data from air quality monitoring stations, which have time limitations. As in China, full coverage of monitoring stations in urban area were built to detect PM 2.5 and other atmospheric pollutants after 2013. As a result, the existing studies in China usually have a tendency to focus on PM 2.5 concentrations for short-time series, such as daily, monthly, and single yearly. Wang and Fang used PM 2.5 data obtained from 241 new observation points in 54 cities of the Bohai Rim Urban Agglomeration in 2014 to determine the spatial-temporal distribution of PM 2.5 and its socioeconomic determinants [27]. Guo et al. used the daily PM 2.5 of 35 monitoring sites from April 2013 to March 2015 that were calculated from daily Air Quality Index data and were collected from the official website of the Beijing Municipal Environmental Protection Bureau [28]. Wang et al. employed the data from monitoring stations to analyze the spatial distribution of PM 2.5 in 190 cities in China in 2014 [29]. However, short-time data cannot reflect stage heterogeneity. Additionaly, the monitoring stations are point measurements and it is not clear to represente the air quality of a given area. Fortunately, remote sensing data we used in the study helps to overcome the two drawbacks above.
Consequently, this study focuses on the following research highlights. (1) We use prefecture-level data to discuss the heterogeneity impact of population urbanization on PM 2.5 from two perspectives: regional heterogeneity and urbanization heterogeneity. To address regional heterogeneity, we divide the Yangtze River Economic Belt (YREB) into three parts: upstream, midstream, and downstream. For the urbanization heterogeneity, the data sample is divided into three categories based on the annual growth rate of urbanization from 2006 to 2016: low, medium, and high urbanization levels. (2) To avoid possible estimation bias caused by spatial interaction effects and the dynamic effect, we address these two issues using a combination of a dynamic model and a spatial econometric specification. We believe that this integrated modeling framework provides new insights on the relationship between population urbanization and PM 2.5 in China. (3) Panel data from prefecture-level cities are used for our study. The data concerning PM 2.5 are from a combination of Aerosol Optical Depth retrievals from multiple satellite instruments.
This study is organized as follows. The next section describes the study area, dataset, and the empirical methods, while Section 3 presents the empirical results. The final section discusses the dynamic, spillover, and heterogeneity effects, respectively, and then presents the conclusions.

Study Area
The YREB is the biggest economic belt in China, covering nine provinces (Jiangsu, Zhejiang, Anhui, Hubei, Hunan, Jiangxi, Sichuan, Guizhou, and Yunnan) and two municipalities governed directly by the central government (Shanghai and Chongqing). The YREB stretches across the eastern, central, and western regions of China, accounting for over 20% of China's geographical area ( Figure 1).
In 2011, the construction of an ecological civilization-especially the green development of the YREB-became a national strategy. "The outline of Yangtze River Economic Belt Development Plan" formally became a national development strategy in 2016 aiming to promote new urbanization for the YREB development [30]. This strategy would provide new opportunities for further development of the cities in the Yangtze River basin and promote urbanization in China. In 2016, the YREB was home to more than 40% of China's population and contributed a similar percentage of the gross domestic product (GDP). Then, the proportion of urban population in the YREB increased from 42.4% in 2006 to 56.9% in 2016. However, most cities in the YREB are facing more serious air pollution than other cities in China, especially the cities in the Yangtze River Delta, which have become a focus of public concern [29].

STIRPAT Model
The STIRPAT (Stochastic Impacts by Regression on Population, Affluence and Technology) model is a classic theoretical framework for PM2.5 research developed on the basis of the IPAT model [31]. In the early 1980s, Ehrlich and Holdren pioneered the Influence, Population, Affluence, and Technology (IPAT) model (I = PAT) to analyze the environmental effects of population and economic variables. In the equation, I represents the environmental impact, P represents the population, A refers to affluence, and T refers to technology. The IPAT model is widely recognized and applied for its simple and effective analysis of environmental impact. However, the model has two drawbacks: First, the IPAT model is a purely mathematical model which cannot be tested directly by empirical data, and the hypothesis of various factors affecting the environment cannot be obtained [32]. Second, the IPAT model simply assumes that the elasticity of population, wealth, technology, and environment are unified. Due to these limitations, Dietz and Rosa improved it to the STIRPAT model. It allows other explanatory and control variables to be added, which is more flexible [32,33]. The STIRPAT model is currently widely used to analyze the environmental impact of population and economic factors [34][35][36]. The expression of STIRPAT can be seen in equation (1).
where I, P, and A have the same meanings as in the model;α is the constant term; β , γ and λ are index parameters for each variable and μ is the random error term. STIRPAT is a nonlinear model containing multiple independent variables, and can therefore be converted to logarithmic form: ln =ln ln ln ln ln where β , γ , and λ can be seen as the elasticity coefficients, which means that every 1% change in ln P , ln A , or ln T would lead to a % β , % γ , or % λ change in lnI.

Model Specification
Based on the STIRPAT model, our econometric model adopts a reduced form analysis to estimate the coefficients that reveal the relationships between population urbanization and PM2.5. The classical EKC hypothesis posits that environmental quality tends to at first deteriorate and then

STIRPAT Model
The STIRPAT (Stochastic Impacts by Regression on Population, Affluence and Technology) model is a classic theoretical framework for PM 2.5 research developed on the basis of the IPAT model [31]. In the early 1980s, Ehrlich and Holdren pioneered the Influence, Population, Affluence, and Technology (IPAT) model (I = PAT) to analyze the environmental effects of population and economic variables. In the equation, I represents the environmental impact, P represents the population, A refers to affluence, and T refers to technology. The IPAT model is widely recognized and applied for its simple and effective analysis of environmental impact. However, the model has two drawbacks: First, the IPAT model is a purely mathematical model which cannot be tested directly by empirical data, and the hypothesis of various factors affecting the environment cannot be obtained [32]. Second, the IPAT model simply assumes that the elasticity of population, wealth, technology, and environment are unified. Due to these limitations, Dietz and Rosa improved it to the STIRPAT model. It allows other explanatory and control variables to be added, which is more flexible [32,33]. The STIRPAT model is currently widely used to analyze the environmental impact of population and economic factors [34][35][36]. The expression of STIRPAT can be seen in Equation (1).
where I, P, and A have the same meanings as in the model; α is the constant term; β, γ and λ are index parameters for each variable and µ is the random error term. STIRPAT is a nonlinear model containing multiple independent variables, and can therefore be converted to logarithmic form: where β, γ, and λ can be seen as the elasticity coefficients, which means that every 1% change in ln P, ln A, or ln T would lead to a β%, γ%, or λ% change in lnI.

Model Specification
Based on the STIRPAT model, our econometric model adopts a reduced form analysis to estimate the coefficients that reveal the relationships between population urbanization and PM 2.5 . The classical EKC hypothesis posits that environmental quality tends to at first deteriorate and then improve in line with urbanization in an inverted U-shaped curve. However, existing studies have shown that there may be U, N, and inverted N-shaped curves between environmental variables and urbanization. Therefore, the primary, quadratic, and cubic terms are decomposed to analyze and verify the nonlinear relationship between PM 2.5 and population urbanization [37,38]. The model can be specified as follows where cities are denoted by the subscript i (i = 1, . . . , N) and the subscript t (t = 1, . . . , N) denotes the time period. β 0 is the intercept term for all individuals. β 1 to β 7 represent coefficients of the corresponding variable and ε i,t represents the random error term.  [39]. Data for the YREB region were extracted from the global dataset and transformed into the GCS_WGS_1984 system using ArcGIS software (Esri, Redlands, CA, USA).
Population urbanization (ln Urban) is our explanatory variable. Since urbanization refers to the process of transforming agricultural population into urban population, the percentage of urban population relative to the total population can be used to represent city's urbanization level. The population urbanization data were mainly collected from the statistical yearbooks of the nine provinces and two municipalities provided in Section 2.1. Additional data for several cities that are not available from this source were gathered from The Statistics Communique on National Economy and Social Development.
The other four control variables are as follows. P represents population density. Considering the difference in administrative division areas between the cities, it is more appropriate to apply population density (population per unit area) to represent the impact of population concentration on PM 2.5 , rather than applying the total population [15].
A refers to the GDP per capita, which is a commonly used indicator representing urban economic development [26].
T is divided into two parts: technological progress Tech and industrial structure IS. First, technological progress is the main indicator for knowledge and ability and is also an important factor in controlling environmental pollution and alleviating PM 2.5 . We use the proportion of financial expenditure for science and education to denote this factor [40]. Second, fossil fuel combustion in secondary industries-including mining, manufacturing, electricity supply, and construction-may have an impact on air pollution [6]. Therefore, the proportion of secondary industry in the GDP is commonly used to represent the influence of industrial structure [41].
The data concerning these control variables were collected from Chinese Urban Statistical Yearbook [42]. Table 1 shows the statistical descriptions for all the variables.

Dynamic Spatial Econometric Model
Many existing empirical analyses show that environmental pollution often shows certain path dependence characteristics. Therefore, it is of great importance to investigate the time-lag effect of PM 2.5 changes [21]. The lagged item of PM 2.5 is introduced into the STIRPAT model, by considering the dynamic cumulative effect of PM 2.5 .
where ε it is the error term, i represents the different regions, and t indicates time. X represents a vector of the control variables.
Besides the dynamic cumulative effect, PM 2.5 may have inevitable spatial autocorrelation and may be influenced by the consequent spatial spillover effects. That is, PM 2.5 of city i may also be affected by its surrounding areas. The specification can be seen in the following equation.
where W denotes the spatial weight matrix. In this analysis we use the distance between cities to establish the geographical weight matrix (W ij = 1/d 2 ij ), which is obtained from the China Geographic Database of National Bureau of Measurement. δ, as the coefficient of W * PM 2.5 , represents the spatial dependence of the sample observation.v t is a normally distributed disturbance term with a diagonal covariance matrix.
To perform a comprehensive analysis of the time-lag effect and spatial interaction effect, we adopt a dynamic spatial panel model, which controls the interference of the above factors to model estimation. This helps us avoid any estimation bias [43]. Combining Equations (4) and (5), the dynamic spatial panel model is set as follows where τ represents the first-order lag regression coefficient of PM 2.5 that reflects the influence of previous related factors on this period. δ represents the spatial lag regression coefficient, which in turn reflects the spatial spillover effect of air pollution. The PM 2.5 gradually increases along the Yangtze River from upstream to downstream ( Figure 2). The spatial pattern of PM 2.5 in YREB shows obvious spatial agglomeration. For example, the high concentration level is concentrated in the downstream area of the Yangtze River, especially in Shanghai and Jiangsu Provinces. PM 2.5 in the upstream area of the YREB is lower than that in the downstream area. Generally, the number of hot spots shows an increasing trend during 2006, 2008, and 2010. That is, PM 2.5 in YREB continues to deteriorate. Then, cities in the high concentration level of PM 2.5 show a downward trend in the downstream area. Based on the spatial distribution of PM 2.5 , the regional and the temporal differences of PM 2.5 pollution were found to be significant [44]. Therefore, it is of great importance to understand the regional differences when considering possible trends in the impact of urbanization on PM 2.5 in different regions. Additionally, the economic and the overall urbanization level of the Yangtze River Delta all take precedence over the other areas in the YREB. Simultaneously, high urbanization cities, such as Shanghai, are currently experiencing high PM 2.5 caused by population agglomeration. We therefore consider urbanization levels to explore the effect of different urbanization levels on PM 2.5 .

Spatial Autocorrelation Test
As described above, there might be spatial correlations in PM2.5 concentrations among adjacent cities [20]. We adopted the Global Moran's I index in order to test the spatial autocorrelation of PM2.5 [45,46]. This commonly used index helps to measure the degree of overall spatial autocorrelation of the variables. It can be expressed as follows

Spatial Autocorrelation Test
As described above, there might be spatial correlations in PM 2.5 concentrations among adjacent cities [20]. We adopted the Global Moran's I index in order to test the spatial autocorrelation of PM 2.5 [45,46]. This commonly used index helps to measure the degree of overall spatial autocorrelation of the variables. It can be expressed as follows where n is the number of cities, which equals 108 in this study. x i and x j are the values of PM 2.5 at location i and location j, respectively; x is the mean of PM 2.5 and ω ij is the element of the spatial weight matrix. The global Moran's I value range from −1 to 1. A larger value of indicates a stronger spatial connection, while a smaller value indicates weaker spatial connection; if the Moran's I value equals zero, then random spatial distributions exists, indicating no spatial correlation. To further distinguish the spatial agglomeration patterns, a Moran scatter plot was used to test the average PM 2.5 in the selected years. The Moran scatter plot and the Local Indicators of Spatial Association (LISA) cluster map of PM 2.5 at the 0.05 significance level are presented in Figure 3. According to the analysis results, the sample areas can be divided into four agglomeration pattern types. The first quadrant is the H-H aggregation type, which indicates that the highly polluted areas are adjacent to each other. H-H regions mainly encompass the city of Shanghai and several cities in the Jiangsu, Zhejiang, and Anhui provinces in the downstream area of the Yangtze River. Similarly, L-L agglomeration regions in the third quadrant are concentrated in the upstream area and include the Sichuan and Yunnan provinces, as well as Chongqing. These regions tend to exhibit relatively lower levels of industrial development and low economic outputs per unit of construction [47]. Observably, regions in the L-L type agglomeration show an increasing tendency, extending from the upstream to midstream areas. The cluster map also shows that the range of H-L and L-H aggregations have not changed significantly from 2006 to 2016. L-L agglomeration regions in the third quadrant are concentrated in the upstream area and include the Sichuan and Yunnan provinces, as well as Chongqing. These regions tend to exhibit relatively lower levels of industrial development and low economic outputs per unit of construction [47]. Observably, regions in the L-L type agglomeration show an increasing tendency, extending from the upstream to midstream areas. The cluster map also shows that the range of H-L and L-H aggregations have not changed significantly from 2006 to 2016.

Full Sample Results
The estimation results employing different model specifications are shown in Table 3. The second column shows the results of equation (3). Before estimating a panel data model, a fixed effect model or random effect model should be chosen using the Hausman test [48]. The result of the Hausman test implies that it is reasonable to employ a fixed effect model. Based on this model, the coefficients of lnUrban ,

Full Sample Results
The estimation results employing different model specifications are shown in Table 3. The second column shows the results of Equation (3). Before estimating a panel data model, a fixed effect model or random effect model should be chosen using the Hausman test [48]. The result of the Hausman test implies that it is reasonable to employ a fixed effect model. Based on this model, the coefficients of ln Urban, (ln Urban) 2 and (ln Urban) 3 are 3.992, −1.137, and 0.104, respectively. All the coefficients are insignificantly positive at the 10% level across the second column, which indicates that there is not obvious relationship exists between urbanization and PM 2.5 without considering spatial correlation and the time lag effect of the dependent variable.
With the time-lag effect of the independent variable included, the system GMM result in the third column shows a strong time-lag effect for PM 2.5 . The estimated coefficient of W * lnPM 2.5i,t is significantly positive (p < 0.01), that is, a high PM 2.5 in the current period indicates a probability of the next period's PM 2.5 to continue increasing, pointing to an obvious lag effect. The global Moran value and local scatter plot show that PM 2.5 among cities in the YREB has a stable positive spatial correlation, confirming the suitability of the spatial econometric model. Column 4 in Table 3 shows the spatial panel model result with spatial interaction effects based on Equation (5). The coefficient of W * lnPM 2.5i,t is positive at the 1% significance test, which shows that there are significant spatial spillover effects for PM 2.5 . This means that a local PM 2.5 change affects geographically neighboring regions. The result serves as a reminder that governmental efforts toward reducing environmental pollution should take a global view. As Table 3 shows, the lag effect of PM 2.5 and the spatial interaction effect both exist. The urbanization effect on PM 2.5 shown in columns 3 and 4 appears to be insignificant.
Most importantly, considering the spatial correlation and time-lag effect of PM 2.5 , next we implemented the dynamic spatial panel model. There are two important tests pertaining dynamic spatial panel data. First, the Sargan statistic of overidentification test is used to examine effectiveness and feasibility of instrumental variables. If the Sargan value fails to pass the significance test, the instrumental variables can be regarded as reliable. Second, the Arellano-Bond estimator is used to test the existence of sequence-dependent errors [49,50]. If the p-values of AR(2) are higher than 0.05, the disturbance term of the dynamic model does not exist with respect to the problem of sequence correlation. All the tests confirm that using a dynamic spatial model to detect the relationship between urbanization and PM 2.5 is the most suitable. The estimated results of the dynamic spatial model can be seen in column 5 of Table 3, where the coefficients of ln PM 2.5i,t−1 and W * lnPM 2.5i,t are 0.356 and 0.797, respectively, and statistically significant at the 1% level. Compared with the results of the system GMM and the spatial panel model, a difference occurs in the coefficients of ln Urban, (ln Urban) 2 , and (ln Urban) 3 . In the system GMM model, the primary, quadratic, and cubic terms of ln Urban are significant at the 10% level, while there is not an obvious association between population urbanization and PM 2.5 in the spatial panel model. However, in dynamic spatial model, population urbanization has an obvious N-shaped effect on PM 2.5 with two inflection points: 27% and 70%, respectively. Thus PM 2.5 will continue to decrease as population urbanization increases from 27% to 70% during the second stage of the N-shaped curve. The result can be explained as the improvement of technology and sufficient capital conducted by government, which helps to control PM 2.5 [21]. Additionally, people require healthier living conditions nowadays than before. Thereby PM 2.5 begins to decrease with the agglomeration of population. Note: There are no inflection points, because the functions are monotonically decreasing. *, **, and *** represent significant differences at the 10%, 5%, and 1% levels, respectively. The dynamic spatial model passes the AR(2) test for serial correlation and the Sargan test for overidentification.

The Heterogeneous Effects of Upstream, Midstream, and Downstream Cities
The YREB is a vast area that is affected by numerous different natural, social, and economic. It thus displays significant diversity in urbanization and PM 2.5 . Considering the long-existing socioeconomic gap and watershed division among areas in the YREB, we split the sample according to upstream (31 cities), midstream (52 cities), and downstream (25 cities) areas to account for potential regional heterogeneities. The regression results for upstream, midstream, and downstream cities are shown in Table 4. Figure 4 shows the relationship trajectory between ln Urban and ln PM 2.5 in the different regions.
In both upstream and downstream areas, regressions pass the AR(2) test for serial correlation and the Sargan test for overidentification. In the upstream area, the coefficients of the primary term of urbanization are significantly positive, while the coefficients of the squared term are negative. That is population urbanization shows an inverted U-shaped effect on PM 2.5 . Accordingly, the inflection point shown in Figure 4a is 43%, which demonstrates that urbanization increases PM 2.5 below the inflection point, while increasing population urbanization restrains PM 2.5 when goes over the 43% level. In the midstream area, when we add the primary term, the quadratic term, and the cubic term of to the model, the p-value of the AR(2) test is less than 0.05. That is, the model rejects the null hypothesis and that the two-order autocorrelation coefficient of the disturbance term is not different. Therefore, we dropped (ln Urban) 2 and (ln Urban) 3 , and found that the coefficient of in this region is 0.039, indicating that a 1% increase in ln Urban would lead to a decrease in ln PM 2.5 of 3.9% for the midstream region. A linear relationship exists between PM 2.5 and urbanization, which can be seen in Figure 4b. In the downstream area, the coefficients of the ln Urban and (ln Urban) 3 terms of population urbanization are both positive and the coefficient of (ln Urban) 2 is negative. Thus, the relationship between PM 2.5 and population urbanization is a monotonically increasing function in Figure 4c. This is similar to the full sample result. That is, population urbanization will promote PM 2.5 in the downstream area. In sum, the increase of the population urbanization rate in the downstream and midstream regions will accelerate PM 2.5 in the future [51]. Furthermore, the coefficient of the lagged dependent variable (ln PM 2.5t−1 ) for the upstream area is significantly greater than that of the midstream and downstream areas. This implies that the lag-effect of air quality in the upstream region is more significant than in other areas. Table 4. Heterogeneity results by upstream, midstream, and downstream cities.

Variables
Regional Heterogeneity Note: There are no inflection points, because the functions are monotonically decreasing. *, **, and *** represent significant differences at the 10%, 5%, and 1% levels respectively. Column 2, 5, and 6 pass the AR(2) test for serial correlation (marked in bold) and the Sargan test for overidentification. Note: There are no inflection points, because the functions are monotonically decreasing. *, **, and *** represent significant differences at the 10%, 5%, and 1% levels respectively. Column 2, 5, and 6 pass the AR(2) test for serial correlation (marked in bold) and the Sargan test for overidentification.
(a) (b) (c) The heterogeneity results can be explained from the perspectives of geographical conditions and the current development of the YREB region. First, the downstream and midstream regions have a relatively higher living standard and industrialization than the upstream region, which leads to a large population transfer, not only from local agricultural sectors, but also due to interregional The heterogeneity results can be explained from the perspectives of geographical conditions and the current development of the YREB region. First, the downstream and midstream regions have a relatively higher living standard and industrialization than the upstream region, which leads to a large population transfer, not only from local agricultural sectors, but also due to interregional migration [52]. Because of increasing population urbanization, as well as an increase in urban employment density, city sprawl is accompanied with an increase in building height and density, which is not beneficial to the rapid dispersion of PM 2.5 concentration. Additionally, the population and labor force increase leads to a related increase in the number of vehicles, as well as energy consumption and infrastructure construction. All these factors will directly affect PM 2.5 concentration. Secondly, the decrease of PM 2.5 with the increase of population urbanization in the upstream area can be explained by the economic development and national policy. Economic development of upstream areas in the YREB are relatively lagging behind midstream and downstream, which lead to serious labor outflow and slow development of secondary industry. Furthermore, the national policy of "protecting the Yangtze River" prevents the upstream areas from investing in highly polluting industries. Over time more and more cities' urbanization levels are higher than the inflection point of 43%, which reduces the impact of PM 2.5 .

The Heterogeneous Effects of Cities on Different Urbanization Levels
The PM 2.5 effect of urbanization might also differ according to cities' urbanization levels. This needs to be investigated further. To accomplish this, we divide the panel data of 108 cities into three groups according to the average urbanization level division for the study period of 2006 to 2016. Specifically, (1) the regions with an average urbanization ratio of 29.00% to 40.00% belong to low urbanization group; (2) the regions with average urbanization ratio of 40.01% to 50.00% belong to medium urbanization group; (3) and the regions with average urbanization ratio of 50.01% to 90% belong to high urbanization group.
The results on the heterogeneous effects according to different urbanization levels are listed in Table 5. Figure 5 shows a diagram representing the relationship between ln Urban and ln PM 2.5 for different urbanization levels. In columns 2 and 3, the coefficient of ln Urban and (ln Urban) 3 are negative, the term of (ln Urban) 2 is positive, and the p-values are all less than 5%. Therefore, an inverted N-shaped relationship exists in the low and medium urbanization level groups in Figure 5a,b. For the cities in the low urbanization level group, the first inflection point is 24%, and the second is 38%. For the medium urbanization level group, the two inflection points are 33% and 46% respectively. To further explore the dynamic impacts of urbanization on PM 2.5 , we calculate the number of cities located in the different stages of the three curves. Considering the urbanization ratio, the two groups are nearly between the two inflection points and pass over the second inflection points. Therefore, increased urbanization first raised PM 2.5 in regions with low urbanization and medium urbanization and then reduced the PM 2.5 .
In column 4, the coefficients of the primary, squared, and cubic terms of urbanization are statistically insignificant. We therefore excluded (ln Urban) 3 and reestimated the model (column 5). Judging from the coefficients in column 5, a U-shaped curve exists in the high urbanization group, with an inflection point of 49% in Figure 5c. Generally, the urbanization growth would first reduce PM 2.5 in this region and then increase the PM 2.5 again. Table 6 shows that all the cities in the high urbanization level group have passed over the inflection point of the U-shaped curve since 2010, which indicates that the PM 2.5 is aggravated with the increasing urban population. Note: There are no inflection points, because of the functions are monotonically decreased. *, **, and *** represent significant differences at the 10%, 5%, and 1% levels, respectively. Columns 2, 3, and 5 pass the AR(2) test for serial correlation and the Sargan test for overidentification.
(a) (b) (c)   2006  7  25  0  11  24  0  17  24  2007  4  28  0  2  33  0  13  28  2008  1  31  0  0  34  1  5  36  2009  1  30  1  0  29  6  2  39  2010  1  29  2  0  26  9  0  41  2011  1  24  7  0  22  13  0  41  2012  0  18  14  0  16  19  0  41  2013  0  14  18  0  14 2006  7  25  0  11  24  0  17  24  2007  4  28  0  2  33  0  13  28  2008  1  31  0  0  34  1  5  36  2009  1  30  1  0  29  6  2  39  2010  1  29  2  0  26  9  0  41  2011  1  24  7  0  22  13  0  41  2012  0  18  14  0  16  19  0  41  Section 3.5 (listed above) confirms that for cities with low or medium urbanization level, population urbanization showed a positive impact on PM 2.5 in the past. However, due to the rapid growth of urbanization, population urbanization has passed the second inflection point and will have a long-term negative effect on PM 2.5 . These cities are mostly remote cities with relatively low overall urbanization levels, which will not promote PM 2.5 because of their safe and comfortable development mode. For cities with a high urbanization level, population urbanization has surpassed the inflection point of the U-shape curve and will maintain a positive impact on PM 2.5 for a long time. Many cities in the high urbanization group such as Shanghai, Nanjing, Hangzhou, Wuxi, Suzhou, and other developed areas are eager to pursue economic and industrial development, which will inevitably lead to a long-term increase in population urbanization. This is similar to the results of the downstream area of YREB. This pursuit of development of these regions-that have already reached a certain urbanization level-will lead to a further continuous increase of population urbanization. With the urban population extremely inflated, these cities will face increased traffic congestion, accelerated real estate construction, and industrial development caused by population agglomeration, and PM 2.5 will eventually increase.

Discussion
Despite the rising interest in understanding the impact of urbanization on air quality, few empirical studies have been able to fully explain this impact. This is partly due to the complicated nature of statistical analysis in such a case, given the presence of factors such as heterogeneous effects, dynamic effects, and spatial dependence. Our study aims to fill this gap in existing research by using a more comprehensive econometric model and focusing on the heterogeneous effect. Considering the time-lag effect and spatial interaction of PM 2.5 , the study used a STIRPAT model to analyze the nonlinear relationship between population urbanization and PM 2.5 based on panel data of 108 prefecture-level cities in the YREB from 2006 to 2016. The results indicate that the relationship shows an N-shaped curve for the YREB. The traditional EKC inverted U curve is not applicable to PM 2.5 and population urbanization in different models, which indicates that population urbanization has obvious stage heterogeneity from the perspective of regional heterogeneity and urbanization level heterogeneity on PM 2.5 .
Based on the above statements concerning regional heterogeneity and urbanization level heterogeneity, we find that (1) for cities in the downstream and midstream areas, as well as those with high urbanization levels, there was a significantly positive trend proving that urbanization will increase PM 2.5 . More attention should be paid to the rapid economic development of cities that have exceeded the inflection point or will soon exceed the urbanization inflection point of 49%, and thus have a positive impact on PM 2.5 . (2) Regarding the upstream areas of the YREB and the small and medium-sized cities with low urbanization level: although urbanization has passed through the increasing stage in relation to PM 2.5 , these areas have passed the second inflection point of the N-shaped curve and have begun to reduce PM 2.5 . Thus, when addressing PM 2.5 , the government cannot simply blame air pollution purely on factors related to population agglomeration. For cities with different urbanization levels, the impact of future urbanization development on PM 2.5 also differs. Therefore, solutions should be found that suit specific circumstances, especially for the areas with a low level of economic development, the source of PM 2.5 should be determined and controlled suitably.

Conclusions
In conclusion, PM 2.5 remains a serious environmental challenge in the YREB, especially in most downstream and midstream regions. Considering the entire YREB, PM 2.5 showed a trend of firstly decreasing and then increasing with the increasing of urbanization from 2006 to 2016. In the upstream area, population urbanization shows an inverted U-shaped effect on PM 2.5 , while presenting an increasingly linear relationship with midstream and downstream. From the perspective of urbanization levels, a similar inverted N-shaped curve exists in the medium and low urbanization level groups, and a U-shaped relationship exists in the high level of urbanization.
Author Contributions: All the authors made significant contributions to the paper. W.X., H.D., and Z.C. conceptualized and designed the study; W.X. analyzed the data; and W.X. and Z.C. wrote and revised the manuscript.