Effects of Land Urbanization on Smog Pollution in China: Estimation of Spatial Autoregressive Panel Data Models

Studying the impact of land urbanization on smog pollution has important guiding significance for the sustainable development of cities. This study adds the spatial effect between regions into the research framework of smog pollution control in China. On the basis of a panel dataset of 31 province-level administrative regions in China from 2000 to 2017, we investigate the impact of land urbanization on smog pollution. We construct a spatial weight matrix and use Moran’s I statistic and the spatial autoregressive panel data model. The research results show that land urbanization and smog pollution have an inverted U-shaped relationship. With the advancement of land urbanization, the area’s smog pollution first increases and then decreases. However, in general, China has not passed the inflection point and is still at a stage where increasing land urbanization rate aggravates smog pollution. Moreover, the country’s smog pollution has a significant spatial positive correlation that shows agglomeration. In that context, multiple environmental governance entities, including the government, enterprises, and the public, need to collaborate on measures to reduce smog pollution. Future urban construction in China will need to integrate solutions that address the current nexus between urbanization and smog pollution to achieve green and sustainable development.


Introduction
Since their reform and opening up, cities in China have been expanding with the continuous growth of the total economic volume [1,2]. The National Bureau of Statistics reported that China's urbanization rate increased from 17.9% in 1978 to 60.6% in 2019 [3]. However, the rapid growth of the Chinese economy and the expansion of the city scale come at the cost of considerable energy consumption and severe environmental pollution [4][5][6][7], especially smog pollution. Smog is a mixture of smoke and fog, and an important measure of smog pollution is the peak hourly concentrations of ambient fine particulate matter (PM 2.5 ) [8]. Industrial sectors, such as automobile manufacturing, coal mining, construction and cement manufacturing, are known to generate large quantities of air pollutants, especially PM 2.5 particles [9]. In recent years, smog pollution has occurred frequently, posing a great health hazard to the public [10,11]. Reports show that ambient particulate matter pollution is the fourth-leading risk factor for deaths in China, behind dietary risks, high blood pressure, and smoking [12]. The ever-increasing smog pollution has attracted the attention of the public. Consequently, policymakers have begun to adopt many strategies to control smog pollution. In 2013, the Chinese government deployed 10 measures to prevent and control smog pollution and released the Action Plan for Preventing and Controlling Air Pollution [13,14]. The implementation of these policies Land 2020, 9,337 3 of 16

Spatial Weights Matrix
The spatial weights matrix is a necessary element in most regression models where a presentation of spatial structure is needed [37]. Common spatial weight matrices include three types: contiguity-based spatial weights, distance-based spatial weights, and economic-based spatial weights [38]. These three spatial weight matrices are defined as follows: 1, i and j are neighbors 0, i and j are not neighbors (1) where i and j are regions, and d is the distance between the geographic centers of the two regions.
We selected the location of the provincial capitals. pgdp is per capita GDP. We used contiguity-based spatial weights to reflect whether area i and area j are adjacent, distance-based spatial weights to reflect the geographic spatial distance between area i and area j, and economic-based spatial weights to reflect the economic distance between area i and area j. Following previous research, we mainly chose contiguity-based spatial weights for spatial measurement analysis [39,40]. We also used distance and economic-based spatial weights to test the robustness of the results.

Moran's I Statistic
This study uses Moran's I statistic to examine the spatial autocorrelation of smog pollution. Moran's I statistic is the most popular test for spatial correlation, including global Moran's I and local Moran's I [41,42]. The calculation is formulated as follows: where ρ i is smog pollution of area I, ρ is the mean of smog pollution, w ij is the spatial weights matrix. The value range of Moran's I statistic is from −1 to 1, and a value greater than 0 indicates spatial positive correlation, that is, the spatial distribution characteristics of high value and high value aggregation or low value and low value aggregation. By contrast, a value less than 0 is a spatial negative correlation, that is, a spatial distribution characteristic in which a high value surrounds a low value or a low value around a high value. A value equal to 0 is irrelevant. To investigate the local spatial autocorrelation of smog pollution, we further used Stata to generate a local Moran scatter plot maps of China's smog pollution for each year. We also used GeoDa and ArcGIS to produce local indicators of spatial association (LISA) agglomeration maps of smog pollution for each year.

SAR Panel Data Model
The SAR panel data model is used to estimate the impact of land urbanization on smog pollution. The traditional panel measurement model does not consider the possible spatial correlation of dependent variables. From the perspective of externality, the SAR panel data model examines the impact of changes in a region's elements from other regions. The SAR panel data model in this study is defined as follows: where i is the region; t is the time; smog is the dependent variable, that is, smog pollution; landurban is the independent variable that this study focuses on, that is, land urbanization; control refers to all control variables; u i and γ t represent individual effects and time effects, respectively. According to the EKC hypothesis, we add the quadratic term of land urbanization to the proposed model to investigate the possible non-linear effects of land urbanization on smog pollution. β 1 , β 2 and β are the coefficients of each variable [43,44].

Variable Definitions and Data Description
The eight variables included in the SAR panel data model constructed through this study (Table 1) are: (1) Dependent variable: Smog pollution is the dependent variable in the proposed model. We selected the PM 2.5 concentration, which is the main source of smog pollution and that which most concerns the Chinese public, as the measurement index. The original data come from the American Atmospheric Composition Analysis Group. The data type is gridded datasets (resolution is 0.01 • × 0.01 • ) [45]. We used ArcGIS to analyze the original data packet and match them with the maps of China's provinces to obtain the average PM 2.5 concentration. Finally, we obtained the panel datasets of 31 province-level administrative regions in China from 2000 to 2017.
The advantage of this analysis is the comprehensive integration of satellite monitoring data and ground monitoring data, which can accurately and objectively reflect the true situation of China's smog pollution. (2) Independent variable: Land urbanization is the independent variable that this article focuses on. With reference to Liu and Wang's approach [46,47], we used land urbanization rate in the measurement, and the specific calculation formula is as follows: where i is the region, t is the time, Uarea is the built-up area of each province's city, and Tarea is the total land area of each province. (3) Control variable: To improve the accuracy of the model estimation results, we selected some relevant factors that may affect smog pollution as control variables and put them in the model. First, considering the traditional stochastic impacts by regression on population, affluence, and technology (STIRPAT) model, we controlled the impact of population, wealth, and technology on smog pollution [48,49]. Second, we measured the demographic factor by the population per km 2 , the wealth factor by per capita GDP, and the technical factor by the number of patent application authorizations. Third, informed by other studies, we considered the impact of industrial structure, education level, and degree of openness on smog pollution [50][51][52]. We model industrial structure by the proportion of the added value of tertiary industry to GDP, the education level by the number of college students per 10,000 people, and the degree of openness by the proportion of total import and export to GDP.
After data collection, this study analyzed the panel dataset of 31 provincial administrative regions in China from 2000 to 2017. Due to the lack of data, the research area of this study does not include Hong Kong, Macau, and Taiwan. Table 2 lists the descriptive statistical and multicollinearity test of all the original data. According to the results of variance inflation factor (VIF), the VIF of the independent variable and all the control variables are less than 10, thereby indicating that the variables selected in this paper do not have multicollinearity [26,53].   The areas with severe smog pollution are mainly concentrated in the four regions: Central, North, East, and Northwest China. Northeast, Southwest, and South China have small degrees of smog pollution. From the perspective of changes, smog pollution is rapidly increasing in various areas, including Beijing, Tianjin, Shanghai, Shandong, Jiangsu, Anhui, and Jilin. The growth of PM 2.5 in all area exceeds 20 µg/m 3 , and in some areas even exceeds 50 µg/m 3 . In terms of land urbanization, the mean and the growth distributions show an increasing distribution characteristic from west to east. This finding is also consistent with the distribution of regional differences in China's economic development level.

Spatial Autocorrelation of Smog Pollution
We used Stata to measure the global Moran's I of China's PM 2.5 concentration from 2000 to 2017. The test results reported in Table 3 show that the global Moran index values over the years are roughly distributed between 0.4-0.5 and all passed the 1% significance test. Thus, China's smog pollution shows a significant positive spatial correlation. The spatial distribution of smog pollution in China presents the "club" distribution characteristics of high value and high value agglomeration and low value and low value agglomeration. This result also indicates the necessity of adding spatial influencing factors to the measurement model that examines the impact of land urbanization on smog pollution.  North, East, and Northwest China. Northeast, Southwest, and South China have small degrees of smog pollution. From the perspective of changes, smog pollution is rapidly increasing in various areas, including Beijing, Tianjin, Shanghai, Shandong, Jiangsu, Anhui, and Jilin. The growth of PM2.5 in all area exceeds 20 μg/m 3 , and in some areas even exceeds 50 μg/m 3 . In terms of land urbanization, the mean and the growth distributions show an increasing distribution characteristic from west to east. This finding is also consistent with the distribution of regional differences in China's economic development level.       Table 4 reports the estimation results of the SAR panel data model under different conditions. Columns 1 and 2 are the estimation results without the addition of control variables. The difference between Columns 1 and 2 adds a quadratic term with the dependent variable to the model to examine whether land urbanization has a nonlinear impact on smog pollution. Column 3 is the estimated result after adding control variables on the basis of the STIRPAT model, and Column 4 is the estimated result after adding all control variables to the model. From the results in Table 4, we can conclude the following: pollution has a significant positive spatial correlation. This result is consistent with Moran's I test result. The estimated coefficient value shows that an increase of 1 µg/m 3 in the PM 2.5 concentration in the neighborhood increases the local PM 2.5 concentration by more than 0.7 µg/m 3 . (2) The estimation results in Columns 2-4 show that without the addition of control variables, the estimated coefficient of the first-order term of landurban is significantly positive. By contrast, the estimated coefficient of the quadratic term is significantly negative. After gradually adding the control variables, the estimated coefficient of the first-order term of landurban is still significantly positive. The estimated coefficient of the quadratic term is still significantly negative. These results indicate that land urbanization has a nonlinear impact on smog pollution. Specifically, land urbanization and smog pollution have an inverted U-shaped relationship, that is, with the increase in land urbanization rate, the level of smog pollution shows a trend of first rising and then falling.  The estimation results of the SAR panel data model show that land urbanization and smog pollution have an inverted U-shaped relationship. We aimed to find the inflection point of this inverted U-shaped relationship and identify the provinces that have crossed this inflection point. On the basis of the regression results in Table 4, we calculated the inflection point values of the inverted U-shaped Land 2020, 9, 337 9 of 16 curve in Columns 2-4, which are 17.96, 17.95, and 16.58, respectively. Compared with the original data, we find that as of the end of the investigation period (2017), no area has crossed the inflection point and entered the stage where increasing land urbanization promotes the improvement of smog pollution. Moreover, most areas are still far from the inflection point. This result indicates that although land urbanization and smog pollution have an inverted U-shaped relationship, China is still on the left of this inverted U-shaped curve as a whole, that is, increasing land urbanization rate aggravates smog pollution.

Change the Regression Method
To verify the robustness of the conclusions drawn in Section 3.3, we performed a series of tests. We changed the regression method to re-estimate the coefficients of the independent variables in the proposed model. Table 5 lists the estimated results. Column 1 is the Ordinary Least Squares (OLS)method not considering the spatial correlation, and Columns 2-4 are the other three panel space measurement models. They are the Panel Space Error Model (PSEM) that only contains the spatial autocorrelation of the error term, the Panel Space Autocorrelation Model (PSAC) that contains the spatial lag of the explained variable and the spatial autocorrelation of the error term, and the Panel Space Dubin Model (PSDM) that considers the spatial lag of independent variables on the basis of SAR panel data model. Table 5 shows that no matter which method is used, the estimated coefficient of the first-order term of the independent variable landurban, which is the focus of this study, is significantly positive. By contrast, the estimated coefficient of the quadratic term is significantly negative. On the basis of the estimation results of OLS, PSEM, and PSAC, only Shanghai has passed the inflection point. The PSDM estimation results are consistent with SAR panel data model, and no province passed the inflection point.

Change the Spatial Weight Matrix
For the spatial measurement model, the choice of the spatial weight matrix has a great impact on the results. To test this influence, we re-estimated the SAR panel data model by changing the spatial weight matrix. Columns 1-2 of Table 6 are the estimated results by using the distance and economic-based spatial weights. Table 6 shows that no matter which spatial weight matrix is used, the estimated coefficient of the first-order term of landurban is significantly positive, and the estimated coefficient of the quadratic term is significantly negative. Land urbanization and smog pollution have maintained a significant inverted U-shaped relationship. The calculation result of the inflection point shows that only Shanghai has passed the inflection point and entered the right half of the inverted U-shaped curve. In general, the abovementioned series of tests proved that the conclusions drawn in this article are robust.

The Inverted U-Shaped Relationship between Land Urbanization and Smog Pollution
The estimation results of the SAR panel data model show that land urbanization and smog pollution have an inverted U-shaped relationship. From other perspectives of urbanization, such as population urbanization, the results of this study are similar to the previous research. For example,  found the same inverted U-shaped relationship between population agglomeration brought about by urbanization and smog pollution by using panel threshold model [54]. Based on the structural equation model, Li et al. (2017) also found that the impact of urban population density on smog pollution shows as an inverted U-shape [55]. Moreover, the results of this study are also different from the previous research. In Wang's (2019) research, he found that there is a significant U-shaped relationship between land urbanization and atmospheric ecological environment quality. The reason for this difference may be that he used SO 2 emissions to define air pollution [56]. This study expands the research on the impact of urbanization on smog pollution and indicates that the relationship between land urbanization and smog pollution conforms to the EKC hypothesis, as an important feature of economic development. Referring to previous studies, we can discuss this influence mechanism. In periods when low land urbanization is rate, the regional economic development level is low, the technical level is not high, the industrial structure level is low, and the economic development is mainly based on the extensive development model [56]. The central government assesses local governments with economic development indicators, such as GDP and taxes. To increase the total amount of local GDP and taxes, the local governments allocate the indicators of construction land to industrial land in the secondary industry [57]. However, industrial land causes great air pollution [58]. In periods of high land urbanization rate, the local economy has developed to an advanced stage, and the development model is mainly characterized by intensive development [59]. The government and urban residents are aware of the importance of environmental protection, and the reality of high pollution forces the central government to issue environmental regulations and policies to restrict the development of industries in various regions. To meet the requirements of the central government's environmental regulations and policies, local governments' land and industrial policies favor the service industry and high-tech industries in the tertiary industry. Local governments also reduce the proportion of industrial land transfers and allocate considerable construction land indicators to tertiary industry land, thereby reducing pollutant emissions and realizing economic transformation and green development [56].

The Current Stage of China
By calculating the inflection point, we find that China is still at a stage where increasing land urbanization rate exacerbates smog pollution, although land urbanization and smog pollution have an inverted U-shaped relationship. The contradiction between urban development and smog pollution is still prominent, and thus resolving this contradiction is an important issue currently in China's urban construction. According to the theory of ecological modernization, urbanization essentially embodies the modernization of individuals and societies. The mitigation of environmental problems, such as air pollution, is the result of the joint efforts of various actors at multiple levels in the process of balancing economic development and environmental protection [60]. These actors include government, enterprises, and the public. To formulate a reasonable urban plan, the government cannot blindly pursue the expansion of the city scale [61]. Instead, policymakers need to focus on the compact and intensive development of cities [22]. In addition, a series of policies is needed to guide the development of urban industrial structure and promote the rational allocation of urban resources. Enterprises need to enhance their sense of social responsibility and continuously reduce the proportion of industries with high energy consumption, high pollution, and high emissions. In addition, they must strengthen the research and development of pollution control technologies to realize clean and sustainable production methods [62]. As far as the public is concerned, they must take on the responsibility of supervising the government and enterprises and promoting environmental protection awareness and knowledge. The public should not simply regard themselves as the victims of smog pollution, but more importantly, realize how to move from a comfortable but high energy consumption lifestyle to a green and sustainable way of life and consumption as citizens [63,64]. Through the joint efforts of different governance entities, the positive externalities of regional, social, and economic development brought about by land urbanization on smog pollution can be stimulated, and a win-win situation between urban development and smog governance can be realized.

The Positive Spatial Correlation of Smog Pollution in China
A series of spatial measurement methods indicate that China's smog pollution has a significant positive spatial correlation that shows agglomeration. One or more higher areas are adjacent to a high area. Similarly, at least one lower area is adjacent to a low area. The results of the global Moran's I show that from 2000 to 2017, the positive spatial correlation of smog pollution fluctuated between 0.4 and 0.5, indicating that this spatial correlation continues to be at a high level. The SAR panel data modelling of this study shows that increase of 1 µg/m 3 in the PM 2.5 concentration in a neighboring region increases the local PM 2.5 concentration by 0.7407 µg/m 3 . This study demonstrates that in the process of smog pollution control, local governments should not only focus on the remediation of "heavy-hit areas," such as North and Central China. They should also realize that the control of smog is not the responsibility of a certain province [65]. The spatial relevance of smog pollution requires each province to break down their own administrative barriers; strengthen regional cooperation and information sharing; promote the flow of technology, talent, capital, and other elements of smog governance among different regions; and improve the efficiency of smog governance to comprehensively improve China's smog pollution problem [66,67].

Conclusions
This study adds the spatial effect between regions into the research framework of smog pollution control in China. On the basis of a panel dataset of 31 province-level administrative regions in China from 2000 to 2017, we investigated the impact of land urbanization on smog pollution. We constructed a spatial weight matrix and used Moran's I statistic and the SAR panel data model. The modelling of this study demonstrates that land urbanization and smog pollution have an inverted U-shaped relationship. With the advancement of land urbanization, an area's smog pollution first increases and then decreases. However, in general, China has not passed the inflection point and is still at a stage where increasing land urbanization rate aggravates smog pollution. Moreover, the country's smog pollution has a significant spatial positive correlation that shows agglomeration. That evidence of agglomeration means that an increase in smog pollution in one city or region will lead to an increase in smog pollution in neighboring cities and regions.
While advancing smog pollution research in China, there are some shortcomings in this study. For example, due to the difficulty of collecting data, this article still uses the panel dataset at the provincial level in China. If a city-level panel dataset can be used in future research, it will more objectively reflect the relationship between land urbanization and smog pollution in China. In addition, the existence of this inverted U-shaped relationship between land urbanization and smog pollution does not mean that the smog pollution problem will be automatically solved when the land urbanization rate exceeds a certain level. Instead, multiple environmental governance entities, including the government, enterprises, and the public, need to collaborate on measures to reduce smog pollution. In that context, future urban construction in China will need to integrate solutions that address the current nexus between urbanization and smog pollution to achieve green and sustainable development.
Author Contributions: Conceptualization, X.Y. and M.S.; data collection, X.Y. and W.S.; analyzed data, X.Y.; writing-original draft preparation, X.Y.; writing-review and editing, X.Y. and X.Z. All authors have read and agreed to the published version of the manuscript.

Acknowledgments:
We are grateful to all anonymous reviewer' comments and suggestions. Those comments are valuable and helpful for revising and improving our paper, as well as the important guiding to our researches.