Modeling the Determinants of PM2.5 in China Considering the Localized Spatiotemporal Effects: A Multiscale Geographically Weighted Regression Method

Yue, Han; Duan, Lian; Lu, Mingshen; Huang, Hongsheng; Zhang, Xinyin; Liu, Huilin

doi:10.3390/atmos13040627

Open AccessArticle

Modeling the Determinants of PM2.5 in China Considering the Localized Spatiotemporal Effects: A Multiscale Geographically Weighted Regression Method

by

Han Yue

¹

,

Lian Duan

^2,3,*,

Mingshen Lu

²,

Hongsheng Huang

²,

Xinyin Zhang

^3,4 and

Huilin Liu

⁵

¹

Center of GeoInformatics for Public Security, School of Geography and Remote Sensing, Guangzhou University, Guangzhou 510006, China

²

School of Natural Resources and Surveying, Nanning Normal University, Nanning 530100, China

³

Joint Research Center for Urban Health and Security Intelligent Data Analytics, Nanning Normal University, Nanning 530100, China

⁴

School of Environment and Life Science, Nanning Normal University, Nanning 530100, China

⁵

Scientific Research Academy of Guangxi Environmental Protection, Nanning, 530022, China

^*

Author to whom correspondence should be addressed.

Atmosphere 2022, 13(4), 627; https://doi.org/10.3390/atmos13040627

Submission received: 8 March 2022 / Revised: 9 April 2022 / Accepted: 12 April 2022 / Published: 14 April 2022

(This article belongs to the Special Issue Spatio-Temporal Analysis of Air Pollution)

Download

Browse Figures

Versions Notes

Abstract

:

Many studies have identified the influences of PM2.5. However, very little research has addressed the spatiotemporal dependence and heterogeneity in the relationships between impact factors and PM2.5. This study firstly utilizes spatial statistics and time series analysis to investigate the spatial and temporal dependence of PM2.5 at the city level in China using a three-year (2015–2017) dataset. Then, a new local regression model, multiscale geographically weighted regression (MGWR), is introduced, based on which we measure the influence of PM2.5. A spatiotemporal lag is constructed and included in MGWR to account for spatiotemporal dependence and spatial heterogeneity simultaneously. Results of MGWR are comprehensively compared with those of ordinary least square (OLS) and geographically weighted regression (GWR). Experimental results show that PM2.5 is autocorrelated in both space and time. Compared with existing approaches, MGWR with a spatiotemporal lag (MGWRL) achieves a higher goodness-of-fit and a more significant effect on eliminating residual spatial autocorrelation. Parameter estimates from MGWR demonstrate significant spatial heterogeneity, which traditional global models fail to detect. Results also indicate the use of MGWR for generating local spatiotemporal dependence evaluations which are conditioned on various covariates rather than being simple descriptions of a pattern. This study offers a more accurate method to model geographic events.

Keywords:

PM2.5; spatial dependence; temporal dependence; spatial heterogeneity; multiscale geographically weighted regression

1. Introduction

Since the reform and opening-up in 1978, China has established a booming economy. However, air pollution emerges behind economic prosperity, and the severe status of China’s poor air quality has become a typical international headline since 2013 [1]. Particulate matter with a diameter smaller than 2.5 μm (referred to as PM2.5) is one of the various types of air pollutants. PM2.5 comes from many sources, such as power plants, agricultural burning, volcanic eruptions, dust storms, and the emission of motor vehicles [2]. Numerous studies have proven the impacts of PM2.5 on atmospheric visibility, global climate change, and economic depression [3]. What is more, PM2.5 exposure is also linked with a range of diseases, such as respiratory-related mortality, cardiovascular-related diseases, and bladder cancer [4,5,6,7]. Therefore, it is necessary to explore the spatial and temporal distribution patterns of PM2.5 and evaluate influences of relevant factors, so as to better understand its occurrence regularity and formulate policies for pollution mitigation.

Spatial and temporal patterns of PM2.5 in China have been widely enquired. Zhou et al. [1] analyzed annual mean PM2.5 concentrations of Chinese cities and found that cities with high PM2.5 levels are mainly located in Hebei province and the Beijing–Tianjin–Hebei region, while cities with low PM2.5 levels are found in south-eastern coastal districts and Yunnan, Xizang, and Inner Mongolia. Their study also demonstrated the existence of two typical spatial effects of PM2.5, which are owned by many geographic events: spatial dependence and spatial heterogeneity. Spatial dependence means similar variable values are commonly observed in nearby geographical locations [8], while spatial heterogeneity means the process generating such observations varies over space [9]. Such spatial dependence and heterogeneity of PM2.5 are also found in other studies [3,10,11,12,13,14]. As to temporal patterns of PM2.5, [1,12,15] found a consistent seasonal variation: the highest PM2.5 level occurs in winter while the lowest level appears in summer. From a monthly point of view, however, PM2.5 presents a U-shaped pattern with a decreasing trend from January to May and a relatively stable period from June to September, followed by an increasing trend from October to December [1,12,13]. Besides, air pollution is interdependent over time, which means that cities with severe pollution in the current period will also stay at high pollution levels in the next period or several future periods [16].

Apart from spatial and temporal characteristics of PM2.5, many studies have identified driving factors of PM2.5. Meteorological conditions, such as temperature, wind velocity, and humidity, could produce different effects on PM2.5 [17,18,19,20]. Socioeconomic features, such as total population, population density, urbanization rate, gross domestic product (GDP), per capita GDP, the proportion of the second industry to GDP, and energy consumption, have also been proven to increase or decrease PM2.5 levels [1].

Various methods have been used to measure the associations between impact factors and PM2.5. The most commonly used methods are non-spatial models, such as classical multivariate linear regression [19,20,21,22,23], generalized additive model [18,24], econometric analysis [3], and input-output structural decomposition analysis [25]. These models, however, account for neither the spatial dependence nor spatial heterogeneity of PM2.5, which may generate biased and inconsistent estimation result [13,14,21]. Therefore, various spatial statistical methodologies have been proposed in the last few decades to consider spatial effects when analyzing geographic events [26]. These methods, however, are developed to account for either spatial dependence or spatial heterogeneity. For example, global spatial regression methods, such as the spatial error model (SEM) [1,21] and spatial lag model (SLM) [13,21], are frequently used to tackle spatial dependence. Studies demonstrated that SEM and SLM are statistically more rigorous than OLS as they have higher explanatory power and can reduce residual spatial autocorrelation [1,13,21]. Global regression models, e.g., OLS, SEM, and SLM, generate spatially fixed parameter estimates, assuming relationships between PM2.5 and impact factors are constant over space. This may not be the truth because the influence of a variable on PM2.5 is probably not stationary, but changes with spatial context. For example, wind can alleviate air pollution by dispersing and diluting air pollutants [27]. However, in desert areas, it may contribute to air pollution by blowing soil and dust into the air [28]. Without considering local variations, global methods may generate biased results [9]. GWR is a common method to deal with spatial heterogeneity [6,11,27]. By embedding spatial location data into regression parameters, GWR calibrates individual parameter estimate at each location. These local parameter estimates indicate how relationship varies over space [9]. Many studies have used GWR to measure the spatial variability of associations between PM2.5 and relevant socioeconomic variables [6,12].

Although empirical applications have verified the validity of spatial models in explaining spatial dependence or spatial heterogeneity, existing studies still have shortcomings. Spatial dependence and spatial heterogeneity usually co-occur in many spatial processes [29]. At this point, precise parameter estimates cannot be determined by either global or local spatial regression models which account for one effect in isolation from the other [30]. Mixed models tackling both spatial effects have received less attention, particularly in the field of environmental analysis. Considering this deficiency, the primary interest of this study is to account for both spatial effects of PM2.5 by including a spatiotemporal lag into the calibration of a recently proposed local model, namely MGWR.

This research extends previous studies in three aspects. First, it utilizes classical OLS to explore driving factors of city-level PM2.5 at a global scale and MGWR to determine potential spatial non-stationarity by providing a separate set of parameter estimates at each location. Second, a spatiotemporal lag is constructed and included in models to consider spatial and temporal dependence simultaneously. Performances of models with and without spatiotemporal lag are then comprehensively compared. Third, based on a three-year PM2.5 dataset, the stability of model performance, spatial and temporal variation of relationships between socioeconomic factors and PM2.5 are examined. Using a new local spatial regression model and accounting for both spatial effects simultaneously, our study is expected to generate more accurate results and provide implications for future research.

The rest of the paper is organized in the following structure. Section 2 introduces the data sources and methods we used in the analysis. Section 3 presents and discusses our study results. The strengths and limitations of the study and directions for future research are concluded in Section 4.

2. Materials and Methods

2.1. China’s City-Level PM2.5

This study investigates city-level PM2.5 in China. Based on data collected by air-quality monitoring stations, the Ministry of Ecology and Environment of PRC issues a daily air quality forecast as well as levels of various air pollutants for each city [31]. We obtained the monthly mean PM2.5 level of each city by averaging daily PM2.5 levels in each month. Finally, we obtained a data set from 288 cities (see Figure 1) between 2015 and 2017. The dependent variable is then the monthly mean PM2.5 level in each city, and variation of PM2.5 level across cities is the subject of interest. More specifically, why are PM2.5 levels larger in some cities than others?

2.2. Independent Variables

Eight variables are selected based on previous studies. Descriptive statistics of dependent and independent variables are summarized in Table 1. Meteorological data (wind and rain) are provided by the National Meteorological Information Center [32]. Socio-demographic data (urban, popd, pcgdp, scgdp, dust, and psg) are derived from the China City Statistical Yearbook [33].

2.3. Global and Local Moran’s I

We use Moran’s I to measure the spatial dependence of city-level PM2.5 [34]:

I = \frac{n \sum_{i = 1}^{n} \sum_{j = 1}^{n} w_{i j} (y_{i} - \bar{y}) (y_{j} - \bar{y})}{(\sum_{i = 1}^{n} \sum_{j = 1}^{n} w_{i j}) \sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(1)

where y_i and y_j represent the PM2.5 level of city i and city j, respectively.

\bar{y}

represents the average of y, n means the total number of cities. w_ij indicates a spatial weight matrix which measures spatial relationships between cities, and the queen contiguity method is used to construct this matrix. The value of Moran’s I ranges from −1 to 1: I < 0 indicates negative spatial dependence, i.e., PM2.5 levels of neighboring cities tend to be different; I > 0 means positive spatial dependence, i.e., PM2.5 levels of neighboring cities tend to be similar; I = 0 means complete randomness. A larger absolute value of Moran’s I means a stronger dependence. The statistical significance of Moran’s I is tested by z-score:

z = \frac{I_{O} - I_{E}}{S D_{I}}

(2)

where I_O, I_E, and SD_I denotes the observed, expected, and standard deviation of Moran’s I, respectively. Detailed descriptions can be found in [35].

In order to measure local spatial dependence of geographic events, Anselin extended Moran’s I to a local scale [36]. Local Moran’s I is formulated as:

I_{i} = \frac{n (y_{i} - \bar{y}) \sum_{j = 1}^{n} w_{i j} (y_{j} - \bar{y})}{\sum_{j = 1}^{n} {(y_{j} - \bar{y})}^{2}}

(3)

where I_i is local Moran’s I of city i, other symbols have the same meanings as those in formula (1). A local z-score can be determined to examine the statistical significance of the index. In our case, local Moran’s I determines the degree to which a specific city’s PM2.5 level varies with those of its nearby cities. More specifically, this local index can distinguish between a statistically significant cluster of high values (High-High), a cluster of low values (Low-Low), an outlier in which a high value is surrounded by low values (High-Low), and an outlier in which a low value is surrounded by high values (Low-High).

2.4. Rank von Neumann Ratio Test and Sample Autocorrelation Function

Temporal dependence measures the similarity of observations at neighboring points in time. We used two typical time series analysis methods to evaluate the temporal dependence of monthly mean PM2.5 in China, with a comparison of the results from each. The monthly mean PM2.5 level of China in one month was derived by averaging the monthly mean PM2.5 levels of each city in that month.

2.4.1. Rank von Neumann Ratio Test

The rank von Neumann ratio test is a non-parametric test of temporal autocorrelation [37,38]. In our case, it is derived as follows:

(1) Sort PM2.5 levels from lowest to highest and appoint a unique rank R_i to ith observation.

(2) Rearrange PM2.5 levels and corresponding ranks in chronological order. Calculate rank von Neumann ratio using the following formula:

v = \frac{\sum_{i = 2}^{T} {(R_{i} - R_{i - 1})}^{2}}{\frac{T (T^{2} - 1)}{12}}

(4)

where T indicates sample size (36 in our case as there are 36 months in total).

(3) Determine the lower critical value based on sample size (T) and significance level (

α

). If calculated ratio v is smaller than the critical value, the observed sequence is then considered to be significantly autocorrelated in time. Otherwise, observations are considered to be independent of each other in time.

2.4.2. Sample Autocorrelation Function

The rank von Neumann ratio test offers a single value to measure the similarity of immediate observations in time. The sample autocorrelation function, however, measures the temporal dependence of observations at different intervals. Firstly, it sorts observations by time order: y₁, y₂, …, y_T. Then, a set of pairs (y_i, y_i+k) for i = 1, …, T−k are constructed, where k represents a specified interval. The kth-order sample autocorrelation coefficient is calculated as:

r_{k} = \frac{\sum_{i = 1}^{T - k} (y_{i} - \bar{y}) (y_{i + k} - \bar{y})}{\sum_{i = 1}^{T} {(y_{i} - \bar{y})}^{2}}

(5)

where r_k means kth-order temporal dependence, namely, the correlation between pairs of observations separated in time by k interval (s). The statistical significance of the coefficient at 95% level can be determined by checking whether it is out of the confidence limit [−2/

\sqrt{T}

, 2/

\sqrt{T}

]. Coefficients outside this range are considered to be statistically significant [39].

2.5. Multiscale Geographically Weighted Regression

MGWR is a recently proposed extension of GWR. Before presenting MGWR, it is helpful to first introduce the classic OLS and traditional GWR in the context of PM2.5 study. Classic ordinary least square (OLS) is formulated as:

\log y_{i} = β_{0} + \sum_{j} β_{j} \log x_{i j} + ϵ_{i}

(6)

where i denotes a city,

y_{i}

denotes PM2.5 level of city i, x_ij means jth explanatory variable of city i, and

β_{*}

denotes unknown parameters to be estimated which indicate associations between PM2.5 and covariates, ϵ_i is an error term. This study utilizes logged forms of variables to make parameter estimates independent of units and make comparison easier.

Global models generate a single set of parameter estimates across all observations assuming relationships between covariates and PM2.5 are invariant over space, which ignores any potential spatial non-stationarity of these relationships. GWR extends OLS by calibrating a regression equation for each location (termed as regression point) with the aid of data in nearby locations (termed as data points) [9]:

\log y_{i} = β_{0} (u_{i}, v_{i}) + \sum_{j} β_{j} (u_{i}, v_{i}) \log x_{i j} + ϵ_{i}

(7)

where (u_i, v_i) represents coordinates of the centroid of city i. A critical bandwidth b is determined by minimizing the corrected Akaike information criterion (AICc). Data points within b are weighted by distance-decay functions, such that data near the regression point are assigned with larger weights than data farther away. The optimal value of b is termed as the optimal bandwidth or scale at which independent variables affect dependent variables.

GWR still has a limitation because it generates a single optimal bandwidth for all covariates, meaning processes generating associations between the dependent variable and all independent variables operate at the same spatial scale. It is possible that some variables impact the dependent variable at local scales, while other variables impact the dependent variable at regional or global scales. To account for this possibility, MGWR extends GWR by relaxing the “same spatial scale” assumption, and it is formulated as [40]:

\log y_{i} = β_{b w_{0}} (u_{i}, v_{i}) + \sum_{j} β_{b w_{j}} (u_{i}, v_{i}) \log x_{i j} + ϵ_{i}

(8)

where bw* denotes the specific optimal bandwidth for the calibration conditional relationship between *th covariate and the dependent variable. GWR allows different processes to operate at different spatial scales by deriving the individual bandwidth for each covariate. Similarly, these optimal bandwidths are determined by minimizing AICc.

By estimating local rather than global parameter estimates, GWR and MGWR allow relationships between covariates and PM2.5 to vary across space. The spatial heterogeneity of these associations is thus identified by these location-specific parameter estimates. Additionally, local regression models usually have multiple testing issues, which has been ignored by most of the previous studies. In this study, we use the da Silva and Fotheringham adjustment for multiple hypothesis testing [41]. This method could enhance the reliability of the results by testing the statistical significance of parameter estimates at each location.

2.6. Spatio-Temporal Lag

Spatial and temporal dependence violates the assumption of traditional regression models that observations are independent of each other. A spatiotemporal lag is constructed and included in models as an independent variable to account for these effects. The lag variable is formulated as:

s t l a g (i, t) = \frac{\sum W_{i j} \times y_{j (t - 1)}}{\sum W_{i j}}

(9)

where stlag(i, t) represents the spatiotemporal lag of city i at time t, while y_j(t−1) means the dependent variable of city j at time t − 1. In line with previous research [30,42], this study limits temporal lag to one time-interval (t − 1). Intuitively, a variable is related most directly to the last observation. W_ij is the weight of city j, constructed by a bi-square kernel function:

w_{i j} = \{\begin{matrix} {[1 - {(\frac{d_{i j}}{b})}^{2}]}^{2} \\ 0 o t h e r w i s e \end{matrix} i f d_{i j} < b

(10)

where d_ij is the distance between city i and j, and b is a critical distance. This study sets b to 100 km, which is an approximation of the average first-neighbor distance (96 km).

3. Results and Discussion

3.1. Spatial Patterns of PM2.5

Spatial patterns of PM2.5 levels are mostly consistent in the three years from 2015 to 2017 (see Figure 2). Cities in the north, especially core cities in the Beijing–Tianjin–Hebei (BTH) region, demonstrate the highest PM2.5 levels. Cities with higher PM2.5 levels are mostly found around the BTH region, as well as in the northeast area. Cities in the southeast coastal areas, northeast corner, and Yunnan Province demonstrate relatively low PM2.5 levels. This agglomeration of PM2.5 can be found in previous studies [1,3,43]. The BTH region is the political and cultural center of China, subject to faster urbanization and industrialization. The prosperous economy in this region attracts mountainous people to work and live here. These people consume an enormous amount of energy and release massive pollutants, which may lead to high levels of PM2.5.

As listed in Table 2, the global Moran’s I of PM2.5 level from 2015 to 2017 is 0.736, 0.731, and 0.681, respectively. These results are much larger than expected values, indicating that PM2.5 is significantly positively autocorrelated in space. Furthermore, the spatial dependencies of PM2.5 at a local scale are presented in Figure 3. A similar local spatial aggregation pattern can be observed: a sizeable high-high cluster is located in the center, including the BTH region, East China, Central China, and part of Northwest China. Low-low clusters are distributed mainly in three parts: southern coastal area, some regions in the northeast, and several cities in the northwest (which disappears in 2016). A few high-low clusters are scattered in the south and northeast while a few low-high clusters are distributed around high-high clusters.

3.2. Temporal Patterns of PM2.5

Figure 4 presents the monthly average of PM2.5 levels. There is a clear periodicity: peaks appear in winter months (December, January, and February), while troughs appear in summer months (June, July, and August). This pattern accords with the seasonal fluctuation of energy consumption [44]. In winter, people consume more energy (i.e., coal gas and liquefied petroleum gas) for heating, which leads to more air pollutant emission. Biomass burning is another contributing factor to air pollution in winter [45].

The result of the rank von Neumann ratio test is v = 0.38 (statistically significant at p = 0.005 level), which is much smaller than the critical value 1.18. Therefore, the data sequence is significantly autocorrelated in time, i.e., PM2.5 levels in neighboring months tend to be similar.

Figure 5 presents the distribution of sample autocorrelation (coefficients r_k) against the lag number (k). The approximately cosinusoidal shape reflects a seasonal pattern of monthly mean PM2.5 levels. When k = 1, r_k reaches a maximum of 0.72, which is far beyond the upper confidence limit (95%). This large first-order temporal autocorrelation coefficient demonstrates the existence of a significantly positive autocorrelation between PM2.5 levels of consecutive months. When k = 6, r_k is −0.65, and beyond the lower confidence limit (95%), this demonstrates the existence of a significantly negative autocorrelation between the PM2.5 levels of two months which are separated by six months. When k = 12, r_k becomes significantly positive again, meaning that the PM2.5 levels of two months which are separated by twelve months are similar. These results can be validated by the monthly mean PM2.5 levels shown in Figure 4.

After inspecting spatial and temporal patterns of PM2.5, we then calibrate a series of regression models to determine the influences of PM2.5. OLS, GWR, and MGWR models are firstly fitted. Then, OLS with a spatiotemporal lag (OLSL), GWR with a spatiotemporal lag (GWRL), and MGWR with a spatiotemporal lag (MGWRL) are fitted to take spatial dependence and temporal dependence into consideration. All models are calibrated separately for each month between 2015 and 2017 except January 2015 (lag variable is not available for this month as it is constructed based on previous month’s data). Finally, this study gets the results of 35 months. In the following sections, we first make a comprehensive comparison between the performances of six models. Then, we present and compare parameter estimates of two representative global and local models.

3.3. Model Comparison

3.3.1. Goodness-of-Fit

Two common criteria, coefficient of determination (R²) and corrected Akaike Information Criteria (AICc), are used to evaluate model goodness-of-fit. As presented in Figure 6a, OLS has the least R² in all months. After incorporating the spatiotemporal lag, R² of OLSL rises substantially. GWR achieves larger R² than OLSL in most months, while R² of GWRL are even larger. R² of MGWR is located between that of GWR and GWRL in each month. Finally, MGWRL achieves the highest R² in all months, which ranges between 0.779 and 0.923.

A lower AICc indicates a better fitting degree; and two AICc values with a difference of three or more generally indicate a significant difference between model goodness-of-fit [30]. As presented in Figure 6b, the reduction of AICc of GWR over those of OLS ranges from 51 to 254, and the reduction of AICc of MGWR over those of GWR ranges from 19 to 106. Therefore, GWR consistently outperforms OLS, and MGWR outperforms GWR. Similarly, MGWRL is superior to GWRL, and GWRL is superior to OLSL.

Overall, the comparison of model goodness-of-fit comes to three conclusions. First, local models (e.g., GWR, MGWR, GWRL, MGWRL) outperform global models (e.g., OLS, OLSL). Second, MGWR outperforms GWR. Third, after including a spatiotemporal lag, all the global and local models achieve better performances.

3.3.2. Residual Spatial Autocorrelation

Another important criterion for evaluating the performance of spatial regression models is examining the existence of residual spatial autocorrelation. The residuals of an ideal model should not present any strong spatial dependence. Moran’s I is used to measure residual spatial autocorrelation. As there are 35 dataset and six models, it is a challenge to present all results. This study only lists results in 2017, as shown in Table 3. Residuals of OLS and OLSL are significantly positively autocorrelated in space in all months. Residuals of GWR also present significant spatial dependence in all months except November, but to a lesser extent than those of OLS and OLSL. Residuals of GWRL are significantly positively autocorrelated in space in six months. MGWR and MGWRL achieve better performance than other models in reducing residual spatial autocorrelation as their residual Moran’s I have no statistical significance in all months.

Figure 7 visualizes residuals of six models in December 2017 to have an intuitive observation of their spatial distributions. As shown in Figure 7a, the spatial clustering pattern of residuals of OLS is distinct. Positive residuals are clustered in central areas, while negative residuals are mostly found in peripheral areas. The vast positive cluster of residuals in central area disappears in OLSL, but there are still apparent positive clusters in the west, east, northeast, and south areas and negative clusters located between positive clusters, as shown in Figure 6b. Residuals of GWR and GWRL are scattered in space, but there still some small-scale clusters. Residuals of MGWR and MGWRL, however, show no apparent clusters as positive and negative values tend to appear alternatively in space. In general, global models are not effective at reducing residual dependence, GWR and GWRL can reduce but cannot eliminate residual dependence, and MGWR and MGWRL can effectively eliminate residual dependence.

In general, after taking spatial and temporal dependence into consideration, the performances of OLS, GWR, and MGWR improve significantly. MGWRL is the most efficient model because it not only achieves the highest goodness-of-fit, but also eliminates residual spatial dependence. These results demonstrate the superiority of MGWR and the validity of accounting for spatial and temporal dependence by a spatiotemporal lag.

3.4. Interpret Parameter Estimates

This section presents a comparison between parameter estimates of two representative models, OLSL and MGWRL.

3.4.1. Results of OLSL

Table 4 lists parameter estimates of OLSL from January 2017 to December 2017. It is noteworthy that stlag presents a consistently positive effect on PM2.5. When stlag increases by one point, the increase of PM2.5 level ranges from a 0.662 (for 201703) to 1.035 (for 201701) point which is much larger than those of other factors. Temporal variations of the parameter estimate of stlag presented in Figure 8a generally conform to that of the spatial autocorrelation (Moran’s I) of PM2.5 presented in Figure 8b. A large extent of spatial autocorrelation means that the PM2.5 level of a city is largely influenced by those of neighboring cities, which is reflected by a large parameter estimate of stlag. In general, the significant positive effect of stlag demonstrates the existence of the spatial and temporal dependence of PM2.5. More specifically, the PM2.5 level of a city is a product of PM2.5 levels of neighboring cities in the previous month.

Scgdp is another factor which demonstrates significant positive on PM2.5 in all months. When scgdp increases by one point, the increase of PM2.5 level ranges from a 0.034 (for 201705) to 0.299 (for 201711) point. The contribution of secondary industry to GDP in China in 2016 is 39.8%, which is much larger than that of developed economies, such as America (18.88%), the European Union (24.5%), and Japan (26.8%) [45]. Secondary industry in China covers, but is not bound to steel production, automobile manufacturing, and chemical production. These are typically heavy polluters of the atmosphere. A higher share of secondary industry has been widely demonstrated to cause more severe air pollution [3,12].

Popd, dust, and psg are positively associated with PM2.5 although these associations are not significant in every month. The positive effect of population density (popd) on air pollution (0.002–0.087) is easy to understand since more people means more energy consumption and more exhaust emission [1,13]. Various industries, including the construction industry, chemical industry, and metallurgical industry, produce a great deal of dust in the process of production and transportation. The harmful influence of industrial dust on air quality has been confirmed by previous studies [12,46]. A larger volume of highway passenger traffic (psg) means more vehicular fuel consumption and more automobile exhausts emission. These exhausts are important sources of PM2.5 [14].

The influences of other variables on PM2.5 are inconsistent over time. Wind velocity (wind) is negatively associated with PM2.5 in most months. Fast wind can increase horizontal mixing and is generally expected to help disperse and dilute air pollutants [27,47]. Rainfall also presents a negative effect on PM2.5 in most months. The reason for this may be that rain can wash off accumulated pollutant masses to the ground [28,47]. Urbanization rate (urban) is significantly positively associated with PM2.5 in a few months. The rapid urbanization process is usually accompanied by a series of environmental problems in the U.S and Europe [48,49], as well as in China [11,50]. Per capita GDP (pcgdp) demonstrates a significant negative effect on PM2.5 levels in several months. Pcgdp indicates the economic growth and prosperity of a city. Previous studies demonstrated that cities with higher per capita GDP usually have better air quality because a high degree of affluence means greater awareness and better skill in terms of environment protection [1,12].

3.4.2. Results of MGWRL

Compared with global models which generate a single parameter estimate for each covariate at all locations, local models produce different parameter estimates at different locations. Three variables, namely popd, scgdp, and stlag, are selected for further discussion because they present consistent relationships with PM2.5 in time, as demonstrated by OLSL. It is worthy of studying whether these consistent relationships exist in space. One advantage of local models is that parameter estimates can be visualized so that we can observe their spatial variation intuitively. However, the abundant information generated by local models poses a challenge for result presentation. Here, the discussion is limited to parameter estimates of popd, scgdp, and stlag in three consecutive months, namely December 2016, January 2017, and February 2017.

Figure 9a,b shows that the association between population density (popd) and PM2.5 level present a similar spatial pattern in December 2016 and January 2017. Significant positive influences of popd are found in the northeast, north, and southwest. When popd increases by one point, the increase of the PM2.5 level ranges from a 0.15 to 0.30 point in these cities. The effect of popd on PM2.5 exhibits a different spatial distribution in February 2017. The positive impacts in the north and southwest become nonsignificant, while significant positive impacts are found in northeast and southeast, as presented in Figure 9c. A reason for the difference might be due to the annual mass migration of people during the Spring festival, which usually comes in February. Living in the economic center of China, people in the southeast have strong economic capabilities for leisure and entertainment. Moreover, with warm weather, human activity, such as driving out for a trip and visiting friends, increases in this long vacation, which may lead to the emission of more air pollutants.

As presented in Figure 9d–f, the secondary industry as a percentage of GDP (scgdp) demonstrates a positive impact on PM2.5 in all cities, but significant associations are mostly found in the northeast (0.07–0.12). With perfect geographic location and rich natural resources, the northeast region contains the heavy industry base of China. The average value of scgdp in this area is more than 40%, which is much more than that of developed countries. Secondary industry, such as mining, metals extraction, and manufacturing, depends heavily on the consumption of coal and crude oil. Vast quantities of energy consumption lead to the emission of air pollutants, such as sulfur dioxide and carbon compounds, which explains why scgdp is positively related to the PM2.5 level in every city, but especially so in northeast China.

Local parameter estimates for spatiotemporal lag (stlag) are presented in Figure 9g–i. MGWRL, in this case, offers a method for capturing localized spatiotemporal dependence or autocorrelation with the superiority that these localized measurements take into consideration the influences of different covariates in the model while traditional measures of spatial dependence do not [42]. Results demonstrate that spatiotemporal lag has a consistent significant positive effect on PM2.5 in all cities. When stlag increases by one point, the increase of PM2.5 level ranges from 0.48 to 1.10 points. These large parameter estimates indicate that the PM2.5 level of a city is heavily influenced by those of neighboring cities in the previous month. However, the strength of the lag effect is not stationary over space. Cities which are most affected by stlag are located mainly in the Beijing–Tianjin–Hebei region, while cities with a comparatively weak impact of stlag are found in the northeast and southeast. The strong effect of spatiotemporal dependence could be understood from two perspectives. First, neighboring cities usually have similar energy and industry structures. For example, the economic growth of cities in Shanxi province (located northwest of the Beijing–Tianjin–Hebei region) is heavily dependent on the coal mining industry, which is an important source of PM2.5. Such similarity may result in similarly high PM2.5 levels in these cities. Therefore, PM2.5 levels tend to be strongly autocorrelated in these areas. Second, neighboring cities usually share similar geographical environment characteristics. Cities in North China, for instance, are close to deserts, and the frequent sand-dust storm driven by dry weather and wind in winter months in these cities may explain the strong stlag effect.

4. Conclusions

PM2.5 emerges as a severe problem in China and has aroused widespread attention these years. The necessity of considering the spatial dimension in studying environmental issues such as air pollution has been pointed out by previous research [21]. Specifically, two spatial effects of PM2.5 are well-known: spatial heterogeneity and spatial dependence. Spatial heterogeneity means the relationships between PM2.5, and determinants change with location, which global regression models cannot detect. Local models such as GWR have thus been utilized to account for heterogeneity. Spatial dependence means that the PM2.5 level is autocorrelated in space. This effect violates the independent observations assumption of conventional regression models. To address this effect, a set of spatial models have been utilized, such as SEM and SLM. However, previous studies have often ignored the temporal dependence of PM2.5. Additionally, the co-occurrence of spatial heterogeneity and spatial dependence has also been ignored.

This study aims to extend previous studies by accounting for spatiotemporal dependence and spatial heterogeneity in modeling determinants of the monthly PM2.5 level in China. First, spatial and temporal distribution patterns of PM2.5 are explored. Results indicate that PM2.5 is unevenly distributed in space and presents a significant autocorrelation feature. A stable agglomeration of cities with high PM2.5 levels is found in the BTH region. As to the temporal dimension, PM2.5 level reveals a seasonality that high values appear in winter while low values appear in summer. A significant temporal dependence is also detected, which means that current PM2.5 levels tend to be similar to those of the previous month. Considering both spatial and temporal dependence, a spatiotemporal lag is constructed based on PM2.5 levels of neighboring cities in the previous month. Then, the spatiotemporal lag is contained in global and local regression models, i.e., OLS, GWR, and MGWR, leading to OLSL, GWRL, and MGWRL. The performance of these models is compared comprehensively in terms of goodness-of-fit and residual spatial dependence. Results demonstrate that the incorporation of spatiotemporal lag to OLS, GWR, and MGWR substantially improves model goodness-of-fit and reduces residual spatial dependence. Local models consistently present better performances than global models. MGWRL is the most effective model in terms of replicating observed PM2.5 patterns and eliminating residual spatial autocorrelation. These results illustrate the necessity of accounting for both spatiotemporal dependence and spatial heterogeneity in modeling geographic events.

Findings from this study have implications for future research. First, to obtain a more precise understanding of the processes generating the associations between impact factors and dependent variables, local models such as MGWR could be used. Second, when studying geographic events, it is necessary to account for the effects of spatial and temporal autocorrelation. It is valid to consider these effects by constructing a spatiotemporal lag. By incorporating the spatiotemporal lag in MGWR, local spatiotemporal dependence measures could be generated which are conditioned on various covariates rather than being simple descriptions of a pattern. The effect of spatiotemporal lag is significantly positive, indicating that the PM2.5 level of a city is a product of PM2.5 levels of nearby cities in the previous month. This finding could help future studies to better understand and predict the development patterns of PM2.5. Third, the results of both global and local models demonstrate that two variables, namely population density and the secondary industry as a percentage of GDP, have consistent positive effects on PM2.5. China’s economic development is strongly dependent on secondary industry, and the process of rapid urbanization leads to the formation of many dense, large cities. These two factors have been widely regarded as the root causes of the nationwide air pollution crisis in China [13]. Urban designers should fully consider the associations between air pollution and industrial structure and population density.

This research has some limitations. First, this study specifies a single critical distance when constructing the spatiotemporal lag. The estimation results of OLSL, GWRL, and MGWRL models may be different when different critical distances are specified. Future research should conduct a robustness analysis on this issue. Second, meteorological features, such as wind direction, humidity, dew point, and temperature, have been demonstrated to be important influencing factors of PM2.5 [51]. Accordingly, future research should take these variables into consideration. Third, future studies should consider the influence of the location of cities (northern China versus southern China) and seasons, because cities in different parts of China have different characteristics in terms of their energy structure, and energy consumption changes with the seasons.

Author Contributions

Conceptualization, H.Y. and L.D.; methodology, H.Y., L.D., and M.L.; writing—original draft preparation, H.Y. and L.D.; writing—review and editing, H.H., X.Z., and H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Program of National Natural Science Foundation of China (41961062, 41761099), Key Research & Development Program of Guangxi Provence (2019AB16010), and Program of Natural Science Foundation of Guangxi Province (2018JJA150089), and National College Students’ innovation and entrepreneurship training program (202110603026).

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhou, C.; Chen, J.; Wang, S. Examining the effects of socioeconomic development on fine particulate matter (PM2.5) in China’s cities using spatial regression and the geographical detector technique. Sci. Total Environ. 2018, 619–620, 436–445. [Google Scholar] [CrossRef] [PubMed]
Yang, H.-H.; Lee, S.-A.; Hsieh, D.P.H.; Chao, M.-R.; Tung, C.-Y. PM2.5 and Associated Polycyclic Aromatic Hydrocarbon and Mutagenicity Emissions from Motorcycles. Bull. Environ. Contam. Toxicol. 2008, 81, 412–415. [Google Scholar] [CrossRef] [PubMed]
Wang, S.; Zhou, C.; Wang, Z.; Feng, K.; Hubacek, K. The characteristics and drivers of fine particulate matter (PM2.5) distribution in China. J. Clean. Prod. 2017, 142, 1800–1809. [Google Scholar] [CrossRef]
Adar, S.D.; Filigrana, P.A.; Clements, N.; Peel, J.L. Ambient Coarse Particulate Matter and Human Health: A Systematic Review and Meta-Analysis. Curr. Environ. Health Rep. 2014, 1, 258–274. [Google Scholar] [CrossRef] [Green Version]
Thurston, G.D.; Ahn, J.; Cromar, K.R.; Shao, Y.; Reynolds, H.R.; Jerrett, M.; Lim, C.C.; Shanley, R.; Park, Y.; Hayes, R.B. Ambient Particulate Matter Air Pollution Exposure and Mortality in the NIH-AARP Diet and Health Cohort. Environ. Health Perspect. 2016, 124, 484–490. [Google Scholar] [CrossRef]
Yeh, H.-L.; Hsu, S.-W.; Chang, Y.-C.; Chan, T.-C.; Tsou, H.-C.; Chang, Y.-C.; Chiang, P.-H. Spatial Analysis of Ambient PM2.5 Exposure and Bladder Cancer Mortality in Taiwan. Int. J. Environ. Res. Public Health 2017, 14, 508. [Google Scholar] [CrossRef]
Feng, L.; Ye, B.; Feng, H.; Ren, F.; Huang, S.; Zhang, X.; Zhang, Y.; Du, Q.; Ma, L. Spatiotemporal Changes in Fine Particulate Matter Pollution and the Associated Mortality Burden in China between 2015 and 2016. Int. J. Environ. Res. Public Health 2017, 14, 1321. [Google Scholar] [CrossRef] [Green Version]
Anselin, L. Spatial Econometrics: Methods and Models; Kluwer Academic: Dordrecht, The Netherlands, 1988. [Google Scholar]
Fotheringham, A.S.; Brunsdon, C.; Charlton, M. Geographically Weighted Regression: The Analysis of Spatially Varying Relationships; John Wiley & Sons: Hoboken, NJ, USA, 2002. [Google Scholar]
Jin, Q.; Fang, X.; Wen, B.; Shan, A. Spatio-temporal variations of PM2.5 emission in China from 2005 to 2014. Chemosphere 2017, 183, 429–436. [Google Scholar] [CrossRef]
Lin, G.; Fu, J.; Jiang, D.; Hu, W.; Dong, D.; Huang, Y.; Zhao, M. Spatio-Temporal Variation of PM2.5 Concentrations and Their Relationship with Geographic and Socioeconomic Factors in China. Int. J. Environ. Res. Public Health 2014, 11, 173. [Google Scholar] [CrossRef] [Green Version]
Wang, Z.-B.; Fang, C.-L. Spatial-temporal characteristics and determinants of PM2.5 in the Bohai Rim Urban Agglomeration. Chemosphere 2016, 148, 148–162. [Google Scholar] [CrossRef]
Zhang, H.; Wang, Z.; Zhang, W. Exploring spatiotemporal patterns of PM2.5 in China based on ground-level observations for 190 cities. Environ. Pollut. 2016, 216, 559–567. [Google Scholar] [CrossRef]
Cheng, Z.; Li, L.; Liu, J. Identifying the spatial effects and driving factors of urban PM2.5 pollution in China. Ecol. Indic. 2017, 82, 61–75. [Google Scholar] [CrossRef]
Zhao, X.; Zhang, X.; Xu, X.; Xu, J.; Meng, W.; Pu, W. Seasonal and diurnal variations of ambient PM2.5 concentration in urban and rural environments in Beijing. Atmos. Environ. 2009, 43, 2893–2900. [Google Scholar] [CrossRef]
Zheng, X.; Yu, Y.; Wang, J.; Deng, H. Identifying the determinants and spatial nexus of provincial carbon intensity in China: A dynamic spatial panel approach. Reg. Environ. Chang. 2014, 14, 1651–1661. [Google Scholar] [CrossRef] [Green Version]
Pateraki, S.; Asimakopoulos, D.N.; Flocas, H.A.; Maggos, T.; Vasilakos, C. The role of meteorology on different sized aerosol fractions (PM10, PM2.5, PM2.5–10). Sci. Total Environ. 2012, 419, 124–135. [Google Scholar] [CrossRef]
He, X.; Lin, Z.S. Interactive effects of the influencing factors on the changes of PM2.5 concentration based on gam model. Huanjing Kexue/Environ. Sci. 2017, 38, 22–32. [Google Scholar]
Kadane, J.B.; Davidson, C.I. Using Statistical Regressions to Identify Factors Influencing PM2.5 Concentrations: The Pittsburgh Supersite as a Case Study AU-Chu, Nanjun. Aerosol Sci. Technol. 2010, 44, 766–774. [Google Scholar]
Akbal, Y.; Unlu, K.D. A deep learning approach to model daily particular matter of Ankara: Key features and forecasting. Int. J. Environ. Sci. Technol. 2021. [Google Scholar] [CrossRef]
Hao, Y.; Liu, Y.-M. The influential factors of urban PM2.5 concentrations in China: A spatial econometric analysis. J. Clean. Prod. 2016, 112, 1443–1453. [Google Scholar] [CrossRef]
Elbayoumi, M.; Ramli, N.A.; Md Yusof, N.F.F.; Yahaya, A.S.B.; Al Madhoun, W.; Ul-Saufie, A.Z. Multivariate methods for indoor PM10 and PM2.5 modelling in naturally ventilated schools buildings. Atmos. Environ. 2014, 94, 11–21. [Google Scholar] [CrossRef]
Yang, H.; Li, W.-W.; Kelly, K.E. Using a Continuous Time Lag to Determine the Associations between Ambient PM2.5 Hourly Levels and Daily Mortality AU-Staniswalis, Joan G. J. Air Waste Manag. Assoc. 2009, 59, 1173–1185. [Google Scholar]
Liu, Y.; Paciorek, C.J.; Koutrakis, P. Estimating regional spatial and temporal variability of PM(2.5) concentrations using satellite data, meteorology, and land use information. Environ. Health Perspect. 2009, 117, 886–892. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Guan, D.; Su, X.; Zhang, Q.; Peters, G.P.; Liu, Z.; Lei, Y.; He, K. The socioeconomic drivers of China’s primary PM2.5 emissions. Environ. Res. Lett. 2014, 9, 024010. [Google Scholar] [CrossRef] [Green Version]
Anselin, L.; Florax, R.J.; Rey, S.J. Advances in Spatial Econometrics: Methodology, Tools and Applications; Springer: Berlin, Germany, 2004. [Google Scholar]
Hu, X.; Waller, L.A.; Al-Hamdan, M.Z.; Crosson, W.L.; Estes, M.G.; Estes, S.M.; Quattrochi, D.A.; Sarnat, J.A.; Liu, Y. Estimating ground-level PM2.5 concentrations in the southeastern U.S. using geographically weighted regression. Environ. Res. 2013, 121, 1–10. [Google Scholar] [CrossRef]
Vardoulakis, S.; Kassomenos, P. Sources and factors affecting PM10 levels in two European cities: Implications for local air quality management. Atmos. Environ. 2008, 42, 3949–3963. [Google Scholar] [CrossRef]
Anselin, L. Spatial econometrics. In A Companion to Theoretical Econometrics; Baltagi, B.H., Ed.; Blackwell: Oxford, UK, 1988; pp. 310–330. [Google Scholar]
Fotheringham, A.S.; Yao, J. Local Spatiotemporal Modeling of House Prices: A Mixed Model Approach. Prof. Geogr. 2016, 68, 189–201. [Google Scholar]
The Ministry of Ecology and Environment of PRC. Official Website of National Air Quality Daily. Available online: http://datacenter.mee.gov.cn (accessed on 3 February 2019).
The National Meteorological Information Center. China Meteorological Data Service Centre. Available online: http://data.cma.cn (accessed on 3 February 2019).
National Bureau of Statistics of China. China City Statistical Yearbook; China Statistics Press: Beijing, China, 2017. [Google Scholar]
Moran, P.A. The interpretation of statistical maps. J. R. Stat. Society. Ser. B Methodol. 1948, 10, 243–251. [Google Scholar] [CrossRef]
Cliff, A.D.; Ord, J.K. Spatial Processes: Models and Applications; Pion: London, UK, 1981. [Google Scholar]
Anselin, L. Local Indicators of Spatial Association—LISA. Geogr. Anal. 1995, 27, 93–115. [Google Scholar] [CrossRef]
Bartels, R. The Rank Version of von Neumann’s Ratio Test for Randomness. J. Am. Stat. Assoc. 1982, 77, 40–46. [Google Scholar] [CrossRef]
Madansky, A. Prescriptions for Working Statisticians; Springer: New York, NY, USA, 1988. [Google Scholar]
Chatfield, C. The Analysis of Time Series: An Introduction (6th Edition); Chapman and Hall: Boca Raton, FL, USA, 2004. [Google Scholar]
Fotheringham, A.S.; Yang, W.; Kang, W. Multiscale Geographically Weighted Regression (MGWR). Ann. Am. Assoc. Geogr. 2017, 107, 1247–1265. [Google Scholar] [CrossRef]
da Silva, A.R.; Fotheringham, A.S. The Multiple Testing Issue in Geographically Weighted Regression. Geogr. Anal. 2016, 48, 233–247. [Google Scholar] [CrossRef]
Fotheringham, A.S.; Park, B. Localized Spatiotemporal Effects in the Determinants of Property Prices: A Case Study of Seoul. Appl. Spat. Anal. Policy 2018, 11, 581–598. [Google Scholar] [CrossRef]
Lu, D.; Xu, J.; Yang, D.; Zhao, J. Spatio-temporal variation and influence factors of PM2.5 concentrations in China from 1998 to 2014. Atmos. Pollut. Res. 2017, 8, 1151–1159. [Google Scholar] [CrossRef]
Okkaoğlu, Y.; Akdi, Y.; Ünlü, K.D. Daily PM10, periodicity and harmonic regression model: The case of London. Atmos. Environ. 2020, 238, 117755. [Google Scholar] [CrossRef]
Zhang, Y.-L.; Cao, F. Is it time to tackle PM2.5 air pollutions in China from biomass-burning emissions? Environ. Pollut. 2015, 202, 217–219. [Google Scholar] [CrossRef]
Remoundaki, E.; Papayannis, A.; Kassomenos, P.; Mantas, E.; Kokkalis, P.; Tsezos, M. Influence of Saharan Dust Transport Events on PM2.5 Concentrations and Composition over Athens. Water Air Soil Pollut. 2012, 224, 1373. [Google Scholar] [CrossRef]
Li, L.; Qian, J.; Ou, C.-Q.; Zhou, Y.-X.; Guo, C.; Guo, Y. Spatial and temporal analysis of Air Pollution Index and its timescale-dependent relationship with meteorological factors in Guangzhou, China, 2001–2011. Environ. Pollut. 2014, 190, 75–81. [Google Scholar] [CrossRef]
Stone, B. Urban sprawl and air quality in large US cities. J. Environ. Manag. 2008, 86, 688–698. [Google Scholar] [CrossRef]
De Ridder, K.; Lefebre, F.; Adriaensen, S.; Arnold, U.; Beckroege, W.; Bronner, C.; Damsgaard, O.; Dostal, I.; Dufek, J.; Hirsch, J.; et al. Simulating the impact of urban sprawl on air quality and population exposure in the German Ruhr area. Part II: Development and evaluation of an urban growth scenario. Atmos. Environ. 2008, 42, 7070–7077. [Google Scholar] [CrossRef]
Wang, S.; Fang, C.; Wang, Y. Spatiotemporal variations of energy-related CO₂ emissions in China and its influencing factors: An empirical analysis based on provincial panel data. Renew. Sustain. Energy Rev. 2016, 55, 505–515. [Google Scholar] [CrossRef]
Liang, X.; Zou, T.; Guo, B.; Li, S.; Zhang, H.; Zhang, S.; Huang, H.; Chen, S.X. Assessing Beijing’s PM2.5 pollution: Severity, weather impact, APEC and winter heating. Proc. R. Soc. A Math. Phys. Eng. Sci. 2015, 471, 20150257. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Location of study areas.

Figure 4. Monthly average of PM2.5 levels in China from 2015 to 2017.

Figure 5. Results of the sample autocorrelation function.

Figure 6. Model comparison of goodness-of-fit: (a) R²; (b) AICc.

Figure 7. Spatial distribution of residuals in December 2017: (a) OLS; (b) OLSL; (c) GWR; (d) GWRL; (e) MGWR; (f) MGWRL.

Figure 8. Temporal variations of (a) parameter estimate of stlag, (b) Moran’s I.

Figure 9. Spatial distribution of local parameter estimates: (a) 201612-popd; (b) 201701-popd; (c) 201702-popd; (d) 201612-scgdp; (e) 201701-scgdp; (f) 201702-scgdp; (g) 201612-stlag; (h) 201701-stlag; (i) 201702-stlag.

Table 1. Descriptive statistics of variables.

Variable	Definition	Mean	S.D.	Min	Max
pm25	Monthly mean PM2.5 (μg/m³)	46.62	25.32	5.06	254.1
wind	Monthly mean wind velocity (m/s)	2.17	0.72	0.50	7.64
rain	Cumulative rainfall in a month (mm)	93.95	101.82	0.00	855.05
urban	Proportion of urban population (%)	36.27	23.40	4.67	100
popd	Population density (person/km²)	438.14	339.72	5.73	2648.11
pcgdp	Per capita GDP (10⁴ yuan)	5.11	2.96	1.01	21.54
scgdp	Secondary industry as percentage to GDP (%)	46.80	9.68	14.95	75.53
dust	Volume of industrial dust emissions (10⁴ ton)	3.90	8.91	0.02	185.98
psg	Highway passenger traffic in a year (10⁸ person)	0.66	1.20	0.01	15.69
Stlag ¹	Spatiotemporal lag variable	47.51	24.84	9.04	215.00

¹ This variable will be explained in detail later.

Table 2. Global Moran’s I of annual mean PM2.5.

	2015	2016	2017
Moran’s Index	0.736	0.731	0.681
Expected Index	−0.004	−0.003	−0.004
z-score	19.921	19.790	18.417
p-value	<0.001	<0.001	<0.001
Pattern	Clustered	Clustered	Clustered

Table 3. Moran’s I of residuals (MI = Moran’s I).

	OLS		OLSL		GWR		GWRL		MGWR		MGWRL
	MI	z-Score	MI	z-Score	MI	z-Score	MI	z-Score	MI	z-Score	MI	z-Score
201701	0.248 ***	30.682	0.066 ***	8.582	0.034 ***	4.679	0.004	0.964	0.011	1.794	−0.011	−0.859
201702	0.255 ***	31.610	0.123 ***	15.472	0.029 ***	4.041	0.006	1.227	−0.001	0.217	−0.010	−0.868
201703	0.156 ***	19.512	0.071 ***	9.202	0.021 **	3.088	0.005	1.128	−0.005	−0.249	−0.009	−0.790
201704	0.171 ***	21.385	0.053 ***	6.982	0.021 **	3.107	0.014 *	2.228	−0.002	0.108	−0.010	−0.845
201705	0.140 ***	17.528	0.107 ***	13.602	0.025 ***	3.568	0.001	0.572	0.002	0.685	−0.011	−0.929
201706	0.230 ***	28.585	0.114 ***	14.422	0.032 ***	4.395	0.024 ***	3.443	−0.001	0.215	−0.003	0.035
201707	0.252 ***	31.231	0.092 ***	11.697	0.036 ***	4.936	0.023 **	3.241	−0.002	0.078	−0.006	−0.389
201708	0.199 ***	24.692	0.111 ***	14.052	0.029 ***	4.004	0.007	1.363	−0.007	-0.458	−0.012	−1.079
201709	0.151 ***	18.875	0.115 ***	14.471	0.039 ***	5.221	0.020 **	2.884	0.003	0.872	−0.010	−0.904
201710	0.198 ***	24.601	0.112 ***	14.186	0.022 **	3.222	0.012 *	1.990	0.002	0.681	−0.005	−0.222
201711	0.167 ***	20.867	0.184 ***	22.877	0.010	1.665	0.014 *	2.190	0.004	0.932	−0.009	−0.763
201712	0.200 ***	19.865	0.199 ***	9.844	0.075 **	2.858	0.011	1.784	0.001	0.186	−0.005	−0.248

* p < 0.05, ** p < 0.01, *** p < 0.001.

Table 4. Parameter estimates for OLSL.

Variables	201701	201702	201703	201704	201705	201706	201707	201708	201709	201710	201711	201712
wind	−0.086 *	−0.051	0.022	−0.040	−0.151 **	−0.073	−0.078	−0.085	−0.072	0.095	−0.270 ***	−0.124 ***
rain	0.017	−0.027 *	−0.036 ***	0.011	−0.125 ***	−0.135 ***	0.001	−0.091 ***	−0.048 ***	−0.108 ***	−0.012	−0.010 *
urban	0.046 *	−0.023	−0.027	−0.018	0.028	−0.048	−0.022	0.068 **	0.021	0.078 *	0.018	0.025
popd	0.043 **	0.087 ***	0.057 ***	0.049 ***	0.002	0.059 **	0.006	0.086 ***	0.065 ***	0.029	0.050 **	0.077 ***
pcgdp	−0.076 **	−0.012	0.045	0.001	−0.084 *	0.017	0.017	−0.008	−0.041	−0.121 **	−0.111 ***	−0.085 ***
scgdp	0.163 *	0.191 **	0.047 *	0.038 *	0.034 **	0.039 *	0.043 *	0.082 *	0.107 *	0.181 *	0.299 **	0.224 **
dust	0.010	0.025 *	0.002	0.038 ***	0.011	0.042 ***	0.063 ***	0.001	0.031 **	0.037 *	0.001	0.021 *
psg	0.027 *	0.007	0.005	0.008	0.014	0.036 *	0.034 *	0.002	0.022	0.039	0.073 ***	0.040 ***
stlag	1.035 ***	0.720 ***	0.662 ***	0.812 ***	0.796 ***	0.993 ***	0.755 ***	0.799 ***	0.851 ***	0.794 ***	0.869 ***	0.841 ***

* p < 0.05, ** p < 0.01, *** p < 0.001.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yue, H.; Duan, L.; Lu, M.; Huang, H.; Zhang, X.; Liu, H. Modeling the Determinants of PM2.5 in China Considering the Localized Spatiotemporal Effects: A Multiscale Geographically Weighted Regression Method. Atmosphere 2022, 13, 627. https://doi.org/10.3390/atmos13040627

AMA Style

Yue H, Duan L, Lu M, Huang H, Zhang X, Liu H. Modeling the Determinants of PM2.5 in China Considering the Localized Spatiotemporal Effects: A Multiscale Geographically Weighted Regression Method. Atmosphere. 2022; 13(4):627. https://doi.org/10.3390/atmos13040627

Chicago/Turabian Style

Yue, Han, Lian Duan, Mingshen Lu, Hongsheng Huang, Xinyin Zhang, and Huilin Liu. 2022. "Modeling the Determinants of PM2.5 in China Considering the Localized Spatiotemporal Effects: A Multiscale Geographically Weighted Regression Method" Atmosphere 13, no. 4: 627. https://doi.org/10.3390/atmos13040627

APA Style

Yue, H., Duan, L., Lu, M., Huang, H., Zhang, X., & Liu, H. (2022). Modeling the Determinants of PM2.5 in China Considering the Localized Spatiotemporal Effects: A Multiscale Geographically Weighted Regression Method. Atmosphere, 13(4), 627. https://doi.org/10.3390/atmos13040627

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Modeling the Determinants of PM2.5 in China Considering the Localized Spatiotemporal Effects: A Multiscale Geographically Weighted Regression Method

Abstract

1. Introduction

2. Materials and Methods

2.1. China’s City-Level PM2.5

2.2. Independent Variables

2.3. Global and Local Moran’s I

2.4. Rank von Neumann Ratio Test and Sample Autocorrelation Function

2.4.1. Rank von Neumann Ratio Test

2.4.2. Sample Autocorrelation Function

2.5. Multiscale Geographically Weighted Regression

2.6. Spatio-Temporal Lag

3. Results and Discussion

3.1. Spatial Patterns of PM2.5

3.2. Temporal Patterns of PM2.5

3.3. Model Comparison

3.3.1. Goodness-of-Fit

3.3.2. Residual Spatial Autocorrelation

3.4. Interpret Parameter Estimates

3.4.1. Results of OLSL

3.4.2. Results of MGWRL

4. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI