Assessing Spatial Heterogeneity of Factor Interactions on PM2.5 Concentrations in Chinese Cities

The identification of fine particulate matter (PM2.5) concentrations and its driving factors are crucial for air pollution prevention and control. The factors that influence PM2.5 in different regions exhibit significant spatial heterogeneity. Current research has quantified the spatial heterogeneity of single factors but fails to discuss the interactions between factors. In this study, we first divided the study area into subregions based on the spatial heterogeneity of factors in a multi-scale geographically weighted regression model. We then investigated the interactions between different factors in the subregions using the geographical detector model. The results indicate that there was significant spatial heterogeneity in the interactions between the driving factors of PM2.5. The interactions between natural factors have significant uncertainty, as do those between the normalized difference vegetation index (NDVI) and socioeconomic factors. The interactions between socioeconomic factors in the subregions were consistent with those in the whole region. Our findings are expected to deepen the understanding of the mechanisms at play among the aforementioned drivers and aid policymakers in adopting unique governance strategies across different regions.


Introduction
Air quality is deteriorating in many places worldwide and has become a challenge to maintaining human health and driving sustainable development [1]. Socioeconomic development in response to rapid urbanisation has caused serious air pollution problems [2,3], especially the accumulation of fine particulate matter (PM 2.5 ) [4][5][6]. Despite its low atmospheric content, PM 2.5 has a crucial impact on air quality and atmospheric visibility [7]. Long-and short-term studies on the relationship between PM 2.5 and health have confirmed that PM 2.5 is related to various human health problems [8]. This is because, unlike coarser particles, PM 2.5 , which has a smaller particle size, can penetrate the human respiratory tract and lungs deeper, increasing the risk of cardiovascular disease and reducing lung function [9]. Prolonged exposure to high PM 2.5 concentrations, therefore, can increase mortality rates [10]. Given these damaging health impacts, the identification of the key factors and quantification of their effects is vital for alleviating and controlling PM 2.5 .
It is now understood that PM 2.5 concentrations are directly affected by human activities and the surrounding environment [11][12][13]. Previous studies have assessed the characteristics and sources of PM 2.5 , such as meteorological conditions, transportation, topography, and economic development [14,15]. For example, Zhou et al. [16] found that population density, industrial structure, industrial smoke emissions, and road density tend to have significant positive effects on PM 2.5 . Similarly, Zhu et al. [17] have stated that there is the surge of PM 2.5 concentration was usually caused by economic activities increase, and there was a unidirectional causality between PM 2.5 concentration and the factors of economic growth, foreign direct investment, and industrial structure in the long-term. In addition, meteorological factors, such as temperature, air pressure, relative humidity, wind direction, and wind force, can directly or indirectly influence the diffusion degree of PM 2.5 [18,19]. In fact, PM 2.5 accumulation is considered to be the synthetic outcome of multivariate natural and social factors, and the interaction between these factors has gradually attracted academic attention. For instance, Yang's study indicated that the interaction between industry and climate is larger than that between ecosystem and climate [20]. Wang et al. [21] also found that the interactions between natural and social factors could enhance the accumulation of PM 2.5 in a bivariate manner.
To date, a large body of literature has documented the independent and interactive effects of factors on PM 2.5 . Researchers have also noted that the impacts of factors in various regions and development stages have obvious temporal and spatial heterogeneity. For example, Wang et al. [22] investigated spatial heterogeneity in both the direction and strength of socioeconomic and landscape factors at a local scale. Yan et al. [23] observed that the impact of industrial structure on PM 2.5 concentrations in the upper 75th quantile cities is larger than that in other cities. Moreover, it has been established that the spatial heterogeneity of driving factors increases the difficulty of understanding the accumulation mechanism of PM 2.5 and poses new challenges for decision-makers to implement interventions. Unfortunately, existing studies have been limited to the evaluation of the spatial heterogeneity of different factors and have ignored the spatial heterogeneity of the interactions between factors. For example, the interaction between temperature and other factors is likely to be different if temperature is found to have an opposite effect on PM 2.5 in various areas.
Among the various models that are able to quantify the impact of factors on PM 2.5 , such as the spatial regression model, Bayesian hierarchical spatial quantile regression [24], and geographically weighted regression (GWR) [25,26], GWR has been adopted in most studies because it can capture the spatial non-stationarity of geographical phenomena by identifying location-related dynamics [27,28]. Although GWR can analyse the heterogeneity of processes and relationships, its single-core size assumes that each response to the predicted relationship runs on the same spatial scale [29,30]. However, the production of PM 2.5 , which often corresponds to different spatial scales, is actually determined by multiple spatial processes of different scales. Therefore, a multi-scale geographically weighted regression (MGWR) model based on the GWR model has been proposed in previous research, which allows relationships between different variables to run at different spatial scales and captures the impact of spatial scales on different factors by searching for different bandwidths [31]. Additionally, other scholars have noticed the nonlinear interactions between factors and introduced another spatial technique known as the geographical detector (GD) model, as a supplement [16]. Nevertheless, all of these studies have mainly concentrated on the interaction between factors from a global perspective, neglecting spatial heterogeneity in factors.
Based on the shortcomings of previous investigations, this study presents a novel method to quantify the spatial heterogeneity of the interaction between multiple driving factors on PM 2.5 concentrations. Using cross-sectional data, we first employed the MGWR model to investigate the spatial heterogeneity of the effects of meteorology, topography, and socioeconomic factors on PM 2.5 concentrations. The study area was then divided into subregions based on the coefficient of the MGWR. Finally, the GD model was used to identify the interactions between driving factors of PM 2.5 in different subregions. To the best of our knowledge, this is the first study to reveal the spatial heterogeneity of the effects For this study, satellite remote sensing and ground monitoring PM 2.5 data for 2016 was obtained [26]. Compared with most existing PM 2.5 estimation datasets, our dataset had a higher spatial resolution of 1 km × 1 km. The high-resolution annual average PM 2.5 concentrations were estimated using the Goddard Earth Observing System chemical (GEOS-Chem) transport model by combining aerosol optical depth (AOD) retrievals from Multi-angle Imaging SpectroRadiometer (MISR), and Sea-viewing Wide Field-of-view Sensor (SeaWIFS) instruments, as well as from the Twin Moderate Resolution Imaging Spectroradiometer (MODIS) instrument [32].
Lu et al. [33] verified satellite data using data from 68 urban monitoring stations in 2013 to determine an R 2 value of 0.81, and a remotely sensed PM 2.5 dataset retrieved by van Donkelaar (2016) was found to have good consistency with the ground monitoring data, and thus can be applied to PM 2.5 research in China [34]. The subset of the global PM 2.5 concentration dataset covering China was used in this study, and the PM 2.5 concentration in Chinese cities was calculated based on the spatial statistical analysis method.

Meteorological Data
Meteorological station data were derived from the China Meteorological Administration (http://data.cma.cn/) (accessed on 23 October 2021), which records the daily average values of meteorological elements at the stations. To calculate the annual average meteorological data in 2016, we first summed the daily values of average temperature, accumulated precipitation, average air pressure, average wind speed, and average relative humidity, and then averaged the daily meteorological values to annual values [35]. The Partial Thin Plate Smoothing Spline method was used to analyse and interpolate the multivariate data to generate meteorological grid data with a resolution of 1 km, which was suitable for surface fitting and interpolation of meteorological data under the condition of reliable meteorological data and limited data density [36]. Smoothing parameter was used to balance the roughness of the surface and the fidelity of the data, as implemented in the ANUSPLIN package [37]. Moreover, the interpolation surface was smooth and continuous.

Topography Data
The topographical factors selected for this research include elevation (DEM) and the normalized difference vegetation index (NDVI). The dataset known as Advanced Spaceborne Thermal Emission and Reflection Radiometer Global Digital Elevation Model (ASTER GDEM) was used to calculate topological factors, which is a 30 m DEM dataset (http://www.gscloud.cn/) (accessed on 23 October 2021). NDVI maintains a significant correlation with PM 2.5 concentrations [38]. The MODIS NDVI product (MOD13Q1) is a 250 m spatial resolution dataset for 2016 obtained online from the Google Earth Engine (GEE) platform [39,40]. GEE is a cloud platform that combines geospatial and remote sensing satellite images [41,42]. The data stored in GEE include nearly 40 years of historical images and daily updated and expanded datasets, and it has the function of immediate data analysis [43]. The NDVI products were computed from atmospherically corrected bi-directional surface reflectance in GEE platform [44,45]. The annual NDVI in China was obtained by averaging the monthly NDVI data on GEE and downloading it.

Socioeconomic Factors
According to previous studies, human social and economic activities consume energy and increase emissions and atmospheric concentrations of PM 2.5 [46][47][48][49]. Urbanisation, industrial pollution, population growth, vehicle emissions, and other human factors have far-reaching impacts on PM 2.5 in different geographical locations [50,51]. Based on existing research results, we selected the following data: areas of built districts (expressed as the proportion of urban built-up area to total area), population density (number of people per unit land area), gross domestic product (GDP), and road density (ratio of total mileage to total area of road network). As shown in Table 1, theaforementioned four socioeconomic factors were derived from the China Statistical Yearbook (CSY) and the China City Statistical Yearbook (CCSY).

Multi-Scale Geographically Weighted Regression
The ordinary least squares (OLS) model is a global modelling method which assumes that the parameter estimates of the area are constant [52]. In this model, an equation is used to reflect the overall statistical correlation between the dependent variables and multiple explanatory variables, as follows: where y j is the dependent variable, β 0 represents the intercept, β t is the estimated parameter of the independent variable x t , p represents the number of impact factors, and ε represents the error term. The classical GWR considers spatial heterogeneity, but its drawback is that the bandwidth involved is constant throughout the study area; hence, it is not possible to analyse the correlation at different scales or the effect of the local weighted neighbourhood types [53]. This is problematic because different processes in this PM 2.5 study inevitably involve different spatial scales. Therefore, it is necessary to analyse the local relations under different scales, and the selection of the bandwidth was of great significance for the accurate estimation of local regression model parameters.
The MGWR model improves the defects of GWR, because it uses different bandwidths instead of a single constant bandwidth over the entire study area to determine the geographical relations at different spatial scales [54]. The MGWR model is more convincing than the GWR model because it allows each variable to have its best bandwidth, which can significantly improve the accuracy of the regression analysis. The optimum bandwidths can be found using iterative optimization processes, that either minimizes the corrected Akaike Information Criterion (AICc) or the cross-validation (CV) statistic [31]. where bwt in β bwt indicates the bandwidth used for calibration of the jth relationship. The calibration of the model results in a set of bandwidths, with each variable corresponding to a bandwidth. The difference in bandwidth represents the difference in spatial scale, thus MGWR describes spatial heterogeneity more accurately by capturing the influence of scale in spatial processes. In order to select the optimal bandwidth for use in the model, the adaptive bi-square kernel function was used. This function eliminates the influence of observations outside the given bandwidth and minimises the Akaike Information Criterion (AIC) and corrected AIC (AICc). Moreover, the performances of the MGWR and OLS models for explaining PM 2.5 across China were compared using adjusted R 2 values, the residual sum of squares (RSS), AIC, and AICc.

Geographical Detector Model
The Geographical detector (GD) model is a statistical method to detect spatially stratified heterogeneity and its determinants [14] and has been widely used in studies of land use, ecology, and urban science, among other fields. The GD model assumes that if an independent variable (X) is associated with a dependent variable (Y), they will exhibit a high degree of consistency in their spatial attributes. There are four types of detectors, i.e., factor detector, interaction detector, ecological detector, and risk detector. In this study, factor detectors and interaction detectors were used to detect the impact of variables and their interactions on the PM 2.5 concentration.
With regard to the factor detector, the GD model employs a q-statistic value to determine the extent to which variable X affects the spatial heterogeneity of variable Y. The q-statistic is calculated as follows: where h = 1, 2, . . . , L is a certain stratum of the explanatory variable X, L is the number of strata, N represents the number of samples in stratum h or the entire study area, σ 2 h and σ 2 are the variances of the dependent variable Y in stratum h or the entire study area, respectively, y h is the risk observation within sub-region h, y h is the mean value of risk observations within sub-region h, and N hj is the number of observations y h in sub-region h.
The q-statistic value is between 0 and 1. The higher the value, the stronger the association between the independent variable X and the dependent variable Y. The interaction detector can be used to quantify the interactive influence of two explanatory variables (X1 and X2) and reveals whether the interaction of variables weakens or enhances the influence on Y or whether they are independent in influencing Y. The q-statistic values (X1 and X2) calculated from the GD model are usually denoted as q(X1) and q(X2), respectively. A new variable layer can be generated by spatially overlaying variable layers X1 and X2, which can be marked as X1∩X2. By comparing the q-statistic values of two variables and the q-statistic value of their interaction, five types of interactions can be identified, namely, nonlinear weakened if q(X1∩X2) < Min(q(X1), q(X2)); Previous studies have focused on the interactions between factors from a global perspective but ignored the possible heterogeneity of the interactions. Therefore, in this study, the study area was divided into several subregions based on the positive and negative local estimates in the MGWR model, and then the GD model was employed to identify the interaction effects in different subregions.  Figure 1 shows the distribution of PM 2.5 at a resolution of 1 × 1 km. The annual average PM 2.5 concentration in China was 29.47 µg/m 3 in 2016. The PM 2.5 concentration in eastern China was much higher than that in the western region, which showed distinct directivity in the low-altitude plain area. The lowest concentration of PM 2.5 was found in Changdu of the Tibet autonomous region, with an average value of 3.14 µg/m 3 . The highest average PM 2.5 concentration was 81.2 µg/m 3 , in Hengshui of Hebei Province. According to the air quality guidelines formulated by the World Health Organization (WHO), the WHO Interim target 1 level (35 µg/m 3 ) is considered the standard limit between low and high PM 2.5 concentrations. Compared to this value, the results show that nearly half of the cities had higher PM 2.5 concentrations and were mainly distributed in the North China Plain, Northeast Plain, Sichuan Basin, and Middle-Lower Yangtze plains. It should be noted that all of these regions are densely populated and have rapid economic development. The relatively high PM 2.5 concentration in the Sichuan Basin is likely closely related to the low terrain and surrounding mountains, which inhibits the diffusion of air pollutants. Likewise, the high concentration in the Northeast Plain is probably due to the presence of heavily polluting industries; however, household heating is also widely conducted in winter.

Spatial Variation Characteristics of PM 2.5 Concentrations
Previous studies have focused on the interactions between factors from a global perspective but ignored the possible heterogeneity of the interactions. Therefore, in this study, the study area was divided into several subregions based on the positive and negative local estimates in the MGWR model, and then the GD model was employed to identify the interaction effects in different subregions. Figure 1 shows the distribution of PM2.5 at a resolution of 1 × 1 km. The annual average PM2.5 concentration in China was 29.47 µg/m³ in 2016. The PM2.5 concentration in eastern China was much higher than that in the western region, which showed distinct directivity in the low-altitude plain area. The lowest concentration of PM2.5 was found in Changdu of the Tibet autonomous region, with an average value of 3.14 µg/m³. The highest average PM2.5 concentration was 81.2 µg/m³, in Hengshui of Hebei Province. According to the air quality guidelines formulated by the World Health Organization (WHO), the WHO Interim target 1 level (35 µg/m³) is considered the standard limit between low and high PM2.5 concentrations. Compared to this value, the results show that nearly half of the cities had higher PM2.5 concentrations and were mainly distributed in the North China Plain, Northeast Plain, Sichuan Basin, and Middle-Lower Yangtze plains. It should be noted that all of these regions are densely populated and have rapid economic development. The relatively high PM2.5 concentration in the Sichuan Basin is likely closely related to the low terrain and surrounding mountains, which inhibits the diffusion of air pollutants. Likewise, the high concentration in the Northeast Plain is probably due to the presence of heavily polluting industries; however, household heating is also widely conducted in winter.  We adopted the local Moran's I test to further investigate the spatial clustering of PM 2.5 concentrations in China. The local spatial autocorrelation index (LISA) considers the spatial distance between cities and tests the spatial clustering of PM 2.5 between neighbouring cities. Figure A1 shows the local spatial autocorrelation results of prefecture-level PM 2.5 concentrations in China. In general, the areas with high PM 2.5 concentration were concentrated in the North China Plain and Middle-Lower Yangtze Plain. The low-low cluster refers to the low-value aggregation of PM 2.5 , and is mainly distributed in the western region, for example, Xinjiang, Tibet, and Qinghai. The lower population density and economic activities in these provinces have resulted in low PM 2.5 levels. In addition, only a few cities are distributed in the high-low and low-high clusters, which may be due to the strong spatial diffusion of PM 2.5 , and the difficulty in isolating high or low values.

Global Influence of Driving Factors on PM 2.5 Concentrations
The driving factors of PM 2.5 include meteorology, topography, and socioeconomic factors. The OLS model was used to determine the impact of nine explanatory variables on prefecture-level PM 2.5 levels, to elucidate the dominant factors that influence PM 2.5 . A multicollinearity problem can arise when there is a high degree of correlation between regression variables and the phenomenon of information overlap. Therefore, the linear relationship among the factors was judged by the variance inflation factor (VIF) before analysing the results of the OLS model. As shown in Table 2, multicollinearity did not cause any problems because all the variables related to VIF were below 10, indicating no serious multicollinearity in the model. Note: ***, **, and * indicate significance at the 1%, 5%, and 10% levels, respectively.
The results of the OLS regression suggest that seven explanatory variables were significant at the 1% level. The significance levels of the effects of ABD and NDVI on PM 2.5 were 5% and 10%, respectively. The coefficient in the OLS model showed that most explanatory variables negatively correlated with PM 2.5 , including TEM, PRE, WS, DEM, NDVI, and ABD. In contrast, we found that PD and RD had a significant positive effect on PM 2.5 . Table 3 shows that the performances of the OLS and MGWR models were significantly different. The AIC and AICc values of the MGWR model were the lowest, indicating a lower difference between the observed value and the fitting value. The highest adjusted R 2 value of the MGWR model also proves its high fitting degree. As shown in Table 4, the influence of cities on PM 2.5 was significantly different based on MGWR. The values of all driving factors were between positive and negative, indicating that the influence of climate, topology, and socioeconomic variables on the PM 2.5 concentration has high spatial heterogeneity in China. The spatial distribution of local coefficients is shown in Figures 2 and 3, revealing the high spatial heterogeneity of effects of the driving factors on PM 2.5 based on the MGWR model.  Shanghai and Shaoxing. On the contrary, western China mainly showed negative correlations, which play a significant role in lowering PM2.5 concentrations.    Figure 4 illustrates the spatial distribution of the local R 2 values in the MGWR model. The MGWR results showed that the nine driving factors explained 78.50% of the PM2.5 concentration distribution. The local R 2 gradually decreased from the north to south, and the MGWR fitting effect was strongest in the northern region. The local R 2 was above 0.75 in Central China and the middle-lower Yangtze River region, which suggests that the influence of climate, topology, and socioeconomic factors on the PM2.5 concentration was more significant in this area. The local R 2 of most cities was more than 65.00%, indicating a high goodness of fit in the MGWR model.  The results show that the temperature rise has an inhibitory effect on the PM 2.5 concentration in most cities in China, whereas the temperature rise in the northeast and northwest intensifies the influence on the PM 2.5 concentration. Specifically, the temperature in Tongling had the greatest moderating effect on the concentration of PM 2.5 . In contrast, the temperature in Harbin had the strongest promoting effect on PM 2.5 . In most cities, the impact of precipitation on PM 2.5 showed a negative correlation. The increase in precipitation along the southeast coast will be accompanied by a decrease in PM 2.5 concentration. However, the increase in precipitation in the Pearl River Delta will be accompanied by an upward trend in PM 2.5 concentration. Moreover, wind speed negatively correlated with the PM 2.5 concentration, as wind action transported pollutants to other areas (Figure 2c). A lower wind speed decreases the diffusion and transportation of fine particles and subsequently increases the concentration of PM 2.5 . Figure 2d shows a negative correlation between DEM and PM 2.5 in almost all cities. This suggests that a decrease in DEM enhances the PM 2.5 concentration. Moreover, the negative correlation between DEM and PM 2.5 gradually strengthened from western to eastern China. Similarly, as shown in Figure 2e, NDVI and PM 2.5 concentrations were mainly negatively correlated in southern and western China, which maintained the strongest negative correlation in northern China.

Spatial Heterogeneity of Influence of Driving Factors
As shown in Figure 3a, ABD negatively correlated with PM 2.5 , and the overall distribution gradually weakened from northwest to southern China. However, a region of positive correlation was observed in the central and coastal areas, but the positive correla-tion coefficient was relatively small. Figure 3b shows a positive correlation between PD and PM 2.5 concentrations, indicating that population agglomeration is a dominant factor that influences PM 2.5 . The positive correlation coefficient between population density and PM 2.5 concentration showed a gradual increase from the south to north, and the northern population density had a greater impact on PM 2.5 .
The regression coefficients of GDP and RD factors are shown in Figure 3c,d. GDP reflects the level of economic output of a region and can therefore be used to measure the degree of regional economic development. Compared with per capita GDP, regional GDP can better reflect regional economic concentration and development levels. The correlation coefficient of GDP in northern China was significantly higher than that in southern China. The GDP in Baoding was found to have the greatest positive effect on PM 2.5 levels. Moreover, the regression coefficient of road density in eastern China was generally positive, indicating a positive correlation between road area and PM 2.5 . Road density in Suzhou had the greatest enhancement effect on PM 2.5 concentrations, followed by those in Shanghai and Shaoxing. On the contrary, western China mainly showed negative correlations, which play a significant role in lowering PM 2.5 concentrations. Figure 4 illustrates the spatial distribution of the local R 2 values in the MGWR model. The MGWR results showed that the nine driving factors explained 78.50% of the PM 2.5 concentration distribution. The local R 2 gradually decreased from the north to south, and the MGWR fitting effect was strongest in the northern region. The local R 2 was above 0.75 in Central China and the middle-lower Yangtze River region, which suggests that the influence of climate, topology, and socioeconomic factors on the PM 2.5 concentration was more significant in this area. The local R 2 of most cities was more than 65.00%, indicating a high goodness of fit in the MGWR model.     The circle represents the q-statistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion. For instance, "TEM(+);PRE(+)" is the region of which the coefficients of the TEM and the PRE are both positive.

Spatial Heterogeneity of Interactions between Driving Factors
As shown in Figure 6, although the impact of PRE on PM2.5 in most areas is positive, Figure 5. The q-statistic values of interactions between temperature and other factors derived from the interaction detector. The circle represents the q-statistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion. For instance, "TEM(+);PRE(+)" is the region of which the coefficients of the TEM and the PRE are both positive.
As shown in Figure 6, although the impact of PRE on PM 2.5 in most areas is positive, there are differences in the interaction between PRE and TEM and NDVI in the PRE(+) region, in which the interaction between PRE and TEM is bivariate and the interaction between PRE and NDVI is nonlinear enhanced.
Remote Sens. 2021, 13, x FOR PEER REVIEW 13 of 26 region, in which the interaction between PRE and TEM is bivariate and the interaction between PRE and NDVI is nonlinear enhanced. Figure 6. The q-statistic values of interactions between precipitation and other factors. The circle represents the q-statistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion. Figure 7 shows that the influence of WS on PM2.5 is consistent in most cases. However, the interaction between WS and ABD is heterogeneous in different subregions. Figure 6. The q-statistic values of interactions between precipitation and other factors. The circle represents the q-statistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion. Figure 7 shows that the influence of WS on PM 2.5 is consistent in most cases. However, the interaction between WS and ABD is heterogeneous in different subregions. Specifically, the interaction between WS and ABD is nonlinear enhanced in the whole and ABD(−) regions, whereas it is bivariate enhanced in the ABD(+) region.
Remote Sens. 2021, 13, x FOR PEER REVIEW 14 of 26 Specifically, the interaction between WS and ABD is nonlinear enhanced in the whole and ABD(−) regions, whereas it is bivariate enhanced in the ABD(+) region. Figure 7. The q-statistic values of interactions between wind speed and other factors. The circle represents the q-statistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion.
As shown in Figure 8, the interactions between NDVI and PRE, ABD, GDP, and RD are heterogeneous. For instance, in the interaction between NDVI and ABD, the effect of Figure 7. The q-statistic values of interactions between wind speed and other factors. The circle represents the q-statistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion.
As shown in Figure 8, the interactions between NDVI and PRE, ABD, GDP, and RD are heterogeneous. For instance, in the interaction between NDVI and ABD, the effect of NDVI on PM 2.5 in the whole region is lower than that of ABD, and both show bivariate enhanced effects. However, the interaction between NDVI and ABD shows nonlinear enhancement in the ABD(+) region, and the q-statistic value of NDVI is higher than that of ABD in the ABD(−) region. In addition, the q-statistic value of NDVI is higher than that of RD in the whole region, but the q-statistic value of RD is higher in the RD(−) region, and the interaction type changes from bivariable enhanced to nonlinear enhanced. NDVI on PM2.5 in the whole region is lower than that of ABD, and both show bivariate enhanced effects. However, the interaction between NDVI and ABD shows nonlinear enhancement in the ABD(+) region, and the q-statistic value of NDVI is higher than that of ABD in the ABD(−) region. In addition, the q-statistic value of NDVI is higher than that of RD in the whole region, but the q-statistic value of RD is higher in the RD(−) region, and the interaction type changes from bivariable enhanced to nonlinear enhanced. Figure 8. The q-statistic values of interactions between NDVI and other factors. The circle represents the q-statistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion. Figure 8. The q-statistic values of interactions between NDVI and other factors. The circle represents the q-statistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion.
In Figure 9, the interactions between GDP and other driving factors indicate that the effect of GDP is higher than that of RD in the whole region but lower in the sub-regions. In addition, the type of interaction between GDP and RD is bivariate enhanced in the RD(−) region, which is also inconsistent with the nonlinear enhancement in the whole region. More interaction results between the driving factors are presented in the Appendix A. Here, we report the interactions with significant heterogeneity, and other results are shown in Appendix A.
Remote Sens. 2021, 13, x FOR PEER REVIEW 16 of 26 In Figure 9, the interactions between GDP and other driving factors indicate that the effect of GDP is higher than that of RD in the whole region but lower in the sub-regions. In addition, the type of interaction between GDP and RD is bivariate enhanced in the RD(−) region, which is also inconsistent with the nonlinear enhancement in the whole region. More interaction results between the driving factors are presented in the Appendix Appendix A. Here, we report the interactions with significant heterogeneity, and other results are shown in Appendix A. Figure 9. The q-statistic values of interactions between GDP and other factors. The circle represents the q-statistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion. Figure 9. The q-statistic values of interactions between GDP and other factors. The circle represents the q-statistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion.

Discussion
A better understanding of the spatial heterogeneity among multiple driving factors of PM 2.5 pollution is beneficial to reveal the different geographical patterns of PM 2.5 concentrations and help decision-makers and planners formulate PM 2.5 pollution control strategies. Using the MGWR and GD models, we observed significant spatial heterogeneity not only with respect to the influence of factors, but also with respect to their interactions.
Our results are consistent with previous reports stating that the coefficients in the southeast cities were negative, whilst in the northwest and the northeast, they were found to be positive [55]. Nevertheless, the impact of temperature on PM 2.5 concentration is still unclear. Generally, the increased temperature can strengthen the atmospheric turbulence and convection that could provide a dynamic field for pollutant transport and spread [56]. However, the temperature can also affect the formation of particles and thus promote the photochemical reaction between precursors [57]. In most cities, the impact of precipitation on PM 2.5 concentrations showed a negative correlation. This is reasonable because precipitation causes the deposition of pollutants from the atmosphere, and most precipitation processes involve strong convective weather, such as thunderstorms and gales, which have a good removal effect on air pollutants, including PM 2.5 [58]. Large areal vegetation coverage inferred by high NDVI values is also known to reduce the atmospheric PM 2.5 , but vegetation typically only plays an auxiliary role in areas dominated by pollution sources and topographical and meteorological factors. However, although farmland typically has a high NDVI value, the PM 2.5 concentration elevated by dust, straw burning, and fertiliser application. It was further noted that significant spatial changes in the socioeconomic driving factors were also observed. Dense cities and industrial areas are often accompanied by high-density populations, which directly lead to rapid increases in gaseous pollutant emissions in production and living areas, far exceeding the level of natural purification. Compared with that in the south, the impact of human activities on PM 2.5 , which in this study was more notable in northern China, where thermal power generation and coal-fired heating in winter are prevalent, especially in northeastern China. In addition, the GDP in most cities was found to have a positive effect on PM 2.5 concentrations, whereas the GDP of only a few cities had a negative impact. This suggests that regional economic development positively impacts PM 2.5 , which further deteriorates the quality of the atmospheric environment. An obvious feature of our results was the tendency for road density to be the highest in eastern China. This may be because the increase in road density represents the expansion of urban traffic in response to an increase in the number of motor vehicles. Therefore, an increase in exhaust emissions will lead to an increase in atmospheric PM 2.5 .
For all samples, most interactions between the natural factors nonlinearly enhanced the influences on PM 2.5 concentrations, except those of the DEM; however, the type of most interactions between socioeconomic factors was bivariate enhanced. This implies that there could be more complicated interactions between natural factors than between socioeconomic factors. For the interactions between natural and socioeconomic factors, most experiments were bivariate enhanced; however, the interactions between WS and PD, GDP, and RD were nonlinearly enhanced. In other words, the impact of socioeconomic development on WS could be vital to explain the PM 2.5 concentration. Moreover, it can be expected that PD and GDP would affect the height of the city, and the RD would reshape the surface roughness. Compared with those of other related studies (Table 5), our results are more consistent with the results of Wang et al. [21] but are inconsistent with the results of Wu et al. [59]. This could be due to the different sizes of the research units since the larger the size of research units can reduce the spatial heterogeneity that could affect the results of the GD model. Given that the socioeconomic data are typically aggregated at the prefecture level, this study and Wang et al. [21] both adopted Chinese cities, while Wu et al. [59] selected grid cells with a spatial scale of 20 × 20 km. In addition to interactions at the global scale, this study goes one step further. Our results indicate that the interactions have obvious spatial heterogeneity, as do the impacts of driving factors, especially TEM and NDVI. The change in temperature not only affects the urban natural systems (e.g., the meteorological system) but also the socio-economic systems, such as the energy assumption (e.g., heating). For NDVI, it is generally believed that vegetation can exert adsorption and dust reduction effects to alleviate PM 2.5 levels. However, the alleviating effect of NDVI could also be related to precipitation and socioeconomic factors. As mentioned before, previous studies employed the MGWR and GD models to detect the spatial heterogeneity of factors and the interactions between factors from a global perspective. Our contribution is to provide a novel analysis process, that is, to partition the samples according to the heterogeneity of factors and then evaluate the spatial interactions in subregions. Our results highlighted that spatial heterogeneity is not only in factors but also in their interactions. Although the effect of a single driving factor is small, the interaction with other factors could present a higher q-statistic value. For instance, in Figure 5F, the interaction between TEM and PD in the "TEM(−);PD(−)" region shows that the q-statistic value of the interaction is higher than 0.6, however, the q-statistic of the TEM is less than 0.1. Therefore, we cannot control one factor and ignore others, especially the interactions between natural and socioeconomic factors [21].
Our approach has some limitations. The analysis data of urban research did not include all cities in China, which was primarily due to the lack of relevant data for most western cities. Most cities in the MGWR model were concentrated in the eastern and middle parts of the country, with fewer data from western cities. The GD model can only detect interactions between two factors. The identification of the influence of multiple factors could require a couple of process models, such as the atmospheric model, land use change model, and socioeconomic model. Our study mainly quantified the various influences of different factors using statistical methods, and the results may help to establish the process model of PM 2.5 concentration in the future.

Conclusions
Identification of the impacts of factors and their interactions on PM 2.5 concentrations is crucial for air pollution control and prevention. However, PM 2.5 concentrations are dependent on numerous factors, and their mechanisms and processes remain unclear. In different regions, the effects of different factors have spatial heterogeneity, which undoubtedly increases the complexity and difficulty of research. In this study, we divided our study area into subregions according to the spatial heterogeneity of factors based on the MGWR model and then employed a GD model to identify the interactions between factors in different subregions. Our findings revealed the spatial heterogeneity of the interactions between the factors for the first time. Specifically, the interactions between natural factors have significant uncertainty, as do the interactions between NDVI and socioeconomic factors. The interactions between socioeconomic factors in the subregions were consistent with those in the entire region. Given the importance to identify factor pairs that exhibit strong interactions between natural factors and socioeconomic factors in managing air pollution [59], the spatial heterogeneity of the interactions suggests that policymakers pay more attention to the subregions with nonlinearly enhanced interactions, while the interaction type is bivariate enhanced in the whole sample [60]. Nonetheless, our study is at the exploratory stage, and future research should focus on the following aspects: (1) for atmospheric science, it is important to couple the nonlinear interactions between natural factors in PM 2.5 modelling and verify them in different regions; (2) for social science, researchers should further consider the influence of different policies on socioeconomic factors and interactions with natural factors; (3) scale problems are still key issues, and this would call for more empirical studies to understand the complexity and uncertainty of the target factors; (4) for the data issue, it was not possible to study some geographical region due to a lack of available data. Future research should attempt to incorporate more extensive geographical data for empirical analysis.   Figure A2. The q-statistic of interaction between DEM and other factors. The circle represents the q-statistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion. Figure A2. The q-statistic of interaction between DEM and other factors. The circle represents the q-statistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion.  Figure A3. The q-statistic of interaction between area of built districts (ABD) and other factors. The circle represents the qstatistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the Figure A3. The q-statistic of interaction between area of built districts (ABD) and other factors. The circle represents the q-statistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion.
Remote Sens. 2021, 13, x FOR PEER REVIEW 22 of 26 red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion. Figure A4. The q-statistic of interaction between population density (PD) and other factors. The circle represents the qstatistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the Figure A4. The q-statistic of interaction between population density (PD) and other factors. The circle represents the q-statistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion.
Remote Sens. 2021, 13, x FOR PEER REVIEW 23 of 26 red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion. Figure A5. The q-statistic of interaction between road density (RD) and other factors. The circle represents the q-statistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion. Figure A5. The q-statistic of interaction between road density (RD) and other factors. The circle represents the q-statistic value of a single driving factor, the red triangle represents the q-statistic value of the sum of two driving factors, and the blue triangle is the q-statistic value of the interaction of two driving factors. If the blue triangle is higher than the red one, the interaction type is nonlinearly enhanced; otherwise, it is bivariate enhanced. "All" represents the whole sample. "Factor A(+);Factor B(−)" represents the subregion.