1. Introduction
Air pollution, which spreads rapidly and has a strong diffusive capacity, is of increasing concern worldwide because of its adverse effects on the environment, climate, and public health [
1]. Because of the expansion of urban populations, the rapid development of the regional economy, and associated urbanization, North China has become one of the most polluted regions in the world [
2,
3]. According to statistics, the most extreme levels of PM
2.5 in Beijing exceeded the World Health Organization (WHO) health standards 40 times in 2013 [
4]. To characterize air quality more accurately and align with the WHO, the latest Chinese Ambient Air Quality Standard (CAAQ) was released in 2012, which stipulates the use of the air quality index (AQI) in a generalized way to provide timely information about air quality to the public and help citizens understand how local air quality changes over time. Haze has had a major negative impact on the physical and mental health of the public, to the point of reducing the happiness of urban residents, implying that the control of atmospheric pollution will be an ongoing major priority for North China. To strengthen the scientific management of air quality in North China, it is necessary to understand the spatial-temporal patterns and influential factors on the AQI. Clearly, it is of enormous practical value to elucidate the main influence factor of air pollution to facilitate the application of targeted control measures.
Previous studies have primarily concentrated on the variations in air quality between years, seasons, or on specific timescales in specific cities, urban agglomeration, and nationwide; thus the air pollution hotspots were identified [
5,
6,
7]. Because the regional differences in air pollution directly reflect on the variations in the levels of socioeconomic development, air quality research in China has focused on the urbanized areas of the Pearl River Delta [
8], Yangtze River Delta [
9] and the Beijing-Tianjin-Hebei (BTH) Area [
10]. With the increasing focus on the spatiotemporal characteristics of air pollution, the relationship between natural environmental and socioeconomic factors and air quality has attracted increasing research interest. Meteorological factors (e.g., temperature, precipitation, relative humidity, wind speed) are introduced to explain the variation of AQI and air pollutants in China [
11,
12]. In addition, it has been demonstrated that AQI is in response to socioeconomic factors such as urbanization, industrial structure, GDP, energy consumption, population, and technological innovations [
13,
14,
15]. Source apportionment can also be conducted using chemical transport models (CTMs), which link the emissions of pollutants to their ambient distributions; whereas results from the models are unstable and account for the sensitive to analytical specifications, such as the number of resolved factors and pollutant input species [
16,
17]. Other researches has been conducted from the perspectives of the relationship between public policies and air quality improvement [
18,
19], air quality modelling [
20], and remote sensing inversion of air pollution source intensity [
21].
Although existing studies on air pollutant characteristics on different spatial and temporal scales have made substantial progress, few studies have focused on the patterns of the spatial distribution and seasonal variation of the AQI in North China, where the AQI value is extremely high. Relevant study has confirmed that air pollution presents spatial agglomeration characteristics in Chinese cities; air pollution in a city is influenced not only by itself, but also indirectly by the neighboring cities [
22]. Thus multiple linear regression model ignores the spatial dependence effect and cannot accurately measure the spillover effect of the AQI. Because of the spatial dependence of urban air pollution, an increasing number of scholars have studied the spatial regression method and analyzed the main influential factors on air quality from a global perspective, which provides the basis for formulating as the joint defense for haze pollution [
23,
24]. However, the regression coefficients of global regression models are consistent (i.e., averaging constant) which masks the local relationships between variables. Meanwhile, there are differences in natural environment and the social and economic levels among cities in North China, and haze often presents obvious regional characteristics [
25].
Therefore, there must be spatial differences in factors affecting air quality. In other words, the factors affecting haze are often heterogeneous in space [
26]. The geographically weighted regression (GWR) model is a local regression model, can effectively make-up for the difficulty of quantitative interpretation of the global spatial regression model in specific spatial positions, by introducing spatial locations into the regression parameters [
27]. Therefore, the GWR model can help analyze the spatial differences of the factors influencing the AQI which provides the basis for formulating relevant policies to regionally differentiate the governance of air pollution. Moreover, the temporal resolution of socioeconomic and meteorological data are often different, thus combining them in a single comprehensive research framework may provide the details of the relationship between meteorological factors and the AQI at a smaller time scale. Consequently, we take North China as a case study, collected public air monitoring data in 97 cities (71 cities are located in the study area, and 26 are located in the surrounding provinces) during January 2014–February 2017, with the aim of conducting a quantitative analysis of spatiotemporal patterns. In the influencing factor analysis, the relationship of daily meteorological factors and AQI was revealed by correlation analysis, and then both socioeconomic and meteorological factors were incorporated into the GWR framework to identify the degree of influence of each factor on AQI at the annual scale. Our paper is of important practical significance for the government to formulate regionally differentiated haze governance policies, improve the effectiveness of regional joint protection, and achieve a sustainable economic and societal development.
2. Materials and Methods
2.1. Study Area
To improve the air quality and alleviate the health burden, the Chinese government has enacted several policies, such as air pollution prevention and control (2013–2017), and comprehensive air pollution control in Beijing, Tianjin, Hebei, and surrounding areas (2017–2018) to mitigate air pollutant emissions in the North China region. Finally, in this study, 58 prefecture-level urban areas in North China which belong to Beijing Municipality, Tianjin Municipality, Hebei Province, Shanxi Province, Shandong Province, and Henan Province, are selected as the study area, taking into account the urban areas involved in above air pollution prevention plan, as well as the integrity of the administrative regions. In 2016, the total population of the region was 340 million, representing 24.8% of the population of China; in addition, the total regional GDP was 19.73 trillion yuan, accounting for 26.5% of the total for China. However, economic development and urban sprawl have generated negative environmental consequences, North China is currently considered as the most air-polluted region in China and even of the world.
2.2. Data Sources and Preprocessing
AQI is a dimensionless number used by government agencies to inform the public about levels of air pollution [
28]. Different countries have their own air quality indices corresponding to different national air quality standards. Here, AQI was calculated based on the Chinese ambient air quality standard (CAAQS) (GB 3095-2012), released in 2012 by China’s Ministry of Environmental Protection (MEP). Calculation of the AQI involves the concentration of PM
2.5, PM
10, SO
2, NO
2, CO, O
3 as follows:
In Equation (1),
is the individual AQI of pollutant item
;
is the concentration of pollutant
;
and
respectively represent the upper and lower limits for pollutants close to
;
represents the IAQI corresponding to
; and
represents the IAQI corresponding to
. In Equation (2),
n is the number of pollutant’s item. When the value of AQI is more than 50, the primary pollutant is air pollutant for max IAQI. AQI values vary from 0–500 and can be divided into six levels with higher levels representing greater air pollution (
Table 1).
To obtain the better spatial interpolation results of AQI, three years of monitoring data were collected from 71 cities which are located in the study area, and 26 are located in the surrounding provinces (
Figure 1). The data were downloaded from the website of the China Environmental Monitoring Station (
http://www.cnemc.cn/). The air quality data covers the interval from 1 January 2014 to 28 February 2017, since many monitoring sites were not established before 2014, a total of 96,023 daily records were used to calculate the arithmetic mean for each city at the monthly, seasonal, and annual scales and to establish a GIS database. Herein, spring refers to March–May, summer to June–August, autumn to September–November, and winter to December–February.
On the one hand daily precipitation and wind speed data of Beijing, Tianjin, Shijiazhuang, Taiyuan, Jinan and Zhengzhou from 2014 to 2016 were collected to analyze the impact of meteorological data on AQI of typical cities (provincial capital) in North China on a daily scale. On the other hand, since the socioeconomic data obtained in this paper are all annual data, both socioeconomic and meteorological factors were incorporated into the spatial regression model framework to identify the degree of influence of each factor on the AQI at the annual scale. Combined with the relevant literature regarding factors on haze pollution and the availability of data, we selected 13 factors that have impacts on the haze pollution. These 13 factors include four meteorological factors, including temperature, precipitation, wind speed, atmospheric pressure, and nine socioeconomic factors, including annual average population, population density, gross domestic product (GDP), per capita GDP, the secondary industry as percentage to GDP, green covered area as rate of completed area, forest coverage, civilian car ownership, total gas supply. The socioeconomic data of 58 prefecture-level cities in 2015 were download from the China City Statistical Yearbook 2016. The daily meteorological data corresponding from 1 January 2015 to 31 December 2015 were sourced from the China meteorological data website which was established by the China Meteorological Information Center (
http://data.cma.cn/). To keep the time dimension of the panel data involved in the modeling consistent, a spatial interpolation approach (inverse distance weighted) was used in order to generate a nationally continuous surface for these meteorological data (which included annual temperature, annual average precipitation, and annual average wind speed and annual air pressure), and zonal statistics were subsequently used in order to obtain meteorological data for each of the study cities in North China using ArcGIS software [
29]. In a similar way, we employed a zonal statistical method using the administrative boundaries of the 58 prefecture-level cities in order to calculate annual average AQI for each of the study cities. We use the variance inflation factor (VIF) to determine whether there is multicollinearity between variables and
Table 2 shows a brief description of the explanatory variables that were selected in this paper and results of multicollinearity test. Because the VIF of GDP are greater than 10, they discarded. Finally, the 12 × 58 variable matrix was normalized to eliminate dimension.
2.3. Exploratory Spatial Data Analysis
Exploratory spatial data analysis (ESDA) is based on the principle that entities with similar geographical attributes are related to each other, and that tests of the global spatial autocorrelation and local spatial autocorrelation of geographical data can be used to determine whether there is convergence or heterogeneity [
30]. In spatial autocorrelation analysis, including global spatial autocorrelation and local spatial autocorrelation, the Global Moran Index (GMI) is calculated as follows:
Here,
and
respectively represent the AQI of city
i and city
j;
is the mean of
;
n is the number of cities; and
is the spatial weights matrix between city
i and city
j. The GMI is in the range of [−1, 1]. Values <1, =0, or >0 respectively indicate a negative correlation, no correlation, or a positive correlation between the spatial units of AQI. The significance of the global spatial autocorrelation can be determined by a statistical test which can be evaluated using the standardized statistic
Z(GMI), as follows:
where
Var(GMI) is the variance of
GMI. At the 0.05 significance level,
Z(GMI) > 1.96 indicates a positive spatial autocorrelation between the spatial units of AQI, meaning that similar high values or low values of the distribution of AQI spatial units in North China represent spatial aggregation;
Z(GMI) < −1.96 indicates a negative correlation between AQI spatial units, and thus the units are spatially separated; and −1.96 <
Z(GMI) < 1.96 indicates that the spatial correlation between AQI spatial units is not evident.
The Local Moran Index (LMI) can detect the correlation between an AQI space unit and its adjacent units in North China, and thus can identify spatial agglomeration or spatial heterogeneity at specific locations [
31]. The LMI is defined as follows:
If the LMI is positive, the map of spatial autocorrelation is distinguished: cluster of high value (HH) with high AQI and low value (LL) with low AQI. If the LMI is negative, then the spatial units with high AQI are surrounded by low AQI (HL) or that spatial units with low AQI are surrounded by high AQI.
2.4. Geographically Weighted Regression
Geographically weighted regression (GWR) is an improvement over the traditional regression model; it considers the autocorrelation of spatial units, embeds the geographical function of spatial data into the regression parameters, and produces the coefficient of determination (R
2) and local regression coefficients, which change with geographical location via the observation value of adjacent spatial units. Thus, GWR is a simple but useful new technique for analyzing spatial non-stationarity which can be used to estimate a parameter at the partial scale and reveal the spatial relationship between AQI and its influencing factors. GWR is defined as follows:
Here, , represents the cities in the study; represents the 58 × 1 dimensional dependent variable (yearly value of AQI); represents the explanatory variable matrix of n × k dimensions; represents the spatial coordinates of the j-th city; is the intercept for location j, and represents the i-th local parameter estimate for the j-th city; and is an independent distribution of random errors.
GWR is calibrated by weighting all observations around a sample point using a distance decay function, on the basis of the assumption that the observations closer to the location of the sample point have a higher impact on the local parameter estimates for that location. When GWR is used, the parameters are estimated as follows:
where
is the estimate of the location-specific parameter, and
is a diagonal
n by
n spatial weight matrix, the off-diagonal elements of which are equal 0 and the diagonal elements represent the geographical weight at city
j. Here, we construct a spatial weight matrix of the geo-weighted regression model based on the Gaussian function, so that the influence of data points near
j can be estimated with a larger weight [
32]. The formula is as follows:
where
is the weight of observation
j for observation
i,
dij is the distance between city
i and city
j, and
b is the kernel bandwidth.
When the distance is greater than the kernel bandwidth, the weight rapidly approaches zero. Both fixed and adaptive kernel bandwidths can be chosen for GWR, where fixed kernel has a constant bandwidth over space, and adaptive kernel can adapt bandwidths in size to variations in data density so that bandwidths are larger in the locations where data are sparse and smaller where data are denser. We used fixed kernel bandwidth in this study, because the dispersion ranges of air pollution at different sample cities were almost identical and did not vary between cities. In addition, fixed kernel bandwidth also allows the comparison of the regression results of different samples on the same level. The optimal bandwidth was determined by minimizing the corrected Akaike information criterion (AICc) [
33].
4. Discussion
In the past few years, North China where the cities with higher AQI are concentrated in the plains, including BTH and its neighbor regions, are infamous for its serious air pollution problems [
35]. Previous studies tended to focus on the spatial temporal variations of only a single pollutant, such as PM
2.5 and PM
10, and given the conclusion that air pollution in China is characterized by significant positive autocorrelation and higher levels in northern China and lower in southern China [
36]. However, the haze pollution in North China cannot be overlooked, and haze governance has become the top priority for the North China region. Our results shows that the average annual urban AQI values during 2014–2016 reached 112.6, 103.4, and 98.5, respectively, exceeded or were close to the Grade II standard of CAAQ, the overall air quality remains poor. Spatially, given the evidence for a strengthened spatial dependence of AQI in the urban space units in this study suggested that an air pollution alleviation policies should be implemented based on both the strategies of maximizing effort and regional joint prevention and control. Such cooperation would involve joint legislation and coordinated action to monitor, assess and implement policies. Temporally, this study found that AQI in North China presents the periodical tendency and shows a sinusoidal pattern of fluctuations over the three years. What is more, although AQI in spring, summer, and autumn decreased during 2014 and 2016, the level of air pollution in winter is increasing, which is demonstrated by the rapid expansion of pollution in the southern Hebei and Shanxi. Therefore, the Chinese government should effectively control urban air pollution according to the specific location and time of the city. More efforts should be devoted to control and reduce air pollution during winter.
Many previous studies have shown that meteorological and socio-economic factors are both closely related to air quality [
37,
38], however, with only a few exceptions that combine them [
39]. Meanwhile, since these two kinds of data have different time scales, socio-economic data are usually annual statistics and meteorological factors are recorded daily; only analyzing the influence of each factor on AQI at the annual scale will lose the details of the relationship between meteorological factors and AQI at a smaller time scale. In addition, the multiple linear regression model ignores the spatial dependence effect and cannot accurately measure the spillover effect of AQI [
40]. Although the climate of the North China region is characterized by stagnant weather, which are favorable for the formation and accumulation of pollutants at the ground, we found that the air quality significantly improved when rainfall reached a moderate level (>10 mm), and the levels of air pollutants were significantly negatively correlated with wind speed at the inter-annual scale. Thus, building an efficient urban air passage structure is important in urban planning and the use of artificial precipitation can be adopted to accelerate the dilution of air pollutants during haze weather with rainfall conditions. Notably, the relevant departments also need to focus on monitoring O
3 pollution when the wind speed increases because O
3 was always positively correlated with wind speed.
Some studies shows that North China region has a high proportion of secondary industry, a coal-dominated energy structure, increasing traffic intensity, and central heating in winter, which aggravates local pollution [
41]. However, compared with global regression models, the local regression model (GWR) can more accurately reflect the spatial differences of AQI influential factors. Result of GWR model according to our new findings shows that vehicle exhaust gases exert great influence on the AQI, especially in semi-arid region, which are inconsistent with most of the existing studies [
42]. Consequently, the administration should implement stricter vehicle emissions standards and traffic restrictions on the city residents. In addition, vehicles with improved fuel efficiency, and vehicles powered by alternative energy sources (such as electric and biomass hybrid vehicles) need to be introduced. Previous researches argued that secondary industry and energy consumption can be linked to coal-burning and emission of a range of pollutants [
43]. In this study, the local coefficients of GWR of industrial structure shows similar result that is positively correlated with AQI and obviously appears as regional differences. Therefore, optimizing industrial institutions, eliminating backward high-energy-consuming industries, and vigorously developing clean energy are very effective for the management of haze pollution in an industrial city. Generally, with increase in wind speed, atmospheric turbulence and convection can be expected to strengthen and provide a dynamic field for pollutant transport and spread. The local coefficients of annual wind speed shows that the inhibition effect of wind speed on AQI increased from southeast to northwest; thus, according to the spatial differences of the influences of wind speed on the AQI, relevant departments can formulate haze warning ranges, a warning time and degrees of warning more specifically. The impact of GDP on air pollution has been widely discussed and inconclusive to date. In contrast to some previous studies, which showed the positive association between the intensity of economic activities and the level of air pollution, our study showed that estimated coefficients of per capita GDP in most of the cities was negative with AQI, and especially the city clusters in the coastal areas of the Bohai Sea, with a well-developed economy. This may indicate that, to some extent, haze in North China is an inevitable consequence of rapid economic development and then turn into a declining trend as the economy continues to grow. Governments at all levels should improve their environmental protection budgets and monitoring capabilities to achieve green development. Green coverage was demonstrated to be a favorable variable for AQI level, which is in agreement with some earlier researches [
44]. Our results also show that expanding forest coverage is the effective method to purify air quality, suggesting that the development ecological city or gardened city needs to be encouraged.
The potential application of this study has to be discussed. First, a single air pollutant cannot comprehensively reflect the air quality status in a place and represent the impact of air pollution on humans. Therefore, we recommend applying the comprehensive air pollutant indices to study the state of air quality. Although different countries have their own air quality indices, corresponding to different national air quality standards, public health risks increase as the AQI rises. Second, studies outside China have focused more on the relationship between AQI and human health and most analyses are conducted on the basis of traditional statistics [
45,
46]. Our spatial and temporal analysis pattern can be applied to study the relationship between AQI and epidemic diseases, which can help present regional differences more intuitive. Meanwhile, study of spatial and temporal variability of air pollution have relatively short period or small space units [
47]. However, we believe that a more robust spatial and temporal analysis should be based on a large number of observed data. Inter-annual variation in air quality can demonstrate the effectiveness of control and mitigation strategies. Seasonal or monthly variations can result from meteorological conditions or the contribution of emission-intensive sources. In addition, the spatial heterogeneity of air quality can reflect the emission reduction efforts, urban expansion and form, land use, and other characteristics of a certain area. For these reasons, the temporal and spatial characteristics of AQI should also be analyzed in depth and thus local governments can effectively cope with air pollution according to their specific location and time. Third, the causes of haze pollution are comprehensive and complex. As far as we know, few studies have explored the driving forces of air pollution from the perspective of a combination of socio-economic and meteorological factors. Thus, the framework of estimating the contributions and spatial spillovers of different anthropogenic factors on the air quality based on GWR model can be easily applied to other regions as long as enough data is available. In this way, relevant policies can be formulated as the joint defense against haze pollution and achieve a sustainable economic and social development.
However, it is should be noted that there are several limitations in this study. There are other socioeconomic parameters, for example, employment and educational attainment, that are excluded from this paper because of the lack of data. Natural factors such as topography, temperature, and barometric pressure also require further consideration. Thus it is necessary to take more variables into account to conduct further studies. Overall, this study provides information to the general public on the spatiotemporal rules of AQI in North China and indicative function for the formulation of urban policy and improvement of air quality in North China.
5. Conclusions
In this paper, we employed AQI, a comprehensive index that covers and integrates six pollutants simultaneously and can reflect the comprehensive status of ambient air quality, to investigate the spatial and temporal variation of air quality in North China. Furthermore, a better understanding of nature and socioeconomic influence factors on air quality was based on the statistical analysis and ESDA-GWR model respectively, which is beneficial to policy makers in the task of formulating pollution control strategies and improving air quality. Our major conclusions are as follows.
In terms of spatial variation, the air quality in North China has improved substantially during the study interval, and the area of heavy pollution has significantly decreased. The core area of severe air pollution is located to the south of Yanshan Mountains, west of Taihang Mountains, and east of Shandong Peninsula and covers the southern parts of the municipal districts of Beijing, Tianjin, central and southern Hebei, western Shandong and northeastern Henan Province. The Global Moran Index (GMI) increased from 0.5073 to 0.6541 during the study interval, indicating that air pollution in North China became more spatially concentrated and that the contrast between the levels of air pollution between cities was intensified.
In terms of temporal characteristics, there were substantial differences between the levels of air pollution in different seasons. The application of wavelet analysis revealed that cycles with lengths of 100–400 days were the most significant in North China and in addition the amplitude of synoptic-scale oscillations varied mainly in the period from late spring to early summer and from late autumn to early winter. The maximum peak of wavelet variance of AQI corresponds to a periodicity of 280 d, represents the major period of AQI oscillations in North China and the AQI exhibits a sinusoidal pattern of fluctuations which can be described as “heavy winter and light summer.” Although air quality in spring, summer, and autumn has improved during the study period, it became worse during winter and there was a substantial expansion of the polluted area in HuangHuaiHai Plain and Fen-Wei Basin.
In terms of the principal factors influencing the air quality, the correlation analysis of daily meteorological factors and AQI are revealed as follows: The flushing effect of rainfall was significant when rainfall reached a moderate level, the relationship of AQI and air pollutants shows different degree of negative correlation at the seasonal and inter-annual scale, whereas O3 was always positively correlated with wind speed. Furthermore, pollutants are transmitted dynamically under the influence of the prevailing wind direction, causing shifts in their spatial location. By incorporating both socioeconomic and meteorological factors into the uniform GWR framework to identify the degree of influence of each factor on the AQI at the annual scale, showed that civil car ownership and industrial production are positively correlated with air pollution; whereas increases in wind speed, GDP, and forest cover promote improved air quality.