Relevance Analysis on the Variety Characteristics of PM 2.5 Concentrations in Beijing, China

: Air pollution has become one of the most serious environmental problems in the world. Considering Beijing and six surrounding cities as main research areas, this study takes the daily average pollutant concentrations and meteorological factors from 2 December 2013 to 30 June 2017 into account and studies the spatial and temporal distribution characteristics and the relevant relationship of particulate matter smaller than 2.5 µ m (PM 2.5 ) concentrations in Beijing. Based on correlation analysis and geo-statistics techniques, the inter-annual, seasonal, and diurnal variation trends and temporal spatial distribution characteristics of PM 2.5 concentration in Beijing are studied. The study results demonstrate that the pollutant concentrations in Beijing exhibit obvious seasonal and cyclical ﬂuctuation patterns. Air pollution is more serious in winter and spring and slightly better in summer and autumn, with the spatial distribution of pollutants ﬂuctuating dramatically in different seasons. The pollution in southern Beijing areas is more serious and the air quality in northern areas is better in general. The diurnal variation of air quality shows a typical seasonal difference and the daily variation of PM 2.5 concentrations present a “W” type of mode with twin peaks. Besides emission and accumulation of local pollutants, air quality is easily affected by the transport effect from the southwest. The PM 2.5 and PM 10 concentrations measured from the city of Langfang are taken as the most important factors of surrounding pollution factors to PM 2.5 in Beijing. The concentrations of PM 10 and carbon monoxide (CO) concentrations in Beijing are the most signiﬁcant local inﬂuencing factors to PM 2.5 in Beijing. Extreme wind speeds and maximal wind speeds are considered to be the most signiﬁcant meteorological factors affecting the transport of pollutants across the region. When the wind direction is weak southwest wind, the probability of air pollution is greater and when the wind direction is north, the air quality is generally better.


Introduction
Ambient fine particulate matter smaller than 2.5 µm (PM 2.5 ) is a major environmental problem and is harmful to human health [1,2]. Numerous studies have documented that short-term and long-term exposure to PM 2.5 can increase the risks of allergies, respiratory system diseases, and cardiovascular diseases [3][4][5][6]. Meanwhile, the haze caused by PM 2.5 reduces visibility [7,8] and affects transportation, causing huge economic losses [9]. Chinese cities suffer heavily from ambient air pollution [10], particularly the capital Beijing [11]. A Global Burden of Disease (GBD) study ranks ambient particulate matter pollution (PM 2.5 ) as the 5th leading risk factor for early death and disability in China [12]. Thus, it is necessary to carry out research on PM 2.5 in China. Current research mainly focuses on the physical and chemical properties of pollutants [13][14][15], although some studies focus on Beijing, ranging from 39.4 • N to 41.6 • N and 115.7 • E to 117.4 • E and located in the north of the North China Plain, is surrounded by Hebei Province, along with Tianjin. The terrain is generally characterized by high altitude in the west and low altitude in the east. The western mountains belong to the Taihang Mountains and the northern mountains belong to the Yanshan Mountains. The central and southeastern parts between the two mountains are plain areas, with large mountainous areas and a total elevation between 20-2300 m. It has 16 districts, 6 of which are located in the downtown area and the remaining 10 are located in the suburbs, covering an area of 16,410.54 square kilometers. The seasonal distribution of precipitation is fairly inhomogeneous. 80% of the annual precipitation occurs in the three months of summer (June, July and August). The sunshine duration is the longest in spring, followed by autumn. In summer, the sunshine duration is slightly shorter due to the plentiful precipitation and the sunshine duration is the shortest in winter. Taking into account the effects of transport, this study focuses on Beijing and its surrounding cities, including Baoding, Chengde, Langfang, Tianjin, Zhangjiakou, and Tangshan. The study area is illustrated in Figure 1.
Sustainability 2018, 10, x FOR PEER REVIEW 3 of 15 into account the effects of transport, this study focuses on Beijing and its surrounding cities, including Baoding, Chengde, Langfang, Tianjin, Zhangjiakou, and Tangshan. The study area is illustrated in Figure 1.

Figure 1.
The physical geography of the study areas with colors denoting altitudes above sea level.

Data Collection
Historical site records of major pollutant concentrations and meteorological data from 2 December 2013 to 30 Table 1.

Data Collection
Historical site records of major pollutant concentrations and meteorological data from 2 December 2013 to 30 June 2017 were collected from open sources, spanning a total of 1307 days. Ground-measured hourly pollutant concentrations, including those of PM 10 , PM 2.5 , carbon monoxide (CO), nitrogen dioxide (NO 2 ), sulfur dioxide (SO 2 ), ozone (O 3 ) and the Air Quality Index (AQI), were collected from the Beijing Municipal Environment Monitoring Center (BJMEMC, http://zx.bjmemc.com.cn/) and then calculated as daily mean values. The locations of the 35 monitoring stations around the city are shown as dots in Figure 1. Original meteorological data on air temperature (TEM), daily sunshine duration (SSD), wind direction (WD), and wind speed (WIN), etc. were acquired from the website of the National Meteorological Information Center (NMIC, http://data.cma.cn/). Pollutant concentrations measured in surrounding cities were obtained from the Ministry of Environmental Protection of China data center (MEP, http://www.mep.gov.cn/). The raw parameters selected from open sources are listed in Table 1.
For meteorological data, NMIC provides the Air Pollution Index (API) interface for data acquisition. After authentication, it can be downloaded directly and the original data can be obtained through analysis. There are no historical archived pollutant concentration data and the BJMEMC official website only provides real-time online display. This paper applies a crawler based on the Scrapy framework to scrawl pollutant concentration records.
The data flow in Scrapy is controlled by the central engine. The engine opens a website for a crawler, requests the URL address for the site and then schedules it in the scheduler. The engine sends the URL to the downloader by the download middleware and the download middleware then disguises itself as a normal client in response to the anti-scrawling strategy and generates feedback from the parsing page and returns to the engine. Once received by the engine, the parsed page is sent to the crawler by the crawler middleware, then the content resolved by the crawler is sent back to the engine. The collected data are saved to a database through the pipeline and item middleware and then returned to the scheduler. After that, the procedures above are repeated until all requests are processed. The engine closes the website and gets all the data.

Methodology
The main data analysis methods adopted in this study include statistical analysis, spatial analysis, and visualization technology. The statistical analysis technique principally included variance analysis, correlation analysis, regression analysis, factor analysis, and so on, analyzing the complicated relationship between PM 2.5 concentrations, meteorological factors, and surrounding factors. Spatial analysis, including Spatial Center Statistics (SCS) and Exploratory Spatial Data Analysis (ESDA) were adopted to reveal the temporal and spatial characteristics of pollutant diffusion. The Spatial Center Statistics focused on depicting spatial distribution, which was mainly realized by calculating the basic parameters of the distribution, while the Exploratory Spatial Data Analysis emphasized the description of data, the identification of data statistical characteristics, and the preliminary judgment of the structure of the data through relevant assumptions, aimed at revealing spatial data characteristics, identifying outliers or regions, exploring spatial association patterns, recognizing accumulate or hotspot areas, implementing spatial zoning, and discovering spatial heterogeneity through geographical visualization. The data visualization methods used in this paper mainly included scatter plot, wind rose chart, and violin diagram, so as to intuitively illustrate the atmospheric phenomena varying with time and space behind data and help to find out the potential development pattern.

General Statistical Characteristics of PM 2.5 Concentrations and Exploratory Data Analysis
Statistical descriptions of the main indicators measured for the observation period are presented in Table 2. It can be seen from the table that the air quality situation in Beijing is certainly not optimistic, considering how the average 24-h value of PM 2.5 concentrations reached 77.40 µg/m 3 . This is three times more than the WHO guidance value (25 µg/m 3 ) and the 24-h average of the PM 10 concentrations reached 104.90 µg/m 3 , which is two times more than that of the WHO guidance value (50 µg/m 3 ). The 24-h maximum concentrations of PM 2.5 and PM 10 reached 477 µg/m 3 and 820 µg/m 3 , respectively. In addition, the variance of PM 2.5 and PM 10 concentrations were also closed to the mean levels respectively, indicating the volatility of the major pollutant concentrations and the instability of the regional air quality. The annual mean value of other gaseous pollutants has not exceeded the national standard at present but the peak value was higher in different degrees than the national level-2 standard in the same period. It reveals that all of the pollutant concentrations in the heavily polluted days have reached reasonably high levels and the long-term exposure to such an environment is very harmful to the human body and corresponding protection measures should be taken as precautions.   Figure 3 shows the temporal and spatial distribution of seasonal variation of PM2.5 concentrations in Beijing in the year 2017. According to climatological classification, the spring in Beijing is regarded as months March, April, and May, the summer from June to August, the autumn It can be drawn from Figure 2 that the air pollution caused by fine particles exhibits remarkable volatility and irregularity, at the same time, it shows a certain seasonal and periodic fluctuation in the overall variation tendency. The peak values of pollutant concentrations generally concentrate on heating periods, indicating that the energy structure of central heating has had an obvious influence on the ambient air quality in Beijing. According to the management methods for central heating in Beijing, the statutory heating period in Beijing is generally from 15 November to 15 March of the next year and fluctuates slightly in accordance to the current situation. The annual variation of PM 2.5 concentrations were relatively stable, with the annual mean values of year 2014 to 2017 were 84.83 µg/m 3 , 80.25 µg/m 3 , 73.01 µg/m 3 , and 57.83 µg/m 3 respectively. The annual mean values spread over a decreasing tendency, which indicates that the current treatment measures for air pollution have achieved initial success. In terms of the periodical tendency, the short cycle of the pollutant concentration fluctuations was about one week (the left magnified curve in Figure 2) to one month (the right magnified curve in Figure 2), the long period is one year, and the peak of the pollutant concentrations occurred alternately throughout the year. Figure 3 shows the temporal and spatial distribution of seasonal variation of PM 2.5 concentrations in Beijing in the year 2017. According to climatological classification, the spring in Beijing is regarded as months March, April, and May, the summer from June to August, the autumn from September to November, and the winter will be regarded as the time from December to February of the next year. This distribution situation diagram uses the monitoring data of 35 monitoring stations in the city, taking into account the anisotropy, autocorrelation, and the trend of data distribution, which is drawn by the Kriging interpolation method in geo-statistics [29,30].    It can be observed from Figure 4 that the diurnal variation of air quality presented a certain It can be drawn from Figure 3 that the air pollution in Beijing is more serious in spring and winter, and slightly better in summer and autumn. The distribution of pollutants varied dramatically in different seasons. The most seriously polluted regions in spring were the southern and central Beijing. In summer, the overall air pollution situation was better, only the southeast part of the city was more polluted. The main polluted areas in autumn were also concentrated in the southeast and southwest regions. In winter, the pollution situation was further aggravated, and some regions in the north also registered by high level pollution in PM 2.5 concentrations. In general, the spatial distributional characteristics of regional air quality were quite different in the four seasons, the pollution levels in the southern regions were more serious, and the concentrations of pollutants gradually reduced from the southwest to the northeast. The seasonal variation of pollutant spatial and temporal distribution may have been caused by different meteorological conditions and the distribution of pollution sources [31,32]. For example, the meteorological conditions formed by the combination of dry climate and strong wind in spring are conducive to the formation and development of sandstorms. The humid and hot environment and the increase of irradiation intensity in summer are beneficial to the formation of photochemical reactions, resulting in secondary pollution. The spatial distribution of pollutants in autumn was mainly due to the regional transport caused by unfavorable weather conditions, which was the main cause of air pollution in this period. In winter, the air quality was inseparable from the biomass burning [33] and coal combustion [34]. With the weakening of the wind and the decrease of the atmospheric height, the diffusion and convection in the horizontal and vertical direction were gradually restricted. The accumulation effect of local pollutants aggravated the outbreak of serious pollution events in winter.  As can be drawn from Figure 5, the particulate matters in Beijing were mainly concentrated on the southeast and central areas at 00:00 in the early morning, and the air quality in the northern and western mountainous areas was better than in other regions. At 04:00 and 08:00, the pollution bound It can be observed from Figure 4 that the diurnal variation of air quality presented a certain seasonal difference and there was a certain fluctuation in diurnal concentrations. In the seasons of summer and autumn, the diurnal variation was small, with the daily fluctuation lingering around 10 µg/m 3 , while the diurnal variations in winter and spring was large, with the daytime fluctuation reaching about 20 µg/m 3 . The diurnal variation of PM 2.5 concentrations was by and large characterized by a "W" type double wave. The peak value in the daytime occurred between 08:00 and 11:00 in the morning, and then continued to decrease to a trough. The peak in the night appeared after 19:00 and then gradually decreased in the early hours of the morning. The occurrence of the peak pollutant concentrations in the daytime could be related to the increase of human activity during the early peak period. With the increase of temperature at noon, the pollutant concentrations gradually decreased, aided by the weather conditions. Subsequently, with the approach of evening peak, the increase of restaurant emissions, and the reduction of the height of the planetary boundary layer, the concentration of pollutants increased further [34]. In the seasons of spring and summer, the average concentrations of pollutants were higher during the daytime and reduced at night, which contrasts with the situations in the autumn and winter. This difference was mainly due to the diverse sources of pollution and their distinct formation mechanisms in different seasons. The air quality in spring and summer was more affected by human activities. With the advent of night and the decrease of human activities, the concentration of pollutants dropped to a lower level in these two quarters. The main influencing factors of outdoor air quality in autumn and winter were the transport and diffusion effect of external pollution sources. The impact of human activity was relatively small and superseded by meteorological conditions. Therefore, during the night time, the lower atmosphere and stagnant wind conditions aggravated the accumulation of pollutants and increased the PM 2.5 concentrations [20].

Diurnal Variation Characteristics of PM 2.5 Concentrations in Beijing
To further reveal the diurnal variation of the pollutant concentrations in Beijing, Figure 5 shows the temporal and spatial variation of the PM 2.5 concentrations in Beijing on 25 December 2015. On 25 December 2015, a serious particulate matter pollution incident occurred in Beijing. The concentration of PM 2.5 in some areas reached over 700 µg/m 3 , causing widespread international and social concerns.
As can be drawn from Figure 5, the particulate matters in Beijing were mainly concentrated on the southeast and central areas at 00:00 in the early morning, and the air quality in the northern and western mountainous areas was better than in other regions. At 04:00 and 08:00, the pollution bound expanded to the northern and western regions and the pollution levels in the southern part of the region were also aggravated, however the northern and western parts of the region still maintained high levels of air quality. By noon, the concentration of particulate matters in the city reached a peak and the pollution range was further expanded. From the southwest to the northeast, almost the whole city was immersed in serious pollutions of middle and above-recommended levels and the concentration of PM 2.5 in some areas reached more than 705 µg/m 3 , creating the record of the highest concentration in a single day in the year. At this time, there were still some regions in the northern mountainous areas that were unaffected. At 16:00, the pollution range expanded once again. The core pollution areas were concentrated in the Fengtai, Chaoyang, and Haidian districts in the center of the city and the northern regions were also thereby affected. After nightfall, the concentration of pollutants gradually decreased but the pollution areas did not shrink. The average concentration of pollutants in the city dropped to the levels of 08:00 in the morning. In general, the heavily polluted areas in this serious pollution event were still concentrated in the southern and central areas, and were obviously affected by the transport effect from southwest directions. At the early stage of the development of this air pollution event (before 12:00 a.m.), the air quality level was mainly affected by the local emission and accumulation effects. Influenced by meteorological conditions, the transport effect of the surrounding pollution sources became the leading factor for the overall air quality levels in the city, which aggravated the severity of the air pollution and promoted the outbreak of a serious air pollution event.

Relevance Analysis between PM2.5 and Major Pollutants
The raw data can be divided into two types, the numerical variables and categorical ones. For numerical variables, the Pearson coefficient analysis showed that the top five linearly related raw parameters with PM2.5 concentrations in Beijing were the PM2.  Figure 6. At the same time, there was a linear correlation between the variables, as shown in the middle of the correlation matrix. Figure 6 shows a linear correlation

Relevance Analysis between PM 2.5 and Major Pollutants
The raw data can be divided into two types, the numerical variables and categorical ones.  Figure 6. At the same time, there was a linear correlation between the variables, as shown in the middle of the correlation matrix. Figure 6 shows a linear correlation matrix of the top 15 variables, with a high linear correlation with PM 2.5 concentration. A significance test was performed on these correlation coefficients and found that sig = 0.000, indicating that the significance level p value was less than 0.001, further indicating that the correlation does exist. It can be seen from the figure that there was a strong linear correlation between PM 2.5 concentration in Beijing and pollutants in surrounding cities, such as: Langfang, Chengde, Baoding, etc.
Moreover, obvious nonlinear relationships between several independent variables and dependent variables could be found during the exploratory data analysis, as depicted in Figure 7. It can be seen from the figure that the distribution of most numerical variables exhibited different degrees of skewness (the diagonal part of the figure), and there was a significant nonlinear relationship between the independent variable and dependent variable (the upper right and the lower left corner). Meanwhile, the linear relationship between these parameters indicated the risk of multi-collinearity (lower right).
For example, the O 3 concentrations, evaporation capacity, and extreme wind speed exhibited an apparent exponential relationship with PM 2.5 , while the PM 10 and PM 2.5 concentrations of Langfang presented potential logarithm relevance.  Moreover, obvious nonlinear relationships between several independent variables and dependent variables could be found during the exploratory data analysis, as depicted in Figure 7. It can be seen from the figure that the distribution of most numerical variables exhibited different degrees of skewness (the diagonal part of the figure), and there was a significant nonlinear relationship between the independent variable and dependent variable (the upper right and the lower left corner). Meanwhile, the linear relationship between these parameters indicated the risk of multi-collinearity (lower right).  For categorical variables, there was a linear correlation between the variables and the target values. Most of the variables exhibited typical periodic variation and fluctuation characteristics, as shown in Figure 8. Figure 8a refers to the violin plot of PM2.5 monthly concentration. It can be seen that the change in PM2.5 concentration showed typical seasonal fluctuations and that the PM2.5 concentration was smaller in the summer and autumn from June to September. The rest of the months fluctuated greatly and the lowest PM2.5 concentration appeared around August. This is consistent with the conclusions of previous studies. Guo et al. found the lowest and highest monthly mean PM2.5 concentrations appeared in August and January, respectively [35]. We encoded the wind direction from 1 to 16 clockwise and 1 represents a north wind direction. Figure 8b refers to the scatter plot of wind direction of extreme wind speed and PM2.5 concentration. It shows that the PM2.5 concentration exhibited periodic rhythm with the change of extreme wind speed direction and the highest concentration of pollutants occurred when the extreme wind speed direction was northeast and southwest (wind direction code is 3 and 11). When the wind direction was west and northwest (wind direction codes 13 and 15), air quality conditions were generally good. Figures 8c and 8d further reveal this phenomenon through wind rose for wind direction of maximal and extreme wind speed against PM2.5 concentrations. The radius refers to the frequency of specific wind direction and the intensity refers to the value of PM2.5 concentrations. The prevailing wind direction of Beijing's maximum wind speed and daily maximum wind speed is northeast-southwest, where the daily maximum wind speed is slightly east. When the wind direction is weak southwest wind, the For categorical variables, there was a linear correlation between the variables and the target values. Most of the variables exhibited typical periodic variation and fluctuation characteristics, as shown in Figure 8. Figure 8a refers to the violin plot of PM 2.5 monthly concentration. It can be seen that the change in PM 2.5 concentration showed typical seasonal fluctuations and that the PM 2.5 concentration was smaller in the summer and autumn from June to September. The rest of the months fluctuated greatly and the lowest PM 2.5 concentration appeared around August. This is consistent with the conclusions of previous studies. Guo et al. found the lowest and highest monthly mean PM 2.5 concentrations appeared in August and January, respectively [35]. We encoded the wind direction from 1 to 16 clockwise and 1 represents a north wind direction. Figure 8b refers to the scatter plot of wind direction of extreme wind speed and PM 2.5 concentration. It shows that the PM 2.5 concentration exhibited periodic rhythm with the change of extreme wind speed direction and the highest concentration of pollutants occurred when the extreme wind speed direction was northeast and southwest (wind direction code is 3 and 11). When the wind direction was west and northwest (wind direction codes 13 and 15), air quality conditions were generally good. Figure 8c,d further reveal this phenomenon through wind rose for wind direction of maximal and extreme wind speed against PM 2.5 concentrations. The radius refers to the frequency of specific wind direction and the intensity refers to the value of PM 2.5 concentrations. The prevailing wind direction of Beijing's maximum wind speed and daily maximum wind speed is northeast-southwest, where the daily maximum wind speed is slightly east. When the wind direction is weak southwest wind, the probability of air pollution is greater, and when the wind direction is north, the air quality is generally better. This phenomenon may be related to the topographical features of the three sides mountains of Beijing and the distribution of southern industrial areas [36,37].
probability of air pollution is greater, and when the wind direction is north, the air quality is generally better. This phenomenon may be related to the topographical features of the three sides mountains of Beijing and the distribution of southern industrial areas [36,37]. Combined with the analysis of the correlation between surrounding pollutants, meteorological factors, and PM2.5 in Beijing, it can further explain the reason why the air quality in southern Beijing is generally better than that in the north. The surrounding pollutants have a strong influence on Beijing's air quality and Beijing's prevailing winds are mostly southerly, so the areas in the south of Beijing have a greater impact and Langfang is closer than Baoding in geographical distance. Therefore, in the correlation analysis, the pollutants in Langfang have a greater impact on Beijing than Baoding.

Conclusions
Today, air pollution has become one of the most serious environmental problems in the world. Fine particulate matters (PM2.5) are harmful to ambient air quality, economic development and human health. Considering Beijing and six surrounding cities as main research areas, this study took the daily average pollutant concentrations and meteorological elements from 2 December 2013 to 13 October 2017 into account and studied the spatial and temporal distribution characteristics, the primary influencing factors, and the forecasting method of PM2.5 concentrations in Beijing in order to Combined with the analysis of the correlation between surrounding pollutants, meteorological factors, and PM 2.5 in Beijing, it can further explain the reason why the air quality in southern Beijing is generally better than that in the north. The surrounding pollutants have a strong influence on Beijing's air quality and Beijing's prevailing winds are mostly southerly, so the areas in the south of Beijing have a greater impact and Langfang is closer than Baoding in geographical distance. Therefore, in the correlation analysis, the pollutants in Langfang have a greater impact on Beijing than Baoding.

Conclusions
Today, air pollution has become one of the most serious environmental problems in the world. Fine particulate matters (PM 2.5 ) are harmful to ambient air quality, economic development and human health. Considering Beijing and six surrounding cities as main research areas, this study took the daily average pollutant concentrations and meteorological elements from 2 December 2013 to 13 October 2017 into account and studied the spatial and temporal distribution characteristics, the primary influencing factors, and the forecasting method of PM 2.5 concentrations in Beijing in order to provide guidance for coping with extreme meteorological disasters and to provide references for improving municipal crisis response and emergency planning.
In this paper, the inter-annual, seasonal and diurnal variation trends, and temporal spatial distribution characteristics of PM 2.5 concentration in Beijing were studied by correlation analysis and geo-statistics. The main conclusions are as follows: (1) The pollutant concentrations in Beijing exhibit obvious seasonal and cyclical fluctuation patterns. Air pollution is more serious in winter and spring and slightly better in summer and autumn, with the spatial distribution of pollutants fluctuating dramatically in different seasons. The pollution in southern Beijing areas are more grievous and the air quality in northern areas are better in general. The diurnal variation of air quality shows a typical seasonal difference and the daily variation of PM 2.5 concentrations by and large presented a "W" type of mode with twin peaks. Except for the emissions and accumulation of local pollutants, air quality is susceptible to the transport effect from southwest.
(2) A feature importance analysis reveals that PM 10 and PM 2.5 concentrations measured from the city of Langfang should be taken as the most important elements of surrounding pollution factors to PM 2.5 in Beijing. These concentrations of PM 10 and CO are the most significant local factors to PM 2.5 in Beijing. Extreme wind speeds and maximal wind speeds are considered to extend most effects of meteorological factors to the cross-regional transportation of contaminants. Pollutants found in the cities of Langfang have a stronger impact on air quality in Beijing than other surrounding factors. Each element affects the air quality of the study areas in a different way.
This study elaborated the spatial and temporal distribution characteristics of PM 2.5 concentrations in Beijing and the influencing modes of various factors on PM 2.5 concentrations in Beijing. It helps to thoroughly recognize and understand the formation mechanisms of serious haze events.
Author Contributions: B.Z. and J.C. conceived the key ideas and the system architecture; B.Z. conducted the research; W.Y. and Z.H. analyzed the data; B.Z. and J.C. wrote the paper; and W.Y. reviewed the process.