Analysis of the Characteristics and Evolution Modes of PM2.5 Pollution Episodes in Beijing, China During 2013

Fine particulate matter (PM2.5) has been recognized as a serious hazard linked to deleterious health effects. In this study, all PM2.5 Pollution Episodes (PPEs) in Beijing during 2013 were investigated with hourly PM2.5 observations from the Olympic Sport Center site, and then their characteristics and evolution modes analysed. Results show that 80 PPEs, covering 209 days, occurred in Beijing during 2013. Average PM2.5 concentrations during PPEs were almost twice (1.86) the annual mean value, although the PPEs showed significant seasonal variations. The most hazardous PPEs tended to occur in winter, whereas PPEs with long duration occurred in autumn. The PPEs could be divided into six clusters based on their compositions of different pollution levels, which were strongly related to meteorological factors. We used series peaks of PM2.5 concentrations to analyse the evolution modes of PPEs and found that the more peaks there were within the evolution mode, the longer the duration, and the higher the average and maximum PM2.5 concentrations. Each peak within a PPE can be identified by “rise” and “fall” patterns. The “rise” patterns are widely related to relative humidity, whereas the “fall” patterns are affected principally by wind speed for one-peak PPEs and boundary layer height for multi-peak PPEs. The peak patterns cannot be explained fully by meteorological factors; however, they might also be closely related to complex and diversified human activities.


Introduction
The combination of urbanisation, industrialisation, and population growth in China has led to a remarkable increase in emissions, and the problem of air pollution has received increasing attention because of its influence on daily life via the climate, environment, visibility, and health.
One of the most harmful air pollutants is particulate matter (PM). Inhalable particles (PM10) can penetrate deep inside the lung, which not only decreases the function of the respiratory and cardiovascular systems, but also increases mortality from pollution-related disease; however, PM2.5 is associated more with adverse health effects than the coarser particles are [1][2][3]. An increasing number of studies have been focused on the variation of PM2.5 concentration. Some studies have considered the chemical composition of PM2.5, including elemental constituents, water-soluble ions, and organic carbon [4][5][6][7]. Other research has attempted to describe the spatiotemporal distributions of PM from site monitoring data, including spatial patterns [8][9][10], diurnal variations [11], and annual periods and trends [12][13][14] of PM concentrations, and to demonstrate their relationships with confounding meteorological factors [15,16].
Beijing, the capital of China, is one of the cities in the world most seriously affected by the problem of air pollution, and considerably higher PM2.5 concentrations have been observed there. For example, a value of 101 μg/m 3 was found in 2000 in the study by Zheng et al. [17], which is similar to the value of 115-127 μg/m 3 observed from 1999 to 2000 in the study by He et al. [4] These values are much higher than the 12 μg/m 3 found in the United States from 2000 to 2007 [18], and the value of 8-25 μg/m 3 observed regionally in Switzerland in 1998-2001 [19]. Wang et al. [20] and Zheng et al. [17] have demonstrated that the major sources of PM2.5 in Beijing are coal combustion, traffic exhaust, dust, and industrial activities. In Beijing, research has increasingly been undertaken on the seasonal or diurnal changes [21] of PM2.5 concentrations series and their emission sources [22,23]. Furthermore, other investigations have demonstrated the impact of meteorological factors on PM2.5 concentrations [24,25], and the seriously damaging effect PM2.5 pollution can have on health in Beijing [26]. However, few studies have focused on the evolution process of each specific PM2.5 Pollution Episode (PPE). These evolution processes, evolving through several different stages, such as emergence, stability, and dispersion, provide a comprehensive depiction of each PPE. These evolution modes can be used to retrieve historical PPE records and predict future PPEs. Thus, in this research, we use PM2.5 observations to further our understanding of PPEs.
For this study, hourly PM2.5 observations were collected continuously at an urban site in Beijing for 13 months (1 February 2013-28 February 2014). In this paper, we provide the definition of PPEs and analyse the characteristics of each PPE with these records. Furthermore, we extract the evolution mode of each PPE, and explore its relationship with confounding meteorological factors. To identify the evolution mode of each PPE, we generalise the PM2.5 concentration series in each PPE using a Perceptually Important Points (PIPs) extraction method, and classify all the PPEs into one of five categories according to its evolution modes. Each category is analysed to compare the influence of variations of the meteorological factors.
The remainder of the paper is organised as follows: the data and methods are described in Section 2. After the basic characteristics of PPEs are defined in Section 3.1, we divide the PPEs into different clusters based on their different compositions in Section 3.2. In Section 3.3, we identify the evolution mode of each PPE and analyse its relationships with meteorological factors. Discussions are also presented in these sections. Finally, conclusions are drawn in Section 4.

Data Source
Two kinds of data sources were used in this study, the details of which, together with an appraisal of their uncertainties, are described in the following.

Ground Observations
We collected hourly PM2.5 concentration observations at the Olympic Sport Centre site (39.982 °N, 116.397 °E) from continuous particulate monitor (BAM-1020, Supplementary Information, Table S2) during the period from 1 February, 2013 to 28 February, 2014, which provided a data set of 6650 records. There were 2782 missing hourly values, which included 46 entire days. Where possible, parts of the records, with less than 6 h missing, were recovered using linear interpolation. Entire days of missing data were not used in the following analysis.

Meteorological Data
ERA-Interim is the global atmospheric reanalysis data produced by the European Centre for Medium-Range Weather Forecasts [27]. It extends back to 1979, and the analysis continues to be extended forward in near-real time. A more detailed description of the ERA-Interim product archive can be found in the paper by Berrisford et al. [28]. Simmons et al. [29] have found that ERA-Interim data agree well with the Climatic Research Unit and Hadley Centre analyses of monthly station temperature data (CRUTEM3), and the correlations between the CRUTEM3 and ERA-Interim data in North America and Asia exceed 99% [14]. In our study, gridded observation records at 3-hourly intervals including wind speed (WS), dew point temperature (DP), surface temperature (ST), and boundary layer height (BLH) were used. The relative humidity (RH) was calculated using the Goff-Grattch equation with the DP and ST records.

Definition of PM2.5 Pollution Episodes (PPEs)
In this study, each PPE is defined with a start hour and an end hour in the PM2.5 concentration series. The start hour of a PPE is defined as the first span of at least 12 h with PM2.5 concentration > 75 μg/m 3 , and the end hour of the PPE is defined as the first span of 6 h after the start hour with PM2.5 concentration < 75 μg/m 3 . In our definition, the duration of each PPE is at least 12 h. Six indices were employed in this study to analyse the basic characteristics of each PPE, the details of which are given in Table 1. Here, Tb and Te denote the start and end hours of a PPE, respectively, and Ct represents the PM2.5 concentration record in time t. Basic statistics regarding these indices have been analysed, and different combinations of these indices used to describe different characteristics of PPEs, including health hazard levels and compositions. Simple classification and Time Series Clustering methods have been used in this study.

Identification of Evolution Mode for Each Pollution Episode (PPE)
To determine the evolution mode of the PPEs, the PIPs of each PPE were identified. The concept of PIPs describes the general shape of the time series, when a data point that has greater domination over the overall shape of the series is considered more important. For a given PPE, which is represented by 3 / g m μ 3 / g m μ the PM2.5 concentration sequence P, the first two PIPs will be the first and last points of P. The next PIP will be the point in P with maximum distance to the first two PIPs. The fourth PIP will then be the point in P with maximum distance to its two adjacent PIPs, either in between the first and second PIPs or in between the second and last PIPs. This process of locating the PIPs continues until all the points in P are attached to a list (Supplementary Information Figure S1a). Here, we used Euclidean Distance (PIP-ED, [31]) to evaluate the importance of the PIPs in each PPE (Supplementary Information Figure S1b). Points with PIP-ED larger than a threshold have been preserved.

Basic Characteristics of PM2.5 Pollution Episodes (PPEs)
We identified 80 PPEs covering 209 days in Beijing during the study period using the method mentioned in 2.2, and these PPEs occupied 45% of the hours of the entire year. Most PPEs occur in January, February, and September (Supplementary Information Figure S2 (Supplementary Information, Figure S4) shows the PPEs with average PM2.5 concentrations of different levels in each season. PPEs are frequent in winter, but relatively fewer in summer. Light PPEs are widely observed in each season, while medium PPEs are more likely in summer and winter. Moreover, over half the hazardous PPEs occurred in winter ( Table 2). The duration of each PPE is displayed in ( Supplementary Information, Figure S5). It can be seen that 27 PPEs last for less than 1 day and 33 last for 1 or 2 days. There are 20 PPEs with durations longer than 2 days (Table 3). Supplementary Information Figure S6 shows PPEs with durations of different levels in each season. Most short PPEs appear in spring, medium PPEs are observed in each season (especially summer and winter), and long PPEs are more likely in autumn and less likely in winter. We calculated the duration ratio of different level concentrations for each PPE. According to these ratios, we divided the 80 PPEs into six clusters. Figure 2 shows the different compositions of each PPE and their clusters. Characteristic of these clusters are listed in Table 4. PPEs in the first cluster are represented as yellow dots and these PPEs occur mainly in red triangles, meaning hazardous pollution accounted for a large proportion of the durations. PPEs represented by circles are within the second cluster where over half the durations of the PPEs involved hazardous pollution. Squares in the middle triangles represent the third cluster in which a tripartite situation between the three levels of pollution occurs. Triangles in the fourth cluster mean PPEs with a large ratio of light pollution. The crosses mean that 60% of the duration was light pollution and 30% medium pollution. The pentagrams represent the cluster for which over half the duration involved medium pollution.  Table 5 shows the number of PPEs of different composition clusters in each season. PPEs with a large ratio of hazardous or light pollution (clusters 1 and 4) occur mainly in winter. These phenomena relate mainly to stable weather conditions when RH and BLH have fewer fluctuations [15]. However, PPEs for which about half the duration is medium (cluster 6) or light pollution (cluster 5) appear mainly in summer and autumn. PPEs in cluster 3 with equal durations of each pollution level are mainly distributed in spring and winter. PPEs of cluster 2 with about half the duration involving hazardous pollutions occur less often, which is thought to be related to sudden changes of PM2.5 concentrations ( Supplementary  Information, Figure S7).

Evolution Mode of PM2.5 Pollution Events (PPEs)
The evolution mode of PPEs reflects the dynamical variations of PM2.5 concentration. One of the most significant characteristics is the appearance of peaks that reflect the accumulation and dispersion processes of PM2.5 pollution.
In our study, we define each peak as a "rise-fall" pattern from PIPs, in which the concentration difference between the peak and valley points should be larger than a threshold. Accordingly, we classified the PPEs into five categories based on the identification of peaks. Supplementary Information Figure S8 displays the characteristics of the evolution modes (red lines) of the PPEs in each category. It can be seen that the more peaks in the evolution mode, the longer the duration, and the higher the average and maximum PM2.5 concentrations of the PPEs. These results show clearly the relationships between evolution modes and pollution severity of the PPEs. Table 6 presents the characteristics of the evolution modes of PPEs in each category. The first category has six PPEs with relatively flat fluctuations of the PM2.5 concentration series, for which no peak pattern could be identified. These PPEs occur seldom in winter, have short average durations of 18.5 h, and average concentrations of 92 μg/m 3 . The second category displays one-peak patterns for the different PPEs. The average duration in this category is about 27 h, and the average PM2.5 concentration is 143.3 μg/m 3 . Most of these PPEs happen in winter with the peaks occurring at night. This is attributed mainly to the higher RH and lower BLH at night [33,34]. Doublepeak patterns are evident in the third category with an average duration of over 30 h and average concentration of 145 μg/m 3 . PPEs in this evolution mode are often observed in summer or winter. The fourth and fifth categories show triple-peak patterns and multi-peak patterns, respectively. The durations and average PM2.5 concentrations of these two categories are 62 and 84.1 h, and 167.1 and 185 μg/m 3 , respectively. To establish the relationships between the evolution process of PPEs and meteorological factors, we analyse the correlations between PM2.5 concentrations of PPEs in each categories and meteorological factors. In PPEs of the first category (no peak), synchronous observations of RH show a weak correlation with PM2.5 concentrations. Although low WS (2.17 m/s) and high average RH (0.72) are favourable meteorological conditions for atmospheric condensation, PM2.5 concentrations may hardly rise up to a certain extent in the circumstance of high BLH (529 m). PPEs of the second category are thought to be affected by meteorological factors in three different ways. The first cluster of one peak pattern may be primarily subject to subsidence inversion effect which commonly acts on accumulation process of PM2.5 under low BLH weather condition in winter. The second cluster show positive correlations between RH and PM2.5 concentrations and negative correlations between WS and PM2.5 concentrations and between BLH and PM2.5 concentrations. The third cluster presents some exceptions that PM2.5 concentrations are positively correlated with WS ( Supplementary Information, Table S3b). This indicates that WS is not always blowing off. Sometimes, pollutant emission from the surrounding factories could be blown into downtown area in Beijing. Most of Double-peak PPEs in third category may be sensitive to meteorological factors, when the correlations tend to be higher and the variations are accorded with changes of PM2.5 concentrations (e.g., Supplementary Information Figure S9b). Other double-peak PPEs display "small-big peaks" pattern. These PPEs, which show weak correlations between PM2.5 concentrations and meteorological factors, are very likely related to a new source of emission or to enhanced continuous emission (Supplementary Information Table S3c). Considering PPE 3 as an example (8 February 2013), the latter significant higher peak is mainly attributed to firecrackers on New Year's Eve. For the triple-peak and multi-peak PPEs in the fourth and fifth category, the meteorological conditions are relatively stable for atmospheric condensation process when average BLH stay on a lower level and WS is always small. Diurnal cycles of PM2.5 concentrations variations could be observed with synchronous daily variability of RH and BLH. During these PPEs, most PM2.5 concentrations rise to peak at midnight and fall valley at noon (Supplementary Information Table S3d,e). However, RH in the multi-peak PPEs does not show positive correlations with PM2.5 concentrations. That's may be attributed to the lag effect of atmospheric condensation process.
For deep studies about the specific evolution process of accumulation and dispersion, we also identified each "rise" and "fall" period in all peaks and compared them with the meteorological factors during the same time. Table 7 shows the correlations between the rate of change of PM2.5 concentrations and meteorological factors (WS, RH, and BLH). We can see that RH affects the accumulation process of all categories PPEs except PPEs in the double-peak mode, especially for PPEs with long duration. Negative correlations can be seen between average RH and the rate of rise of PM2.5 concentrations. This result is intuitively different from previous studies [15,25], which have demonstrated that pollution accumulates more easily under conditions of higher RH. However, BLH and WS are also important factors affecting the rise pattern of the one-peak and triple-peak processes, respectively. These results are consistent with previous studies [15]. For the dispersion process, a clear negative correlation can be observed between the fall rate of PM2.5 and WS in the one-peak pattern because of the "blowing-off" effect. Weak correlations between the rates of change for the double-peak process and meteorological factors confirm the reason as being related to emission source. Furthermore, the dispersion process of the multipeak process is also highly correlated with RH and BLH.

Illustrative Cases
(1) Single peak, wind blowing-off A one peak PPE was observed at the end of February, when PM2.5 concentrations increased to a hazardous value-441 μg/m 3 under a suitable weather condition of RH and BLH before 11:00 A.M. However, when WS increased to 8 m/s, PM2.5 concentrations had been decreased significantly to a moderate level in 3 h (Supplementary Information Figure S9a). This blowing-off effect should be common in Beijing during winter.
(2) Double peaks, synchronous variations The double-peak PPE in mid-July show some synchronous variations of PM2.5 concentration and meteorological factors. High correlation between these indices can be observed and the peak times of PM2.5 are almost accordance with the other three peak (or valley) times (Supplementary Information Figure S9b). This evolution mode of PPE need relatively stable weather conditions with lower WS and higher RH.
(3) Small-Big peak, multi-source emission PPE in the Spring Festival show a typical small-big peak pattern. Significant increase of PM2.5 concentrations on New Year's Eve can be observed after midnight (Supplementary Information Figure S9c). RH is suitable for condensation process when pollution emission from firecrackers are enormous. This pattern of PPE always can be seen in Beijing.

Conclusions
This article documents the characteristics of PM2.5 Pollution Episodes (PPEs) and extracts their evolution mode using hourly PM2.5 observations obtained in Beijing between 1 February 2013 and 28 February 2014. With the aid of a set of descriptive indices, better understanding of PPEs is gained, and the core conclusions drawn are as follows. PPEs with a large ratio of hazardous or light pollution (clusters 1 and 4) occurred mainly in winter, whereas PPEs for which about half the duration was medium (cluster 6) or light pollution (cluster 5) occurred mainly in summer and autumn. The PPEs that had equal durations of all three pollution levels occurred mainly in spring and winter. These compositions are affected mainly by meteorological factors. (3) The evolution modes of the PPEs were identified based on the peak patterns that reflect the accumulation and dispersion processes of PM2.5 pollution. The greater the number of peaks in the evolution mode, the longer the duration, and the higher the average and maximum PM2.5 concentrations PPEs. Each peak in PPE is identified by "rise" and "fall" patterns that reflect the accumulation and dispersion processes of the PPEs, respectively. The rise patterns in each peak are related to RH. The fall patterns in the one-peak PPEs are affected mainly by WS, whereas those in the multi-peak PPEs are related to BLH.
These results suggest that the peak patterns cannot be fully explained by meteorological factors alone, but that they might also be closely related to complex and diverse human activities. Most importantly, these findings are helpful for furthering our understanding of PM2.5 pollution mechanisms, and they can be used to improve the accuracy of model simulations of air quality.