3.1. Clustering Analysis
The aerosol data are classified into different clusters (2~8 clusters) by the K means method, and their DB indices are calculated respectively. The values are shown in Figure 3
. It is seen that the DB index is the smallest when the cluster number is four. We therefore consider this as the reasonable number. The sample size and centroids of the four cluster results are listed in Table 3
. For more robust results, we redo the clustering using an independent technique—SOM. We have actually tried different numbers of nodes. We found that using 3 nodes produces a slightly lower DB index than using 4 nodes, but the sample numbers of the three types are quite unbalanced (one type has the majority of the data points). When using 5 nodes or more, the DB index increases. We therefore consider 4 nodes as the optimum number. This result also agrees the best with K means results. The SOM results are given in the parenthesis of Table 3
in parallel to the K means results. We notice that the results of the two techniques are almost identical, which increases the credibility of the clustering results.
When analyzing the spatial distribution and seasonal variability of aerosol types in the next section, we hope to know the aerosol type information from all sites. So we use all data in this step, but the data not used for clustering analysis must be manually classified according to its distance from the clustered centers. For the observations used in the clustering step, reclassification will not affect the results. This is because our reclassification criterion is the same as that used in K means. K means determines to which type each data belongs to according to the closest distance between this data point and the centroid of four types. When we decide which type this data should belong to, we also assign it the type whose centroid is the closest to this data point. Therefore, there will not be cases when a data being classified into one type by K means is later assigned to a different type by reclassification. Then, all 39,249 data are used and classified into the defined four types according to the Euclidean distance between the data and the centroid of the four types, i.e., the selecting the centroid with the smallest distance and classifying the data into that type. In addition, we define a distance threshold as the longest distance from the centroid of each type, i.e., if the shortest distance between a data point and a centroid is still greater than the distance threshold of this type, this data is not classified and is thus discarded. In this way, we classified 39,249 data records by discarding only 47 records.
Moreover, to rule out the effect of sampling on the clustering results, we further perform a Monte Carlo type validation by randomly selecting half of the data from the original data as a subset to conduct data processing and clustering analysis and repeat the process ten times. Each time we compare the clustering results of the subset with those of all data. If one observation data is classified as the same aerosol type in both cases, it is marked as “correct”, thus the accuracy rate of the four types can be obtained by counting the number of correct classifications. Figure 4
shows the accuracy rate for these ten experiments using the K means method. It is seen that most accuracy rates are above 80%, with more than half of them exceeding 90%. Therefore, we consider that the four types identified are stable.
The four types identified by the analysis are tentatively named as dust, scattering mixed type, absorbing mixed type and scattering fine type respectively, primarily based on their scattering/absorption parameters (single scattering albedo, refractive indices, absorption Angstrom Exponent) and size parameters (effective radius, fine mode fraction, extinction Angstrom Exponent). Because of the difficulty in high dimensional visualization, Figure 5
a–d shows the scatter plot between two representative parameters—single scattering albedo (SSA) and fine mode fraction (FMF). It is observed that type 3 has 440 nm SSA about 0.85 which indicates the strongest absorption. Type 2 and type 4 have SSA above 0.9 in four wavelengths, which implies relatively strong scattering. On the other hand, Type 1 has the lowest FMF, while Type 4 has the highest FMF. As indicated by Chen et al. [19
], FMF < 0.4 indicates that coarse particles dominate the aerosol model, FMF > 0.6 indicates that fine particles dominate the aerosol model, and between them, the aerosol model is mixed. In Figure 5
e, we further define a range for each type based on their scatter plots. Although there are slight overlaps between scattering mixed with absorbing mixed and scattering fine types, overall the ranges of four types are well separated. Dust has the lowest FMF and moderate SSA; scattering fine has the highest FMF and highest SSA; absorbing mixed has the lowest SSA and moderate FMF, while the properties of scattering mixed lie between those of scattering fine and absorbing mixed. These results support the validity of our clustering results.
In addition to single wavelength properties, the spectral dependencies of many parameters are also associated with aerosol composition. For example, as indicated by several previous studies [25
], dust SSA increases with wavelength due to its UV absorption, whereas anthropogenic aerosols show a reversed behavior. We therefore continue to examine the spectra for three parameters of the four types in Figure 6
: (a) single scattering albedo (SSA), (b) asymmetry parameter (g), (c) aerosol optical depth (AOD). In Figure 6
a, it is clearly seen that SSA of desert dust increases with wavelength while that of other types decreases with wavelength, which is due to the absorption of dust in the UV wavelengths and is consistent with the previously documented results. Moreover, the SSA spectra of absorbing and scattering mixed aerosols both exhibit non-monotonic behavior, i.e., SSA first increases with wavelength from 440 to 675 nm, then decreases. This is an indication of mixing of black carbon with dust or organic carbon and is a typical feature of East Asian aerosols [27
]. According to the more quantitative method proposed by Li et al. [27
], absorbing mixed (with greater SSA spectral curvature) is likely black carbon mixed more with dust, whereas scattering mixed likely contain more organic carbon. With respect to asymmetry parameter, desert dust has the g
factor above 0.7 and varies little with wavelength, whereas that of other types decreases rapidly with wavelength (Figure 6
b). Figure 6
c also shows that dust AOD has a much more flat spectrum than others. These clear differences between dust and the other aerosol types confirm that the former indeed primarily consists of larger particles.
3.2. Spatial Distribution and Seasonal Variability of the Aerosol Types
With the clustering information in hand, it becomes possible to examine the spatial and temporal variability of major aerosol types in China. Figure 7
shows the distribution of frequency of occurrence of each type for each season. Note that although the clustering only uses a selected portion of AERONET data described in Section 2
, Figure 7
is produced using all data. Each data point is classified into one of the four types according to the threshold mentioned in Section 3.1
and the Euclidian distance between this data point and the center of each type. Some sites do not appear on the map during certain seasons because they have available data only for some seasons but not for others. In addition, five sites representative of major pollution source regions in China (SACOL, Beijing, Taihu, Taiwan Cheng Kung University and Hong Kong) are further selected to examine their seasonal variability of aerosol types in detail (Figure 8
shows that desert dust mainly occurs in northwest China in the spring (MAM), close to the major dust sources of the Taklimakan and Gobi deserts. The North China Plain also has significant amounts of dust in the spring due to the transport by the prevailing west winds in this season. These results are consistent with previous studies [29
]. Figure 8
also reveals that dust has high frequency of occurrence in the spring at SACOL (Northwest China) and Beijing (North China Plain). Moreover, Taihu is also dominated by dust aerosol in spring. Scattering mixed type occurs mostly in the spring and summer (JJA) over East China and South China, and it also appears relatively frequently in the South China all year round (Figure 7
and Figure 8
). A large number of such aerosols in south China are due to the emission from biomass burning, industrial activities, automobile exhaust and so on. Another reason may be that organic carbon aerosols released by extensive biomass burning activities in Southeast Asia are transported to South China by the monsoon [32
]. The total amount of absorbing mixed type is relatively small and is mainly found over North China Plain and East China and occurs most frequently in the fall (SON) and winter (DJF) (Figure 7
and Figure 8
). The seasonal feature is consistent with the heating time when many carbonaceous aerosols are released from coal combustion and vehicle exhausts [33
]. For the scattering fine type (Figure 7
and Figure 8
), the occurrence mainly concentrates over North China Plain and East China during the summer and fall. These two seasons are featured with high humidity and precipitation, and the suspended water droplets can facilitate the conversion of gases to particles, leading to more fine mode aerosols with strong scattering (such as the conversion of sulfur dioxide to sulfate particles, [33
]). Moreover, hygroscopic growth of aerosols also increases their scattering efficiency. Both factors favor the higher scattering properties retrieved by AERONET. Nonetheless, due to the limitation of remote sensing data, we cannot isolate the relative humidity effect.
Note that here we focus on identifying the optical and microphysical properties of different aerosol types, which are critical in aerosol forcing estimation and satellite aerosol retrievals. The remote sensing measurements that we used cannot give the detailed chemical composition of aerosols. By comparing Figure 7
with existing studies that report aerosol chemical composition in China ([5
]), we can make some inference of the composition of four aerosol types. The dust type should mainly contain sand dust, urban fugitive dust and coal ash ([7
]), which is the main component of aerosol in northwest China. Zhang et al. [7
] also found black carbon with strong absorption has its high level of mass concentrations in North China especially during autumn and winter, consistent with spatial and seasonal distribution of our absorbing mixed type. Many previous studies [5
] showed the small particles with strong scattering property such as organic carbon, sulfate, nitrate and ammonium all have high concentration in China especially in urban due to various emission sources. They also account for a large proportion of aerosol composition in summer which maybe driven by the aqueous processing and gas-phase photochemical production. Therefore, it is likely that the two scattering types mainly consist of organic carbon, ammonium sulfate and nitrate. Moreover, the scattering mixed type should contain more organic carbon due to its SSA spectra. Besides, Figure 8
also shows the typical aerosol types in representative regions of China. For example, SACOL (Northwest China) is dominated by dust aerosol with big particles almost throughout the year. Beijing (North China) and Taihu (Southeast China) are mixed regions of all aerosol types where the four types all have obvious frequency of occurrence, so there is a significant mixture of coarse particles and fine particles in these regions. The frequency of dust aerosols is close to 0 in Taiwan Cheng Kung University and Hong Kong (South China), therefore this region is mainly dominated by fine particles.
Briefly, dust aerosols are mainly found over Northwest China and the North China Plain in the spring, scattering mixed aerosols over Southeast China in the spring and summer, absorbing mixed type over East China in the fall and winter, and scattering fine type over East China in the summer and fall.
An advantage of sun photometers is their continuous measurements during the daytime, allowing the examination of diurnal variability of aerosol type. We conduct a statistical analysis on the diurnal variation of aerosol types by calculating the averaged frequency of diurnal aerosol type change. Specifically, for each station, the number of days with at least four daily observations is counted, and then the number of days on which more than one type is found is calculated; the ratio between the latter and the former thus represents the frequency of diurnal change of aerosol type. The results are shown in Figure 9
. We can see that except for Northwest China, diurnal type change is frequently observed (frequency > 0.5) for most East and South China sites. This is reasonable as in the Northwest aerosol composition is relatively simple (mainly dust as discussed in the previous section), whereas in East and South China, complicated mixtures are often found due to both local emission and remote transport. We also analyzed the frequency of occurrence of the four aerosol types in the morning and afternoon but did not find outstanding patterns. Therefore, diurnal variability of aerosol types is a common phenomenon and cannot be ignored. This has important implications for both climate modeling and satellite remote sensing. Current climate models generally use averaged optical parameters while different aerosol type will result in different radiative forcing. In addition, aerosol retrieval from geostationary platforms has become increasingly popular. Such practice also needs to assume the aerosol model and without accounting for the diurnal variability of aerosol type will inevitably introduce uncertainties in the results. We will present detailed analysis of the effect of aerosol diurnal change on radiative forcing and aerosol retrieval in a following study.
3.3. Aerosol Type Map for Satellite Remote Sensing
Another important practical use of aerosol classification is to facilitate the assumption of aerosol model in passive satellite retrievals, such as those from Moderate Resolution Spectroradiometer (MODIS) and the Visible Infrared Imaging Radiometer Suite (VIIRS). For example, the aerosol model used in the MODIS retrieval algorithm is based on the classification study using AERONET measurements by Omar et al. [17
]. However, many of these global models are not quite representative of East Asian aerosols. In fact, only one station in China, Beijing, is used in the Omar et al. [17
] work. On the one hand, East Asia is a global pollution hotspot with complicated emission sources and aerosol composition and deserves special attention in aerosol observation. On the other hand, current satellite aerosol products still have large uncertainties over this region, and aerosol model assumption has been suggested as a major source of uncertainties (e.g., [13
]). As a result, it is thus necessary to refine the representation of aerosol optical and microphysical properties in East Asia in the retrieval algorithms in order to improve the retrieval accuracy over this region.
The SSA and g spectra used by MODIS and VIIRS are shown in Figure 10
whose aerosol model information comes from reference [38
]. By comparing Figure 6
and Figure 10
, obviously desert dust relates to dust in Figure 10
, and we can infer that absorbing mixed type should contain more black carbon. Scattering fine type maybe dominated by sulfate and nitrate aerosols. However, because in reality the aerosols are almost always mixed and the sun photometers retrieved the integrated properties, it is not possible to accurately define their chemical composition. We find that the major difference is the lack of obviously curved SSA spectral in the MODIS and VIIRS fine aerosol models, which is a characteristic frequently occurring over East Asia. We further attempt to generate a seasonal map of aerosol type distribution in China based on the spatial and seasonal variability of the four aerosol types revealed by Figure 7
, as shown in Figure 11
. Although the resolution is coarse and there are many places not covered due to the limited number of AERONET sites, this map provides more detailed information about aerosol type variability in China than that used for many satellite aerosol retrieval algorithms such as MODIS and VIIRS. With the ongoing ground observation efforts in China, it is promising that this map could become more accurate in the future.