Analyzing the Characteristics of Cloud Condensation Nuclei (CCN) in Hebei, China Using Multi-Year Observation and Reanalysis Data

: The study of the characteristics and variations of cloud condensation nuclei (CCN) plays an important role in understanding the effects of aerosol–cloud interactions. This paper selected observation data in a city region of Shijiazhuang in North China from 2005 to 2007, along with the corresponding MERRA-2 and ERA5 data, to analyze the characteristics of CCN, sort the factors affecting the diurnal variation of CCN number concentration (N CCN ) according to their importance, and build the relationship between N CCN and supersaturation (SS) in the heavily polluted region. The results show that there was a bimodal distribution of a daily time series for N CCN in Shijiazhuang, China. By calculating the correlation between CCN and pollutants observed in winter 2007, we identiﬁed that the dominant factor for peaks of N CCN is SO 2 in the morning but NO 2 in the evening. We also ranked the factors affecting the diurnal variation of N CCN by using observation and reanalysis data and found that the concentration of pollutants is the greatest impact factor in summer, but the atmospheric stability is the greatest factor in winter. Finally, we determined the relationship between N CCN and SS according to the Twomey formula (N CCN = cSS k ) and found there was a reasonable value range (i.e., 0.5~0.7) for the parameter k in East and North China. Speciﬁcally, it is more reasonable for k to be 0.5 in summer and 0.7 in winter.


Introduction
The multiphase system composed of solid and liquid particles suspended in the air is called aerosol. Changes in aerosol composition, particle size, and number concentration have important effects on the radiation budget, clouds, and precipitation in the earth system [1][2][3][4]. Aerosols can cool (and warm) the earth-atmosphere system by scattering (absorbing) solar radiation directly (i.e., aerosol-radiation interaction). In addition, a part of aerosols can also indirectly affect cloud radiative properties [2] and precipitation processes [5] by influencing the cloud droplet size/number of warm stratus clouds, namely aerosol-cloud interaction. These aerosols are called cloud condensation nuclei (CCN). CCN plays a significant role in aerosol and cloud physics as it links aerosol characteristics to the cloud formation process. Due to the complexity of aerosol-cloud interaction and cloud/rain processes, there are large uncertainties in estimating the effect of aerosol on climate using global climate models (GCM) [6][7][8]. To better evaluate climate model simulation on aerosol-cloud interaction and reduce the uncertainty, it is necessary to conduct a more in-depth study on the characteristics of CCN over large-scale and long-term spatial-temporal domains.
In addition, the parameterization of N CCN is also a focus of previous research. The Twomey formula is one of the most widely used parameterizations for determining N CCN through SS, namely, N CCN = cSS k [39]. Therefore, the determination of c and k in this parameterization scheme is an important objective as well. Some studies fitted parameters for a certain season by using single-year observation data. For example, Hudson [40] and Martins et al. [41] determined the parameters of autumn in America and Brazil, respectively. More generally, however, researchers could only obtain parameters for a short-term period (typically for days) because of the limitation of data, especially in China [18,24,25,27,29]. The lack of the parameterization scheme derived from long-term CCN data hinders the further application in GCM and large-scale studies.
Beginning in the 11th FYP (Five-Year Plan) in 2006-2010, the Chinese government set a goal of controlling air pollution and put efforts in reducing the sulfate emissions from coal combustion by utilizing new techniques of desulfurization. In September 2013, the "Air Pollution Prevention and Control Action Plan" was released and became the most influential policy for improving China's environment in the subsequent decade. Correspondingly, the air quality in North China has been continuously improved since 2010, as shown from multiple surface measurements [42,43] or satellite datasets [44,45]. Therefore, we left out the CCN data after 2010 as our study focused on CCN in the heavily polluted period. Noting that 2006 was a year with severe pollution [46], the observation data from 2005 to 2007 were therefore selected for application in this study. This paper selected multi-season measurements conducted in Shijiazhuang, a polluted megacity, and used the observations in summer and winter during 2005-2007 to analyze the characteristics of CCN. The concurrent data, which includes the Modern-Era Retrospective Analysis for Research and Applications version 2 (MERRA-2) and the ECMWF Reanalysis 5th Generation (ERA5), are also analyzed to identify the factors affecting the diurnal variation of N CCN , and build the relationship between N CCN and SS in the heavily polluted region. This paper is organized as follows. The next section describes the data and methodology, followed by the results illustrated in Section 3. The conclusion and discussion are given in Section 4.

Station Data
Shijiazhuang (38 • 02 N, 114 • 31 E), the capital of Hebei, China, was selected as the studying site, which is a megacity in North China and located at one of the polluted regions in the world [47]. Here, we utilized the following observed data, including the concentrations of CCN, pollutants (PM 10 , NO 2 and SO 2 ), and meteorological variables (temperature and wind speed), to study the characteristics of CCN in the heavily polluted region. The location of the study area and the observed periods are shown in Figure 1. CCN was observed at the Hebei Meteorological Service Station (38 • Figure 1 or column "Time" in Table 1), where "winter" and "summer" are roughly defined as in the period of "December-January-February" and "June-July-August", respectively, and allowing a half-month extension of the period for including as much data as possible. Meteorological data were obtained in the Meteorological Observation Station (38 • [46], the observation data from 2005 to 2007 were therefore selected for application in this study. This paper selected multi-season measurements conducted in Shijiazhuang, a polluted megacity, and used the observations in summer and winter during 2005-2007 to analyze the characteristics of CCN. The concurrent data, which includes the Modern-Era Retrospective Analysis for Research and Applications version 2 (MERRA-2) and the ECMWF Reanalysis 5th Generation (ERA5), are also analyzed to identify the factors affecting the diurnal variation of N , and build the relationship between N and SS in the heavily polluted region. This paper is organized as follows. The next section describes the data and methodology, followed by the results illustrated in Section 3. The conclusion and discussion are given in Section 4.

Station Data
Shijiazhuang (38°02′ N, 114°31′ E), the capital of Hebei, China, was selected as the studying site, which is a megacity in North China and located at one of the polluted regions in the world [47]. Here, we utilized the following observed data, including the concentrations of CCN, pollutants (PM , NO , and SO ), and meteorological variables (temperature and wind speed), to study the characteristics of CCN in the heavily polluted region. The location of the study area and the observed periods are shown in Figure 1 Figure 1 or column "Time" in Table 1), where "winter" and "summer" are roughly defined as in the period of "December-January-February" and "June-July-August", respectively, and allowing a half-month extension of the period for including as much data as possible. Meteorological data were obtained in the Meteorological Observation Station (38°02′ N, 114°25′ E) (OBS_M), approximately 9 km away from the OBS_S station, where surface temperature and wind speed at 10 m height were available during all the CCN measurement periods.   Table 1). In the OBS_S site, a CCN Counter, model CCN-100 from Droplet Measurement Technologies (DMT; Boulder, CO, USA), was used to measure the N CCN at 4 different SS levels (0.1%, 0.3%, 0.5% and 1.0%) by utilizing the difference between the heat and moisture diffusion coefficients [48]. The theoretical error of CCN concentration observed by this CCN Counter is no more than 10% [49]. To ensure the accuracy of the CCN data, the CCN Counter was calibrated by the manufacturers every year before observations [14]. The CCN data have a temporal resolution of 1 s and were processed by hourly averaging before analysis. In the OBS_E and OBS_M sites, pollutants (e.g., the mass concentrations of PM 10 (M PM10 ), NO 2 (M NO2 ) and SO 2 (M SO2 )) and meteorological elements (the surface temperature (T O ) and 10 m wind speed (WS10 O )) data were provided, respectively. Both of them have a temporal resolution of 1 h. See Table 1 for more details.

Reanalysis Data
To analyze the factors affecting the long-term CCN variation more comprehensively, we needed corresponding climatological data to assist. Here, MERRA-2 and ERA5 data of Shijiazhuang region (38 • 00 N-38 • 03 N, 114 • 24 E-114 • 33 E) were extracted. MERRA-2 is the latest atmospheric reanalysis of the modern satellite era produced by NASA's Global Modeling and Assimilation Office (GMAO) [50], which was often used to analyze aerosol characteristics and aerosol-climate interaction and relations [51][52][53]. MERRA-2 data have been proved to be reliable in representing the long-term temporal and spatial distribution of pollutants (i.e., PM 2.5 , PM 10 sulfate, organic carbon, and black carbon) in China [54][55][56]. We downloaded the pollutants and meteorological data from Giovanni (https://giovanni.gsfc.nasa.gov/giovanni/, accessed on 21 January 2020), including the data of surface mass concentration of dust (M Dust ), the column mass concentration of SO 2 (C SO2 ), the total column concentration of ozone (C O3 ), the column mass concentration of black carbon (C BC ), the surface temperature (T M ), the 10-m wind speed (WS10 M ), and the surface wind speed (WS). ERA5 performs decently in characterizing meteorological conditions (i.e., temperature, wind speed, and precipitation) for climate studies [57,58]. Therefore, ERA5 data (https://cds.climate.copernicus.eu/cdsapp#!/home, accessed on 8 January 2020) were used to analyze the effect of meteorological conditions on CCN variation in this study, we extracted the data of surface temperature (T E ), the relative humidity (RH), the 10-m wind speed (WS10 E ), the boundary layer height (BLH), the boundary layer dissipation (BLD), the convective available potential energy (CAPE), and the convective inhibition (CIN). Both MERRA-2 and ERA5 data are processed as hourly data, and more information about them had shown in Table 1.

Methodology
We verified the consistency between MERRA-2 and ERA5 data with available observational data at the OBS-M site, e.g., WS10 O for verifying 10 m wind speeds WS10 M and WS10 E , and T O for verifying surface temperature T M and T E . The result showed that the Pearson correlation coefficient (r) values of WS10 M and WS10 E with WS10 O are 0.677 and 0.591, respectively. For the temperature, the r values are 0.963 and 0.940, respectively. All results passed the consistency test with a 95% confidence level, indicating a reasonable consistency between the reanalysis data and the observations. Therefore, we used both MERRA-2 and ERA5 data to carry out analyses when the station data were not available.
To highlight the variation of each variable, we mainly used the normalized changes of a variable, as defined below: where V (V can be any individual variable normalized in a specified period in Table 1, e.g., N CCN in winter 2007) means the time series of the normalized change of V t , which is the value of a variable at a certain observational time point t (e.g., measured N CCN at one day in winter 2007), and V MAX and V MIN represent the maximum and minimum values of the variable during the whole observational period (e.g., maximum and minimum values of N CCN in winter 2007). The time series of normalized data can highlight the relative variations of variables, reduce the complexity of graphics, and avoid the dimensional conversion, making our analysis process more concise. We used the Pearson correlation coefficient (r) to characterize the correlation (linear correlation) between different variables [59]. It is worthy of noting that the use of the normalized method does not change the r values between different variables, ensuring the reliability of our correlation analysis.

Diurnal Variation of N CCN
Considering the N CCN data with SS = 0.3 % had more valid data samples, we selected the N CCN under this supersaturation to analyze the characteristics of diurnal variation. The absolute variations of N CCN in five measurement periods are shown in Figure 2a. In addition, to highlight the characteristics of the diurnal variation, the normalized N CCN in summer and winter are plotted in Figure 2b,c. Overall, the diurnal variations of N CCN in different periods show bimodal distributions. In detail, there are some differences in N CCN daily series between winter and summer. On one hand, the intensities of the two peaks are not consistent. The peak in the morning is more prominent for summer, but the peak in the evening is stronger for winter. On the other hand, the times hitting peak points in different seasons are also different. In summer, the double peaks appear around 09:00 Local Standard Time (LST) in the morning and 22:00 LST in the evening, whereas the peaks appear around 08:00 LST in the morning and 18:00 LST in the evening in winter. The above results show that the diurnal variations of N CCN in winter and summer might be influenced by different factors. We analyze the possible influencing factors in more detail in Section 3.3.
peaks appear around 08:00 LST in the morning and 18:00 LST in the evening in winter. The above results show that the diurnal variations of N in winter and summer might be influenced by different factors. We analyze the possible influencing factors in more detail in Section 3.3.

Diurnal Variation of NCCN in Winter
Pollutants in the atmosphere have an important effect on the aerosol ability for activating to form cloud droplets. Zhang et al. [60] pointed out that aerosol activity is linked not only to particle composition but also to mixing state, aging, and physical processes (such as coating and condensation). The impact of highly soluble pollutants, such as gas precursors such as SO and NO , can affect the formation of hygroscopic matter, and influence particle hygroscopicity [61]. The more SO and NO coating in the aerosol, the more hygroscopic the particles are and the easier the particles to serve as CCN. Here, we obtained the hourly-averaged concentrations of pollutants (NO , PM and SO ) from the available station data in winter 2007 and made use of them to identify the possible reasons for the two peaks occurring in the diurnal variations of N . As shown in Figure 3, N is highly consistent with the concentrations of pollutants, especially for NO and PM . The r values of M and M with N are 0.789 and 0.771, and all pass the consistency test with a confidence of 99%. In terms of a.m. and p.m. periods, the positive relationship is apparent for SO (r = 0.797) at a.m., and for NO (r = 0.933) at p.m., suggesting that the dominant factor affecting the diurnal variation of N is time-dependent. In the morning, SO plays a critical role, but the role of NO is more significant in the afternoon and becomes a dominant factor in the evening. PM and N always maintain a high positive correlation all day, which indicates that the total aerosol particle concentration plays a decisive role in the N variation. Note that the data analyzed in this sub-section only reflect winter 2007. The pollutant of SO is mainly emitted from the heating facilities in North China [62,63], and the frequently occurring temperature inversion in winter evenings may accumulate the pollutant to reach a peak in the morning. The pollutant of NO is mainly from the traffic emission in the daytime [64], and likely accumulates to reach a peak in the evening.

Diurnal Variation of N CCN in Winter
Pollutants in the atmosphere have an important effect on the aerosol ability for activating to form cloud droplets. Zhang et al. [60] pointed out that aerosol activity is linked not only to particle composition but also to mixing state, aging, and physical processes (such as coating and condensation). The impact of highly soluble pollutants, such as gas precursors such as SO 2 and NO 2 , can affect the formation of hygroscopic matter, and influence particle hygroscopicity [61]. The more SO 2 and NO 2 coating in the aerosol, the more hygroscopic the particles are and the easier the particles to serve as CCN. Here, we obtained the hourly-averaged concentrations of pollutants (NO 2 , PM 10 and SO 2 ) from the available station data in winter 2007 and made use of them to identify the possible reasons for the two peaks occurring in the diurnal variations of N CCN . As shown in Figure 3, N CCN is highly consistent with the concentrations of pollutants, especially for NO 2 and PM 10 . The r values of M NO2 and M PM10 with N CCN are 0.789 and 0.771, and all pass the consistency test with a confidence of 99%. In terms of a.m. and p.m. periods, the positive relationship is apparent for SO 2 (r = 0.797) at a.m., and for NO 2 (r = 0.933) at p.m., suggesting that the dominant factor affecting the diurnal variation of N CCN is time-dependent. In the morning, SO 2 plays a critical role, but the role of NO 2 is more significant in the afternoon and becomes a dominant factor in the evening. PM 10 and N CCN always maintain a high positive correlation all day, which indicates that the total aerosol particle concentration plays a decisive role in the N CCN variation. Note that the data analyzed in this sub-section only reflect winter 2007. The pollutant of SO 2 is mainly emitted from the heating facilities in North China [62,63], and the frequently occurring temperature inversion in winter evenings may accumulate the pollutant to reach a peak in the morning. The pollutant of NO 2 is mainly from the traffic emission in the daytime [64], and likely accumulates to reach a peak in the evening.

Factors Affecting the Diurnal Variation of NCCN
The diurnal variations of N are related to the factors affecting the aerosol ac tion, for instance, the physical and chemical characteristics of the aerosol, and the me ological conditions. To quantify the effects of these factors, we plotted their variations calculated the correlation r between N and the following variables: pollut (C , C , C , and M ), meteorological variables (T , WS10 , WS and RH), and mospheric stability variables (BLH, BLD, CAPE, and CIN), as shown in Figure 4 and T 2. Here, pollutant and atmospheric stability data were sourced from the MERRA-2 ERA5, respectively, and meteorological data were obtained from on-site observation reanalysis data. The atmospheric stability variables are used to represent the environm tal conditions for pollutants (emitted from the Earth's surface) to diffuse away from emission sources. BLH and BLD characterize the atmospheric state under tiny tu lence, and CAPE and CIN correspond to stronger atmospheric disturbances (i.e., con tion). Note that CIN values are available only in summer. In general, pollutants wi diffused and the concentration near the source region will be diluted with high BLH, CAPE, and low CIN.

Factors Affecting the Diurnal Variation of N CCN
The diurnal variations of N CCN are related to the factors affecting the aerosol activation, for instance, the physical and chemical characteristics of the aerosol, and the meteorological conditions. To quantify the effects of these factors, we plotted their variations and calculated the correlation r between N CCN and the following variables: pollutants (C SO2 , C O3 , C BC , and M Dust ), meteorological variables (T O , WS10 O , WS and RH), and atmospheric stability variables (BLH, BLD, CAPE, and CIN), as shown in Figure 4 and Table 2. Here, pollutant and atmospheric stability data were sourced from the MERRA-2 and ERA5, respectively, and meteorological data were obtained from on-site observation and reanalysis data. The atmospheric stability variables are used to represent the environmental conditions for pollutants (emitted from the Earth's surface) to diffuse away from the emission sources. BLH and BLD characterize the atmospheric state under tiny turbulence, and CAPE and CIN correspond to stronger atmospheric disturbances (i.e., convection). Note that CIN values are available only in summer. In general, pollutants will be diffused and the concentration near the source region will be diluted with high BLH, BLD, CAPE, and low CIN.  The hourly-averaged time series of different variables and N in winter and mer are shown in Figure 4. In summer, the diurnal variation of pollutants and N a positive correlation. Among them, SO has the best correlation, followed by B dust, and finally O (Figure 4a). In terms of meteorological variables, temperatur wind speed are negatively correlated while RH is positively correlated to N . The hourly-averaged time series of different variables and N CCN in winter and summer are shown in Figure 4. In summer, the diurnal variation of pollutants and N CCN has a positive correlation. Among them, SO 2 has the best correlation, followed by BC and dust, and finally O 3 (Figure 4a). In terms of meteorological variables, temperature and wind speed are negatively correlated while RH is positively correlated to N CCN . Wind speed (surface and 10 m wind speed) has the strongest correlation, followed by surface temperature and RH (Figure 4b). All atmospheric stability variables have a negative correlation with N CCN in the diurnal variation, of which the strongest correlation is BLD, followed by CAPE, and finally BLH and CIN (Figure 4c). In winter, the positive and negative correlations between different variables and N CCN in the diurnal variation are generally consistent with those in summer, but the strength of the correlation (i.e., r) is different. For the pollutants, the diurnal variation of SO 2 still has the strongest impact on N CCN , followed by O 3 and dust, and the least impacting pollutant is BC (Figure 4d). Wind speed is still the most important factor affecting N CCN among meteorological variables (Figure 4e). However, the correlation between surface wind speed and N CCN is negative in summer but positive in winter, which will be discussed later. There are strong negative correlation relationships between N CCN and atmospheric stability variables, especially for BLH and BLD (Figure 4f).
As mentioned above, WS10 O (wind speed at 10 m height) and N CCN always show a negative correlation, but WS (surface wind speed) and N CCN have different correlations in different seasons. We explain the phenomenon as follows. Wind speed is regarded as an important factor affecting the source and sink of aerosol and pollutants. WS10 O is higher from the ground and is mostly determined by the synoptic meteorology or the large-scale air circulation. The larger WS10 O usually indicates stronger horizontal air motions and favors the diffusion of pollutants, resulting in a lower N CCN . However, WS is strongly affected by the surface properties such as vegetation types, soil grains, surface roughness, etc., and is most likely determined by the local topography, and small-to middle-scale weather situations. Therefore, the relationship between WS and N CCN is prone to change with seasons.
To analyze the main influencing factors of N CCN in winter and summer, we sorted the variables according to the absolute values of r, as shown in Figure 5. In this chord diagram, the absolute values of r denote the correlations between the corresponding variables and N CCN connected by the belt. The wider belt indicates the stronger correlation between the individual variable and N CCN . For all the variables listed in Table 2, only those variables with correlations passing the test of 90% confidence level (i.e., p-value is less than 0.1) were selected to show in Figure 5. In summer, we see the variations in pollutants of C SO2 and C BC have the greatest impact on that in N CCN , followed by meteorological conditions of wind speed (WS and WS10 O ), and finally atmospheric stability of BLD and CAPE. In winter, the factor that has the greatest impact on N CCN variation is atmospheric stability of BLH and BLD, followed by the pollutant of C SO2 , and the relatively minor factor is the meteorological condition of wind speed (WS10 O and WS). Based on the above results, we summarize that high N in Shijiazhuang is most likely related to high pollutant concentration and atmospheric stability, and low 10 m wind speed. In summer, the concentration of pollutants is relatively low, so the variations of pollutants have relatively strong impacts on N variation. In winter, however, the concentration of the pollutant is relatively high, so it has less impact on the change of N , but the variation of atmospheric stability plays a more important role.

NCCN-SS Relationship
An overview of hourly-averaged CCN number concentration is shown in Figure 6. Overall, N keeps a high level, up to 14,026 cm under SS = 1% in winter, and N is a little higher in winter than summer, especially at high SS. N has an increasing tendency with SS in both seasons in Shijiazhuang. Based on the above results, we summarize that high N CCN in Shijiazhuang is most likely related to high pollutant concentration and atmospheric stability, and low 10 m wind speed. In summer, the concentration of pollutants is relatively low, so the variations of pollutants have relatively strong impacts on N CCN variation. In winter, however, the concentration of the pollutant is relatively high, so it has less impact on the change of N CCN , but the variation of atmospheric stability plays a more important role.

N CCN -SS Relationship
An overview of hourly-averaged CCN number concentration is shown in Figure 6. Overall, N CCN keeps a high level, up to 14,026 cm −3 under SS = 1% in winter, and N CCN is a little higher in winter than summer, especially at high SS. N CCN has an increasing tendency with SS in both seasons in Shijiazhuang. Many empirical formulas have been proposed to obtain N from the supers tion (SS). A power-law equation proposed by Twomey [39] is most commonly namely N = cSS . The parameters of c and k contain information concerning th ticle number concentration and the chemical characteristics of aerosols. We fitte Twomey formula and obtained the fitting parameters in different periods of this stu shown in Figure 7. When fitting, we first took the logarithm of both sides of the Tw formula, that is, ln(N ) = kln (SS) + ln (c). Next, we used the QR method [65] to lin fit the above equation (function lm in R language) using the observed data points tain k and ln (c). The R-squared values were obtained when conducting linear fittin high R denoted a better performance of the fitted curve in representing the obs N -SS relationship (Figure 7). The values of k in winter (1.08 and 0.69) are large those in summer (0.52, 0.49 and 0.58). Therefore, the fitting curves for winter are st than the ones for summer (Figure 7). The fitting parameter c can be regarded as a to represent the particle number concentration. Obviously, the values are higher in w (22,510 and 15,766) than those in summer (12,676, 14,116 and 13,428). The paramete Shijiazhuang maintaining a high level (c ≥ 10,000) in both winter and summer show characteristic of a typical continental CCN (c ≥ 2200, k < 1) [66]. However, we n that the fitting parameters in winter 2006 are significantly higher than the results in years in North China [18,30]. This may be attributed to the highly frequent and s pollution events in winter 2006 in Shijiazhuang. Qu et al. [46] analyzed the air pol condition in Shijiazhuang based on the Air Quality Index (AQI). The results showe there were 5 days with high-polluted events (AQI > 300) and 21 days with middl luted events (200 < AQI ≤ 300) from 2005 to 2009, and all of the high-polluted e and almost half of middle-polluted events occurred in winter and spring of 2006. T fore, we think the observations in winter 2007 are more representative of the gene pollution condition in the winter of Shijiazhuang, and the fitting parameters in w 2006 are more representative of a state of extremely heavy pollution. At the same the above results also remind us that it is necessary to consider using larger c and ues while using N = cSS in heavily polluted areas. Many empirical formulas have been proposed to obtain N CCN from the supersaturation (SS). A power-law equation proposed by Twomey [39] is most commonly used, namely N CCN = cSS k . The parameters of c and k contain information concerning the particle number concentration and the chemical characteristics of aerosols. We fitted the Twomey formula and obtained the fitting parameters in different periods of this study, as shown in Figure 7. When fitting, we first took the logarithm of both sides of the Twomey formula, that is, ln(N CCN ) = kln(SS) + ln(c). Next, we used the QR method [65] to linearly fit the above equation (function lm in R language) using the observed data points to obtain k and ln(c). The R-squared values were obtained when conducting linear fitting and high R 2 denoted a better performance of the fitted curve in representing the observed N CCN -SS relationship ( Figure 7). The values of k in winter (1.08 and 0.69) are larger than those in summer (0.52, 0.49 and 0.58). Therefore, the fitting curves for winter are steeper than the ones for summer ( Figure 7). The fitting parameter c can be regarded as a proxy to represent the particle number concentration. Obviously, the values are higher in winter (22,510 and 15,766) than those in summer (12,676, 14,116 and 13,428). The parameter c of Shijiazhuang maintaining a high level (c ≥ 10, 000) in both winter and summer shows the characteristic of a typical continental CCN (c ≥ 2200, k < 1) [66]. However, we noticed that the fitting parameters in winter 2006 are significantly higher than the results in other years in North China [18,30]. This may be attributed to the highly frequent and severe pollution events in winter 2006 in Shijiazhuang. Qu et al. [46] analyzed the air pollution condition in Shijiazhuang based on the Air Quality Index (AQI). The results showed that there were 5 days with high-polluted events (AQI > 300) and 21 days with middle-polluted events (200 < AQI ≤ 300) from 2005 to 2009, and all of the high-polluted events and almost half of middle-polluted events occurred in winter and spring of 2006. Therefore, we think the observations in winter 2007 are more representative of the general air pollution condition in the winter of Shijiazhuang, and the fitting parameters in winter 2006 are more representative of a state of extremely heavy pollution. At the same time, the above results also remind us that it is necessary to consider using larger c and k values while using N CCN = cSS k in heavily polluted areas. The comparison of parameters c and k derived from this study with parameters from different regions (continental or transitional between marine and continental) is shown in Table 3. We found that the value of c in Shijiazhuang is about one order of magnitude higher than others, which may be related to the high pollution level in Shijiazhuang. Although there is an obvious difference between c parameters in Shijiazhuang and other studies in China, the values of k are similar, as shown in Table 3. Qi and Yao [30] studied the CCN characteristics at Qingdao, China in winter and found that the k values were 0.51~0.72 at different weather conditions. Wang et al. [18] showed that the values of k were 0.49~0.69 for different wind conditions in autumn of Tianjin, China. Based on the above results, we summarize the recommended values of parameters c and k to construct the fitting formula (N = cSS ) when predicting N in East and North China for a large-scale study. The parameter c can be obtained from the average value of N at SS = 1.0% if the observation data are available. Otherwise, c within the range of 2000 to 15,000 is regarded as a general case, but c > 15,000 should be regarded as the extremely heavy pollution case. The recommended value of parameter k is 0.5-0.7 (0.5 for summer and 0.7 for winter) in East and North China. During the extremely polluted situation, k needs to be further increased, and it can be selected between 0.7 and 1.0. The comparison of parameters c and k derived from this study with parameters from different regions (continental or transitional between marine and continental) is shown in Table 3. We found that the value of c in Shijiazhuang is about one order of magnitude higher than others, which may be related to the high pollution level in Shijiazhuang. Although there is an obvious difference between c parameters in Shijiazhuang and other studies in China, the values of k are similar, as shown in Table 3. Qi and Yao [30] studied the CCN characteristics at Qingdao, China in winter and found that the k values were 0.51~0.72 at different weather conditions. Wang et al. [18] showed that the values of k were 0.49~0.69 for different wind conditions in autumn of Tianjin, China. Based on the above results, we summarize the recommended values of parameters c and k to construct the fitting formula (N CCN = cSS k ) when predicting N CCN in East and North China for a large-scale study. The parameter c can be obtained from the average value of N CCN at SS = 1.0% if the observation data are available. Otherwise, c within the range of 2000 to 15,000 is regarded as a general case, but c > 15,000 should be regarded as the extremely heavy pollution case. The recommended value of parameter k is 0.5-0.7 (0.5 for summer and 0.7 for winter) in East and North China. During the extremely polluted situation, k needs to be further increased, and it can be selected between 0.7 and 1.0.

Conclusions and Discussion
Studying CCN in the atmosphere is of great significance for understanding aerosolcloud interactions. This study used multi-year data observed in Shijiazhuang from 2005 to 2007 and the reanalysis data (MERRA-2 and ERA5) to investigate the characteristics of CCN, sort the factors affecting the diurnal variation of N CCN according to their importance, and build the relationship between N CCN and SS in the heavily polluted region. Our comprehensive analysis based on CCN long-term observations combined with climatescale reanalysis data can provide a reference for climatological CCN research analysis or model simulation.
The diurnal variations of N CCN in different periods show a bimodal distribution, and the peak in the morning is more prominent for summer, but the evening peak is obvious for winter. By analyzing the observation data measured in winter 2007, we found the dominant factor affecting this pattern is the pollutant SO 2 in the morning but the pollutant NO 2 in the evening, and PM 10 always maintains a high positive correlation with N CCN all day. Additionally, we ranked the importance of factors affecting N CCN diurnal variation by using observation and reanalysis data according to the absolute values of the Pearson correlation coefficient. The variation of pollutants (C SO2 and C BC ) has the greatest impact on N CCN in summer, but the variation of atmospheric stability (BLH and BLD) in winter. To study the relationship between N CCN and SS, we fitted the Twomey formula (N CCN = cSS k ) and found the CCN has the typical continental characteristic in Shijiazhuang. According to this study and some previous research, we recommended a reasonable value range (0.5~0.7) of the parameter k for applying in East and North China. In detail, it is more reasonable for k to be 0.5 in summer and 0.7 in winter. When extreme pollution (c > 15, 000) occurs, the value of k can be further increased, recommending a value between 0.7 and 1.0.
Several points are worthy of mentioning for future studies. Although reanalysis data have great convenience in acquisition and application, these data still have great limitations in the characterization of small-scale regional pollution and meteorological conditions. Therefore, observations of pollutants and meteorological variables on a local scale should be carried out together with N CCN observations in the future, so as to further improve the understanding of the complex relationship between CCN and pollutants and meteorological conditions. In addition, the N CCN data used in this study are representative for 2005-2007 only. In recent years, China has achieved remarkable success in pollution control [68], and its impacts on CCN and aerosol-cloud interaction are of interest to study in the future.