4.5.1. Aerosol Mass Concentration
In this subsection we assess the WRF-CHIMERE skill in reproducing the aerosol mass concentration observed at surface and aboard the ATR42. 
Figure 3 shows the comparison between the observed and predicted time series of daily TM of SO
4, NO
3, NH
4 and OM at Cabauw Tower, with shaded area and red bar denoting the 25th and 75th percentiles. Daily TMs and interquartile ranges are calculated starting from hourly modelled and observed and modelled values. Statistical indices obtained from the comparison are listed in 
Table 3.
In general, the model reproduces the day-to-day variations of observations, especially for inorganic aerosols. WRF-CHIMERE captures the marked decrease of the concentrations during the scavenging period, the following recovery starting from 20 May, and the sharp increase of aerosol pollution at the end of month. Model performance is poor on 15–16 May, all three modelled aerosol inorganic species exhibit a peaks 4–5 times larger than observations. This unrealistic modelled peak is not representative of the chemistry and transport ability of the model. For all other days, the model is able to reproduce the observations. For this peak in particular, the fact to have an over-estimation for the three modelled inorganic species and only for one day, shows that the problem is not coming from emissions or chemistry, but more probably to a badly reproduced local wind speed or direction, advecting a polluted air mass just over the tower.
However, correlation between modelled and measured TM is larger than 0.5 for all aerosol species analyzed, it is 0.55, 0.54, 0.52 and 0.64 for SO4, NO3, NH4 and OM, respectively. Despite the overestimation of 15–16 May, NMB of WRF-CHIMERE TM is relatively small for aerosol sulfate (+14%) and ammonium (+11%). Instead, particulate nitrate is overestimated by 48%. TM of OM is underestimated over all simulated periods, MB is −1.7 µg/m3 (or −37%). The analysis of IQR reveals that model spread around central tendency of the distribution is about 1.6 times larger than observed for inorganic aerosols. By contrast, it is 1.7 times lower for OM.
The model tendency to overestimate surface aerosol nitrate is also confirmed by comparison of WRF-CHIMERE simulations with observations of daily inorganic aerosol mass concentration observed at EMEP sites. As shown in 
Table 4, daily NO
3 is highly biased by 0.6 µg/m
3 (+247%). As at Cesar tower, modelled SO
4 shows a relative small bias of +14%. Daily aerosol ammonium is overpredicted by 0.2 µg/m
3 (+44%). SO
4, NO
3 and NH
4 are reproduced with a correlation of 0.48, 0.70, and 0.56, respectively. These results are statistically comparable with those obtained by large-scale simulation in other modelling studies over Europe. For example, in multi-year simulations of Lecoeur and Seigneur [
82] correlation coefficient is 0.57, 0.42, and 0.58 for SO
4, NO
3 and NH
4, respectively. Balzarini et al. [
83] reported correlations of 0.48, 0.60, and 0.56 for the same variables. For the same studied case, Tuccella et al. [
76] found that daily SO
4, NO
3 and NH
4 are simulated with a correlation of 0.66, 0.74 and 0.82, respectively. All these results are using modeling systems very different compared to the one used in this study, and it is difficult to associate the model set-up, the meteorology or the emissions to the statistical results found. The main message is that the model presented in this study is able to have realistic results and with the same order of accuracy than other existing online modeling systems.
Table 4 also reports the statistical indices related to daily PM
2.5 and PM
10 at EMEP stations within D3. WRF-CHIMERE reproduces the day-to-day variability of PM
2.5 and PM
10 with a correlation of 0.54 and 0.51, and mean positive bias of 2.4 and 5.6 µg/m
3, respectively. Time series (not shown) show that PM
2.5 is systematically overestimated during the whole period, while for PM
10 the bias is essentially due to the long-range transport of dust. It should be noted that, as shown by Mailler et al. [
36] in a continental scale evaluation of CHIMERE, the current version of the model tends to overestimate the PM
2.5 in both winter and summer. Mailler et al. [
36] associated the bias to ammonium nitrate overprediction, and this is coherent with the results discussed above.
 The analysis of vertical profiles has been diversified for the planetary boundary layer (PBL) and free troposphere (FT). According to Crumeyrolle et al. [
80], PBL height was lower 1600 during the EUCAARI campaign. Following Tuccella et al. [
76], we have considered as PBL concentrations the modelled data below 1600 m and as FT values those above this altitude. 
Figure 4 shows the comparison between the predicted and observed vertical profiles of SO
4, NO
3, NH
4 and OM concentration, during the period 1, 2a, 2b, and 3, respectively. Modelled data have been extracted interpolating the output point by point along the flight trajectories. In 
Figure 4, the dots represent the TMs and the bars denote the 25th and 75th percentiles of observed and modelled distributions. For comparisons, the observed and modelled profiles have been projected on 16 layers of 250 m depth.
During P1, WRF-CHIMERE captures the profile of observed mass concentration in the marine PBL, except for SO4 that is underestimated by more than a factor of 2 close to the surface. Moreover, during P1, the model exhibits a considerable underestimation above the Planetary Boundary Layer (PBL). Measurements above the PBL during P1 reported in this study are not representative of clean marine air, because they were mainly carried out close to the Dutch coast. The bias above the PBL is most probably due to errors in the wind simulation since the simulated plume is localized further to the South compared to observations (not shown).
During P2a and P2b, the model tends to underestimate the mass concentration of observed SO4, NO3, NH4 and OM especially close to the surface. Moreover, WRF-CHIMERE does not reproduce the observed variability especially during the period 2a. The best agreement between observations and model results was found during P3. During this period, modelled values are within the observed range except for the layers close to the surface. WRF-CHIMERE captures several features of observed vertical shape. Both observed and modelled aerosols increase in the PBL, reach the maximum at an altitude of about 500 m, and show a homogeneous profile above the PBL.
Figure 5 shows the comparison between the observed and predicted vertical profile of PM
2.5. During P1, PM
2.5 is within the observed range in the PBL but is underestimated by a factor 2–3 in the Free Troposphere (FT). The comparison with the flights carried out during the scavenging period (P2a and P2b) shows that the modelled PM
2.5 in the PBL is at the lower end of the observed distribution, but the agreement is satisfying in the FT. Finally, during P3, WRF-CHIMERE underestimates the PM
2.5 mass concentration less than a factor 2 close to the surface, while PM
2.5 is overestimated on the rest of vertical profile. The positive bias in the FT is most likely attributable to the fine fraction of advected desert dust during P3. The overestimation of the dust fine fraction is consistent with the results reported by Menut et al. [
84].
 Table 5 summarizes the statistical indices related to the comparison between predicted and observed aerosol mass concentration over the whole campaign in the PBL and FT. WRF-CHIMERE underestimates the TM of SO
4 on average by about 1.1 µg/m
3 (or −50%) in both PBL and FT. By contrast, NO
3 TM is overestimated by 0.5 (+26%) and 0.9 µg/m
3 (+450%) in the PBL and FT, respectively. However, looking at 
Figure 4, we underline that this bias is essentially due to P2a period and therefore does not reflect a systematic error of the model in aerosol nitrate simulation. Particulate ammonium is simulated within a factor 2 with respect to the observations, the bias is −39% and −30% in the PBL and FT, respectively. OM is systematically underestimated by a factor 2–3 in both PBL and FT. Average PM
2.5 is underpredicted by the model by −6 µg/m
3 (−32%) in the PBL, but shows a negligible error (+7%) in the FT. The analysis of IQR reveals the general tendency of the model to underestimate the variability of aerosol mass concentration found in the observations, except for NO
3 and NH
4 in the PBL and FT respectively. The underestimation of the aerosol mass concentration variability is a common feature found in other modeling systems (e.g., Tuccella et al. [
76]). Part of the model underestimation during P2a and P2b could be related to an excess of wet scavenging in The underestimation of the aerosol mass concentration variability is a common feature found in other modeling systems (e.g., Tuccella et al. [
76]). Part of the model underestimation during P2a and P2b could be related to an excess of wet scavenging in the model. Another critical factor for a correct simulation of bulk aerosol mass is a proper simulation of the aerosol size distribution. This will be discussed in the 
Section 4.5.2.
   4.5.2. Aerosol Size Distribution and CCN
The first step for reliable simulations of ACI is a satisfactory prediction of the aerosol size number distribution together with a proper representation of aerosol mass and composition. These three elements control the number of aerosol particles activated as cloud droplet.
Figure 6 shows the TMs of aerosol size distribution simulated and observed aboard the ATR-42 in all three periods within the PBL and FT. WRF-CHIMERE exhibits a larger number of aerosol particles in the Aitken mode with respect to the observations in both PBL and FT during P1, P2, and P3. CHIMERE particles relative to the bins of Aitken mode (aerosol particle with diameter between 0.01 and 0.1 µm) are overestimated from a few hundred up to 2–3 orders of magnitude.
 There is also a tendency to underestimate the larger particles of the accumulation mode. These results suggest that the overestimation of ultrafine particles is likely related to an excess of particle formation from nucleation. Moreover, the lack of sufficient particles in the accumulation mode reduces the coagulation of ultrafine aerosols on larger existing particles, so that the loss of smaller particles by coagulation is not efficient enough. Another probable source of error is the size distribution at emission: a profile is applied to the total mass of primary aerosol emissions into the model bins (fixed for each type of source and the emitted species). CHIMERE does not include a prognostic treatment of particle number, which is diagnosed from the mass, density and particle diameter. It should be noted that ACI is unlikely affected by ultrafine particles, the most favoured particles to act as CCN are those larger than 100 nm. Therefore, it is interesting to explore how the model reproduces these particles referred to as condensation nuclei (CN) in this study.
Table 6 reports the observed and predicted TMs of CN within the PBL and FT. CN are underpredicted in the PBL from a factor 2.8 during P1 to negligible bias during P2a and P2b. By contrast, WRF-CHIMERE predicts a larger number of CN (about a factor 2) in the PBL during the large transport days when a significant contribution to CN is given by the fine fraction of the dust advected from the desert. In the FT, the model exhibits a systematic positive bias ranging from a factor 1.3 up to 3.3. Looking at 
Figure 6, it should be observed that the bias in simulating CN is not uniformly distributed with the aerosol diameter but in both case of low and high bias, the model in general overestimates the CN with diameter between 0.1 and 0.2 µm and underestimates the particles larger than 0.2 µm. This bias could have an important impact on the CCN calculation leading to larger (lower) concentrations at high (low) levels of supersaturation.
 Vertical profiles of CCN at 0.2% of supersaturation are shown in 
Figure 7 and their TMs for each period are reported in 
Table 6. In the PBL, CCN concentration is underestimated during wet scavenging days and overestimated during P1 and P3, but the bias is always lower than a factor 1.5. Concerning the FT, although CN are high biased in each period, modelled CCN are lower relative to the observations by a factor 3.4 during P2b, 1.6 during P2a and P3, respectively. CCN are overestimated in the FT only during P1. The results found here are comparable with the performances of other state-of-art models. For example, Tuccella et al. [
76] comparing WRF/Chem to our same data set, reported an overestimation of CCN by a factor 1.5 in the mixed layer and a small bias in the FT.
In addition to the size distribution, modelled CCN bias is associated with the critical diameter (CD) and aerosol composition. CD is defined as the cut-off diameter above which all particles act as CCN at a given level of supersaturation. Observed CD has been calculated from aerosol size number distribution and CCN measurements. It is given by the diameter at which the integrated size distribution equals the CCN concentration. TMs of observed and predicted CD are reported in 
Table 6 and are overlaid to the size distributions shown in 
Figure 6. To explain the link between aerosol size distribution, CD and composition in reproducing CCN, we focus as an example on the results obtained in the FT during P3. We choose this case because, as shown in 
Figure 6, the predicted size distribution is very similar to the observations for diameters larger than about 0.15 µm, therefore the CCN bias has a negligible dependence on the size distribution. In the case examined, CD is larger than the observed one by a factor 1.5, explaining the negative bias of CCN despite the reasonable simulation of the size distribution (for diameters larger than 0.15 µm). According to Kohler theory, CD depends on the aerosol hygroscopicity that in turn is a function of composition. Given a supersaturation level, the larger the hygroscopicity, the lower the CD. As a consequence, underestimation of hygroscopicity in the FT during P3 is most likely due to an excess of dust in the fine fraction of modelled PM
2.5 (see 
Section 4.5.1) combined with the missing of sufficient inorganic (sulphate, ammonium, nitrate) mass concentration (about 50% less compared to AMS data) that lowers the aerosol solubility. A measure of hygroscopicity is given by CCN efficiency (CCNE) defined as the ratio between CCN and CN. The observed and modelled values are shown in 
Table 6. As expected, CCNE simulated by WRF-CHIMERE for the case examined is about 5 times smaller than the observations. In the PBL, CCNE is simulated within a factor 2 but the error is larger the FT.
  4.5.3. Cloud Optical Properties
Figure 8a,b show the TMs of cloud droplet concentration number (CDNC) averaged over the whole period (14–30 May 2008). Satellite CDNC has been calculated starting from liquid cloud droplet effective radius (Re) and liquid cloud optical depth (LCOD) following the method used by Georg and Wood [
85]. WRF-CHIMERE captures qualitatively several features of the spatial distribution of observed CDNC. The gradient between North Sea and land is reproduced by model, as well as the distribution of the maxima values of CDNC, with the exception of some regions where the model shows an important bias, for example South-East of England coast, in the middle of the border between the Netherlands and Germany, on the border between France and Belgium. As shown in 
Table 7, domain average of calculated CDNC is underestimated by 40%.
 As it is evident from 
Figure 8a,b, both model and observations show a fine scale variability that makes a point-by-point comparison between WRF-CHIMERE and MODIS difficult. Modelled and observed clouds exhibit small differences in location and timing of formation. As a consequence, the distribution function (DF) is a useful tool that provides an overall view of WRF-CHIMERE performance in reproducing CDNC and at the same time is independent by spatial and temporal phase errors [
86]. Observed and modelled DFs of CDNC are shown in 
Figure 8c.
In general, WRF-CHIMERE overestimates the occurrence frequency (OF) of CDNC less than 100 #/cm3 and underpredicts the largest values. Specifically, the cumulated OF of CDNC representative of “clean atmospheres” (less than 100 #/cm3) are overestimated by a factor of 2. By contrast, modelled CDNC typical of “polluted conditions” (100–600 #/cm3) are about 1.5 times smaller relative to the MODIS data, while values representative of “very polluted conditions” (larger than 600 #/cm3) constitute the 0.5% and 1.5% of observed and modelled sample, respectively.
The origin of the bias of modelled CDNC is related to the uncertainties in aerosol size distribution and hygroscopicity prediction. As already mentioned in the 
Section 4.5, CCN are underestimated at low supersaturations and high biased for large supersaturations, because modelled CN are larger than observations at the smallest diameters of accumulation mode (0.1–0.2 µm) but are low biased for larger sizes. Therefore, given that the range of supersaturation of our simulations is relatively low, the main uncertainties in prediction of CDNC is due to the general underestimation of CN at larger diameter of aerosol size distribution. Another factor that explains CDNC bias is the uncertainty related to hygroscopitcity that in turn affects the CD. As discussed in the previous Section, modelled CD is larger with respect to the observed one because CCNE (which is a measure of aerosol hygroscopitcity) predicted by WRF-CHIMERE is low relative to ATR-42 data. Larger CD implies a lower amount of CCN that may be activated as cloud droplet.
Figure 9a,b shows the TMs calculated over the whole period of liquid cloud droplet effective radius (R
e). As for CDNC, WRF-CHIMERE is able to reproduce qualitatively the features of observed Re pattern. The gradient between North Sea and land and the distribution of the maxima values of Re are reproduced by the model, with the exception of the regions where a bias in CDNC. In these regions, CDNC was noticed and Re are anticorrelated. As shown in 
Table 7, modelled and measured TM and IQR of R
e show very similar values, meaning that WRF-CHIMERE is able to capture the central tendency of measured R
e.
 As displayed in 
Figure 9c, the maximum of DF of modelled R
e has the peak shifted toward larger values with respect to satellite data. DFs of MODIS and WRF-CHIMERE exhibit the maximum around 9–10 and 7–8 µm, respectively. Cumulated OF around the observed maximum (6–9 µm) represents about the 50% of MODIS sample, while in the same range falls the 34% of WRF-CHIMERE data. The shift of the peak toward larger droplet found in WRF-CHIMERE is a consequence of low CDNC in the model.
Figure 10a,b report the TMs of retrieved and predicted LCOD, respectively. As for previous variables, WRF-CHIMERE explains the features of LCOD observed by MODIS. It reproduces the lowest values of the pattern observed in the North of the domain, and the band of maximum values extending from northern France up to northwest Germany crossing Belgium and The Netherland. WRF-CHIMERE overestimates observed LCOD along this band, but the model tends to slightly underpredict LCOD elsewhere. LCOD averaged over the domain is reported in 
Table 7. It is underestimated by a factor 1.3 (−19%) but exhibits larger variability (IQR) than the observations.
 As we would expect, the bias in simulation of cloud optical properties depend on the particular aerosol mechanism adopted in the models. For example, Tuccella et al. [
76], using WRF/Chem with a modal aerosol model on the same region and period of our study, found an opposite behavior with respect to our results. Indeed, Tuccella et al. [
76] reported an overestimation of CCN that produces smaller Re and larger LCOD relative to the observations. However, it should be noted that the magnitude of bias found here is comparable to WRF/Chem performances shown by Tuccella et al. [
76], for example LCOD is overestimated by a few percent up to 48%.
As described in 
Section 2, WRF-CHIMERE includes also the glaciation indirect effect through the processes of heterogeneous and homogeneous ice nucleation starting from aerosol predicted by chemical modules. Unfortunately, no available in situ measurements allow the evaluation of the ice particle concentration. The simulated ice cloud optical depth (ICOD) was compared to that observed by MODIS, including the precipitating hydrometeors and suspended ice cloud.
Figure 10c,d shows the TMs of observed and simulated ICOD. WRF-CHIMERE captures qualitatively the maximum values displayed in the observed pattern in a band between 52–53° N and on the border between Belgium and Germany. WRF-CHIMERE tends to underpredict the minimum values of ICOD observed over the sea and Northwestern Germany. Simulated ICOD TM average is about half that of observations, but IQRs are very similar meaning larger dispersion of modelled values with respect to MODIS. 
Figure 11b reports the comparison between observed and modelled ICOD DFs. The comparison suggests that WRF-CHIMERE has about 15% more of ICOD less than 2. DFs reach maximum for values between 3–20 and 20–40 for observed and predicted sample, respectively. WRF-CHIMERE underestimate the maximum of observed OF by about 15–20%.
 Although in this section we have focused on the relationship among aerosol size distribution, hygroscopicity, CDNC, and COD, the uncertainties in simulating cloud optical properties may be also related to model setup used. For example, Barò et al. [
87] investigated the sensitivity of WRF/Chem of ACI to two different cloud microphysics schemes, and found serious differences in CDNC and cloud water mixing ratio patterns predicted by two parameterizations that depend on geographical location, pollution level, and season. In addition to cloud microphysics representation, Otkin and Greewald [
85] showed an important sensitivity of cloud optical properties to PBL schemes.