4.1. Hourly Concentrations of Primary Pollutants
Major gaseous pollutants were also monitored at the AAPMSs.
Figure 6 and
Figure 7 present the spatial averages of observed and simulated hourly concentrations of NO, NO
2, and SO
2 from different AAPMSs for d03 and d04, respectively.
Table 3 summarizes the ensemble performances of the participant CTMs at each AAPMS for each season.
In general, most CTMs showed good agreement with the observed concentration levels of NO and NO2 for each season, with regular diurnal patterns of NO in the warmer seasons. However, none of the models fully reproduced the elevated concentrations, e.g., for 19–20 May and 3 November (NO and NO2) and 30 January (NO), with differences of 50%–200% between the observations and models, among others; the models tend to overestimate the observed daily maximums of NO: around 10–20 ppbv in spring and around 10–30 ppbv in summer, by a factor of 2.
For midnight on 30 January, all participant CTMs could not simulate the considerably increased level of NO before the rapid NO decrease associated with airmass changes, although all participants reproduced the NO decrease well. This suggests that CTMs successfully simulated the concentration change owing to meteorological changes in the synoptic scale but failed to simulate an increase in the amounts owing to local scale meteorological changes such as the strong atmospheric stability, especially during colder seasons. All models tended to overevaluate the daytime NO reduction. In particular, two WRF-Chem types (M30 and M31) and M05 produced strikingly low constant values, 0.001 or 0.000 ppbv, during the daylight hours in summer and autumn. The normalized mean bias (NMB) for both domains produced a strong underestimation of NO (approximately −40% to −50%), except during the spring. Underestimates of NO at remote stations in Japan have been observed for regional CTMs, as reported by MICS–Asia III results [
9], and the correlations and index of agreement (IoA) values ranged from 0.18 to 0.43 and 0.41 to 0.51, respectively. The performance levels of each model exhibited substantial differences between both domains and seasons. The differences between seasons are likely related to meteorology simulation abilities, but the reasons for the differences appearing between domains are unclear in this stage.
The differences for NO2 in each model were large. Among these models, M31, M32, and M30 tended to overestimate elevated NO2 levels. The lower levels of NO2 obtained by M30 were often comparable to the NO2 concentration obtained by M03, which provided considerably lower NO2 concentrations compared to other models. These results suggest that the differences in meteorological conditions and NOx chemistry in each model produced the NO2 discrepancy between the models. Most of the models produced better results for NO2 than for NO, with ensemble averages of seasonal statistics, e.g., correlation values, of 0.56 (d03) and 0.55 (d04), 0.72 (d03), and 0.71 (d04), particularly in the winter.
Over the year, most models obviously overestimated the observed SO2, with an ensemble bias of 1.7–4.2 ppbv (NMB: 120%–350%) for d03 and 1.5–2.5 ppbv (NMB: 160%–470%) for d04. In addition, relatively high SO2 levels were found for M30, M31, and M32. Meanwhile, M03 and M20 tended to produce lower concentrations compared to the other models, with a negative bias of −1.3 ppbv (M03) and −0.2 ppbv (M20) recorded especially in the spring; and exhibited better performances (IoA: 0.58–0.59) over the other models (IoA: 0.30–0.39), especially in the winter. The input SO2 emissions into two CMAQ simulations (M03 and M20) differed from SO2 emissions of J-STREAM. For example, SO2 emissions in both total and bottom layers of J-STREAM were more than twice those of M03 for d03, respectively. Meanwhile, for d04, including active volcanos, although the total SO2 emissions of J-STREAM were half those of M03, the bottom layer SO2 emissions of J-STREAM were 1.3 times those of M03. The differences in divided SO2 emission amounts in the lower layers possibly affected the simulated atmospheric SO2 concentrations. The second-best model setting, M03, performed slightly better (IoA: 0.41) than other models, which suggests that atmospheric SO2 concentrations were considerably affected by the input emission conditions, including the injection heights. Although modifications of emission conditions help to produce better SO2 simulation, using modifications alone to resolve the overestimation of SO2 (up to 470%) is not realistic.
The differences among models with respect to emissions, chemistries, and meteorological conditions led to major differences in simulated primary pollutant concentrations; moreover, the simulated differences between similar model settings increased in the winter.
4.2. Simulated Daily Concentrations of PM2.5 Components and Total PM2.5 Mass
Figure 8 and
Figure 9 present spatially averages obtained from observed and simulated daily concentrations for PM
2.5 components (SO
42−, NO
3−, NH
4+, EC, and OC) and total PM
2.5 mass for different AAPMSs in d03 and d04, respectively. The seasonal ensemble performances of the participant CTMs at each AAPMS are also summarized as statistics in
Table 4 and
Table 5 for each domain. The goal and criteria levels for CTM performance statistics, NMB, normalized mean error (NME), and correlation were recommended by Emery et al. [
52], and the fractional bias (FB) and fractional error (FE) were recommended by Boylan and Russell [
53], which is listed in
Table A1. Individual model performance reports of each CTM are shown in
Table A2 and
Table A3.
With SO
42− as a dominant PM
2.5 component, most CTMs showed good agreement with daily concentration levels and day-to-day changes in both domains for each season, with the exception of a few model settings. Overall, the ensemble statistics, including the NMB (−0.85, 1.65%), NME (30.34, 29.11), FB (3.66, −13.77%), FE (30.41, 34.28), and correlation (0.74, 0.86), passed the goal level in d03 for summer and autumn. For d04, the NMB (−7.5%), NME (30.34), FB (−13.04%), FE (32.83%), and correlation (0.84) passed the goal level for summer. With the exception of d03 in winter and d04 in summer, the correlation and IoA indicated excellent performance, with maximum values of 0.74–0.88 and 0.79–0.87 for d04 in winter. Most CTMs underestimated the observed SO
42− in d04 on 29–30 July, with relatively low values for the correlation (0.36) and IoA (0.52). This result may lead to underestimations of the total PM
2.5 mass in connection with the NH
4+ concentrations. WRF-Chem (M30, 31) clearly overestimated SO
42− concentrations in PM
2.5 due to the SO
42− mass build-up problem associated with the nucleation calculation in MADE/SORGAM [
54]. In addition, the WRF-Chem group employed their own physical parameterizations such as cumulus convection and microphysics for their meteorological simulations. Additional sensitivity simulations for meteorological fields are required to quantitatively evaluate the model inter-differences of SO
42− and total PM
2.5 mass concentrations owing to the differences in meteorological simulations. We will perform this in the next phase. The largest positive biases were found in M31, with 3.0–9.7 µg/m
3 (NMB: 52%–177%) for d03 and 3.8–10.2 µg/m
3 (NMB: 131%–240%) for d04. These simulated overestimations were slightly higher for CMAQ Version 4.7.1 (M27 and M28), particularly for d04 in spring. This trend indicates that the updated sulfur chemistries in CMAQ Version 5.0 [
35,
55,
56,
57,
58] enhanced the performance of this model compared to the previous versions. In winter, CAMx (M29) performed better, with biases of −0.32 µg/m
3 (NMB: −7.3%) for d04 and 0.39 µg/m
3 (NMB: 3.4%) for d04 under the same emission condition. This result is attributed to an underestimation of SO
42− by the dominant participant model, CMAQ, which may be caused by an inadequate aqueous-phase SO
42− production by Fe- and Mn-catalyzed O
2 oxidation [
14].
All participant CTMs overestimated NO
3− levels in warmer seasons, with ensemble biases of 1.22–1.55 µg/m
3 (NMB: 194%–651%) for d03 and 0.85–1.99 µg/m
3 (NMB: 145%–588%) for d04. The largest positive biases were found in summer. Above all, M20, M30, and M31 strongly overestimated elevated NO
3−levels. Only M11 showed relatively good agreement with observations for d03 in summer, with a minimum bias of 0.12 µg/m
3 (NMB: 91%) and improved values for the correlation (0.46) and IoA (0.54). However, M11 also produced low concentrations for SO
42− and NH
4+. As observed for d04 in autumn, all models exhibited better performance for the daily concentration levels and day-to-day changes in NO
3−. For example, M30 has a minimum bias of 0.14 µg/m
3 (NMB: 11%), which passed the goal NMB level for 24-h NO
3−. Some deviations in NO
3− between observations and the models were attributed to NH
4+ and potentially NH
4NO
3. In winter, most models reproduced day-to-day changes in both domains but tended to underestimate elevated NO
3− levels, with ensemble mean biases of −0.89 µg/m
3 (NMB: −18.9%) and −2.36 µg/m
3 (NMB: −42.8%). A previous model inter-comparison study for the Tokyo metropolitan area, UMICS, concluded that the participant models overestimated NO
3− levels in both summer and winter [
11,
12], although available observations included only one winter and three summer stations. In our validations, most models produced higher NO
3− levels in spring and summer, lower NO
3− levels in winter, and moderate NO
3− levels in autumn, compared with accumulated observation data for d03 and d04. This result is expected to be more accurate than previous reports because a greater number of observations (for 18–22 stations) were included.
As mentioned above, the day-to-day variations in NH4+ were consistent with those of SO42− and NO3−. Therefore, most CTMs showed good agreement with daily concentration levels and day-to-day changes in both domains for each season, with the exception of some elevated peaks. Above all, the ensemble performances indicators, FE and FB, were −27.9%–8.9% and 34.3%–41.1%, thus passing the goal level in both domains for all seasons except winter. Notably, the differences among models increased in summer. Two WRF-Chem models (M32, M31) predicted higher NH4+ levels, with biases of 1.96–3.03 µg/m3 (NMB: 84%–130%) and 1.72–2.71 µg/m3 (NMB: 85%–61%) for d03 and d04, respectively. The M20 model, which employed EAGrid for emissions and an original configuration for meteorology, also produced relatively high NH4+ levels in d03, with a bias of 1.68 µg/m3 (NMB: 51%). These overpredictions were likely associated with those of SO42− and NO3− in summer. Meanwhile, relatively larger negative biases were found for M11, at −1.18 µg/m3 (NMB: 35%) for d03 and −0.81 µg/m3 (NMB: 33%) for d04.
The EC levels simulated by most CTMs were considerably lower than the observations in both domains for all season. The model ensemble biases were −0.90 to −0.20 µg/m3 (NMB: −46% to −22%) and −2.77 to −0.39 µg/m3 (NMB: −58% to −40%) for d03 and d04, respectively, with larger values for Tokyo. Both models employing EAGrid2000-JAPAN (M20 (d03) and M27 (d04)) produced higher EC values than other CTMs with different emission settings, and relatively better NMB values were obtained, at −20% to −3% and −35% to 42%, respectively. This trend suggests that the EC emissions of J-STREAM might be underestimated.
The CTMs reproduced some of elevated OC levels in the warmer seasons, but clearly underestimated the observed OC levels for autumn and winter, with model ensemble biases of −1.78 to −0.01 µg/m
3 (NMB: −42% to 7%) and −2.77 to −0.81 µg/m
3 (NMB: −59% to −39%) for d03 and d04, respectively, which are similar to the EC values. Additionally, as observed for the EC, the negative biases of OC for the Tokyo area were larger than those for western Japan. However, the negative biases of all participant CTMs have been clearly moderated compared with the UMICS cases [
11,
12]. Among the models, M02, M03, and M11 predicted relatively higher OC levels and overestimated the summer OC concentrations. Full-domain nesting simulations were performed via M02 and M03 using a relatively recent CMAQ model (Version 5.1), which includes updates for some chemical and aerosol mechanisms, such as POA aging, SOA mass yields with new pathways from isoprene, alkanes, and PAHs, and SOA formation reactions in the aqueous-phase chemistry. Continual nesting simulations for the Asian scale (d01) performed by CMAQ Version 5.1 exhibited higher regional-scale OC levels, leading to higher OC levels in urban areas in Japan compared with previous versions. Thus, an empirical SOA yield model can predict the same OC concentration level as the VBS model M11. It should be noted that effect of the updated SOA yield mechanisms was not clear at the urban scale when using CMAQ Version 5.1 or higher (e.g., M01, M04–05). Additionally, to evaluate simulated OC concentrations, more observational data are needed.
Overall, most CTMs showed good agreement with observed concentration levels of total PM
2.5 mass in both domains for each season. These results are likely associated with the reproducibility of some dominant components, e.g., SO
42− and NH
4+. Moreover, CTMs tended to fail at reproducing some heavily polluted situations and underestimated the considerably high PM
2.5 concentrations (approximately 40–50 µg/m
3). A considerable underestimation (≈30 µg/m
3) of total PM
2.5 associated with PM
2.5 components, except for SO
42−, was observed for d04 in the winter season, 25 January and 2 February; during that time, the nighttime simulated surface temperature was clearly lower than that in the observations (
Figure 3). This implies that the simulated higher surface temperature compared with that in the observations formed weaker atmospheric stability, which produced weaker accumulations of particulate pollutants at nighttime, especially during colder seasons. The model ensemble biases were −8.66 to −0.99 µg/m
3 (NMB: −43% to −5%) for d03 and −2.91 to −11.98 µg/m
3 (NMB: −55% to −19%) for d04. The largest negative biases are found in winter due to underestimations of NH
4NO
3, particularly for d04. M31 and M32 tended to overpredict the total PM
2.5 due to overestimates of inorganic compounds. Of the model ensemble statistics for d03, the NMB (−5%, 13%) NME (22%, 26%), FB (−9%, −17%), FE (26%, 29%), and correlation (0.81, 0.78) passed the goal level for 24-h total PM
2.5 mass in spring and summer, respectively. In addition, the majority of the other statistical indicators passed the criteria levels as well.