Model Performance Differences in Fine-Mode Nitrate Aerosol during Wintertime over Japan in the J - STREAM Model Inter-Comparison Study

: In this study, the results for nitrate (NO 3− ) aerosol during winter from the first-phase model inter-comparison study of Japan’s Study for Reference Air Quality Modeling (J-STREAM) were analyzed. To investigate the models’ external and internal settings, the results were limited to Community Multiscale Air Quality (CMAQ) models. All submitted models generally underestimated NO 3− over the urban areas in Japan (e.g., Osaka, Nagoya, and Tokyo); however, some model settings showed distinct behavior. The differences due to the model external settings were larger than the model internal settings. Emissions were an important factor, and emissions configured with lower NO x emissions and higher NH 3 emissions led to a higher NO 3− concentration as the NH 3 was consumed under NH 3 -rich conditions. The model internal settings of the chemical mechanisms caused differences over China, and this could affect western Japan; however, the difference over Tokyo was lower. To obtain a higher NO 3− concentration over the urban areas in Japan, the selection of the HONO option for the heterogenous reaction and the inline calculation of photolysis was desired. For future studies, the external settings of the boundary condition and the meteorological field require further investigation.


Introduction
Air quality modelling is an approach to improving our understanding of the behavior of air pollutants. Modelling systems are based on numerical representations of processes, such as emissions, transport (e.g., advection and diffusion), chemical reactions, and the deposition of air pollutants, and these processes contain uncertainties [1]. Model inter-comparison studies are valuable for understanding the uncertainties in modelling and for seeking ways to improve the modelling. Model inter-comparison studies have been conducted in Japan, where the air pollution comes from both long-range transport from the Asian continent and local air pollution [2][3][4][5]. A project called Japan's Study for Reference Air Quality Modeling (J-STREAM) has begun. As reported in the introduction and overview of J-STREAM [6], the project aims to establish reference air quality modelling for source apportionment and to formulate a strategy to suppress secondary air pollutants, including particulate matter with diameters of less than 2.5 μm (PM2.5) and photochemical ozone (O3), in Japan through model inter-comparison studies. The first-phase focused on ascertaining the ranges and limitations of PM2.5 and O3 concentrations simulated by participants using common input datasets. The participating models are Community Multiscale Air Quality (CMAQ), Comprehensive Air quality Model with eXtensions (CAMx), and Weather Research and Forecasting-Chemistry (WRF-Chem).
The first-phase of J-STREAM targeted the 15 months from January 2013 to March 2014, or specific periods in each season that corresponded to the government's monitoring of PM2.5. The findings of the first-phase of J-STREAM have been published in several studies [7][8][9][10][11]. A vegetation database covering Japan was introduced, and this resource was used to improve the meteorological fields as well as biogenic emissions for volatile organic compounds [7]. All models overestimated O3 over urban areas in Japan during summertime, and the background O3 concentration had a large effect, even in urban areas in Japan. To reduce this overestimation, halogen chemistry, dry deposition velocity, precursors over the Asian continent, and vertical transport were identified as key factors [8]. A detailed analysis of the effect of photochemical mechanisms on O3 concentration was also performed. The differences in O3 concentration caused by photochemical mechanisms were within 10 ppbv, and this could partly explain the overestimation of O3 concentration over Japan [9]. The overview of the model performances for PM2.5 during four representative seasons was published [10]. This overview highlighted that the underestimation of PM2.5 was mostly noted during winter in the first-phase of J-STREAM [10]. To overcome this model shortage, sulfate aerosol (SO4 2− ), a major component of PM2.5 in Japan, was studied, and SO4 2− production in the current modeling was reviewed and refined via aqueous phase reactions in CMAQ [11].
In this study, we focused on nitrate aerosol (NO3 − ), which is a major component of PM2.5 during the winter. We analyzed the performance of each model in detail and investigated the cause of the underestimation. The remainder of the paper is organized as follows. The methodology of the model inter-comparison in the first-phase of J-STREAM is outlined in Section 2. The results of the first-phase of J-STREAM for NO3 − are described in Section 3. The reasons for the model underestimation of NO3 − are discussed in Section 4. Finally, the conclusions from this study and recommendations for the model settings are reported, and subjects for future research are summarized in Section 5.

Methodology
The setup of the model was fully described in our previous overviews of the first-phase of J-STREAM [6,10]; hence, brief summaries are provided here. The four nested domains were specified as the target domains and commonly used in J-STREAM. Domain 1 (d01) covered the countries in East and Southeast Asia, domain 2 (d02) covered most of Japan, and domains 3 (d03) and 4 (d04) focused on urban areas in Japan. The horizontal grid resolution was 45 km for d01, 15 km for d02, and 5 km for d03 and d04. The horizontal grids numbers of west-to-east and south-to-north were 207 × 157 for d01, 141 × 147 for d02, 69 × 48 for d03, and 51 × 57 for d04. All domains were constructed with 30 nonuniform vertical layers from the surface up to 100 hPa, with an approximate 53 m for the first layer. In Figure 1, the NOx emissions are plotted to illustrate the domains. There were large amounts of NOx emissions over the Asian continent and Japan, especially over urban areas in Japan. The domains of d03 and d04, respectively, covered the Kansai (including Osaka and Aichi prefectures, colored in grey in Figure 2-hereafter called 'Osaka' and 'Nagoya' for simplicity) and Kanto (including Tokyo Metropolis colored in grey in Figure 2-hereafter called 'Tokyo' for simplicity) regions, which cover the major population and economic centers of Japan, and showed intense NOx emissions. In this study, we focused on d03 and d04, which we considered as a reasonable horizontal resolution for urban-scale modeling [12,13].
The first-phase of J-STREAM for PM2.5 targeted the periods of governmental monitoring of PM2.5 in each season. The representative 2 weeks of intensive observations in each season recommended by the Ministry of the Environment, Japan, were conducted in the 2013 fiscal year (i.e., April 2013 to March 2014) from 7 to 26 May, 2013; 22 July to 10 August, 2013; 21 October to 9 November, 2013; and 20 January to 8 February, 2014. To focus on the behavior of NO3 − during winter when the conditions for NO3 − production are favorable, observations from 20 January to 8 February, 2014 were used in this study; this period is referred to as 'winter' hereafter. PM2.5 samples were collected daily from 10:00 local time to 10:00 the next day at most sites, and the components of inorganic ions (e.g., SO4 2− , NO3 − , and NH4 + ), carbon (elemental and organic carbon), and trace elements were collected. The inorganic ions were analyzed by ion chromatography, and trace elements were analyzed by inductively coupled plasma-mass spectrometry. Over the Kansai region, PM2.5 monitoring was conducted at 24 ambient air pollution monitoring stations (APMSs) and 14 roadside air pollution monitoring stations (RAPMSs). Over the Kanto region, monitoring was conducted at 25 APMSs (one station was designated as a background monitoring station) and 8 RAPMSs. APMSs are designed to monitor general air quality not affected by specific sources, such as roads; hence, APMSs were used to compare the model performances in this study. The participating models in this NO3 − analysis during winter are listed in Table 1. The model information was updated from an overview of J-STREAM [1], and the models were selected for winter from all models submitted to the first-phase of J-STREAM [6]. To focus on the important factors that caused the performance differences for NO3 − within the same model, we limited the number of CMAQ models. Five versions of CMAQ (versions 4.7.1, 5.0.1, 5.0.2, 5.1, and 5.2) were analyzed. In this study, 17 models were used for the analysis of d03, and 20 models were used for d04. The external and internal model settings are listed in Table 1. 'External settings' refers to the model settings that drive the model from the outside: the meteorological field, emissions, initial conditions, and boundary conditions. In the framework of the first-phase of J-STREAM, these external datasets were provided to participants as common data, and it was voluntary to prepare the different external datasets. The domain was unified; however, the simulation from d01 to d03 and d04 or the simulation only over d03 and/or d04 could be selected by participants. Therefore, if the participant conducted the simulation from d01, the boundary condition over d03 and d04 was different from the common dataset. The simulations with the M02, M03, M07, M08, M09, M12, M13, M14, and M15 models were conducted from d01, and the boundary condition for the outside of d01 was taken from the CHASER global model [14] submitted to the Hemispheric Transport of Air Pollution (HTAP) 2 experiment [15], except for that in M07, which used the MOZART global model [16]. The common boundary condition dataset was based on M15, which is called the 'reference model simulation' hereafter. The M01, M04, M05, M06, M22, M23, M24, M25, M26, M27, and M28 models were used this common boundary condition from M15 for the simulation over d03 and/or d04. The common emission dataset in the first-phase of J-STREAM consisted of the anthropogenic emissions taken from HTAP version 2.2 [17] over Asia, the Japan Auto-Oil Program Emission Inventory Database (JEI-DB) [18] over Japan, the biomass burning emissions estimated by the Global Fire Emission Database (GFED) version 4.1 [19], the biogenic emissions calculated from the Model of Emissions of Gases and Aerosols from Nature (MEGAN) version 2.1 [20], and volcanic SO2 emissions used in Aerosol Comparisons between Observations and Models (AeroCom) [21] with replacements and additions based on the report from the Japan Meteorological Agency [22] for domestic volcanoes. M03 and M27 used a different emission dataset for anthropogenic emissions in Japan; M03 combined JEI-DB [18] for automobile sources and EAGrid2010-JAPAN for other sources [23], and M27 used EAGrid2000-JAPAN [24]. These different emission datasets of EAGrid focused on the years of 2010 [23] and 2000 [24], and these were outdated in terms of the targeted year in the analyzed period of the 2013 fiscal year in this study. However, all bottom-up emissions included uncertainties, and it was beneficial to use these emissions in order to examine the difference that arose from emissions. The meteorological field was simulated by the Weather Research Forecast (WRF) version 3.7.1 [25], and its configurations were provided in the overview for this project [6]. Most models used this common meteorological dataset; however, M07 used its own WRF meteorological simulation. The common dataset of initial conditions was also provided by M15; however, M07, M08, M09, M12, M13, and M15 were simulated with different initial conditions because these models were used to perform a continuous simulation to cover all periods of the 2013 fiscal year. The external settings for models in M02, M03, M07, M08, M09, M12, M13, M14, M15, and M27 were configured differently due to the use of different external settings, whereas those in M01, M04, M05, M06, M22, M23, M24, M25, M26, and M28 had the same external settings.
'Internal settings' refers to the internal options for the model configurations: chemical mechanisms, aerosol modules, and some optional settings in the model configuration. The models used the Carbon Bond (CB) 05 [26], Statewide Air Pollution Research Center (SAPRC) 07 [27] and 99 [28], and Regional Atmospheric Chemistry Mechanism (RACM) 2 [29] chemical mechanisms. Depending on the version of CMAQ, the aerosol module was selected as AERO6 or AERO5. One of the nominated differences before and after CMAQ version 5.0 is the update of the thermodynamic equilibrium calculation from ISORROPIA version 1.1 [30] to version 2.1 [31]. Along with this update, detailed aerosol profiles were used to the subdivide emissions of uncategorized particulate matter into primary ammonium (NH4 + ), sodium (Na + ), chloride (Cl − ), non-carbon organic mass, and trace elements (Ca 2+ , K + , Mg 2+ , Al, Si, Ti, Fe, and Mn) [32]. Though both AERO6 and AERO5 are based on the yield model for secondary organic aerosol (SOA) [33], yield parameterization was updated in AERO6. In addition to the difference in the chemical mechanism and aerosol modules used in the CMAQ models, some configurations were different. In the internal settings of CMAQ, we focused on the KZMIN, photolysis, and HONO options here. These options were configured differently in CMAQ models submitted in the first-phase of J-STREAM. If KZMIN was used, the model calculated the minimum vertical diffusion coefficients considering the fraction of the urban area within a given grid. The photolysis rates were calculated inline or calculated based on the look-up table [34]. The HONO option was used for the heterogeneous HONO-producing reactions on the ground surface [35]. For the deposition process, the dry deposition of all CMAQ models was based on the M3DRY scheme [36], with some updates for explicit treatment on organic nitrates species after version 5.1 [37]. The wet scavenging was common according to the common meteorological field, except for M07. Even if the participants used all the common data, each model could be configured differently in the internal settings. Models of M14 and M15 were identical regarding the external and internal settings. --X X X X X SAPRC99 aero5 Y Table N 1 Domain: "X" indicates the domain in which the participant conducted their simulation. "-" indicates that the participant did not conduct their simulation over domains 1 and 2, and they used the common dataset for the simulation of domains 3 and/or 4; these models are shaded in grey. 8 Option in CMAQ for the production of HONO from the heterogeneous reaction on the ground surface: "Y" indicates this option was switched on, and "N" indicates this option was switched off.

Overview of Model Performance
The spatial distribution of the observed NO3 − concentration at APMSs averaged during winter are mapped in Figure 2. The observed NO3 − concentration showed lower levels (i.e., less than 1.0 μg/m 3 ) only at remote sites (e.g., near the edge of the domains), and the concentration level increased sharply (i.e., larger than 3.0 μg/m 3 ) over the center of the Kansai and Kanto regions. The averaged concentration during winter across APMSs shown in Figure 2 over d03 and d04 are summarized in Figure 3. For the CMAQ models results, the Aitken, accumulation, and coarse modes were treated in the aerosol module, and the simulated hourly averaged concentration and the information about aerosol distributions taken in CMAQ after version 5.0 were used to calculate the concentrations so as that the distributions corresponded to the PM2.5 observations. In contrast, the sum of Aitken and the accumulation modes was simply used for CMAQ version 4.7.1. Over d03, the averaged observed concentration was 2.8 μg/m 3 , and the concentrations in the models ranged from 2.0 to 2.6 μg/m 3 , indicating a slight underestimation. Over d04, the averaged observed concentration was 4.0 μg/m 3 , and the concentrations in the models ranged from 1.9 to 2.9 μg/m 3 . The models tended to underestimate the observed concentration, and the underestimation was larger over d04 than over d03.  Whiskers show the standard deviation for the data. White bars are the observations at APMSs, and the black, grey, and red bars are the model results. Black indicates models that used a different external dataset, grey represents models that used the common dataset (see also Table 1), and red represents the ensemble mean.
In Figure 3, the model performance based on the ensemble mean are also shown. The ensemble mean was the simple average of all participating models: where C is the concentration simulated by each model and N is the number of models. N is 17 for d03 and 20 for d04 (see Table 1).
The performance of each model and the ensemble mean was determined from statistical analysis based on the correlation coefficient (R), normalized mean bias (NMB), normalized mean error (NME), mean fractional bias (MFB), and mean fractional error (MFE) (see Appendix A for their mathematical definitions). The values of R, NMB, and NME and the values of MFB and MFE were proposed in the model performance goal and performance criteria, respectively [38,39]. For NO3 − , R was not used to determine the model performance [38]. The proposed model performance goals were NMB <±15% and NME <+65% for the best model performance, and the proposed model performance criteria were NMB <±65% and NME <+115% for the acceptable model performance for daily NO3 − concentration levels [38]. The model performance goals were also proposed as MFB ≤±30% with MFE ≤+50% for the best model performance, and the model performance criteria were proposed as MFB ≤±60% with MFE ≤+75% for acceptable model performance [39]. The performance of each model and model ensemble mean over d03 and d04 are summarized based on these judgements in Table 2. The R values were around 0.2 over d03 and around 0.3 over d04, indicating that the linear correspondence between the models and observations was weak. The NMB of all models showed negative bias over d03 and d04 ( Figure 3). NMB met the performance criteria over d03 and d04, and the scores were better over d03 compared to those over d04. Some models (M02, M03, M05, M08, M12, M14, and M15) showed the performance goal over d03. The M15 reference model achieved its NMB performance goal over d03. Despite the performance goal of M15 on NMB over d03, the performance of some other models showed the performance criteria on NMB over d03. This result suggested that even though the common dataset was used (i.e., M01, M04, M06, M22, M23, M24, and M26), the model performance could be worsened compared to the M15 reference model simulation; hence, the internal model settings in the model configuration should be considered. All models met the performance criteria for NME over d03 and d04, and the scores were slightly better over d04, in contrast to the results for NMB. This could have been related to the better R value over d04 compared with d03. The results for MFB were interesting. All models showed negative NMB values, whereas MFB showed positive or negative values over d03. Models M06, M07, M13, M23, M24, and M26 showed negative values for both NMB and MFB, and the NMB values were larger than for the other models, suggesting that these models tended to substantially underestimate the NO3 − concentration. For all models, the MFE scores were above 80% and did not meet the model performance criteria.
The model performances were examined in terms of the horizontal distribution. The simulated horizontal distribution averaged during winter by the ensemble mean is shown in Figure 4 (top). Each model result is shown in Figures S1 and S2 in the supplementary material for d03 and d04, respectively. The mean bias between the ensemble mean and APMSs is spatially distributed in Figure  4 (center). Over remote areas (i.e., the edge of domain), the models performed well within a mean bias of −1.0 to +1.0 μg/m 3 . However, the model results were underestimated over central urbanized areas with intense emissions, especially in d04. To examine the model performance diversity, the coefficient of variation (CV) was calculated and is shown in Figure 4 (bottom). The CV is defined as the standard deviation among models divided by the model ensemble mean; therefore, a large value of CV indicates inconsistency among the models, whereas a small value indicates consistency. Only one grid over Osaka showed a large (i.e., greater than 0.30) CV over d03, and the value of CV was generally larger over the ocean rather than the land. This was the same over d04; the model differences in the south east in this domain (e.g., the Pacific Ocean) were large, and CV values between 0.12 and 0.18 were found over the Tokyo area. The value of CV was larger over the oceans, but the NO3 − concentration over the ocean was smaller (i.e., below 1.0 μg/m 3 ). For a detailed analysis of the model performances, representative sites in Osaka, Nagoya, and Tokyo (indicated by pink arrows in Figures 2 and 4) were investigated. Over Osaka, the site with the largest mean bias was found over southern Osaka and was selected. Over Nagoya, all models generally reproduced the NO3 − concentration well and the mean bias was smaller; hence, the highest concentration site was chosen. Over Tokyo, the largest mean bias was found at a north-eastern site, which was selected. The temporal variation of the observations and all models including the ensemble mean are shown in Figure 5 at the three representative sites. At the Osaka site, the high concentration over 8 μg/m 3 on 25 January was not simulated well by any of the models. This high concentration was also observed in Nagoya and Tokyo, and none of the models captured this high concentration either. At Osaka, one model (M03, discussed in detail in Section 4.1.2) showed a distinctly higher concentration and captured a high concentration on 24 January. At the Nagoya site, the model differences were smaller compared with the Osaka and Tokyo sites, and the model performance was better. At the Tokyo site, the high concentrations observed early on (25 January) and late in the analyzed period (2 February) were not well simulated by any of the models. Model performance differences were found in the early-to-middle part of the analyzed period. In summary, the models submitted to the first-phase of J-STREAM generally underestimated NO3 − concentration, especially in d04 of the Kanto region; however, some models showed different behavior.  Figures 2 and 4 for the location of these sites). Black lines represent models using different external datasets, grey lines represent models using the common dataset (see also Table 1), and red lines represent the ensemble mean.

Discussion
In this section, each model setting is examined in detail, and we discuss the reasons for the different model behavior. In these comparisons, the difference between two models are defined as: where, unless stated otherwise, C is the modelled NO3 − concentration and M15 was used as a reference model. This calculation of the difference is shown as the time-averaged value over the analyzed period in winter. Before the model external and internal settings were examined, the dependence of the performance on the model version was investigated. The majority of the participating CMAQ versions were 5.1 and 5.0.2, and M05 and M15 were considered appropriate as reference models because these models used a common dataset and internal settings (Table 1). In the first-phase of J-STREAM, CMAQ version 5.0.1 was the main participating model; however, the changes in version 5.0.2 were minor. From CMAQ versions 5.0.2 to 5.1, several changes were introduced [40,41]. These updates included the species treated in the SAPRC07 chemical mechanism [42], the SOA treatment [43], and the description of clouds related to the inline calculation of the photolysis rate [44]. The difference between M05 and M15 is shown in Figure 6. Though the difference between M05 and M15 ranged from −0.2 to +0.3 μg/m 3 in some grids, the spatially averaged difference was within ±0.1 μg/m 3 . Therefore, the model difference due to the model version was small ( Figure  3), and the model differences caused by external and internal settings could play an important role. In Sections 4.1 and 4.2, the model external and internal settings, respectively, are analyzed and discussed.

Meteorology and Boundary Conditions
The model performance difference caused by the meteorology and boundary conditions could be seen by comparing M07 and the reference model M15 (see Table 1). This analysis was conducted from d01. In Figure 7, the modelled NO3 − concentration in the M15 reference model is shown over d01 and d02, and the difference between M07 and M15 is also shown. Over Asia (d01), high NO3 − concentrations above 10.0 μg/m 3 (dark red in Figure 7) were limited to over the Asian continent (i.e., near intense emission sources) and low concentrations below 1.0 μg/m 3 (white in Figure 7) were seen over most oceanic regions, except the East China Sea and the Sea of Japan. Over Japan (d02), high NO3 − concentrations above 2.5 μg/m 3 (yellow in Figure 7) were found over the western Japan, Kansai, and Kanto regions. The difference between M07 and M15 showed both effects caused by the meteorological field in different settings of the WRF meteorological model and boundary conditions taken from different global models. The effects near the boundaries of d01 were attributed mainly to the boundary conditions. Thus, the negative and positive differences over the western and northern edges of d01, respectively, stemmed from the effect of different global models. This difference was more than 1.0 μg/m 3 , and considering the low concentration level over the northern part of d01 (e.g., the far-east region of Russia), the selection of the global model will be an important setting for refining the model simulation over northern parts of the Asian continent. This point was also mentioned in the Asian-scale model inter-comparison study (MICS-Asia) [45,46]. Another possibility in addition to the aerosol components themselves was the difference caused by other species. For example, O3 could have contributed to NO3 − production, and M07 simulated higher O3 during winter [8]; this difference might have been partly related to the difference found in NO3 − in Figure 7. The boundary condition and meteorological impact was further investigated by examining the differences between M07 and M15 over d03 and d04 (Figure 8). Over the Kansai region, the difference was sparsely found, whereas over the Kanto region, the difference was clearly visible as a higher concentration for M07 compared with M15 over the northern Tokyo area. The difference over the Kansai region was −0.5 to +0.3 μg/m 3 , and the spatially averaged difference was −0.04 μg/m 3 . In contrast, the difference over the Kanto region was mostly positive up to +0.7 μg/m 3 , and the spatially averaged value was +0.13 μg/m 3 . In terms of the meteorological field, the higher concentration found over the northern part of Tokyo was related to the characteristics of the land-sea breeze circulation over the Kanto region that transports high NO3 − concentrations centered over Tokyo metropolitan area to the northern part of Tokyo. The role of the land-sea breeze is well known in summer [47]; however, the enhanced role in winter due to the urban heat island effect is caused by urbanization [48,49]. The meteorological field of the common dataset was evaluated for temperature, wind field, and precipitation in the overview paper [10], and it showed general agreement. An examination of different meteorological fields itself is required, and the behavior of NO3 − caused by the meteorological field could be one of the key points for further study. In addition, the difference caused by the meteorological field and boundary conditions should be separately investigated in further studies to distinguish their respective impact over the urban area of Japan.

Emission
The model performance difference caused by emissions can be examined in M03 and M27, with M15 as the reference model (see Table 1). Models M03 and M27 performed differently; in particular, the bias (NMB and MFB) performance scores were better than in the other models (Table 2). Therefore, the emission external setting was one of the most important factors in the model differences. M03 and M15 were compared over d03, and M03, M15, and M27 were compared over d04. The differences in NO3 − concentration and relevant emissions and chemical species are presented in Figure 9 for d03 and Figure 10 for d04.
Over d03, the model performance difference for NO3 − concentration was inhomogeneous, but M03 gave higher concentrations over most parts of Osaka and a slightly higher concentration over Nagoya (Figure 9a). The difference ranged from −0.9 to +2.5 μg/m 3 , and the spatially averaged difference was +0.10 μg/m 3 due to the inhomogeneous difference over d03. Figure 9a illustrates the temporal variation for M03 and M15 at the two representative sites in Osaka and Nagoya shown in Figure 5. The performance of M03 at the Osaka site was distinct. For example, on 24 January, all models gave concentrations close to zero, except for M03, which simulated a concentration of 6.0 μg/m 3 to capture the observation. As in d03, M03 also gave a higher concentration over d04 ( Figure  9a). The difference between M03 and M15 ranged from −0.4 to +0.8 μg/m 3 , and the spatially averaged difference was +0.16 μg/m 3 . Furthermore, compared with M03, M27 gave higher concentrations (Figure 10a). The difference between M27 and M15 ranged from −0.5 to +1.8 μg/m 3 , and the spatially averaged difference was +0.49 μg/m 3 . The temporal variations of NO3 − concentration at Tokyo for M03, M15 and M27 are also shown in Figure 10a. Over d04, as indicated by the comparison across APMSs (Table 2 and Figure 3), M27 simulated higher NO3 − concentrations than M03.
Over d03 and d04, we investigated the related species to explain the difference in model performance in NO3 − concentration. NOx emissions were higher in M15 than in both M03 and M27 over d03 and d04, (Figures 9b and 10b), and as a sequence, NOx concentration was higher in M15 than in M03 and M27 (Figures 9c and 10c). Comparing the daily averaged observed NOx concentration showed that the higher concentration simulated by M15 matched the observations better. The NO3 − precursors, HNO3 and NH3 gas, were examined further. The difference in HNO3 concentration between M03 or M27 and M15 was small (Figures 9d and 10d)-mostly within −0.3 to +0.3 ppbv over the whole of d03 and d04. Therefore, the difference in NO3 − concentration could be associated with NH3 behavior. The difference in NH3 emissions (the NH3 emissions of the common dataset are shown in Figure S3 in supplementary material) was inhomogeneous (Figures 9e and 10e); however, higher NH3 emissions in M03 and M27 were found over Osaka and Tokyo, where higher NO3 − concentrations were also observed. Despite these higher NH3 emissions, NH3 concentrations were mostly lower in M03 and M27 than in M15.
In terms of the configuration of M03 and M27, these model versions were different from the M15 reference model ( Table 1). The difference between M03 and M02, and M27 and M28 were solely attributed to the difference due to emissions originally used as the difference dataset from the common dataset. These additional analyses are shown in Figures S4 and S5 in the supplementary material. The differences found as spatial distributions were generally similar to those found from the difference to M15, and the timeseries of M02 and M28 were closer to M15 rather than M03 and M28, respectively. As discussed in the introduction of Section 4, the difference due to the model version could have been small, and the differences that arose from the model external settings (i.e., emissions in this case) would have led to large model performance differences.
This behavior was examined in detail based on the gas ratio (GR) analysis [50]. The GR indicated the sensitivity of NO3 − to changes in SO4 2− and NH4 + concentration. For the SO4 2− concentration, the SO2 emissions in the common dataset are shown in Figure S6, and the model performances are presented in Figures S7 and S8 in the supplementary material for SO4 2− and SO2 in the same format as in Figures 9 and 10. The GR is defined as the ratio of free ammonia to the total nitrate as: The DSN is defined as: where the DSN is equal to or greater than 2 if there is sufficient NH4 + . As a result, the adjGR in Equation (3) The adjGR value was used to understand the NO3 − production conditions; an adjGR value greater than 1 means that NH3 is abundant for neutralizing both SO4 2− and NO3 − (NH3-rich).
The spatial distribution patterns of adjGR are plotted in Figure 11. During the analyzed period, high NO3 − concentrations generally greater than 1.5 μg/m 3 (yellow in Figure 4, or Figures S1 and S2) corresponded well to NH3-rich conditions. Therefore, under NH3-rich conditions, the consumption of NH3 was larger in M03 and M27 than in M15, and this led to higher NO3 − concentrations. Due to the effective consumption of NH3, the value of adjGR was lower in M03 and M27 than in M15; however, the NH3-rich conditions suggested by adjGR being equal to unity was similar in M03, M27, and M15. The analysis of the differences in emissions revealed that the differences in NO3 − concentration were caused by emission settings for NH3 emissions rather than NOx emissions.

Chemical Mechanism
The chemical mechanisms were configured differently in the submitted CMAQ models in the first-phase of J-STREAM, and it was appropriate to compare models M08, M09, M12, and M13 because the model performance difference caused by the chemical mechanism for O3 concentration of these four models was previously analyzed [9]. Therefore, the model performance difference caused by the chemical mechanism was examined in M08, M09, M12, and M13, and M13 was used as a reference model in this case because the SAPRC99 used in M13 was the oldest mechanism and the differences could be interpreted as the progress related to updating the mechanisms. Due to the restriction of the selection of the aerosol module, the M13 of SAPRC99 was configured with AERO5, whereas other models of M08, M09, and the M12 of SAPRC07, CB05, and RACM2 were configured with AERO6 (Table 1). The differences are shown in Figure 12 over all domains. Over d01, SAPRC07 (M08) showed a higher NO3 − concentration compared with SAPRC99 (M13) over southern China, but it showed a lower concentration over India. The comparison between CB05 (M09) and SAPRC99 (M13) showed complex differences with a higher NO3 − concentration in CB05 over southern China and a lower concentration over northern China and India. RACM2 (M12) mostly gave higher NO3 − concentrations than SAPRC99 (M13). The higher concentration in RACM2 was consistent with the results over Europe and the U.S. [51,52]. The difference of precursors of HNO3 and NH3 were analyzed and presented in Figure S9 in the supplementary material. The dependencies on HNO3 and NH3 showed a complex feature; higher concentration by M08 and M09 were related to NH3, whereas that by M12 depended on HNO3. Over d02, the difference in China affected western Japan and could have been related to the NO3 − concentrations over d03 of the Kansai region. The difference caused by chemical mechanisms over western Japan was up to +1.0 μg/m 3 , and this impact occurred over a broad area over western Japan. Over d03, the difference in chemical mechanisms was mostly positive compared with SAPRC99, and the differences between SAPRC07 and SAPRC99 (M08 and M13), CB05 and SAPRC99 (M09 and M13), and RACM2 and SAPRC99 (M12 and M13) were all up to +0.8 μg/m 3 . The spatial averaged differences were +0.23, +0.22, and +0.26 μg/m 3 for SAPRC07, CB05, and RACM2, respectively, compared with SAPRC99. From this analysis, we concluded that the difference due to the chemical mechanisms could have an effect, especially over western Japan and the Kansai region, and this difference may have been mostly related to the long-range transport from outside of Japan. However, the difference related to the long-range transport was not found over d04 of the Kanto region. As seen in d02, the sequential difference from the Asian continent did not reach the Kanto region. The behavior of the difference in NO3 − concentration over the Kanto region was different, especially for CB05, which showed a negative difference over northern Tokyo. The difference between SAPRC07 and SAPRC99 (M08 and M13) ranged from −0.2 to +0.3 μg/m 3 , the difference between CB05 and SAPRC99 (M09 and M13) ranged from −0.3 to +0.4 μg/m 3 , and the difference between RACM2 and SAPRC99 (M12 and M13) was up to +0.3 μg/m 3 . The spatial averaged differences were +0.09, +0.08, and +0.12 μg/m 3 for SAPRC07, CB05, and RACM2, respectively, compared with SAPRC99. Over the Kanto region, the difference due to the chemical mechanism was limited, and its effects were smaller compared with those over the Kansai region.

KZMIN
In CMAQ models, the minimum eddy diffusivity was set as 1.0 m 2 s −1 ; however, when the user used the KZMIN option, the minimum eddy diffusivity was internally calculated below an assigned altitude (500 m) as: where (lower) is 0.01 m 2 s −1 , (upper) is 1.0 m 2 s −1 , and the variation of PURB is the percentages of the urban fraction derived from the land use data. Therefore, when the KZMIN option was used, the minimum eddy diffusivity for grids categorized as urban areas could be lower than the prescribed value of 1.0 m 2 s −1 , and the minimum eddy diffusivity was reduced to 0.01 m 2 s −1 at the grids for nonurban areas [40]. Before version 5.0 (version 4.7.1 was used in this study), the values used for (lower) and (upper) were different and were 0.5 and 2.0 m 2 s −1 , respectively. In the model settings submitted to J-STREAM, it was not possible to directly compare the KZMIN option, but M06 and M23 could be used to investigate the difference caused by the KZMIN option despite the differences in the model version. M26 and M28 could also be used; however, because of the changes in the (lower) and (upper) values before and after version 5.0, M26 and M28 were not compared. Before M06 and M23 were compared, the difference between M05 and M22 was checked to investigate the difference caused by updating the model version (see also Figure 6). The difference between M05 and M22 showed a slightly higher NO3 − concentration for M05, especially over Nagoya and northern Tokyo ( Figure 13). The difference ranged from −0.1 to +0.4 μg/m 3 over d03 and from −0.1 to +0.4 μg/m 3 over d04. The spatial averaged difference was +0.04 μg/m 3 over d03 and +0.08 μg/m 3 over d04, indicating that the difference due to the model version was small. In the comparison of M06 and M23, the updated photolysis rate in the model version could be neglected because both models used a look-up table to calculate the photolysis rate. The PURB mapping used in this study and the difference between M06 and M23 are shown in Figure 14. Higher PURB values were seen over Osaka and Tokyo. Despite the model version difference, this result suggested that the KZMIN option could cause a lower NO3 − concentration over urban areas and a higher NO3 − concentration over nonurban areas. However, the difference ranged from −0.3 to +0.4 μg/m 3 over d03 and from −0.2 to +0.2 μg/m 3 over d04, and the spatial averaged differences were below +0.05 μg/m 3 over both domains. The effect of the treatment of KZMIN has been reported for night time O3 formation [53]. For the further study of the effects of KZMIN, the daily averaged value is insufficient, and observations with high temporal resolution are required [54]. This analysis could be included in the second-phase of J-STREAM using hourly observations of inorganic aerosols [55]. Moreover, as the land use data in the first-phase of J-STREAM were based on the United States Geological Survey (USGS), urban areas were underrepresented compared with the observations. Therefore, the second-phase of J-STREAM has established a new land use dataset, and the meteorological field, biogenic emissions, and concentrations of O3 and SOA in this revision have been verified [7].

HONO and Photolysis
The CMAQ models contain the HONO option. In general, HONO is produced in the atmosphere by a direct emission, gas phase oxidation, heterogeneous reaction, and by a surface photolysis reaction. The HONO option in CMAQ models controls whether the heterogeneous reaction is considered [35]. In several heterogeneous reactions involving HONO, the following reaction is the most important:

2NO + H O → HONO + HNO
This heterogeneous reaction could occur on aerosol and ground surfaces and is implemented in CMAQ models with a first-order rate constant in NO2 [35]. This production of HNO3 will lead to the formation of NO3via the gas phase reaction with NH3. The HONO reaction is also related to photolysis; however, there have been updates to the inline calculation of photolysis along with model updates (see Section 4.2.2). Therefore, distinguishing the effects of each change is difficult. The differences in NO3 − concentration due to the model external and internal settings are shown in Figure 15. The model performance differences in NO3 − concentration due to the model external settings (opaque colors in Figure 15) were greater than those due to the model internal settings (transparent colors in Figure 15 with gray texts). The detail of the model external settings have been discussed in Section 4.1. In the results for the model internal settings, SAPRC99 tended to simulate lower NO3 − concentrations than other chemical mechanisms (see Section 4.2.1). For example, the domainaveraged value showed a higher NO3 − concentration in M06, in which KZMIN was used, than in M23 (see Section 4.2.2). Here, we focus on HONO and the photolysis options. The selection of the HONO option caused better model performance in terms of the HONO concentration. The models M04, M06, M23, M24, M26, M27, and M28, in which the HONO option was switched off, simulated a HONO concentration five times lower than the other models, which included the HONO heterogeneous reaction. The spatial distributions of the models are shown in Figures S10 and S11 in the supplementary material over d03 and d04, respectively. Models in which the HONO option was switched off simulated HONO concentrations of 0.01-0.02 pbbv, whereas models including the HONO option simulated concentrations of up to 0.5 ppbv over d03 and d04. The observed HONO concentration over Nara prefecture, which is close to Osaka, was 0.58-1.26 ppbv during the 1994-1995 winter season [56]. In addition, the HONO concentration observed at Tsukuba, which is a rural area in the Kanto region, was 0.86 ± 0.14 ppbv during the 2004-2005 winter season [57]. Based on this qualitative comparison with the HONO observation, the simulated HONO concentrations without the heterogeneous reaction were an order of magnitude too low, and we concluded that the HONO option should be switched on in the simulation settings over Japan.
The photolysis calculation is involved in reactions of the reactive nitrogen (e.g., NO2, NO3, HONO, and HNO3) and other important species such as O3. In the comparison of the photolysis methods, the inline calculation (circles in Figure 15) was generally related to higher NO3 − concentrations, except for M13, which was configured with the SAPRC99 model. The models M06, M23, M24, M26, M27, and M28 used a lookup table (triangles in Figure 15), and these models also switched off the HONO option. The model performances of M01, M04, M06, M22, M23, M24, and M26 using the common dataset were worse over d03 compared with the M14 reference model. In principle, the table option for photolysis calculated the daily clear-sky photolysis rates from the lookup tables of molecular absorption cross-sections and quantum yield data, as well as climatologicallyderived O3 column and optical depth data. In contrast, the inline option for photolysis accounted for the presence of O3 and ambient aerosol predicted in the CMAQ model and used these estimates to adjust the actinic flux, rather than relying on a look-up table. This could be realistic calculation but it also included errors associated with the simulation results. In addition, the cloud distribution and optical properties are described more consistently in the meteorological field following the updates after version 5.1 [44]. Therefore, inline photolysis calculations are theoretically reasonable. The analysis shown in Figure 15 shows that the HONO option and inline photolysis calculations should be used to simulate higher NO3 − concentrations over urban areas in Japan. For the external settings, models that used their own data are shown by opaque shading, and models that used the common data are shown by transparent shading with gray texts. The chemical mechanisms are indicated by color: CB05 is orange, RACM2 is green, SAPRC07 is blue, and SAPRC99 is sky blue. Large symbols indicate that KZMIN was used, and small symbols indicate KZMIN was not used. Inline and table photolysis calculations are indicated by circles and triangles, respectively. The internal settings for HONO are shown by the HONO concentration on the x-axis.

Conclusions
The first-phase of the Japanese model inter-comparison study, J-STREAM, found that NO3 − was underestimated during winter. In the present study, the reason for this model underestimation was analyzed by focusing on CMAQ, and the external and internal model settings were investigated over urban areas in Japan, namely, the Kansai (e.g., Osaka and Nagoya) and Kanto (e.g., Tokyo) regions. . The selection of the chemical mechanism could increase NO3 − concentration over western Japan via long-range transport, and the difference over the Kanto region was smaller.  KZMIN internal setting: The use of the KZMIN option, which calculates lower minimum vertical diffusion coefficients compared with the prescribed value, led to lower concentrations over the grids with a land use category of urban, as well as to higher concentration over other grids. Though there was a clear relation between the difference and the fraction of urban area, these differences over domains 3 and 4 (+0.05 and +0.04 μg/m 3 , respectively) were smaller.  HONO and photolysis internal settings: The models in which the HONO option was switched off, including the heterogeneous reaction of HONO, showed a HONO concentration five times lower than in models with the option switched on. Based on the comparison with HONO observations in Japan, the HONO concentration simulated by models without the heterogeneous reaction were an order of magnitude too low. Some models also used a lookup table to calculate photolysis. The difference in model performance between these models and the M15 reference model suggested that the HONO option should be switched on and that inline photolysis calculation is required for simulating air quality over urban areas in Japan.
In further studies, different external settings as potential factors to cause large differences on modeling performance rather than internal settings should be investigated. In this study, the observations were based on daily averaged values; however, these are inadequate for understanding the diurnal variation closely related to meteorology, emissions, and other factors. Comparisons with high-temporal resolution measurements are being examined in the second-phase of J-STREAM [54]. As the KZMIN option showed different behavior on urban and nonurban grids, simultaneous measurements at different locations in the Kanto region could improve our understanding of the behavior of air pollutants over urban and nonurban areas [58].
Supplementary Materials: The following supplementary materials are available online www.mdpi.com/2073-4433/11/5/511/s1. Figure S1: Spatial distributions of NO3 − concentration averaged during winter for each model over domain 3. Figure S2: Spatial distributions of NO3 − concentration averaged during winter for each model over domain 4. Figure S3: Mapping of NH3 emissions (common dataset) over domains 3 and 4. Figure S4: (Left) Spatial distributions of the difference caused by emissions between M03 and M02 over domain 3. (Right) Temporal variation of models M03 and M02 at specific sites in Osaka and Nagoya (see Figures 2 and 5). The analysis targets are (a) NO3 − concentration, (b) NOx concentration, (c) HNO3 concentration, and (d) NH3 concentration. The value shown on the lower-right side in (a) indicates the spatially averaged difference. See also Figure 9. Figure where N is the total number of paired observations (O) and models (M).