Evaluation of Regional Atmospheric Models for Air Quality Simulations in the Winter Season in China

Meng, Fan; Du, Xiaohui; Tang, Wei; He, Jing; Li, Yang; Wang, Xuesong; Yu, Shaocai; Tang, Xiao; Xing, Jia; Xie, Min; Zeng, Limin; Dong, Huabin

doi:10.3390/atmos17010001

Open AccessEditor’s ChoiceArticle

Evaluation of Regional Atmospheric Models for Air Quality Simulations in the Winter Season in China

by

Fan Meng

^1,2

,

Xiaohui Du

¹,

Wei Tang

^1,*,

Jing He

³,

Yang Li

¹,

Xuesong Wang

⁴,

Shaocai Yu

^5,6,*,

Xiao Tang

⁷,

Jia Xing

⁸,

Min Xie

⁹

,

Limin Zeng

⁴ and

Huabin Dong

⁴

¹

Chinese Research Academy of Environment Sciences, Beijing 100012, China

²

Asia Center for Air Pollution Research, Nishi-ku, Niigata-shi 950-2144, Japan

³

Xinjiang Institute of Environmental Protection Science, Urumqi 830011, China

⁴

College of Environmental Sciences and Engineering, Beijing University, Beijing 100871, China

⁵

Zhejiang Key Laboratory of Solid Waste Pollution Control and Resource Utilization, School of Environmental Sciences and Engineering, Zhejiang Gongshang University, Hangzhou 310018, China

⁶

Zhejiang Key Laboratory of Ecological Environmental Damage Control and Value Transformation, Eco-Environment Science Research & Design Institute of Zhejiang Province, Hangzhou 310007, China

⁷

Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China

⁸

School of Environment, Tsinghua University, Beijing 100084, China

⁹

School of Atmospheric Sciences, Nanjing University, Nanjing 210023, China

^*

Authors to whom correspondence should be addressed.

Atmosphere 2026, 17(1), 1; https://doi.org/10.3390/atmos17010001

Submission received: 14 October 2025 / Revised: 1 December 2025 / Accepted: 2 December 2025 / Published: 19 December 2025

(This article belongs to the Section Air Quality)

Download

Browse Figures

Versions Notes

Abstract

This study conducted an intensive air quality model evaluation as a response to the urgent need to understand the reliability, consistency, and uncertainty of air quality models supporting the implementation of the PM_2.5 Air Pollution Control Action Plan in China. Five regional air quality models of CMAQ version 5.02, CMAQ version 5.3.2, CAMx version 6.2, CAMx version 7.1, and NAQPMS have been evaluated for the CO, SO₂, NO₂, O₃, PM₁₀, and PM_2.5 concentration and components. A unified statistical method and the same observational data set of 2017, comprising 17 air pollution episodes collected from four super monitoring stations in the regions of Beijing–Tianjin–Hebei, Yangtze River Delta, Pearl River Delta, and Chengdu–Chongqing in China, have been used for the evaluation. All the participating models performed well in simulating the mean PM_2.5 concentrations, with an NMB ranging from −0.29 to −0.04, showing that the participating models are basically suitable for simulation and as evaluation tools for PM_2.5 in regulatory applications. However, the participating models showed a great variability for PM_2.5 components, with the NME ranging from 0.48 to 0.53. The models performed reasonably well in simulating the mean sulfate, nitrate, BC, and NH₄⁺ concentration in PM_2.5, while they were diversified in simulating the mean OC concentrations. The participating models also consistently performed well in simulating the concentration of NO₂, CO, and O₃. However, the models generally overestimated SO₂ concentrations, and to some extent underestimated PM₁₀ concentrations, which is likely attributable to uncertainties in emission sources and the rapid implementation of strict control policies for SO₂. The evaluation work of this study shows that there remains significant potential for further enhancement. Updating and improving the emission inventory should be prioritized to achieve better results, and further investigations into the uncertainties associated with the meteorological simulations, chemical mechanisms, and physical parameterization options of air quality models should also be conducted in future work.

Keywords:

regional atmospheric models; model evaluation; air quality; PM_2.5; O₃; winter season

1. Introduction

In the past three decades, many air quality models have been developed as tools for air quality simulation and prediction in air quality planning, management, and assessment. Besides the Gaussian models, which mainly focus on transport and turbulence diffusion processes at the local scale, numerical chemistry and transport models (CTMs) have been developed. CTMs can simulate the atmospheric physical and chemical processes of air pollutants from emission to removal in the atmosphere, including advection transport, turbulent diffusion and convection, gas-phase and liquid-phase chemical reactions, aerosol dynamics and heterogeneous chemistry, dry deposition and surface processes, and wet deposition by clouds and precipitation back to the Earth’s surface. They are suitable for the simulation and prediction of both primary air pollution problems and secondary and regional air pollution problems, such as particulate matter, photochemical oxidation, and acid deposition, and can be used with different spatial scales (city, regional, or global scale), different resolutions or grid sizes, and different temporal resolutions or time steps. Based on the development level and the degree of complexity, different models and versions vary in physical and chemical algorithms and options, the demand for emission and meteorological input data, and the computer resource requirements.

Due to the complexity of atmospheric processes, the limitations of scientific knowledge, and computing technology, the quality models are more or less approximations of real physical and chemical processes and mechanisms. Therefore, the air quality models may not simulate all the physical and chemical processes well for all species and application scenarios. On the other hand, there are always uncertainties in the emission inventories and meteorology input data. CTMs are not always accurate enough and have various uncertainties. In addition to accuracy, consistency is another important requirement when used as a prediction and assessment tool for air quality management or regulatory purposes such as air quality planning and environmental impact assessment. At the same time, input data, parameter availability, and ease of use are always important issues for air quality modeling in regulatory applications or management applications. Therefore, the evaluation and validation of air quality models are essential for model development and are the necessary preconditions for regulatory applications and the basis for model development.

During the long period of development and application, the CMAQ model has conducted systematic evaluation for every version [1,2,3], especially through the annual CMAS conference [4]. For example, the comprehensive evaluation of CMAQ v5.3 and v5.3.1 [3,4] has been conducted recently using monitoring data from the 1304 O₃ monitors and 2010 PM_2.5 monitors of the Air Quality System (AQS) maintained by the U.S. Environmental Protection Agency (EPA), 190 O₃ monitors and 196 PM_2.5 monitors of the National Air Pollution Surveillance Program (NAPS) in Canada, PM_2.5 components from 242 CSN sites, 149 sites of the Interagency Monitoring of Protected Visual Environments (IMPROVE), and 94 Clean Air Status and Trends Network (CASTNet) sites. CAMx7.1 was also evaluated for daily PM_2.5 concentrations and component species against the observed data of 2016, including 107 CSN sites and 150 IMPROVE monitoring sites. There are also many evaluation works that have been conducted for specific model development and specific applications [5,6,7,8,9,10].

For the regulatory application of air quality models, it is crucial to evaluate the comprehensive performance for the main pollutants of the required average time in air quality standards. For a long time, even though many model evaluations for specific air pollutant cases have been evaluated [11,12,13,14], there have been only a few systematic evaluation and verification studies for regional air quality models at a national level in China. In the project on Long-range Trans-boundary Air Pollutants in Northeast Asia (LTP), the simulation of the episode and long-term results of sulfur concentrations of CMAQ by China, RAQM by Japan, and CADM by Korea were compared [15]. In the MICS-Asia phase III, the performances for O₃, NO_x, and PM from 14 independent modeling groups have been compared [16,17]. In response to the problems in the model applications and to meet the urgent needs for regulatory applications of air quality models for the Air Pollution Control Action Plan (2014–2017), which has targeted PM_2.5 in China, the Ministry of Ecology and Environment and the Ministry of Science and Technology of China have jointly launched the project of “Research on the Technical System of Regulatory Air Quality Modeling” in 2017 (hereinafter referred to as “the Project”) [18].

This study was the model evaluation part of the Project. Five regional CTM air quality models and versions (hereafter AQMs) have been evaluated for the regional PM_2.5 and O₃ pollution problem using the same set of monitoring data and unified statistical methods, i.e., CMAQ version 5.02 (CMAQv5.02 hereafter), CMAQ version 5.3.2 (CMAQv5.3.2 hereafter) [19], CAMx version 6.2 (CAMx6.2 hereafter), CAMx version 7.1 (CAMx7.1 hereafter) [20,21], and the NAQPMS model.

2. Model Description and Specifications

2.1. Model Descritption

2.1.1. CMAQ Model

The Community Multiscale Air Quality (CMAQ) model is a sophisticated numerical air quality model developed by the U.S. EPA [22]. The CMAQ model has detailed chemical mechanisms which can provide information about the chemical composition of a mixture of pollutants, such as PM and VOCs. The CMAQ model also has a number of analytical tools for the contributions of chemical and physical processes, as well as specific chemical reactions, sensitivity analysis, and source apportionment [23]. The CMAQ model uses the output of mesoscale meteorological models such as the Weather Research and Forecasting model (WRF) [24] and output from its SMOKE emission pre-processor model [25].

The CMAQ model continues to improve its scientific algorithms, fix bugs, and improve the functionality and efficiency of codes as the version evolves. CMAQv5.0 updated the turbulent mixing during stable conditions and vertical advection schemes and the gas-phase chemical mechanisms to SAPRC07 [26,27] and CB05 [28], which updated the representation of organic and inorganic species and reactions; it also updated the online photolysis rates module, which calculates photolysis based on real-time meteorology and aerosol concentrations. For aqueous and heterogeneous chemistry, the rate constants for S(IV) to S(VI) were updated, and the model simulated Fe and Mn for catalysis reactions. The AER06 aerosol module of CMAQv5.0 improved aerosol chemistry and also improved speciation for primary PM_2.5 components of NH⁴⁺, Na⁺, Cl⁻, and trace elements [29]. The oxidation of primary organic aerosols has been included to improve the OM/OC ratio. The incorporation of ISORROPIA 2.0 [30] improved the gas/particle partitioning of inorganic species, including the treatment of Ca²⁺, K⁺, and Mg²⁺ in the sea salt and soil dust. CMAQv5.0 also included lightning NO_x emission and NH₃ bi-directional exchange at surfaces instead of traditional dry deposition [31]. The version CMAQv5.01 and CMAQv5.02 are incremental updates of version 5.0 that include several changes to the science algorithms in the base model and new diagnostic/scientific modules.

CMAQv5.3 contains numerous science updates. CMAQv5.3 includes a new aerosol module (AERO7), which includes significant updates to the secondary organic aerosol (SOA) chemistry. New options for simulating the exchange of pollutants between the land and the atmosphere were added, including the new Surface Tiled Aerosol and Gaseous Exchange (STAGE) bi-directional deposition model [32]. The Hybrid Vertical Coordinate (HVC) of WRF [24] was added to CMAQv5.3, along with the meteorology pre-processor (MCIP). CMAQv5.3.1 and CMAQv5.3.2 are minor updates to CMAQv5.3, including a substantial update to the gas-phase chemistry apportionment algorithms.

2.1.2. CAMx Model

Very similar to the CMAQ model, the Comprehensive Air Quality Model with extensions (CAMx) is a Eulerian photochemical transport model developed by the ENVIRON company in the late 1990s for tropospheric air pollution (O₃, particulates, and air toxics) over spatial scales ranging from neighborhoods to continents. CAMx has multiple gas-phase chemistry mechanisms, including CB05 and CB6 [28], SAPRC99, and SAPRC07 [26,27]. The particulate chemistry of CAMx includes inorganic gas–aerosol partitioning (ISORROPIA), inorganic aqueous chemistry (RADM-AQ), and organic gas–aerosol partitioning and oxidation (VBS or SOAP), which use the products from gas-phase photochemistry for the production of sulfate, nitrate, and condensable organic gases. CAMx offers a high-order accuracy, little numerical diffusion schemes for horizontal advection [33,34], and a hybrid approach for vertical diffusions and convective transport, using standard “K-theory” and Version 2 of the Asymmetric Convective Model (ACM2) [35]. CAMx offers two dry deposition options: an old approach based on the models of Wesely [36] and Slinn and Slinn [37], using 11 land-use categories, and an updated approach based on the algorithms of Zhang [38], using 26 land-use categories and a simple surface sub-model for exchanges between air and soils and vegetation. The NCAR TUV radiative transfer and photolysis model is used to provide a multi-dimensional lookup table of clear-sky photolysis rates.

CAMx has many extensions and probing tools for tracking emission contributions to predicted O₃ and/or PM species concentrations by source region and/or category, conducting first-order and second-order concentration sensitivities for the emissions, initial conditions and boundary conditions, and process analysis extensions for in-depth analyses of physical and chemical processes.

The main update of Version 6.2 of CAMx (CAMx6.2) includes a Volatility Basis Set (VBS) for organic gases and aerosols, a more efficient hybrid 1.5-D scheme that describes the evolution of OA in the 2-D space of the oxidation state and volatility. This version of CAMx also includes the CB6 Mechanism with halogens and the treatment of top boundary conditions from the global model. CAMx7.00 includes Dimethyl sulfide (DMS) chemistry, and eight new primary PM_2.5 species (Fe, Mg, Mn, Ca, K, Al, Si, Ti) have been added. SOAP treatments were also updated, including updated terpene SOA yields. The new features and major updates in CAMx7.1 include the Carbon Bond Version 6 Update (CB6 revision 5 or CB6r5) for incorporating recent updates to chemical reactions for inorganic and simple organic species that play important roles in O₃ formation.

2.1.3. NAQPMS

The NAQPMS model was developed in response to China’s complex regional air pollution problem and emission sources [39]. The NAQPMS model is a multiscale three-dimensional Euler air quality numerical model, which includes processes such as advection, diffusion, gas-phase chemistry, aerosol chemistry, and dry and wet deposition. It can simulate the distribution, transport, and deposition characteristics of atmospheric trace gases such as O₃, NO_x, SO₂, and CO and atmospheric aerosol components such as PM_2.5, dust, and carbonaceous aerosol. It also incorporates an atmospheric chemical data assimilation module and a source apportionment and tracking module. NAQPMS can use the output of mesoscale meteorological models such as WRF and output from the SMOKE emission pre-processor model [25].

The NAQPMS model uses a conservative flux advection transport algorithm. The eddy turbulence diffusion module introduces a diffusion operator that reflects the structure of the boundary layer and the characteristics of the surface. There are two gas-phase chemical mechanisms, CBM-Z and CBM-IV, in NAQPMS. The dry deposition process uses a deposition velocity algorithm based on molecular diffusion, turbulent mixing, and gravitational sedimentation processes. The wet deposition process calculates the gravitational drag effect caused by particle hygroscopic growth in addition to the traditional precipitation removal process.

2.2. Model Specificatioins

To improve model performance, this project also conducted comparisons and evaluations and provided recommendations for the chemical mechanism and physical algorithm options. Based on the comparison and evaluation work of this project (e.g., Figures S6 and S7) and the situations of input data, the individual participating models determined the physical process algorithms and chemical mechanisms of the models (Table 1).

For the gas-phase chemistry, CAMx6.2 and CAMx7.1 used the SAPRC99 and SAPRC07 gas-phase chemical mechanisms, CMAQv5.02 and CMAQv5.32 used CB06, and the NAQPMS model used CBZ mechanisms. All the participating AQMs were coupled with aerosol chemistry, including inorganic aerosols and secondary organic aerosols. AE06 and AE07 aerosol models have been used for chemical reactions, and different versions of ISORROPIA models were used for the aerosol thermodynamic equilibrium. All the models also include secondary organic aerosol modules such as SOAP. Participating models used RADM-AQ or similar aqueous chemistry and wet deposition models.

For horizontal advection, the CAMx and CMAQ models applied the Piecewise Parabolic Method (PPM) [34], while NAQPMS used the scheme developed by Walcek and Aleksic [40]. Most models used a non-local closure PBL scheme, the Asymmetric Convective Model, version 2 (ACM2), for vertical diffusion, except NAQMPS, which used an explicit Crank–Nicolson scheme. For dry deposition, all the CMAQ, CAMx, and NAQPMS models used the WESELY89 dry deposition method [36]. CMAQ and NAQPMS used ACM cloud modules for wet scavenging, while the CAMx models used the SMGA and NEWBWSM modules. All AQMs used WRF as a meteorological driver. The detailed model settings are in Table 1.

Each model used similar Lambert map projections and similar domains and grids (outer domains) with different parameters (Figure S1). All the five models/versions used a similar domain covering the whole or most of China, including the Beijing–Tianjin–Hebei (BTH) region, Yangtze River Delta (YRD) region, Pearl River Delta (PRD) region, and Chengdu in the Chengdu–Chongqing (CC) region in the northern, central, southern, and southwestern parts of China.

The models simulated 17 cases (Table 2) of severe PM_2.5 and O₃ cases. Each model used a few days’ spin-run or a continuous simulation of previous days as an initial condition. The participating models used a fixed boundary condition and a large domain covering East Asia or the whole of China to reduce the uncertainties of boundary conditions.

2.3. Emission and Meteorology Data

2.3.1. The Meteorology Input Data for CTMs

All participating AQMs use the Weather Research and Forecasting model (WRF) meteorology model as a meteorology driver, which provides the gridded meteorology data for AQMs. To help the AQMs in this study, four sets of gridded meteorology data have been tested and compared, including the FNL Operational Model Global Tropospheric Analyses data of ds083.2 and ds083.3 from the National Centers for Environmental Prediction of the U.S. (NCEP) and ERA-Interim and the most recent Reanalysis v5 (ERA5) of the European Centre for Medium-Range Weather Forecasts (ECMWF). The data of ds083.2 and ds083.3 have the same 6 h interval and 32 pressure layers, and a 1° and 0.25° horizontal resolution, respectively. ERA-interim has a 1 h time interval, a 1.1251° horizontal resolution, and 38 pressure levels. ERA-5 has the same 1 h time interval, a 31 km horizontal resolution, and 38 or 138 pressure levels. The comparison results show that the overall performance of ERA5 simulates the humidity of the real atmosphere very well, as well as temperature and wind speed, and is better than the other data sets. At the same time, there is no significant improvement in the simulation effect of the ERA5 138-model-layer data compared to the 38-model-layer data. It is therefore recommended to use 38-layer ERA5 data to save computing resources [18].

As a model evaluation project, we evaluated input grided meteorology data and suggested WRF specifications and schemes (Figures S3–S5). However, each participating model group conducted WRF model simulation independently due to the different domain and grid and CTM model specifications.

2.3.2. Emission Inventory

There is no official emission inventory for China. This project has developed an emission inventory for 2005–2018 [18]. However, each model group also updated the emission data based on their available source of activity data and emission factors. The emission inventory used in the study includes the emissions of all the primary air pollutants and precursors of secondary air pollutants for PM_2.5 and O₃ from all major emission sectors including power plants, industrial, household, mobile emissions, and biogenic emissions. Each model group updated the emission data based on their available source of activity data and emission factors.

Despite a long period of control efforts, China’s total air pollutant emissions are still at a very high level. For example, in 2015, the emissions of SO₂, NO_x, PM₁₀, PM_2.5, BC, OC, NMVOC, and NH₃ were 17,294, 22,591, 11,859, 8217, 581, 1839, 25,629, and 10,432 kt/y, respectively. At the same time, the emission of air pollutants in China also experienced a rapid change, especially after the action plan started in 2014. SO₂ emissions showed a decreasing trend from 2005 to 2009, with a rapid decrease after 2013. Relatively, the Shandong province and the southwestern provinces of Sichuan and Guizhou, where the sulfur contents in coal are high, have higher SO₂ emissions. Emissions of NO_x are mainly concentrated in the eastern part of China, where they continued to rise from 2005 to around 2013, followed by a rapid decline. The NO_x emissions from power plants and industrial combustion have decreased significantly. However, the emissions from iron and steel, cement, industrial processes, and road transportation have increased, and the emissions from transport have almost doubled. Particulate matter emissions are mainly from industrial processes and residential combustion. NMVOC emissions have been slowly increasing since 2005, with a slight downward trend in recent years. Ammonia emissions are strongly influenced by livestock stock, fluctuating slightly between 2005 and 2017, with little change overall.

Biogenic VOC emissions have the same importance as anthropogenic VOC emissions. In this work, MEGANv2.10 models based on the leaf area index, emission factors of different vegetation types, and updated vegetation data from 2013 were used.

The emission source inventories need to be pre-processed for use in AQMs. In this study, the resolution of emission inventory is basically the annual data of administrative areas. All the participating models used the SMOKE pre-processing model [25] or a similar approach developed by this project [18]. First of all, the emissions of administrative units in the inventory were allocated to the grids of models using SMOKE or a similar approach using the weights of populations and GDPs and point sources, and other sources with spatial information can be allocated directly to the grids. The total emissions of different source types from different sectors at certain periods of time (usually yearly) should be allocated to the model time steps, usually hourly, according to the temporal profile of different source sectors.

As for chemical speciation, the group of species such as VOCs, NO_x, sulfur, smoke, and dust in the emission inventories have been converted to chemical species groups or explicit chemical substances in the chemical mechanism of AQMs. For particulate emissions, not only the components but also the size distributions, for example, PM_2.5 and PM10, have been considered. In the Project, we provided suggestions for pre-processing. However, each modeling group did the pre-processing based on their own data situation and model specification.

3. Evaluation Methodology and Observation Data

3.1. Model Evoluation Methodology

Recent versions of the CMAQ and CAMx models, including CMAQv5.02 and CMAQv5.32, CAMx6.2 and CAMx7.1, and NAQPMS in this study, use state-of-the-art science and similar algorithms. However, different models and versions differ to some extent in terms of boundary layer processes and vertical turbulent diffusion algorithms, some parts of gas-phase chemistry and aerosol chemistry, especially involving heterogeneous chemistry, VOCs, and secondary organic aerosols (SOAs) and the dry deposition of gaseous species. Therefore, in addition to the science and algorithm comparison, the evaluation of PM components and the variation or correlation of the simulated concentrations are important.

Accuracy is the most important model evaluation indicator. The evaluation compared the measured concentration at monitoring sites versus the simulated concentration at corresponding grids using statistical metrics. To quantify the model performance for different air pollutants and the capacity for simulating temporal variation, the simulated daily average concentration of CO, SO₂, NO₂, PM_2.5, PM₁₀, and PM_2.5 components and Maximum Daily 8 h Average Ozone (MDA8 O₃) were extracted for the study sites and compared against monitoring data. Various statistical measures were used in this evaluation work, including the Normalized Mean Bias (NMB), Normalized Mean Error (NME), root mean square error (RMSE), and Pearson correlation coefficient (R) for the evaluation of bias, error, and correlation between the time series data of modeling results (M_i,t) and observations (O_i,t):

N M B = \frac{1}{N} \sum_{i = 1}^{N} \frac{M_{i} - O_{i}}{O_{i}},

(1)

N M E = \frac{1}{N} \sum_{i = 1}^{N} |\frac{M_{i} - O_{i}}{O_{i}}|,

(2)

R M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(M_{i} - O_{i})}^{2}},

(3)

R = \frac{\sum_{i = 1}^{n} (M_{i} - {\bar{M}}_{i}) (O_{i} - {\bar{O}}_{i})}{\sqrt{\sum_{i = 1}^{n} (M_{i} - {\bar{M}}_{i})^{2}} \sqrt{\sum_{i = 1}^{n} (O_{i} - {\bar{O}}_{i})^{2}}}

(4)

N is the total number of observations. M_i and O_i are modeling concentrations and observation concentrations for monitoring data i.

As this evaluation is intended to provide a systematic overview of the general performance of air quality models during the specified period, it does not extend to detailed case-specific analysis. The statistical measures are the average across different cases (cities and seasons) for a single species. The model with larger errors (ME and NME) shows a more significant variation in performance across different cities and seasons.

3.2. Model Simulation Cases and Observation Data

Due to complex atmospheric chemical transformations, all air pollutants are interconnected to varying degrees. Some pollutants, such as PM_2.5 and PM₁₀, are mixtures of multiple pollutants. Therefore, model evaluations should address the performance for all pollutants and their compositions simultaneously. This study collected and utilized observation data in the Project [18] from 17 severe air pollution cases: 12 PM_2.5 cases and 5 O₃ cases from five research super-stations located in Xianghe and Beijing within the Beijing–Tianjin–Hebei (BTH) region, Taizhou in the Yangtze River Delta (YRD) region, Shenzhen in the Pearl River Delta (PRD) region, and Chengdu in the Chengdu–Chongqing (CC) region. These sites represent regionally the northern, central, southern, and southwestern parts of China (Table 2 and Figure S1).

It is worth noting that China’s air pollution situation has been changing rapidly in recent years. From 2015 to 2020, China’s air quality has improved significantly. The concentrations of NO₂, CO, SO₂, PM₁₀, and PM_2.5 have all decreased nationwide, with SO₂ showing the greatest decrease, followed by CO and PM_2.5. In these five years, the concentration of O₃ has been on the rise and reached a high point in 2018. At the same time, the PM components also changed significantly. The trends in pollutant concentrations in big cities such as Beijing, Chengdu, Xi’an, Shanghai, and Guangzhou generally improved faster. The rapid improvement of air quality in China has also posed significant challenges to the model inventory and model accuracy.

4. Evaluations Results of AQMs

Simulations of CO, NO₂, SO₂, O₃, PM_2.5, and PM₁₀, the six air pollutants listed in the ambient air quality standards of China [41] which are essential for the regulatory application of AQMs, were evaluated for model performance. Furthermore, to investigate the model performance for PM, in addition to the PM_2.5 and PM₁₀ concentrations, the PM_2.5 component concentrations, including sulfate, nitrate, EC, OC, and ammonium, were also evaluated. The evaluations considered both average cases and the most severe case to understand model behavior during severe air pollution events.

4.1. Evaluations of Model Performance for CO Concentration

While CAMx6.2 underestimated the CO concentration, CAMx7.1, CMAQv5.0.2, and CMAQv5.3.2 performed reasonably well in simulating the mean CO concentrations (Figure 1a, Table S1). The simulated mean CO concentrations for the four models were 0.53 mg/m³, 1.50 mg/m³, 1.32 mg/m³, and 0.88 mg/m³, while the mean observation concentrations were 1.25 mg/m³, 1.16 mg/m³, 1.46 mg/m³, and 1.17 mg/m³, respectively, showing that the four models have small biases and errors. The NMBs of the four models were −0.41, 0.28, 0.10, and −0.15, and the NMEs of the four models were 0.59, 0.52, 0.57, and 0.35, respectively. The correlation coefficients of the four models were 0.59, 0.46, 0.54, and 0.55 (Figure 1b, Table S1). The NAQPMS model also had a similar good performance, with a high R of 0.03–0.84 (average of 0.44) and an NMB and RMSE range of −0.49 to 1.12 (average of 0.32) and 0.20 to 2.50 (average of 1.35), respectively (Figure 1b, Table S1).

For the highest CO concentration case, CAMx7.1 predicted well, with 2.88 mg/m³ against the 2.97 mg/m³ of observation. However, CAMx6.2, CMAQv5.0.2, and CMAQv5.3.2 simulated with 0.60 mg/m³, 1.63 mg/m³, and 1.50 mg/m³ against the observed concentrations of 2.60 mg/m³, 2.50 mg/m³, and 2.49 mg/m³, respectively. The four models also have similar correlation coefficients for high CO cases, as other cases, with correlation coefficients of 0.54, 0.55, 0.70, and 0.43, respectively. The four models also have a similar bias and errors as in other cases, with NMBs of −0.77, −0.03, −0.37, and −0.34 and NMEs of 0.77, 0.45, 0.47, and 0.46, respectively.

Generally, except for CAMx6.2, which underestimated CO, the performances of all the participating models were acceptable in simulating mean CO concentrations. The participating models also performed well for the highest CO concentration cases.

4.2. Evaluations Model Performance for SO₂ Concentration

Generally, CAMx6.2 slightly and CAMx7.1, CMAQv5.0.2, and CMAQv5.3.2 significantly overestimated the SO₂ concentration (Figure 1c, Table S1). The simulated mean SO₂ concentrations for all cases from the CAMx6.2, CAMx7.1, CMAQv5.0.2, and CMAQv5.3.2 models were 5.81 µg/m³, 27.28 µg/m³, 38.80 µg/m³, and 26.91 µg/m³, while the mean observation concentrations were 4.13 µg/m³, 10.16 µg/m³, 9.80 µg/m³, and 11.46 µg/m³, respectively. The four models have a large positive bias, with NMBs of 3.37, 5.36, 13.35, and 4.94, respectively, and relatively large errors of NME of 3.64, 5.43, 13.39, and 5.02, respectively (Figure 1d, Table S1). Similarly, the NAQPMS model overestimated SO2 concentrations with a large positive bias, an average NMB of 3.30, and a large RMSE of 15.60. The correlation coefficients for the five models, CAMx6.2, CAMx7.1, CMAQv5.0.2, CMAQv5.3.2, and NAQPMS, were averaged at 0.46, 0.20, 0.46, 0.41, and 0.46, respectively (Figure 1d, Table S1).

In order to investigate the performance of the model for severe SO₂ pollution situations, the highest SO₂ concentration case was specifically evaluated. CAMx6.2 underestimated the SO₂ concentration, while the other three models still overestimated. Notably, CMAQv5.02 and CMAQv5.3.2 performed better than all cases on average. Specifically, the four models predicted 4.96 µg/m³,61.58 µg/m³,44.04 µg/m³, and 48.63 µg/m³, whereas the observation concentrations were 12.83 µg/m³, 33.58 µg/m³, 33.74 µg/m³, and 32.52 µg/m³, respectively. The four models also have better correlation coefficients of 0.66, 0.78, 0.73, and 0.50, respectively, for high SO₂ cases. The bias and errors are still high, but the bias is not always positive. The NMBs and NMEs are also improved for the highest concentration cases, with NMBs of −0.61, 0.83, 0.31, and 0.48 and NMEs of 0.61, 0.85, 0.47, and 0.58, respectively.

The model performance varied significantly in different regions. Relatively speaking, a more considerable discrepancy in the simulation can be found for southern China. For example, both CMAQv5.0.2 and CMAQv5.3.2 overpredicted SO₂ concentrations with a higher bias and error in Shenzhen and Chengdu, showing the problem of updating the emission inventory or the faster improvement of coal combustion-related air pollution. A similar poor performance for SO₂ and underestimation of sulfate had been found in a previous model comparison study by the LTP project, where three modeling results from China, Korea, and Japan showed a big discrepancy.

Compared to the model performance for other modeling species, the source of the poor performance of all models in simulating SO₂ is probably the rapid emission inventory change due to the air pollution action plan during that period, especially the phasing out of small boilers and changing fuel from coal to natural gas and electricity for household heating in northern China. The better performance for the high SO₂ cases of models suggested that there is still potential to improve not only the emission amount but also the temporal variations in emission inventories.

4.3. Evaluations of Model Performace for NO₂ Concentration

CAMx7.1, CMAQv5.0.2, and CMAQv5.3.2 predicted very well in simulating the NO₂ concentrations (Figure 1e, Table S1). The NO₂ concentrations simulated by the three models were 47.67 µg/m³, 55.95 µg/m³, and 47.07 µg/m³, while the mean observation concentrations were 48.54 µg/m³, 61.58 µg/m³, and 54.94 µg/m³, respectively. On the other hand, the CAMx6.2 model underestimated NO₂ at 19.65 µg/m³ compared to the observed value of 35.66 µg/m³. On average, the bias and errors of the four models are small; the NMBs are only −0.23, 0.27, −0.02, and 0.07, and the NMEs are 0.56, 0.70, 0.42, and 0.55, respectively (Figure 1f, Table S1). The models also have good correlation coefficients of 0.64, 0.54, 0.52, and 0.53. The NAQPMS model has a similarly good performance for NO₂, with an R of 0.03~0.78, NMB in the range of −0.19~1.88, and RMSE in the range of 8.70~56.00, respectively.

For the highest NO₂ concentration cases, the performances of the four models are not as good as the average of all cases. The four models underestimated the NO₂, at 24.18 µg/m³, 54.28 µg/m³, 66.63 µg/m³, and 64.73 µg/m³, compared to the observations of 57.54 µg/m³, 108.25 µg/m³, 84.84 µg/m³, and 118.15 µg/m³, respectively. Nevertheless, the bias and errors of the four models are still small, with NMBs of −0.58, −0.50, −0.21, and −0.45 and NMEs, which improved slightly from the average cases, of 0.59, 0.50, 0.23, and 0.46, respectively. The four models also have better correlation coefficients of 0.66, 0.69, 0.41, and 0.70, respectively, for the high NO₂ cases.

Overall, these participating models perform well in simulating the NO₂ concentrations. The good performance for NO₂ is similar to the previous study in MICS-Asia III [17].

4.4. Evaluations of Model Performance for O₃ Concentration

The CAMx6.2, CAMx7.1, CMAQv5.0.2, and CMAQv5.3.2 models perform reasonably well in simulating the daily maximum 8 h average O₃ concentrations (MDA8 O₃) (Figure 1g). The four models’ simulated mean MDA8O₃ concentrations for all cases were 30.18 µg/m³, 115.76 µg/m³, 90.94 µg/m³, and 72.29 µg/m³, while the mean observation concentrations were 29.65 µg/m³, 87.45 µg/m³, 78.65 µg/m³, and 60.78 µg/m³, respectively. The simulated results from the four models have good correlation coefficients of 0.75, 0.53, 0.75, and 0.48, respectively, and have small biases and errors (Figure 1h, Table S1). The NMBs for MDA8O₃ of the four models are 0.06, 0.40, −0.17, and 0.14, and the NMEs for MDA8O₃ of four models are 0.54, 0.48, 0.57, and 0.56, respectively. The NAQPMS model has a similarly good performance for O₃, with an R of 0.37~0.86, NMB in the range of −0.83~4.83, and RMSE in the range of 16.1~83.6, respectively.

For the highest (MDA8O₃) concentration cases, the simulated MDA8O₃ from the four models was 48.01 µg/m³, 199.42 µg/m³, 189.04 µg/m³, and 132.41 µg/m³, while the observation concentrations were 62.41 µg/m³, 172.53 µg/m³, 190.89 µg/m³, and 111.1 µg/m³, respectively. The four models have similar biases and errors, −0.65, 0.16, −0.04, and 0.18 for the NMB and 0.37, 0.19, 0.23, and 0.20 for the NME, respectively. The four models also have better correlation coefficients than the average cases of 0.64, 0.82, 0.92, and 0.76, respectively.

Overall, the participating models performed well in simulating the MDA8O₃ concentrations. The good performance of the participating model is similar to the model evaluation results of CMAQ and CAMx in the U.S. [42,43]. However, it is worth noting that most cases for evaluations in this study were in the fall or winter when O₃ levels are typically low. In the previous study of MICS-Asia III, the model performance for O₃ showed a considerable variability, high uncertainties, and usually overestimation (high NMB of 0.25–1.25 for May–September for the Greater BTH region domain) [16]. Therefore, further research is needed to investigate the performances of the participating models during the O₃ pollution season.

4.5. Evaluations of Model Performance for PM₁₀ Concentration

The models generally underestimated PM₁₀ concentrations. CAMx6.2, CAMx7.1, and CMAQv5.0.2 significantly underestimated PM₁₀ concentrations, and only CMAQv5.3.2 slightly underestimated (Figure 1i, Table S1). The simulated average concentrations were 66.57 µg/m³, 95.17 µg/m³, 87.43 µg/m³, and 121.54 µg/m³, while the mean observation concentrations were 149.62 µg/m³, 155.88 µg/m³, 134.78 µg/m³, and 148.21 µg/m³, respectively. The NMBs for PM₁₀ of the CAMx6.2, CAMx7.1, and CMAQv5.0.2 models were −0.44, −0.30, and −0.20, while CMAQv5.3.2 had −0.01 (Figure 1j, Table S1). The NMEs for PM₁₀ of the four models were 0.65, 0.51, 0.56, and 0.51, respectively. The correlation coefficients for PM₁₀ of the four models were 0.48, 0.45, 0.48, and 0.38, respectively (Figure 1j, Table S1). Similarly, the NAQPMS model had a similar range of errors and correlation coefficients for PM₁₀, with an R in the range of −0.05~0.88, NME in the range of −0.08~0.69, and RMSE in the range of 27.10~169.90.

For the highest PM₁₀ concentration cases, while CMAQv5.3.2 reproduced peak concentrations very well, CAMx6.2, CAMx7.1, and CMAQv5.0.2 underestimated PM₁₀ concentrations significantly. The four models predicted PM₁₀ concentrations of 79.50 µg/m³, 135.14 µg/m³, 129.02 µg/m³, and 263.5 µg/m³, whereas the observation concentrations were 272.50 µg/m³, 272.50 µg/m³, 272.50 µg/m³, and 272.5 µg/m³, respectively. Accordingly, CMAQv5.3.2 had a very small bias of −0.03, and the other three models had bigger biases of −0.71, −0.50, and −0.53. CMAQv5.3.2 had a small NME of 0.20, while the other three models had NMEs of 0.71, 0.50, and 0.53, respectively. The CAMx7.1, CMAQv5.0.2, and CMAQv5.3.2 models had good correlation coefficients of 0.88, 0.78, and 0.77, respectively, and CAMx6.2 had a low correlation coefficient of 0.34. Except for the CMAQv5.3.2 model, the other three models still have the potential to improve their performances for PM₁₀.

4.6. Evaluations of Model Performance for PM_2.5 Concentration

During the action plan period of 2014–2017, PM_2.5 was the most severe air pollution problem. The air quality models are urgently needed as prediction and assessment tools for control policy-making. The CAMx6.2, CAMx7.1, CMAQv5.0.2, and CMAQv5.3.2 models perform well in simulating the mean PM_2.5 concentrations and have a better performance than in simulating PM₁₀. For most cases, the four models slightly underestimated the PM_2.5 concentrations, and the simulated mean PM_2.5 concentrations were 60.72 µg/m³, 79.85 µg/m³, 80.58 µg/m³, and 75.48 µg/m³, while the mean observation concentrations were 95.19 µg/m³, 95.16 µg/m³, 92.77 µg/m³, and 94.24 µg/m³, respectively (Figure 1k, Table S1). The biases and errors of the four models were similarly small: −0.29, −0.07, −0.04, and −0.11 for NMB and 0.51, 0.48, 0.53, and 0.52 for NME, respectively (Figure 1l, Table S1). The modeled results also have good correlation coefficients with observations of 0.58, 0.55, 0.60, and 0.39, respectively (Figure 1l, Table S1). The model performances for PM_2.5 in this study were similar to the work of MICS-Asia III, in which the participating models also slightly underestimated the PM_2.5 concentration with small biases [16].

For the highest PM_2.5 concentration cases, the models still underestimated the PM_2.5 concentrations. The CAMx6.2, CAMx7.1, CMAQv5.0.2, and CMAQv5.3.2 models predicted 97.46 µg/m³, 120.44 µg/m³, 91.46 µg/m³, and 120.14 µg/m³, while the observation concentrations were 155.93 µg/m³, 155.93 µg/m³, 155.93 µg/m³, and 157.54 µg/m³, respectively. The NMBs for the four models were −0.37, −0.23, −0.41, and −0.24, and the NMEs were 0.42, 0.30, 0.44, and 0.25, respectively. The four models had very good correlation coefficients of 0.89, 0.87, 0.90, and 0.93, respectively. Similarly, the NAQPMS model simulated PM_2.5 reasonably well, with an R in the range of 0.01~0.89, NMB in the range of −0.10~1.61, and RMSE in the range of 21.30~155.90.

Overall, all participating models performed reasonably well in simulating the PM_2.5 concentrations.

4.7. Evaluations for Modeled PM_2.5 Components

The AQMs simulate the compositions of particulate matter, such as sulfate, nitrate, elemental carbon (EC), organic carbon (OC), and crustal elements, and then sum each of the independently simulated components to obtain PM_2.5 and PM₁₀ concentrations. Therefore, to evaluate the model performance in simulating PM_2.5 and PM₁₀, it is more important to understand its performance in simulating different PM components.

In this work, the model performances in simulating the PM_2.5 components of sulfate, nitrate, BC, OC, and NH₄⁺ are evaluated.

4.7.1. Evaluation for Nitrate of PM_2.5

The CAMx7.1 and CMAQv5.0.2 models perform very well and slightly underestimated the mean nitrate concentrations, and CAMx6.2 and CMAQv5.3.2 underestimated more significantly. The simulated nitrate concentrations of CAMx6.2, CAMx7.1, CMAQv5.0.2, and CMAQv5.3.2 were 10.36 µg/m³, 19.32 µg/m³, 18.15 µg/m³, and 13.77 µg/m³, while the mean observation concentrations were 19.12 µg/m³, 18.96 µg/m³, 18.86 µg/m³, and 19.37 µg/m³, respectively (Figure 2a, Table S1). The biases and errors of the four models for nitrate were small, with NMBs of −0.04, 0.34, 0.47, and 0.27 and NMEs of 0.95, 0.87, 1.02, and 1.16, respectively (Figure 2b, Table S1). The four models had good correlation coefficients of 0.62, 0.55, 0.63, and 0.40, respectively. The NAQPMS model had a similar good performance for nitrate, with an R in the range of 0.04~0.89 for high concentration cases, and an NMB in the range of −0.87~11.37. Overall, the participating models perform well in simulating the nitrate concentrations of PM_2.5.

4.7.2. Evaluation for Sulfate of PM_2.5

Generally, the CAMx6.2, CAMx7.1, CMAQv5.0.2, and CMAQv5.3.2 models perform reasonably well in simulating the mean sulfate concentrations (Figure 2c, Table S1). CAMx6.2 and CMAQv5.0.2 slightly underestimated, while CAMx7.1 and CMAQv5.3.2 slightly overestimated the sulfate concentrations. The simulated mean sulfate concentrations of the four models were 6.53 µg/m³, 11.63 µg/m³, 8.97 µg/m³, and 6.82 µg/m³, respectively, while the mean observation concentrations were 7.71 µg/m³, 7.70 µg/m³, 7.50 µg/m³, and 8.76 µg/m³, respectively. CAMx6.2 and CMAQv5.3.2 had small biases, with NMBs of 0.23 and 0.21, while CAMx7.1 and CMAQv5.0.2 had acceptable NMBs of 1.25 and 0.75, respectively (Figure 2d, Table S1). The four models also had reasonable errors; the NMEs were 0.81, 1.51, 1.16, and 0.82, respectively. The four models had good correlation coefficients of 0.58, 0.53, 0.62, and 0.43, respectively. The NAQPMS model had a similar acceptable performance for sulfate, with an R in the range of −0.35~0.86, NMB in the range of −0.29~3.66, and RMSE in the range of 2.30~23.70 (Figure 2d, Table S1).

Overall, the participating models effectively simulated the sulfate concentrations, which is similar to the good performance for sulfate in previous studies conducted in the U.S, e.g., the models overestimate the magnitudes of concentrations at CSN and IMPROVE sites in most regions and seasons by 0.1–0.6 μg/m³, reported in 2016 [21,42,43]. However, the overestimations of SO₂ concentrations of the participating models suggest that there are still large uncertainties in emission inventory, and the aqueous and heterogeneous reactions and meteorology need to be further investigated. In the model evaluation work of MICS-Asia III, the general underestimation of sulfates was also found, and the possible absence of sulfate formation mechanisms such as heterogeneous reactions was suggested.

4.7.3. Evaluation for BC of PM_2.5

Overall, the CAMx6.2, CAMx7.1, CMAQv5.0.2, and CMAQv5.3.2 models perform very well in simulating the mean BC concentrations. The simulated mean BC concentrations for the four models were 6.13 µg/m³, 7.04 µg/m³, 5.56 µg/m³, and 8.04 µg/m³, while the mean observation concentrations were 4.80 µg/m³, 6.88 µg/m³, 4.89 µg/m³, and 6.79 µg/m³, respectively (Figure 2e, Table S1). All four models slightly overestimated the BC concentration, with positive NMBs of 0.57, 0.45, 0.49, and 1.05. The errors of the four models in simulating BC were relatively large, with NMEs of 0.98, 0.96, 0.92, and 1.41, respectively (Figure 2f, Table S1). The models show reasonable correlation coefficients of 0.50, 0.60, 0.48, and 0.59. The NAQPMS model had a similar performance for the evaluation cases, with an R in the range of −0.02~0.86 and NMB in the range of −0.53~4.63.

4.7.4. Evaluation for OC of PM_2.5

The performances of CAMx6.2, CAMx7.1, CMAQv5.0.2, and CMAQv5.3.2 were diversified in simulating the mean OC concentrations. CAMx6.2 and CAMx7.1 simulated the mean OC concentrations very well, whereas CMAQv5.0.2 underestimated and CMAQv5.3.2 overestimated the mean OC concentrations significantly. The simulated mean OC concentrations of the four models were 14.03 µg/m³, 16.15 µg/m³, 6.72 µg/m³, and 20.94 µg/m³, while the mean observation concentrations were 14.84 µg/m³, 13.14 µg/m³, 15.27 µg/m³, and 12.27 µg/m³, respectively (Figure 2g, Table S1). The CAMx6.2 and CMAQv5.3.2 models had large NMBs of 1.63 and 2.43, whereas CAMx7.1 and CMAQv5.0.2 had reasonable NMBs of 0.80 and 0.66, respectively (Figure 2h, Table S1). All four models had large errors, with NMEs of 2.14, 1.06, 1.55, and 2.79. The four models also had reasonable correlation coefficients of 0.60, 0.56, 0.57, and 0.53 (Figure 2h, Table S1). The NAQPMS model had a similarly diversified performance for OC, with an R in the range of 0.15~0.81, NMB in the range of −0.42~30.88, and RMSE in the range of 2.60~44.30.

4.7.5. Evaluation for NH₄⁺ of PM_2.5

The CAMx6.2, CAMx7.1, CMAQv5.0.2, and CMAQv5.3.2 models underestimated the mean NH₄⁺ concentrations, but with small biases and errors (Figure 2i,j, Table S1). The simulated mean NH₄⁺ concentrations were 5.41 µg/m³, 10.05 µg/m³, 8.66 µg/m³, and 5.80 µg/m³, while the mean observation concentrations were 13.40 µg/m³, 13.41 µg/m³, 13.17 µg/m³, and 12.96 µg/m³, respectively. The NMBs for NH₄⁺ of CAMx6.2, CAMx7.1, CMAQv5.0.2, and CMAQv5.3.2 were −0.31, 0.16, 0.06, and −0.22, respectively. The four models had consistent small NMEs for NH₄⁺ of 0.69, 0.71, 0.70, and 0.74, respectively. He four models also had good correlation coefficients of 0.68, 0.59, 0.66, and 0.52, respectively. The NAQPMS model had a similar good performance for NH₄⁺, with a high R of 0.89 for high concentration cases. The model performances for NH₄⁺ of PM_2.5 in this study are similar to the work conducted in MICS-Asia III (Chen et al., 2019, [16]).

All the participating models simulated gaseous NH₃ in the atmosphere, which is the precursor of NH₄⁺ in PM_2.5. Unfortunately, there were few gaseous NH₃ concentration data for model verification in this study. In AQMs, the NH₄⁺ concentrations are affected by the partitioning between gaseous NH₃ and the NH₄⁺ of PM_2.5; the equilibrium of cations and anions is both in aerosol and liquid phases. At the same time, most areas in the modeling domain are in NH₃-rich conditions. The discrepancy of sulfate, nitrate, and missing cations and precursors in the emissions, such as chloride ions, can lead to uncertainties in simulating NH₄⁺ concentrations for PM_2.5. The performances of the gaseous ammonia and ammonium of PM_2.5 for participating models still need to be further evaluated.

5. Discussion

5.1. Sources of Uncertainty of CTMs and the Limitation of the Evaluation in This Work

In this study, we mainly used daily average concentrations, which are required for PM_2.5 and most air pollutants by the Ambient Air Quality Standard of China [41], as evaluation statistical measures. The participating models demonstrated their basic ability in simulating the daily average concentrations for NO₂, CO, O₃, PM_2.5, and MDA8 O₃ with acceptable errors and biases, and can be used as prediction and evaluation tools in regulatory applications. However, for the purpose of the detailed evaluation of models and improving the specific chemical mechanisms or physical algorithms of models, detailed case-specific evaluations using a higher temporal resolution, even if not defined in the air quality standards (e.g., for PM_2.5), should be considered. We did not evaluate the model performance for annual average concentration due to a lack of monitoring data.

The sources of uncertainty in the air quality models can be roughly categorized as emission data, meteorological data and meteorology models, and chemical mechanisms and physical algorithms. In the Project, the meteorology input data, including global analysis data and land-use data, were tested and provided. An emission inventory was updated and provided; the pre-processing tool and suggestions were also provided. The meteorology and air quality specifications of meteorology and air quality models were tested and suggested. However, each model group updated and modified the input emission data, pre-processed the data, and selected the meteorology model, CTMs, modeling specifications, and options based on their own circumstances and considerations [18]. Given the large number of types of data, the various uncertainties in the emission inventory and meteorology input data, and the tremendous atmospheric chemical and physical processes, scientific schemes, and specification options, this study, following the common air quality modeling approach, did not require a consistency of input data and model specifications. This study also did not conduct in-depth sensitivity analyses for sources of uncertainty using either brutal-force or using a sensitivity analysis module for the models. Therefore, the evaluation results cannot be used as guidance for specific data, mechanism, and algorithm improvements.

Due to the uncertainties in emissions and chemical characteristics of different air pollutants, it is difficult to define fixed criteria for the air pollutants in different concentration levels and emissions (much higher concentrations and more complicated emission sources in China, with various meteorology). We applied the approach of a comparison of statistical indicators, used in the development of CMAQ and CAMx in the U.S. [1,2,3] and other countries, and made suggestions for criteria for further consideration in the Project [18].

It is also worth noting that, even though there are five cases for high O₃ events, the main purpose of this evaluation is for PM_2.5. Due to the measurement data availability, we did not investigate in-depth the performance of models for O₃ and its precursors, and the products and radicals of photochemical reactions. The evaluation for high O₃ seasons should be investigated further, especially for the species of VOCs, NOx, and radicals.

Since the PM_2.5 concentrations are very much dominated by regional emissions, the large domain covering East Asia or the whole of China in this study can very much reduce the uncertainties of boundary conditions. However, due to the global background and high concentration in the upper air of tropospheric ozone, further investigations about boundary condition are needed.

5.2. Uncertainty of Emission Data

The uncertainties in the emission data used in the CTM air quality model stem from both the emission inventory and the pre-processing procedures. In this evaluation, we did not observe significant performance differences among the participating models or their versions, despite variations in chemical mechanisms, physical algorithms, model configurations, and spatial resolutions. However, the models showed a consistent performance across different species, with a relatively better performance for PM_2.5, CO, and NO₂, and poorer performance for PM₁₀, SO₂, and certain PM_2.5 components. Compared to the generally better performance of CMAQ and CAMx models in the United States and Europe, it can be concluded that the emission inventory remains the largest source of uncertainty in China.

Even though the participating models simulated the sulfate concentrations relatively well, the overestimation of SO₂ may still suggest the possible underestimation of sulfate by the models. The model evaluation results for SO₂ and sulfate in this study were comparable to the poor performance from previous evaluations conducted in China. In contrast, extensive evaluation efforts in the U.S. and Europe using the CMAQ and CAMx models have generally demonstrated a good performance for both SO₂ and sulfate. The poor performance of the participating models in simulating SO₂ indicated that there was still a large uncertainty in the emission inventory, which was most likely caused by the rapid emission inventory change as a result of the air pollution action plan during that period. The performances of meteorology modeling, such as precipitation and humidity, which are related to aqueous and heterogeneous sulfate formation mechanisms, also need to be further investigated for the possible underestimation of sulfate.

For the highest PM₁₀ concentration cases, three models, CAMx6.2, CAMx7.1, and CMAQv5.0.2, underestimated PM₁₀ concentrations significantly. Since coarse PM mainly comes from sources such as road dust, wind-blown dust and soil, and the long-range transport of dust–sand storms, the difference in the model performances or uncertainty of models for PM₁₀ may be related to the uncertainties in local emission data. A high spatial and temporal resolution emission inventory or emission models, either inline or offline, considering human activity, the conditions of road or loading sites, and meteorology are essential for improvement.

The pre-processing of emission data, including the spatial and temporal allocation of emission sources of administrative units to the modeling grids and time steps, as well as chemical speciation, can contribute further to the uncertainties and inconsistency due to the lack of necessary information and methodology. For example, the current approach using population and GDP for the spatial allocation of the emissions of administrative units to the modeling grids is not appropriate or accurate for all air pollutants and types of sources. The other essential information, such as chemical speciation profiles for the chemical speciation of VOCs, NOx, and PM emissions for model species and temporal profiles of the sources for temporal allocation, is usually not available for emission sources.

5.3. Evaluation Methods

The primary objective of this study is to provide a general performance evaluation for the key air pollutants of air quality models for regulatory applications. Therefore, we pay more attention to the model general performance for six air pollutants from the air quality standards of China, and five components of PM_2.5. This work used the most common statistical indicators for model performance, which is common practice in quantitative model evaluation (Tables S1 and S2). However, when investigating the performance of specific cases or different species, the model performance showed more uncertainties. For example, the participating models also showed reasonable correlation coefficients, indicating that the models reproduced and represented temporal and regional distributions and temporal variations in air pollutant concentrations well. However, upon a closer examination of the time series across different case studies, it can be found that simulated values for various air pollutants exhibit distinct deviations. Deviations are notably greater for PM_2.5, particularly for some PM_2.5 components (Tables S1 and S2). Simulated O₃ usually showed a better performance for diurnal variation, while CO and NO₂ showed a poorer performance (Tables S1 and S2). Therefore, even the general performance of models, e.g., for PM_2.5 and O₃, are acceptable; a further evaluation for detailed cases, diurnal variation, and PM_2.5 components and precursors of O₃ is necessary.

This study also did not conduct in-depth sensitivity analyses for sources of uncertainty using either brutal-force or using a sensitivity analysis module of the models. This study did not require a consistency of input data and model specifications. Therefore, the evaluation results do not provide direct guidance for the improvement of specific input data such as emission inventory and pre-processing methods, meteorology input data and modeling, and mechanisms and algorithms of air quality models. For evaluation work for the purpose of model and application improvement, in-depth evaluations for specific species and specific cases (location and meteorology) are needed. In future work, standardized charts and figures for time series and spatial distributions for regulatory model evaluations could be considered.

6. Conclusions

This study inter-compared five regional chemistry and transport models/versions using the same set of observation data and statistics indicators.

All the participating models performed well in simulating the daily average concentrations of NO₂, CO, O₃, and PM_2.5 with an acceptable bias, and can be used as prediction and evaluation tools in regulatory applications. The participating models generally overestimated SO₂, underestimated PM₁₀, and showed a variability among models for the air pollutant and PM_2.5 component concentrations. The participating models also showed reasonable correlation coefficients, indicating that the models reproduced and represented the temporal and regional distributions and temporal variations in air pollutant concentrations well.

Except for CAMx6.2, which underestimated CO, the performances of all participating models were acceptable in simulating the mean CO concentrations. The participating models also performed well for the highest CO concentration cases. Similarly, except for CAMx6.2, which underestimated NO₂, all the models reproduced the average and the range of NO₂ concentrations well. However, for the highest concentrations cases, all the models underestimated the NO₂ concentrations. Overall, the participating models perform well in simulating the MDA8 O₃ concentrations. The good performances of the participating models were similar to the model evaluation results of CAMQ and CAMx in the U.S and better than the results of MICS-Asia, in which a considerable variability of model performance for O₃ was found. It is worth noting that only five O₃ cases were evaluated, while the remaining cases occurred during fall and winter, when O₃ levels are generally lower. Further research is needed to investigate the performance of the participation model during the O₃ pollution season.

The participating models generally underestimated PM₁₀ concentrations and showed a great variability among different models. CAMx6.2, CAMx7.1, and CMAQv5.0.2 significantly underestimated PM₁₀ concentrations, and only CMAQv5.3.2 slightly underestimated. The participating models performed well in simulating PM_2.5 concentrations; for most cases, the participating models slightly underestimated the PM_2.5 concentrations. However, the participating models showed a great variability for PM_2.5 components. The models performed reasonably well in simulating the mean sulfate and nitrate concentration in PM_2.5. The participating models also simulated BC very well in terms of the mean concentration and peak value. However, the performances of CAMx6.2, CAMx7.1, CMAQv5.0.2, and CMAQv5.3.2 were diversified in simulating the mean OC concentrations. The participating models performed reasonably well in simulating the average and peak values of NH₄⁺ concentrations. However, since there was no NH₃ concentration data for verification and most areas in the modeling domain are under NH₃-rich conditions, the performance of NH₄⁺ may be affected by the partitioning between NH₃ air concentrations and aerosols and the equilibrium of cations and anions of aerosols. The performances of the gaseous ammonia and ammonium of PM_2.5 for the participating models still need to be further evaluated.

In this air quality model evaluation work, different air quality models and versions and different model parameterization schemes for different pollutants and components, as well as for different periods of time and regions, still have varying degrees of uncertainties. The evaluation results suggested that, for regulatory applications of air quality models, continuous and routine evaluations are essential, and a regulatory mechanism or framework is essential for air quality applications and improvement.

Based on the evaluation results, it can be concluded that the emission inventory remains the largest source of uncertainty in China. Therefore, updating and improving the emission inventory should be considered a top priority for the regulatory applications and improvement of CTM air quality models.

This study aims to conduct a general performance evaluation of the key air pollutants in air quality models for regulatory applications. Overall, the model performance was found to be acceptable for most air pollutants, and the modeling results can provide valuable information for policy-making. However, significant uncertainties remain, particularly given the limitations of this evaluation. Further regular and systematic evaluations are needed. More in-depth evaluations for specific input data—such as emissions, meteorology, and modeling—as well as chemical mechanisms and physical algorithms in air quality models are essential for improving both models and data. At the same time, air quality model application guidelines, standardized evaluation methodologies, and regulations are necessary to ensure consistent model evaluations and regulatory applications in China.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/atmos17010001/s1, Table S1: The model performance for CO, SO₂, NO₂, O₃, PM_2.5 PM₁₀ concentrations and PM_2.5 components of 17 cases. Table S2: The model performance for CO, SO₂, NO₂, O₃, PM_2.5 PM₁₀ concentrations and PM_2.5 components of the highest concentration case. Table S3: Recommended scheme for the WRF meteorological model in China. Figure S1: Flowchart of air quality model evaluation. Figure S2: Locations of super-stations in China: Xianghe and Beijing within the Beijing-Tianjin-Hebei (BTH) region; Taizhou in the Yangtze River Delta (YRD) region; Shenzhen in the Pearl River Delta (PRD) region; and Chengdu in the Chengdu-Chongqing (CC) region. The BTH, YRD, PRD, and CC regions are highlighted in yellow on the map. The shadow area is the modeling domain of CAMx6.2 and the CMAQv5.02. Figure S3: Comparison of simulated and observed surface air temperatures for seven major Chinese cities in January and July 2018 using NCEP FNL data under different parameterization schemes. (a) January, (b) July. Figure S4: Comparison of simulated and observed surface air pressures for seven major Chinese cities in January and July 2018 using NCEP FNL data under different parameterization schemes. (a) January, (b) July. Figure S5: Comparison of simulated and observed surface wind speed for seven major Chinese cities in January and July 2018 using NCEP FNL data under different parameterization schemes. (a) January, (b) July. Figure S6: The comparison of hourly concentrations simulated by CAMx7.1 (blue solid lines) with the observation data (red dot lines) at Xianghe site of 2018.01.06~2018.01.08 (Case1, PM_2.5 episode). Figure S7: The comparison of O₃ concentrations simulated by CAMx7.1 (blue solid lines) with the observation data (red dot lines) at Taizhou site of 2018.05.28~2018.06.09 (Case13, O₃ episode). Figure S8: The comparison of hourly concentrations simulated by CMAQ5.3.2 (red solid lines) with observation data (black dot line) at Xiang He of 2018.01.06 ~ 2018.01.08 (Case1, PM_2.5 episode).

Author Contributions

Methodology, Y.L. and J.H.; validation, X.W., S.Y. and X.T.; data curation, W.T.; writing—original draft preparation, F.M. and X.D.; writing—review and editing, J.X. and M.X.; project administration, L.Z. and H.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research is funded by “Leading Goose” R&D Program of Zhejiang (2025C02231), the project of Research on the technical system of regulatory air quality models (Project No. 2018YFC0213501), the Major Science and Technology Project of the Xinjiang Uygur Autonomous Region—Research on the Comprehensive Causes and Advanced Control Technologies for Severe Air Pollution in the Urumqi-Changji-Shihezi Region (No. 2024A03012-2), the Jing-Jin-Ji Regional Integrated Environmental Improvement-National Science and Technology Major Project (No. 2024ZD1200502), and the Basic Scientific Research Business Expense Project of Central-level Public Welfare Scientific Research Institutes (No. 2024YSKY-08).

Institutional Review Board Statement

The study did not require ethical approval.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions to the study are included in the article; further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Appel, K.W.; Bash, J.O.; Fahey, K.M.; Foley, K.M.; Gilliam, R.C.; Hogrefe, C.; Hutzell, W.T.; Kang, D.; Mathur, R.; Wong, D.C. The Community Multiscale Air Quality (CMAQ) model versions 5.3 and 5.3. 1: System updates and evaluation. Geosci. Model Dev. Discuss. 2020, 14, 2867–2897. [Google Scholar] [CrossRef] [PubMed]
Appel, K.W.; Napelenok, S.L.; Foley, K.M.; Pye, H.O.; Hogrefe, C.; Luecken, D.J.; Bash, J.O.; Roselle, S.J.; Pleim, J.E.; Young, J.O. Description and evaluation of the Community Multiscale Air Quality (CMAQ) modeling system version 5.1. Geosci. Model Dev. 2017, 10, 1703–1732. [Google Scholar] [CrossRef] [PubMed]
Appel, K.W.; Pouliot, G.A.; Simon, H.; Sarwar, G.; Pye, H.O.T.; Napelenok, S.L.; Akhtar, F.; Roselle, S.J. Evaluation of dust and trace metal estimates from the Community Multiscale Air Quality (CMAQ) model version 5.0. Geosci. Model Dev. 2013, 6, 883–899. [Google Scholar] [CrossRef]
Appel, W. Summary and evaluation of the new Community Multiscale Air Quality (CMAQ) model version 5.5. In Proceedings of the CMAS Conference 2024, Chapel Hill, NC, USA, 21–23 October 2024; Available online: https://www.cmascenter.org/conference/2024/agenda.cfm (accessed on 18 November 2025).
Pennington, E.A.; Wang, Y.; Schulze, B.C.; Seltzer, K.M.; Yang, J.; Zhao, B.; Jiang, Z.; Shi, H.; Venecek, M.; Chau, D.; et al. An Updated Modeling Framework to Simulate Los Angeles Air Quality. Part 1: Model Development, Evaluation, and Source Apportionment. EGUsphere 2023. [Google Scholar] [CrossRef]
Campbell, P.C.; Tang, Y.; Lee, P.; Baker, B.; Tong, D.; Saylor, R.; Stein, A.; Huang, J.; Huang, H.-C.; Strobach, E.; et al. Development and evaluation of an advanced National Air Quality Forecasting Capability using the NOAA Global Forecast System version 16. Geosci. Model Dev. 2022, 15, 3281–3313. [Google Scholar] [CrossRef]
Torres-Vazquez, A.; Pleim, J.; Gilliam, R.; Pouliot, G. Performance evaluation of the meteorology and air quality conditions from multiscale WRF-CMAQ simulations for the Long Island Sound Tropospheric Ozone Study (LISTOS). J. Geophys. Res.-Atmos. 2022, 127, e2021JD035890. [Google Scholar] [CrossRef]
Chen, X.Y.; Zhang, Y.; Wang, K.; Tong, D.; Lee, P.; Tang, Y.H.; Huang, J.P.; Campbell, P.C.; McQueen, J.; Pye, H.O.T.; et al. Evaluation of the offline-coupled GFSv15-FV3-CMAQv5.0.2 in support of the next-generation National Air Quality Forecast Capability over the contiguous United States. Geosci. Model Dev. 2021, 14, 3969–3993. [Google Scholar] [CrossRef]
Harkey, M.; Holloway, T.; Kim, E.J.; Baker, K.R.; Henderson, B. Satellite Formaldehyde to Support Model Evaluation. J. Geophys. Res.-Atmos. 2021, 126, e2020JD032881. [Google Scholar] [CrossRef]
Toro, C.; Foley, K.; Simon, H.; Henderson, B.; Baker, K.R.; Eyth, A.; Timin, B.; Appel, W.; Luecken, D.; Beardsley, M.; et al. Evaluation of 15 years of modeled atmospheric oxidized nitrogen compounds across the contiguous United States. Elem.-Sci. Anthr. 2021, 9, 20. [Google Scholar] [CrossRef]
Shen, J.; Wang, X.; Li, J.; Li, Y.; Zhang, Y. Evaluation and intercomparison of ozone simulations by Models-3/CMAQ and CAMx over the Pearl River Delta. Sci. China Chem. 2011, 54, 1789–1800. [Google Scholar] [CrossRef]
Ma, S.; Zhang, X.; Gao, C.; Tong, D.Q.; Xiu, A.; Wu, G.; Cao, X.; Huang, L.; Zhao, H.; Dan, M. Multimodel simulations of a springtime dust storm over northeastern China: Implications of an evaluation of four commonly used air quality models (CMAQ v5. 2.1, CAMx v6. 50, CHIMERE v2017r4, and WRF-Chem v3.9.1). Geosci. Model Dev. 2019, 12, 4603–4625. [Google Scholar] [CrossRef]
Gao, Z.; Zhou, X. A review of the CAMx, CMAQ, WRF-Chem and NAQPMS models: Application, evaluation and uncertainty factors. Environ. Pollut. 2024, 343, 123183. [Google Scholar] [CrossRef] [PubMed]
Lou, M.; Wu, Q.; Wang, W.; Chen, H.; Cao, K.; Fan, X.; Liang, D.; Yu, F.; Zhang, J.; Wang, Z. Evaluating the EPICC-Model for Regional Air Quality Simulation: A Comparative Study with CAMx and CMAQ. EGUsphere 2025. [Google Scholar] [CrossRef]
Kim, C.H.; Chang, L.S.; Kim, J.S.; Meng, F.; Kajino, M.; Ueda, H.; Zhang, Y.; Son, H.-Y.; He, Y.; Xu, J.; et al. Long-term simulations of the sulfur concentrations over the China, Japan and Korea: A model comparison study. Asia-Pac. J. Atmos. Sci. 2011, 47, 399–411. [Google Scholar] [CrossRef]
Chen, L.; Gao, Y.; Zhang, M.; Fu, J.S.; Zhu, J.; Liao, H.; Li, J.; Huang, K.; Ge, B.; Wang, X.; et al. MICS-Asia III: Multi-model comparison and evaluation of aerosol over East Asia. Atmos. Chem. Phys. 2019, 19, 11911–11937. [Google Scholar] [CrossRef]
Li, J.; Nagashima, T.; Kong, L.; Ge, B.; Yamaji, K.; Fu, J.S.; Wang, X.; Fan, Q.; Itahashi, S.; Lee, H.-J.; et al. Model evaluation and intercomparison of surface-level ozone and relevant species in East Asia in the context of MICS-Asia Phase III–Part 1: Overview. Atmos Chem. Phys. 2019, 19, 12993–13015. [Google Scholar] [CrossRef]
Ministry of Science and Technology of the People’s Republic of China. (2018–2021). Research on the Technical System of Regulatory Air Quality Models (Project No. 2018YFC0213501); National Key R&D Program of China, Special Program on Atmospheric Pollution Causes and Control Technologies; Ministry of Science and Technology of the People’s Republic of China: Beijing, China. Available online: https://www.nstrs.cn/kjbg/SearchResult?wd=2018YFC0213501&q=All:2018YFC0213501 (accessed on 1 December 2025).
U.S. EPA, CMAQ/DOCS/Release_Notes/README.md at 5.3.2 (October, 2020 Release). Available online: https://github.com/USEPA/CMAQ/blob/5.3.2/DOCS/Release_Notes/README.md (accessed on 15 September 2023).
ENVIRON. User’s Guide Comprehensive Air Quality Model with Extensions (CAMx); ENVIRON International Corporation: Novato, CA, USA, 2024; Available online: https://www.camx.com (accessed on 13 October 2025).
U.S. EPA. Air Quality Modeling Technical Support Document: 2016 CAMx PM_2.5 Model Evaluation to Support of EGU Benefits Assessments. Available online: https://www.epa.gov/system/files/documents/2023-01/PMmodelperformance_2026fj_NH3Rscale0_Final2.pdf (accessed on 1 January 2024).
Byun, D.; Schere, K.L. Review of the governing equations, computational algorithms, and other components of the models-3 Community Multiscale Air Quality (CMAQ) modeling system. Appl. Mech. Rev. 2006, 59, 51–77. [Google Scholar] [CrossRef]
Kwok, R.H.F.; Baker, K.R.; Napelenok, S.L.; Tonnesen, G.S. Photochemical grid model implementation and application of VOC, NO_x and O₃; source apportionment. Geosci. Model Dev. 2015, 8, 99–114. [Google Scholar] [CrossRef]
Powers, J.G.; Klemp, J.B.; Skamarock, W.C.; Davis, C.A.; Dudhia, J.; Gill, D.O.; Coen, J.L.; Gochis, D.J.; Ahmadov, R.; Peckham, S.E.; et al. The Weather Research and Forecasting Model Overview, System Efforts, and Future Directions. Bull. Am. Meteorol. Soc. 2017, 98, 1717–1737. [Google Scholar] [CrossRef]
Institute for the Environment. SMOKE v5.2 User’s Manual; The University of North Carolina at Chapel Hill: Chapel Hill, NC, USA, 2025. Available online: https://www.cmascenter.org/smoke/documentation/5.2/USER_MANUAL.pdf (accessed on 13 November 2025).
Carter, W.P.L. Development of the SAPRC-07 chemical mechanism. Atmos. Environ. 2010, 44, 5324–5335. [Google Scholar] [CrossRef]
Carter, W.P.L. Development of a condensed SAPRC-07 chemical mechanism. Atmos. Environ. 2010, 44, 5336–5345. [Google Scholar] [CrossRef]
Whitten, G.Z.; Heo, G.; Kimura, Y.; McDonald-Buller, E.; Allen, D.T.; Carter, W.P.L.; Yarwood, G. A new condensed toluene mechanism for Carbon Bond: CB05-TU. Atmos. Environ. 2010, 44, 5346–5355. [Google Scholar] [CrossRef]
Reff, A.; Bhave, P.V.; Simon, H.; Pace, T.G.; Pouliot, G.A.; Mobley, J.D.; Houyoux, M. Emissions Inventory of PM_2.5 Trace Elements across the United States. Environ. Sci. Technol. 2009, 43, 5790–5796. [Google Scholar] [CrossRef] [PubMed]
Fountoukis, C.I.; Nenes, A. Isorropia II: A Computationally Efficient Thermodynamic Equilibrium Model for Multiphase Multicomponent Aerosols. In Proceedings of the 2006 Annual Meeting, Philadelphia, PA, USA, 5–8 January 2006. [Google Scholar]
Cooter, E.J.; Bash, J.O.; Walker, J.T.; Jones, M.R.; Robarge, W. Estimation of NH3 bi-directional flux from managed agricultural soils. Atmos. Environ. 2010, 44, 2107–2115. [Google Scholar] [CrossRef]
Hogrefe, C.; Bash, J.O.; Pleim, J.E.; Schwede, D.B.; Gilliam, R.C.; Foley, K.M.; Appel, K.W.; Mathur, R. An analysis of CMAQ gas-phase dry deposition over North America through grid-scale and land-use-specific diagnostics in the context of AQMEII4. Atmos. Chem. Phys. 2023, 23, 8119–8147. [Google Scholar] [CrossRef]
Bott, A. A positive definite advection scheme obtained by nonlinear renormalization of the advective fluxes. Mon. Weather Rev. 1989, 117, 1006–1016. [Google Scholar] [CrossRef]
Colella, P.; Woodward, P.R. The piecewise parabolic method (PPM) for gas-dynamical simulations. J. Comput. Phys. 1984, 54, 174–201. [Google Scholar] [CrossRef]
Pleim, J. A combined local and non-local closure model for the atmospheric boundary layer. Part I: Model description and testing. J. Appl. Meteorol. Climatol. 2007, 46, 1383–1395. [Google Scholar] [CrossRef]
Wesely, M.L. Parameterization of Surface Resistances to Gaseous Dry Deposition in Regional-Scale Numerical Models. Atmos. Environ. 1989, 23, 1293–1304. [Google Scholar] [CrossRef]
Slinn, S.A.; Slinn, W.G.N. Predictions for particle deposition on natural waters. Atmos. Environ. 1980, 24, 1013–1016. [Google Scholar] [CrossRef]
Zhang, L.; Wright, L.P.; Asman, W.A.H. Bi-directional air-surface exchange of atmospheric ammonia: A review of measurements and development of a big-leaf model for applications in regional-scale air-quality models. J. Geophys. Res. Atmos. 2010, 115, 1–23. [Google Scholar] [CrossRef]
Wang, Z.; Li, J.; Wang, X.; Pochanart, P.; Akimoto, H. Modeling of regional high ozone episode observed at two mountain sites (Mt. Tai and Huang) in East China. J. Atmos. Chem. 2006, 55, 253–272. [Google Scholar] [CrossRef]
Walcek, C.J.; Aleksic, N.M. A simple but accurate mass conservative, peak-preserving, mixing ratio bounded advection algorithm with FORTRAN code. Atmos. Environ. 1998, 32, 3863–3880. [Google Scholar] [CrossRef]
GB 3095-2012; Ambient Air Quality Standards. Ministry of Ecology and Environment of People’s Republic of China (MEE, China): Beijing, China, 2012. Available online: https://www.mee.gov.cn/ywgz/fgbz/bz/bzwb/dqhjbh/dqhjzlbz/201203/t20120302_224165.shtml (accessed on 13 September 2025).
U.S. EPA. Air Quality Modeling Final Rule Technical Support Document, 2015. Ozone NAAQS Good Neighbor Plan. Available online: https://www.epa.gov/system/files/documents/2023-03/AQ%20Modeling%20Final%20Rule%20TSD.pdf (accessed on 5 February 2024).
Marmur, A.; Liu, W.; Wang, Y.; Russell, A.G.; Edgerton, E.S. Evaluation of model simulated atmospheric constituents with observations in the factor projected space: CMAQ simulations of SEARCH measurements. Atmos. Environ. 2009, 43, 1839–1849. [Google Scholar] [CrossRef]

Figure 1. The model performances for simulating (a,b) CO, (c,d) NO₂, (e,f) SO₂, (g,h) O₃, (i,j) PM₁₀, and (k,l) PM_2.5 concentrations. Model in the charts is the simulated concentrations of models. Obs. in the charts are the observation data used for evaluation. The units in the chart for Obs. and Model are (µg/m³).

Figure 2. The model performance for simulating PM_2.5 components, (a,b) Nitrate, (c,d) Sulfate, (e,f) BC, (g,h) OC, and (i,j) NH₄. Model in the charts is the simulated concentrations of models. Obs. in the charts are the observation data used by the individual models, which vary slightly in terms of model simulation time. The units in charts for Obs. and Model, MB, ME, and RMSE are (µg/m³). The units in the charts for NMB and NME are %. R is unitless.

Table 1. The model settings for CMAQv5.02, CMAQv5.3.2, CAMx6.2, CAMx v7.1, and NAQPMS.

Meteorology	CMAQv5.02	CMAQv5.32	CAMx6.2	CAMx v7.1	NAQPMS
Meteorology model	WRF3.2	WRF3.7/WPS3.7	WRFv3.9.1.1/WPF3.9.1.1	WRF3.2/WPS3.9.1	WRF3.7/WPS3.7
Microphysics	WRF Single-Moment 3-class scheme (WSW3)	Lin (Purdue)	WSM3/Kain–Fritsch (new Eta) Cumulus scheme	Lin (Purdue)	Lin (Purdue)
Boundary layer	YSU	YSU	ACM2 (Pleim) PBL (ARW)	YSU	YSU
Long-wave scheme	RRTM	RRTM	RRTM	RRTM	RRTM
Short-wave scheme	Dudhia	Dudhia	Dudhia	Dudhia	Dudhia
Land-surface process	Noah Land-surface	Noah Land-surface	Noah Land-surface	Noah Land-surface	Noah Land-surface
CTM model parameterization	CMAQv5.02	CMAQv5.32	CAMx6.2	CAMx v7.1	NAQPMS
Horizontal advection	PPM	PPM	PPM	PPM	Walcek and Aleksic scheme
Vertical advection	WRF_Cons	WRF_Cons	Implicit backward-Euler (time) hybrid centered/upstream (space)	Implicit backward-Euler (time) hybrid centered/upstream (space)	Walcek and Aleksic scheme
Horizontal diffusion	Multiscale	Multiscale	Explicit simultaneous 2-D	Explicit simultaneous 2-D	K-theory
Vertical diffusion	ACM2	ACM2	ACM2	ACM2	K-theory
Gas-phase chemistry	SAPRC99	CB06	SAPRC99	SAPRC07TC	CBM-Z
Photolysis	JPROC	inline-JPROC	TUV	TUV	TUV with consideration of cloud effects
Aqueous chemistry	AQCHEM-KMTI	AQCHEM-KMTBR	RADM-AQ	RADM-AQ	RADM2
Heterogeneous chemistry	ACM	ACM; NO₂ heterogeneous reactions on aerosol	SO₂ heterogeneous reactions on aerosol	SO₂ heterogeneous reactions on aerosol	28 heterogeneous reactions +SO₂ heterogeneous catalytic reactions
Aerosol thermodynamics	Isorropia2.0	Isorropia2.2	Isorropia1.7	Isorropia1.7	Isorropia2.1
Secondary organic aerosol mechanisms	VBS + two products	AE7, VBS + two products	SOAP	SOAP2.2	VBS
Dry deposition	Wesely89	M3dry	ZHANG03 scheme	Wesely89	Wesely89
Wet deposition	RADM-AQ	CLOUD/ACM_AE7	SMGA	SMGA	Henry’s law
Boundary condition	Fixed	Fixed	Fixed	Fixed	Fixed
Modeling domain	Whole country	East of China	Whole country	Whole country	Whole country
Grid	36 km	12 km	36 km	36 km	27 km
Simulation period	17 evaluation cases	17 evaluation cases	17 evaluation cases	17 evaluation cases	17 evaluation cases

Table 2. Monitoring sites of 17 severe air pollution cases.

	Cities	Period of Time	Episode	Latitude (^o E)	Longitude (^o N)	Representing Region
1	Xianghe	6–8 January 2018	PM_2.5	116.95	39.79	BTH
2	Xianghe	11–15 January 2018	PM_2.5	116.95	39.79	BTH
3	Xianghe	17–19 January 2018	PM_2.5	116.95	39.79	BTH
4	Taizhou	25–30 May 2018	PM_2.5	119.99	32.55	YRD
5	Shenzhen	9–11 October 2018	PM_2.5	113.98	22.59	PRD
6	Shenzhen	25–27 October 2018	PM_2.5	113.98	22.59	PRD
7	Beijing	2–4 November 2018	PM_2.5	116.31	39.99	BTH
8	Beijing	12–15 November 2018	PM_2.5	116.31	39.99	BTH
9	Beijing	23–26 November 2018	PM_2.5	116.31	39.99	BTH
10	Beijing	30 November–2 December 2018	PM_2.5	116.31	39.99	BTH
11	Beijing	10–14 January 2019	PM_2.5	116.31	39.99	BTH
12	Chengdu	31 August–4 September 2019	PM_2.5	103.85	30.39	CC
13	Taizhou	28 May–9 June 2018	O₃	119.99	32.55	YRD
14	Taizhou	11–15 June 2018	O₃	119.99	32.55	YRD
15	Shenzhen	28 September–9 October 2018	O₃	113.98	22.59	PRD
16	Shenzhen	26–28 October 2018	O₃	113.98	22.59	PRD
17	Chengdu	10–19 August 2019	O₃	103.85	30.39	CC

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Meng, F.; Du, X.; Tang, W.; He, J.; Li, Y.; Wang, X.; Yu, S.; Tang, X.; Xing, J.; Xie, M.; et al. Evaluation of Regional Atmospheric Models for Air Quality Simulations in the Winter Season in China. Atmosphere 2026, 17, 1. https://doi.org/10.3390/atmos17010001

AMA Style

Meng F, Du X, Tang W, He J, Li Y, Wang X, Yu S, Tang X, Xing J, Xie M, et al. Evaluation of Regional Atmospheric Models for Air Quality Simulations in the Winter Season in China. Atmosphere. 2026; 17(1):1. https://doi.org/10.3390/atmos17010001

Chicago/Turabian Style

Meng, Fan, Xiaohui Du, Wei Tang, Jing He, Yang Li, Xuesong Wang, Shaocai Yu, Xiao Tang, Jia Xing, Min Xie, and et al. 2026. "Evaluation of Regional Atmospheric Models for Air Quality Simulations in the Winter Season in China" Atmosphere 17, no. 1: 1. https://doi.org/10.3390/atmos17010001

APA Style

Meng, F., Du, X., Tang, W., He, J., Li, Y., Wang, X., Yu, S., Tang, X., Xing, J., Xie, M., Zeng, L., & Dong, H. (2026). Evaluation of Regional Atmospheric Models for Air Quality Simulations in the Winter Season in China. Atmosphere, 17(1), 1. https://doi.org/10.3390/atmos17010001

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Evaluation of Regional Atmospheric Models for Air Quality Simulations in the Winter Season in China

Abstract

1. Introduction

2. Model Description and Specifications

2.1. Model Descritption

2.1.1. CMAQ Model

2.1.2. CAMx Model

2.1.3. NAQPMS

2.2. Model Specificatioins

2.3. Emission and Meteorology Data

2.3.1. The Meteorology Input Data for CTMs

2.3.2. Emission Inventory

3. Evaluation Methodology and Observation Data

3.1. Model Evoluation Methodology

3.2. Model Simulation Cases and Observation Data

4. Evaluations Results of AQMs

4.1. Evaluations of Model Performance for CO Concentration

4.2. Evaluations Model Performance for SO2 Concentration

4.3. Evaluations of Model Performace for NO2 Concentration

4.4. Evaluations of Model Performance for O3 Concentration

4.5. Evaluations of Model Performance for PM10 Concentration

4.6. Evaluations of Model Performance for PM2.5 Concentration

4.7. Evaluations for Modeled PM2.5 Components

4.7.1. Evaluation for Nitrate of PM2.5

4.7.2. Evaluation for Sulfate of PM2.5

4.7.3. Evaluation for BC of PM2.5

4.7.4. Evaluation for OC of PM2.5

4.7.5. Evaluation for NH4+ of PM2.5

5. Discussion

5.1. Sources of Uncertainty of CTMs and the Limitation of the Evaluation in This Work

5.2. Uncertainty of Emission Data

5.3. Evaluation Methods

6. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.2. Evaluations Model Performance for SO₂ Concentration

4.3. Evaluations of Model Performace for NO₂ Concentration

4.4. Evaluations of Model Performance for O₃ Concentration

4.5. Evaluations of Model Performance for PM₁₀ Concentration

4.6. Evaluations of Model Performance for PM_2.5 Concentration

4.7. Evaluations for Modeled PM_2.5 Components

4.7.1. Evaluation for Nitrate of PM_2.5

4.7.2. Evaluation for Sulfate of PM_2.5

4.7.3. Evaluation for BC of PM_2.5

4.7.4. Evaluation for OC of PM_2.5

4.7.5. Evaluation for NH₄⁺ of PM_2.5