A New Coupling Method for PM 2.5 Concentration Estimation by the Satellite-Based Semiempirical Model and Numerical Model

: Aerosol optical and chemical properties play a major role in the retrieval of PM 2.5 concentrations based on aerosol optical depth (AOD) data from satellites in the conventional semiempirical model (SEM). However, limited observation information hinders the high-resolution estimation of PM 2.5 . Therefore, a new method for evaluating near-surface PM 2.5 at high spatial resolution is developed by coupling the SEM and the chemical transport model (CTM)-based numerical (CSEN) model. The numerical model can provide large-scale information for aerosol properties with high spatial resolution at a large scale based on emissions and meteorology, though it can still be biased in simulating absolute PM 2.5 concentrations. Therefore, the two crucial aerosol characteristic parameters, including the coefﬁcient integrated humidity effect ( γ (cid:48) ) and the comprehensive reference value of aerosol properties ( K ) in SEM, have been redeﬁned using the WRF-Chem numerical model. Improved model performance was observed for these results compared with the original SEM results. The monthly averaged correlation coefﬁcients (R) by CSEN were 0.92, 0.82, 0.84, and 0.83 in January, April, July, and October, respectively, whereas those of the SEM were 0.80, 0.77, 0.72, and 0.72, respectively. All the statistical metrics of the model validation showed signiﬁcant improvements in all seasons. The reduced biases of estimated PM 2.5 by CSEN indicated the effect of hygroscopic growth and aerosol properties affected by the meteorology on the relationship between AOD and estimated PM 2.5 concentrations, especially in winter and summer. The better performance of the CSEN model provides insight for air quality monitoring at different scales, which supplies important information for air pollution control policies and health impact analysis.


Introduction
Fine particulate matter (PM 2.5 ) consists of a complicated mixture of chemical compounds, mainly including organic carbon (OC), elemental carbon (EC), nitrate, sulfate, ammonium salt, sodium salt (Na +), and water [1,2]. Although Chinese government authorities have implemented strict atmospheric pollution control measures in recent years, significant pollution episodes and haze events still occur frequently in several regions in China. In most cases, the concentration of PM 2.5 in various areas can also reach values higher than the WHO standard (5 µg/m 3 ). PM 2.5 is related to various environmental and climate effects and adverse human health impacts [3][4][5][6][7]. Although nationwide ground-based PM 2.5 monitoring networks have already been implemented around China, generating larger-scale high-resolution PM 2.5 estimation is still challenging, hindering the understanding of the PM 2.5 variation at diverse spatial scales. Therefore, to further understand the spatiotemporal distribution, transport paths, and formation mechanisms of PM 2.5 , it

MODIS AOD Data
The MODIS sensors aboard NASA's satellites named Terra and Aqua have a return visit period of 1-2 days, offering daily near-global observations of the Earth. Terra and Aqua pass the equator at local times of approximately 10:30 am and 1:30 pm, respectively, therefore providing sustained daily monitoring of aerosols around the Earth. The 1-kmresolution MODIS AOD retrieved by the aerosol retrieval algorithm proposed by Lin et al. (2015) is used in this study [25]. The retrieved AOD is validated by AERONET version 3.0 level 1.5 at seven observed stations (Table A1) (see Appendix A). The positions of the seven observed stations are shown in Figure A1 (see Appendix A). Here, the observation times of MODIS and AERONET are consistent. Thus, we proposed a collocation method of spatiotemporal consistency. The space configuration standard is a 10 km radius at each AERONET site for a MODIS AOD value spatial average. AERONET AODs are interpolated to 0.55 µm to match the MODIS product wavelength and then temporally averaged within a window of ± 30 min of the satellite transit time. Figure A2

Ground Monitoring Data
The hourly concentrations of PM 2.5 data, measured at Air Quality Monitoring Stations (AQMS), in mainland China, Hong Kong, and Taiwan were obtained from the Ministry of Environmental Protection of China (MEPA), the Hong Kong Environmental Protection Department, and the Taiwan Environmental Protection Administration, respectively. Hourly measurements, including the RH and visibility (L), were obtained from the national China Meteorological Administration (CMA) surface observation network, which includes 297 global telecommunication system (GTS) stations in the study region. The locations of the PM 2.5 AQMS and GTS stations are presented in Figure 1a

The SEM
In the SEM, the aerosol hygroscopic growth effect, and the mass extinction efficiency (MEE) are the main elements linking the ground-level PM2.5 mass concentration and satellite-retrieved AOD [19,[25][26][27]]. To make the paper self-contained, we briefly summarize the key equations of the SEM model in this section. The details of the derivation process can be found in Lin et al. (2015) and   [19,25].
The physical relationship between the PM2.5 concentration and aerosol extinction coefficient at ground level driven by the hygroscopic growth can be described using Equa- The hourly PM 2.5 concentration and hourly meteorological parameters were measured at different stations over the study region. Since PM 2.5 mass concentration and meteorological parameter values generally change smoothly at the regional scale, PM 2.5 monitoring sites and GTS stations in a window of 5 km distance can be matched. For this study, the meteorological data at 11:00 am and 2:00 pm were extracted to match the passage times of Terra and Aqua, respectively. We found that 271 GTS observation sites could be matched with 301 PM 2.5 sites. As a result, each site pair (2 × 365 days) has a maximum of 730 valid data points. The meteorological and PM 2.5 data obtained from the 301 matching stations were administered as a calibration database applied to build the AOD-PM 2.5 model. The ground-measured PM 2.5 data at 11:00 am and 2:00 pm were used to validate the satellitederived hourly PM 2.5 data. The data from the remaining 1695 AQMS monitoring sites were used as a validation database to evaluate the accuracy of the estimated concentrations of PM 2.5 mass. Various monitoring stations are superimposed on a map (Figure 1a). This map shows that the PM 2.5 monitoring sites are concentrated in the main urban areas, whereas the coverage rate in rural areas is much lower.

The SEM
In the SEM, the aerosol hygroscopic growth effect, and the mass extinction efficiency (MEE) are the main elements linking the ground-level PM 2.5 mass concentration and satellite-retrieved AOD [19,[25][26][27]]. To make the paper self-contained, we briefly summarize the key equations of the SEM model in this section. The details of the derivation process can be found in Lin et al. (2015) and   [19,25].
The physical relationship between the PM 2.5 concentration and aerosol extinction coefficient at ground level driven by the hygroscopic growth can be described using Equation (1): where ext is the ground-level aerosol extinction coefficient, which is calculated based on the observed visibility. α ext,10 represents the reference mass extinction efficiency (MEE) of mixed aerosols at 0.55 µm under the condition of RH = RH 0 . RH 0 is the reference RH value, which generally is set at 40% to represent dry conditions. F represents the fine mode fraction (FMF), which is equal to the ratio of PM 2.5 and PM 10 mass concentration and can represent the effect of aerosol size distribution. Similar to the Hänel growth coefficient [24], γ is the coefficient of the integrated humidity effect (IHE).
α ext,10 F , denoted as K, represents the integrated mass extinction efficiency.
By assuming a negatively exponential form for the vertical distribution of the aerosol extinction coefficient, AOD can be derived using the following Equation (2), where H is the scale height, which indicates the effect of the aerosol vertical structure. At the individual GTS monitoring station, the H can be derived by the ratio of the AOD and ext, while the spatial ext can be derived according to the gridded AOD from satellite observations and the spatial interpolation of H.
Following Equation (1), the values of γ and K ( α ext,10 F ) at the monitoring stations can be calculated with the matched observed values of the ext, RH, and PM 2.5 concentrations. Thereafter, by spatial interpolation, the spatial variation in the PM 2.5 concentration can be estimated using Equation (3):

The WRF-Chem Model
The horizontal resolution of 9 km for the parent domain area (90 • E to 140 • E, 10 • N to 45 • N) was set for the WRF-Chem model analysis. Moreover, the domain was set up with grid sizes of 188 × 149. Our current model settings cover most of China, including the surrounding terrestrial and oceanic areas ( contains five sectors (i.e., industry, power plant, transportation, residential combustion, and agricultural activity) and more than 10 major atmospheric pollutants, including SO 2 , NOx, CO, NMVOC, NH 3 , PM 2.5 , PM 10 , BC, OC, and carbon dioxide emissions [37]. For biogenic species, the Model of Emissions of Gas and Aerosols from Nature was used [38]. From the surface to the upper air limit of 50 hPa, there are 38 vertical layers, of which 12 layers are located below 2 km to sufficiently represent the vertical structure of the boundary layer. The model configuration and performance in the study domain can be found in our previous publication [30].
The horizontal resolution of 9 km for the parent domain area (90°E to 140°E, 10°N to 45°N) was set for the WRF-Chem model analysis. Moreover, the domain was set up with grid sizes of 188 × 149. Our current model settings cover most of China, including the surrounding terrestrial and oceanic areas ( Figure 2). We obtained the emission inventory from the Multiresolution Emission Inventory for China (MEIC) and applied it with Final Operational Global Analysis (FNL) to run the WRF-Chem model in this research. The MEIC covers more than 700 anthropogenic sources of emissions in the Chinese mainland and contains five sectors (i.e., industry, power plant, transportation, residential combustion, and agricultural activity) and more than 10 major atmospheric pollutants, including SO2, NOx, CO, NMVOC, NH3, PM2.5, PM10, BC, OC, and carbon dioxide emissions [37]. For biogenic species, the Model of Emissions of Gas and Aerosols from Nature was used [38]. From the surface to the upper air limit of 50 hPa, there are 38 vertical layers, of which 12 layers are located below 2 km to sufficiently represent the vertical structure of the boundary layer. The model configuration and performance in the study domain can be found in our previous publication [30].

The CSEN Model
In the SEM, the parameters of aerosol characteristics (K and γ′) are generated based upon long-term observed data regression and have relatively low spatial and temporal resolutions. Therefore, it is difficult to establish a high-resolution PM2.5 inversion with aerosol parameters based on a large-scale ground-based observation network. We can expect diminished model performance for the sites far from the ground-based observation sites.
To improve the SEM accuracy and evaluate the effect of aerosol characteristic variations, a new method for PM2.5 estimation based upon coupling the SEM and the CTM was developed. As mentioned above, the CTM simulates both meteorological and aerosol

The CSEN Model
In the SEM, the parameters of aerosol characteristics (K and γ ) are generated based upon long-term observed data regression and have relatively low spatial and temporal resolutions. Therefore, it is difficult to establish a high-resolution PM 2.5 inversion with aerosol parameters based on a large-scale ground-based observation network. We can expect diminished model performance for the sites far from the ground-based observation sites.
To improve the SEM accuracy and evaluate the effect of aerosol characteristic variations, a new method for PM 2.5 estimation based upon coupling the SEM and the CTM was developed. As mentioned above, the CTM simulates both meteorological and aerosol chemical and physical effects based on an emission inventory. In this research, the WRF-Chem model was applied to provide simulated aerosol properties. As the WRF-Chem model uses emission information with high temporal and spatial resolutions driven by reasonable meteorology and basic chemistry, it is expected that the simulated overall aerosol property pattern is relatively more reliable compared to interpolation of site-based regression, although the uncertainty in absolute PM 2.5 mass concentration simulation at individual stations could be considerable.
Therefore, we first carried out a four-season simulation with WRF-Chem in this study. The model provides simultaneous data, including hourly ext, RH, and PM 2.5 concentrations at the same grid cells. Linear transformation of Equation (1) by log-transforming both sides at each grid cell, results in Equation (4) as follows: which exactly satisfies a linear regression form as shown in Equation (5). γ (i,j) and K (i,j) can be fitted by the slope and intercept from the linear regression at each model grid (i,j) since X (i,j) and Y (i,j) are the previously known values. Thus, γ (i,j) and K (i,j) can be fitted by simultaneous measurements of ext (i,j) RH (i,j) , and PM 2.5(i,j) mass at the same grid. The aerosol properties were derived once from each grid cell using the simulation of the individual months in this study.
Thus, the gridded γ and K parameters based on CTM data can be derived. Combining the observational ext and RH data, the PM 2.5 mass concentration can be evaluated by Equation (6) for the whole domain.
In summary, the spatial map of ext (i,j) and RH (i,j) can be achieved by the interpolation of hourly ext (i,j) and RH (i,j) at GTS observational stations. Hour-specific AOD gridded data are retrieved from the MODIS satellite, and H is calculated and interpolated based on the observational visibility [24,25], while γ and K are gridded data estimated based on the WRF-Chem simulation for each grid cell and for each month. The temporal and spatial resolutions of the gridded γ and K could be further improved for downscaling issues.

Evaluation of Aerosol Characteristics from WRF-Chem: γ and K
According to Equations (4) and (5), the two integrated parameters of aerosol properties can be derived by using the WRF-Chem data. The IHE of PM 2.5 explains how visibility deteriorates at a given PM 2.5 level if the air humidity is higher than normal, which reflects the dependence of the integrated light extinction ability of PM 2.5 on RH. γ is the coefficient of IHE. As demonstrated by Lin et al. (2015) [19], in addition to the RH dependence of the hygroscopic growth effect of aerosol, we also include the RH dependence on FMF and MEE factors, since the factors are also likely to depend on RH due to the variation in the aerosol characteristics in various meteorological conditions.
The results presented in Figure 3 show that the γ values are affected by both anthropogenic emissions and meteorological conditions. For the transition seasons (represented by October and April), the overall absolute levels and spatial pattern of γ from the SEM and WRF-Chem are similar, and the spatial pattern seems to be closely associated with emissions [39,40]. We observed higher values of γ in the eastern parts of China than in other regions. Guangxi, Jiangxi, Anhui, Shaanxi, Jiangsu, and Zhejiang were associated with significantly larger γ values than other areas, related to the significant industrial development in these areas in recent years. The lower values observed in the northern and central areas and the Tibetan Plateau in China are mainly due to higher dust or biomass burning aerosol loads [41], which are generally less hygroscopic than industrial aerosols [42]. A comparison of the two results shows that, although the spatial patterns are similar, the CTM results provide more details of the distribution of γ values at a much higher spatial resolution. However, regarding the spatial pattern in the winter and summer seasons (represented by January and July), the two γ results show considerable differences. Areas with the highest γ in winter in the CTM results are missing in the SEM results. Remote Sens. 2022, 14, x FOR PEER REVIEW 8 of 17 K represents the integrated mass extinction efficiency, which is determined by the ratio of MEE and FMF [25]. As shown in Figure 4e-h, the MEE value is not sensitive to different scattering aerosol species [49]. Higher K values are observed in the southwest (Yunnan-Guizhou Plateau), northwest (Gansu), and some coastal areas located in eastern parts of China (i.e., Zhejiang), which are probably more affected by higher loading of coarse particles. The desertification in the Yunnan-Guizhou Plateau [50], the dust area in Gansu, and the increased sea salt aerosols in coastal areas may lead to a lower FMF [51] and thereafter a higher K value there. Overall, the spatial distribution patterns of the two models were comparable. However, a more detailed spatial distribution of K values was observed for the WRF-Chem results compared with the SEM in all seasons.

Statistical Results
All the daily PM2.5 measurements from 1695 monitoring stations were used as validation to evaluate the CSEN model performance. Figure 5 shows the scatter plots between the observations and the estimated PM2.5 mass concentrations of the CSEN and the SEM. Regarding the seasonal variation in γ in the whole study domain, we observed generally higher γ values in most of the southern areas in October, generally higher γ values in most of the northern areas in July, the lowest γ values in the whole domain in spring, and a generally low γ background but highest γ values in specific areas, especially in JJJ (Beijing-Tianjin-Hebei) and Shandong Province in winter. Generally, the widely higher γ values in autumn and summer reflect the strong RH dependence of the IHE in the wet season, while a lower background of γ reflects the weaker RH dependence of the IHE in the dry season. Nevertheless, in specific regions, such as JJJ and Shandong Province, the IHE can be significantly influenced by meteorological conditions. More specifically, the dry season (i.e., spring and winter) is usually marked by lower humidity, frequent cold fronts, and high wind speeds, which favor the formation of fugitive dust and its advection [43][44][45]. Because of this fugitive dust or dust storms, the average effective radius of particles in the dry season is larger, and the particles are generally less hygroscopic than the fine particles [44][45][46][47]. As a result, the lowest γ values in the whole domain occurred in spring, and a generally low γ background occurred in winter. The significantly high γ values were observed in the JJJ and Shandong areas in January, which should be related to the widely reported regional extreme haze events that frequently occur in the winter [48]. During these extreme haze events, in addition to the pollution accumulation due to stagnation and the significant decrease in the PBLH, abnormal southerly winds bring airmasses laden with pollutants and moisture. The extreme haze events result in an enhanced FMF, fine-mode effective radius, and volume concentration of PM 2.5 due to the hygroscopic growth effect of the particles and, after that, an increase in aerosol extinction and total AOD [43,44]. However, this significant hygroscopic effect associated with abnormal meteorological conditions, as shown in Figure 3a, cannot be reflected in the SEM results (Figure 3e), which is likely due to the low resolution and poor representation of the observation stations.
In spring (Figure 3b), the lowest γ might be associated with coarser particle generation, mainly due to the influence of more fugitive dust and dust storm events leading to a further decrease in aerosol hygroscopicity. During the wet season (i.e., summer and autumn), humid weather conditions are likely to be associated with a lower surface wind speed and sufficient humidity conditions conducive to the larger hygroscopic effect of aerosols. The higher γ values observed in the northern part than in the southern part of eastern China in summer (and vice versa in autumn) might be due to the variation in the relative abundance of precipitation.
K represents the integrated mass extinction efficiency, which is determined by the ratio of MEE and FMF [25]. As shown in Figure 4e-h, the MEE value is not sensitive to different scattering aerosol species [49]. Higher K values are observed in the southwest (Yunnan-Guizhou Plateau), northwest (Gansu), and some coastal areas located in eastern parts of China (i.e., Zhejiang), which are probably more affected by higher loading of coarse particles. The desertification in the Yunnan-Guizhou Plateau [50], the dust area in Gansu, and the increased sea salt aerosols in coastal areas may lead to a lower FMF [51] and thereafter a higher K value there. Overall, the spatial distribution patterns of the two models were comparable. However, a more detailed spatial distribution of K values was observed for the WRF-Chem results compared with the SEM in all seasons. K represents the integrated mass extinction efficiency, which is determined by the ratio of MEE and FMF [25]. As shown in Figure 4e-h, the MEE value is not sensitive to different scattering aerosol species [49]. Higher K values are observed in the southwest (Yunnan-Guizhou Plateau), northwest (Gansu), and some coastal areas located in eastern parts of China (i.e., Zhejiang), which are probably more affected by higher loading of coarse particles. The desertification in the Yunnan-Guizhou Plateau [50], the dust area in Gansu, and the increased sea salt aerosols in coastal areas may lead to a lower FMF [51] and thereafter a higher K value there. Overall, the spatial distribution patterns of the two models were comparable. However, a more detailed spatial distribution of K values was observed for the WRF-Chem results compared with the SEM in all seasons.

Statistical Results
All the daily PM2.5 measurements from 1695 monitoring stations were used as validation to evaluate the CSEN model performance. Figure 5 shows the scatter plots between the observations and the estimated PM2.5 mass concentrations of the CSEN and the SEM.

Validation of the Estimated PM 2.5 3.2.1. Statistical Results
All the daily PM 2.5 measurements from 1695 monitoring stations were used as validation to evaluate the CSEN model performance. Figure 5 shows the scatter plots between the observations and the estimated PM 2.5 mass concentrations of the CSEN and the SEM.
Based on the SEM method, the correlation coefficients (R) between the observed and estimated values were 0.82, 0.77, 0.75, and 0.78, with moderate biases (root mean square error (RMSE) = 21.01, 11.63, 12.21, and 7.11 µg/m 3 ) for January, April, July, and October, respectively. For the CSEN method, on the other hand, the R values between the observed and estimated values were 0.92, 0.82, 0.84, and 0.83, with a much lower model bias (RMSE = 13.71, 8.19, 5.59, and 6.26 µg/m 3 ) for January, April, July, and October, respectively. All the slopes based upon the CSEN model are closer to 1.
The mean absolute error (MAE) and mean relative error (MRE) between the predictions and observations for the four seasons were also calculated. The comparison of the four seasons of the two methods is shown in Table 1. Significant improvement in the PM 2.5 estimation by CSEN compared to that of SEM is clearly shown. In addition, both results are better than the CTM performance for PM 2.5 (the RMSE of January on average is greater than 21 µg/m 3 and R is lower than 0.6 [30]).
Based on the SEM method, the correlation coefficients (R) between the observed and estimated values were 0.82, 0.77, 0.75, and 0.78, with moderate biases (root mean square error (RMSE) = 21.01, 11.63, 12.21, and 7.11 μg/m 3 ) for January, April, July, and October, respectively. For the CSEN method, on the other hand, the R values between the observed and estimated values were 0.92, 0.82, 0.84, and 0.83, with a much lower model bias (RMSE = 13.71,8.19,5.59, and 6.26 μg/m 3 ) for January, April, July, and October, respectively. All the slopes based upon the CSEN model are closer to 1. The mean absolute error (MAE) and mean relative error (MRE) between the predictions and observations for the four seasons were also calculated. The comparison of the four seasons of the two methods is shown in Table 1. Significant improvement in the PM2.5 estimation by CSEN compared to that of SEM is clearly shown. In addition, both results are better than the CTM performance for PM2.5 (the RMSE of January on average is greater than 21 µ g/m 3 and R is lower than 0.6 [30]).

The Seasonal Variation in Spatial Distribution
The seasonal variations (represented by January, April, July, and October) in satelliteretrieved PM 2.5 with the corresponding near-surface observed data from all the sites in eastern China are shown in Figure 6. Here, the monthly averaged PM 2.5 estimation was calculated based on the result of the daily PM 2.5 estimation. The results show that the PM 2.5 levels are much higher in northern China than in other regions. According to the estimated results in Figure 6a-d, the PM 2.5 pollution still shows regional properties, as reported by previous studies [8,25]. Regional heating and the industrial sectors are mainly responsible for pollution episodes. The highest pollution levels occurred in winter, followed by autumn, spring, and summer. The modeling results in Figure 6a-h illustrate that the highest pollution levels are mainly concentrated in Northern China Plain (NCP) areas, consistent with the results from the ground-based monitoring stations (Figure 6i-l).
The PM 2.5 concentration levels estimated by the SEM method in the NCP areas significantly underestimated the observed concentration levels during winter. We also observed a severe overestimation of the SEM results for the same region during autumn. The CSEN model, on the other hand, provided a more robust result with higher levels of accuracy, especially in the major polluted area. The spatial variation in the PM 2.5 deviation as a percentage between the evaluated PM 2.5 average and ground-based PM 2.5 average, including January, April, July, and October, is shown in Figure 7. PM 2.5 mass concentrations estimated by the CSEN model exhibit much smaller MRE values than those analyzed by SEM. The MRE values calculated for the ground-based monitoring stations were within the range from −50% to +50%, which were on average 30% smaller, especially during winter and summer.

The Seasonal Variation in Spatial Distribution
The seasonal variations (represented by January, April, July, and October) in satelliteretrieved PM2.5 with the corresponding near-surface observed data from all the sites in eastern China are shown in Figure 6. Here, the monthly averaged PM2.5 estimation was calculated based on the result of the daily PM2.5 estimation. The results show that the PM2.5 levels are much higher in northern China than in other regions. According to the estimated results in Figure 6a-d, the PM2.5 pollution still shows regional properties, as reported by previous studies [8,25]. Regional heating and the industrial sectors are mainly responsible for pollution episodes. The highest pollution levels occurred in winter, followed by autumn, spring, and summer. The modeling results in Figure 6a-h illustrate that the highest pollution levels are mainly concentrated in Northern China Plain (NCP) areas, consistent with the results from the ground-based monitoring stations (Figure 6i-l). The PM2.5 concentration levels estimated by the SEM method in the NCP areas significantly underestimated the observed concentration levels during winter. We also observed a severe overestimation of the SEM results for the same region during autumn. The CSEN model, on the other hand, provided a more robust result with higher levels of accuracy, especially in the major polluted area.
The spatial variation in the PM2.5 deviation as a percentage between the evaluated PM2.5 average and ground-based PM2.5 average, including January, April, July, and October, is shown in Figure 7. PM2.5 mass concentrations estimated by the CSEN model exhibit much smaller MRE values than those analyzed by SEM. The MRE values calculated for the ground-based monitoring stations were within the range from −50% to +50%, which were on average 30% smaller, especially during winter and summer.
Overall, significant improvements were observed in PM2.5 estimations by the CSEN model specifically for January and July in terms of statistics and better spatial distribution patterns. This better agreement should be attributed to the significant difference in the estimated γ′ (especially in northern China) between WRF-Chem and the pure SEM

The Seasonal Variation in Major Clusters
To further explain the regional PM2.5 mass concentration variations, we mainly focused on four key clusters in China: Beijing-Tianjin, YRD, Sichuan, and PRD ( Figure 8 and Table 2). According to Figure 8, the PM2.5 concentrations retrieved by the two methods vary significantly among different regions. January was the most polluted month in all four critical urban clusters in China. In the Beijing-Tianjin area, the PM2.5 concentrations Overall, significant improvements were observed in PM 2.5 estimations by the CSEN model specifically for January and July in terms of statistics and better spatial distribution patterns. This better agreement should be attributed to the significant difference in the estimated γ (especially in northern China) between WRF-Chem and the pure SEM method (Figure 3a-c). On the other hand, the better agreement, and the reduced biases in the validation results, especially in winter and summer, showed that the CTM γ in the CSEN method can provide more realistic aerosol properties, significantly improving model performance.

The Seasonal Variation in Major Clusters
To further explain the regional PM 2.5 mass concentration variations, we mainly focused on four key clusters in China: Beijing-Tianjin, YRD, Sichuan, and PRD ( Figure 8 and Table 2). According to Figure 8, the PM 2.5 concentrations retrieved by the two methods vary significantly among different regions. January was the most polluted month in all four critical urban clusters in China. In the Beijing-Tianjin area, the PM 2.5 concentrations dropped from 58.85 µg/m 3 in January to 38.27 µg/m 3 in July, increasing slightly to 40.53 µg/m 3 in October. The mean PM 2.5 concentrations retrieved by the CSEN model performed better than those by the SEM in all four months, especially in January. On the other hand, neither model performed well in October, with MREs of −20.94% and −23.71%, respectively. In the PRD region, the mean PM 2.5 concentration retrieved by the SEM method is seriously overestimated, with MREs of 44.32%, 39.57%, and 38.76% in January, April, and July, respectively. In the Sichuan area, both methods underestimated the PM 2.5 concentration in January, although the CSEN provided a closer value to the ground-based monitoring stations. In the YRD region, the results from the CSEN method are slightly overestimated in January (2.84%) and July (1.66%) and underestimated in April (−2.61%) and October (−6.36%). In contrast, the results of the SEM are underestimated in January (−0.49%) and October (−4.02%) and overestimated in April (12.57%) and July (29.08%). The overall results obtained with the CSEN method provide more realistic values when compared with the ground-based monitoring network data in the four clusters.

Discussion and Conclusions
Large-scale high-resolution measurements of PM 2.5 are fundamental to investigating various environmental, climate, and adverse human health impacts. Although nationwide ground-based PM 2.5 monitoring networks are already implemented around China, limited by the nature of in situ observations, reaching high-resolution PM 2.5 retrievals at a larger scale is still challenging and hinders understanding the variability of PM 2.5 at different spatial scales. State-of-the-art statistical and machine learning models combining ground observations and satellite AOD have been applied to provide PM 2.5 estimation. However, most of these are limited in the physical understanding between the model inputs and outputs.
The physical-based SEM proposed in our previous study showed that the PM 2.5 estimation accuracy could be greatly improved by combining the aerosol characteristics into the physical model. In the SEM, the humidity coefficient (γ ) and an integrated reference value (K) of aerosol characteristics could be obtained based on matched in situ meteorological and air quality data, which are very sparse in spatial resolution and therefore limit the resolution of PM 2.5 estimation.
Therefore, in this research, we developed a new method for estimating surface PM 2.5 by coupling the SEM and a CTM-based numerical model (CSEN). The CSEN takes advantage of the WRF-Chem model in providing high spatial resolution of aerosol properties to optimize the aerosol parameters (γ and K) required for the PM 2.5 estimation. Four months of air quality were simulated with the WRF-Chem model to acquire the variation in aerosol property parameters (γ and K).
Comparison of γ values and K values between the two methods show that their overall spatial patterns are comparable, but the new method exhibits a significantly improved resolution, and there is a significant difference in the γ values in winter and summer. Validation results indicated that the correlation coefficient between the observed and the estimated values increased to 0.92, 0.82, 0.84, and 0.83 with a much lower model bias by CSEN, from 0.82, 0.77, 0.75, and 0.78, with a moderate bias by SEM for January, April, July, and October, respectively. In addition, the RMSE was reduced by between 0.85 and 7.3 µg/m 3 .
The CSEN PM 2.5 estimation is significantly improved in all seasons, especially in winter and summer, which is because the CTM provides a more complete and more realistic spatial distribution of aerosol properties with emission data at high spatial-temporal resolution based on reasonable meteorology. However, the uncertainty in the absolute PM 2.5 concentration simulated at individual stations could be primarily due to the existing model errors and uncertainties. The results prove that CSEN is valuable for PM 2.5 evaluation at urban and regional scales, especially for regions lacking ground measurements. It also provides valuable tactics for air pollution control strategies and health risk assessment tactics.
Limitations remain in CSEN estimation mainly due to the following: (1) In this work, we did not fill pixel gaps of AOD due to the existence of thick clouds, which might reduce the model performance at these pixels. (2) The number of PM 2.5 observed stations was limited, and the PM 2.5 observation stations were irregularly distributed, which might reduce the performance of the model far from the monitoring station area. (3) Although the results show that the CTM could provide a reasonably high spatial-temporal resolution for γ and K values driven by emission and meteorology, the spatial resolution and accuracy of meteorological field and aerosol parameters simulated by the WRF-Chem model are still affected by model errors and uncertainties, which could be further improved by, for example, more reliable parameterization schemes for the formation and the gas-particle conversion of secondary aerosols. (4) The γ and K were fitted by one-month simulations in this study to acquire the local average aerosol properties. The daily and synoptic scale variation in aerosol properties due to the transport impact and abrupt emission variation could be smoothed by using monthly period regression.  Figure A1. Distribution of the AERONET sites. Figure A1. Distribution of the AERONET sites.