Estimating PM 2.5 Concentrations Using the Machine Learning RF-XGBoost Model in Guanzhong Urban Agglomeration, China

: Fine particulate matter (PM 2.5 ) is a major pollutant in Guanzhong Urban Agglomeration (GUA) during the winter, and GUA is one of China’s regions with the highest concentrations of PM 2.5 . Daily surface PM 2.5 maps with a spatial resolution of 1 km × 1 km can aid in the control of PM 2.5 pollution. Thus, the Random Forest and eXtreme Gradient Boosting (RF-XGBoost) model was proposed to ﬁll the missing aerosol optical depth (AOD) at the station scale before accurately estimating ground-level PM 2.5 using the recently released MODIS AOD product derived from Multi-Angle Implementation of Atmospheric Correction (MAIAC), high density meteorological and topographic conditions, land-use, population density, and air pollutions. The RF-XGBoost model was evaluated using an out-of-sample test, revealing excellent performance with a coefﬁcient of determination (R 2 ) of 0.93, root-mean-square error (RMSE) of 12.49 µ g/m 3 , and mean absolution error (MAE) of 8.42 µ g/m 3 . The result derived from the RF-XGBoost model indicates that the GUA had the most severe pollution in the winter of 2018 and 2019, owing to the burning of coal for heating and unfavorable meteorological circumstances. Over 90% of the GUA had an annual average PM 2.5 concentrations decrease of 3 to 7 µ g/m 3 in 2019 compared to the previous year. Nevertheless, the air pollution situation remained grim in the winter of 2019, with more than 65% of the study area meeting the mean PM 2.5 values higher than 35 µ g/m 3 and the maximum reaching 95.57 µ g/m 3 . This research would be valuable for policymakers, environmentalists, and epidemiologists, especially in urban areas.


Introduction
PM 2.5 , or fine particulate matter with aerodynamic diameters of 2.5 µm or less, not only exacerbates photochemical pollution and climate change but also substantially influences human health and quality of life [1].The increase in concentration of PM 2.5 is caused by cases of natural phenomenon, including dust intrusion, volcano eruptions, forest fires, and the everyday life and production activity of mankind [2,3].China has experienced rapid economic growth in recent years that undoubtedly has contributed to the corresponding increase of PM 2.5 concentrations due to intensive human activity and huge industrial emissions [4].Since 2010, China has successively promulgated the "12th Five-Year Plan" for National Environment Protection [5], the Air Pollution Prevention and Control Action Plan [6], and the Blue Sky Protection Campaign [7] to limit the number of days with high levels of air pollution.China completed 1436 monitoring stations in 338 cities [8], and the number of days with adequate air quality grew by 5.8% countrywide by 2020 compared to 2015 [9].Nevertheless, some areas in China, such as Beijing-Tianjin-Heibei and Guanzhong Urban Agglomeration (GUA), have still been suffering from PM 2.5 pollution, especially the GUA, which was subjected to pollution for approximately 29.4% of all days in 2020 [10], showing that PM 2.5 must be monitored continuously and accurately.However, the established PM 2.5 observation network cannot fully capture the spatial distribution over large areas because the monitoring stations are primarily located in urban areas.As a result, various models were investigated to estimate ground-level PM 2.5 concentrations based on the relationship between satellite remote sensing aerosol optical depth (AOD) and PM 2.5 concentrations, frequently in conjunction with meteorological conditions and land use, etc. [11].
Machine learning and deep learning, multiple linear regression and Least Absolute Shrinkage and Selection Operator [12], decision tree [13], Random Forest ( [14,15], Ad-aBoost [16], eXtreme Gradient Boosting (XGBoost) [17,18], Gradient Boost Regression Tree (GBRT) [19], Gradient Boosting Machines (GBM) [20], neural network (Arhami et al., 2013) convolutional neural networks [21][22][23], the Elman Neural Network (ENN) and Back Propagation Neural Network (BPNN) [19], have been increasingly used in recent years to predict PM 2.5 concentrations due to their superior capabilities in data mining and excellent model performance, such as improved machine learning algorithms calculating spatial and temporal information as input variables when the models were trained to incorporate spatiotemporal heterogeneity between AOD and PM 2.5 [24][25][26][27].Comparatively, tree-based machine learning, specifically the ensemble algorithm, which includes bagging, boosting, and stacking and can accurately capture non-linear relationships and interactions between predictors, has demonstrated a significant improvement in model performance with higher R 2 and lower RMSE and MAE [28].In particular, XGBoost, as a typical ensemble algorithm, performs efficiently with far fewer resources [9].
Satellite-based AOD, with several spatial and temporal resolutions generated from various sensors using different algorithms, is an extremely useful and indicative variable to assess PM 2.5 exposure.The latest AOD product derived from the MAIAC algorithm [29] was increasingly employed due to its high spatial resolution of 1 km × 1 km.The drawback of AOD, which severely restricts the further improvement of model performance, is the considerable number of missing values caused by cloud cover and high surface reflectivity [30].The association among outputs from the GEOS-Chem model, land-use, and non-missing AOD were explored to predict the missing AOD values, which eventually improved the model performance [22].In addition, predictions and fusions of MAIAC AOD from Terra and Aqua based on spatial and temporal smoothing [31] and linear regression approaches [15,32] also contributed to the optimization of prediction power.Previous studies have demonstrated that AOD is strongly influenced by natural emissions, meteorological conditions (e.g., temperature, humidity), and anthropogenic pollution [33,34].
In almost all surface PM 2.5 exposure estimating models, the meteorological condition is another important variable affecting air pollution formation, transport, and deposition [35], especially in regards to temperature and precipitation [36,37].However, the spatial resolution of meteorological data was much lower than that of the PM 2.5 concentrations maps, which could limit the effect of modeling [20].
This research made an attempt to integrate the RF and XGBoost model to estimate PM 2.5 concentrations of GUA where the atmospheric pollution is the most serious in China.The objectives of this study are as follows: (1) to forecast the missing AOD on the monitoring station scale using RF base on the relationship between the existing AOD and factors including meteorology, the Digital Elevation Model (DEM), Global Land Cover (GLC), and People Density (PD), land use (LU), CO, SO 2 , NO 2 , and PM 10 , and to examine the feature importance and select variables using XGBoost before modelling; (2) to generate 1 km PM 2.5 concentrations maps and analyze the spatio-temporal variation of PM 2.5 in GUA; (3) to investigate the possible causes of PM 2.5 pollution to assist with environmental management.

Study Area
GUA is the largest group located between 33.10 • and 37.16 • North and 104.58 • and 112.57• East in central China, covering western Shanxi Province, the middle part of Shaanxi Province, and eastern Gansu Province (Figure 1).It is characterized by weak atmospheric dispersion [38] due to the embracing of Qinling mountain in the south and Loess Plateau in the north.According to the seventh National Census, the study area has a population of around 41 million people, with one mega-city, Xi'an, and nine medium-sized cities, including Xianyang, Baoji, Weinan, Tongchuan, Linfen, Yuncheng, Qingyang, Pingliang, and Tianshui.The yearly average PM 2.5 concentrations in cities throughout GUA exceeds the standards II (35 µg/m 3 ) set by China's Ministry of Ecology and Environment based on the Bulletin of Ecological Environment from 2018 to 2020.
generate 1 km PM2.5 concentrations maps and analyze the spatio-temporal variation of PM2.5 in GUA; (3) to investigate the possible causes of PM2.5 pollution to assist with environmental management.

Study Area
GUA is the largest group located between 33.10° and 37.16° North and 104.58° and 112.57°East in central China, covering western Shanxi Province, the middle part of Shaanxi Province, and eastern Gansu Province (Figure 1).It is characterized by weak atmospheric dispersion [38] due to the embracing of Qinling mountain in the south and Loess Plateau in the north.According to the seventh National Census, the study area has a population of around 41 million people, with one mega-city, Xi'an, and nine mediumsized cities, including Xianyang, Baoji, Weinan, Tongchuan, Linfen, Yuncheng, Qingyang, Pingliang, and Tianshui.The yearly average PM2.

Ground Measurements
Hourly data for average PM2.5, PM10, SO2, CO, and NO2 mass concentrations for the years 2018-2020 were downloaded from China National Environmental Monitoring Centre (http://www.cnemc.cn(accessed on 15 October 2021)).56 air pollution stations were unevenly and sparsely distributed in the study area, which were mainly set in urban areas (Figure 1).The missing, abnormal, and repeated values were examined and removed before calculating monthly, seasonal, and yearly mean PM2.5 concentrations.

Data Source 2.2.1. Ground Measurements
Hourly data for average PM 2.5 , PM 10 , SO 2 , CO, and NO 2 mass concentrations for the years 2018-2020 were downloaded from China National Environmental Monitoring Centre (http://www.cnemc.cn(accessed on 15 October 2021)).56 air pollution stations were unevenly and sparsely distributed in the study area, which were mainly set in urban areas (Figure 1).The missing, abnormal, and repeated values were examined and removed before calculating monthly, seasonal, and yearly mean PM 2.5 concentrations.

MODIS AOD
The Terra and Aqua MODIS C6 daily 1-km MCD19A2 products derived from the MAIAC algorithm were employed in this study.The AOD data from 2018 to 2020 were extracted by the PM 2.5 monitoring sites from Google Earth Engine (GEE) (https: //code.earthengine.google.com(accessed on 15 October 2021)), using the mean value of the three or four bands at the wavelength of 550 nm, and extreme values (above 4) were excluded [22] from modeling.Because of bright surfaces, cloud contamination and extreme values, approximately 69% of the AOD data was missing [39], which greatly impacted the model's performance.

Meteorological Conditions
The meteorological data of 1677 meteorological observation stations (Figure 1) were provided by Shaanxi Meteorological Bureau (http://sn.cma.gov.cn/(accessed on 15 October 2021)), which included coordinates, maximum temperature, minimum temperature, average temperature, total precipitation accumulation from 8 pm to 8 pm, total precipitation accumulation from 8 am to 8 am, evaporation, 2-min average wind speed, average relative humidity and sunshine hours (Table 1).Extreme weather related variables, such as 2-min average wind speed greater than 50 m/s and sunshine time greater than 14 h, were excluded from the dataset.The daily meteorological maps were interpolated by inverse distance weight (IDW) method in the python platform with a spatial resolution of 1 km × 1 km and a temporal resolution of 1 day.October 2021)).PD was obtained from GEE with the spatial resolution 0.009 • × 0.009 • .All the auxiliary data were resampled to the spatial resolution of 1 km × 1 km to match the AOD data.

Methodology 2.3.1. Data Integration
In this study, the AOD, meteorological conditions, and auxiliary data were reprojected to a uniform projection of WGS84.The coordinates of PM 2.5 monitoring sites were used to extract all independent variables across temporal-spatial domain in the python platform.Moreover, the date (day of the year (DOY)) and coordinates were included in the dataset due to the spatial and temporal heterogeneity of PM 2.5 .Thus, a dataset with 48,548 samples and 21 variables was constituted.However, among the samples, 33,415 samples had no AOD values due to the high reflectance surfaces and clouds.

Fill Missing AOD with RF
AOD was influenced by meteorology and land use, according to a prior study [30].RF, as a machine learning method with randomly sampling and independently building trees, is more robust to noise [40]; thus, it was used to investigate the relationship between AOD and other variables (meteorology, GLC, DEM, and PD).
The dataset was divided into two distinct groups; a group with AOD (GWA) and a group of missing AOD (GMA).The former had 15,133 samples and was used to build an optimal model using RF to estimate AOD values, while the latter applied the optimal model to predict and assign AOD values for the 33,415 samples.The meteorological data, GLC, DEM, PD and air pollution variables (except for PM 2.5 ) were used in both model building and estimation stages for the two groups.Firstly, GWA was randomly split into the training and testing datasets with proportions of 80% and 20%, respectively.Secondly, an initial model was trained using RF with default hyper parameters for the training dataset, in which the AOD was the label (dependent variable) and meteorology, DEM, GLC, LU, PD, CO, SO 2 , NO 2 , and PM 10 were the features (independent variables).The grid search based on the 10-fold cross-validation approach was then utilized to modify the hyper parameters by comparing the coefficient of determination (R 2 ) of the model.The optimal RF model was obtained when the value of R 2 was high and stable.Thirdly, all the data of the testing dataset was input into the optimal RF model to estimate AOD values, which were compared with the measured AOD values by R 2 , root-mean-square error (RMSE), and intercept and slope of the regression line to evaluate the accuracy of the model.To obtain a model with better generalization, methods such as reducing the number and depth of the trees were adopted to simplify the model when the model was over-fitting.On the contrary, the complexity of the model was increased when the model was under-fitting.Finally, all the data of GMA, including the meteorology, DEM, GLC, LU, PD, CO, SO 2 , NO 2 , and PM 10 were put into the optimal RF model, and all the missing AOD values were estimated.

XGBoost Model and Feature Selection
XGBoost, a tree boosting machine learning system, has been widely used by researchers in a wide range of fields with satisfactory results [9].In particular, during recent years, several studies have applied the XGBoost algorithm for air quality purposes [41][42][43][44].XGBoost could efficiently prevent over-fitting via three methods-regularized objective, shrinkage, and column subsampling-resulting in more accurate prediction.A regularized objective aids in penalizing complexity and selecting a basic and effective model.Shrinkage reduces the influence of each particular tree and leaves room for future trees to enhance the model.Column subsampling, which is less memory consuming when training a model, is more efficient than row subsampling to avoid over-fitting [40].For computation efficiency, the exact greedy algorithm combined with a weighted quantile sketch algorithm and a novel sparsity-aware algorithm based on the block structure aids in reducing computation time and memory consumption when developing a model [45].Nevertheless, the XGBoost model performs well in existing literature compared to other ensemble algorithms, such as RF, GBDT, highly randomized trees (ERT), and GBM.
For all the 48,548 samples, the Pearson correlation coefficients (r) and p_values between the 17 selected variables (AOD, meteorology, DEM, GLC, LU and PD) and PM 2.5 concentrations were calculated and analyzed (Table 2).The air quality variables (CO, SO 2 , NO 2 , and PM 10 ) were not considered in this step due to the sparsely distributed station.The correlation coefficients between PM 2.5 and five variables are positive, most of which are very low particularly for GLC (0.09), LU (0.07), and PD (0.02).The r of satellite-derived AOD was significantly higher than other variables, which is important in estimating PM 2.5 (Figure 2).In contrast, the remaining twelve variables show negative correlation coefficients, especially for TEM_AVG, TEM_Min, TEM_Max, and EVP, with r of −0.48, −0.47, −0.47, and −0.35, respectively.The latitude (Lat) variable, with r and p_values of −0.00 and 0.26, was removed from the dataset due to the extremely low correlation between Lat and surface PM 2.5 concentrations.Four steps were implemented to obtain an optimum model for estimating ground-level PM 2.5 concentrations.First, the XGBoost model was used to calculate feature importance using all variables with potential influence on surface PM 2.5 to preliminarily fit the model and calculate the contribution of each variable, and the feature scores were then re-arranged in descending order (Figure 3).Second, the RMSE and R 2 were employed to examine the accuracy of XGBoost model when the features were recursively eliminated as described by previous research [17].When GLC, PD and LU were removed from the dataset, the model's performance remained almost constant with only a minor decrease in R 2 and RMSE; thus, these three variables were excluded from the XGBoost model and the remaining variables were used to train the model.Third, the dataset was divided into two groups-the first group, with the years 2018-2019, was prepared as the dataset for modeling (DFM), and the second group in 2020 was used as the dataset for predicting (DFP).Fourth, the DFM was split into a sample-based training set and a sample-based testing set with proportions of 80% and 20%, respectively.The training set was used to train and validate the model, while the testing set was used to verify the model.The built-in functions, KFold and CV from the scikit-learn and XGBoost library packages, were used to tune the hyperparameters, which considerably impacted the model's performance and resilience.

Model Evaluation
In this study, the performance of the RF and XGBoost models was evaluated using the statistical metrics R 2 , RMSE, and MAE in conjunction with three universally accepted and independent validation approaches, namely sample, spatial, and temporal-based testing.
The dataset was randomly split into the sample, spatial, and temporal-based training and testing datasets taking the index, site index, and DOY as basic factors, with the proportions of training and testing sets as 80% and 20%, respectively.The dataset was GWA when verifying the ability of the RF model to estimate AOD, whereas the dataset was DFM when verifying the ability of the XGBoost model to estimate PM 2.5 concentrations.The sample, spatial, and temporal-based training datasets were used to train the RF and XGBoost model with the tuned hyper-parameters.In contrast, the testing dataset was used to evaluate the model's performance, which may lead to a comprehensive understanding of the RF and XGBoost model's performance.After that, the measured AOD and PM 2.5 values were compared with the predicted values by the above metrics and scatter plot with a linear regression line.
trees to enhance the model.Column subsampling, which is less memory consuming when training a model, is more efficient than row subsampling to avoid over-fitting [40].For computation efficiency, the exact greedy algorithm combined with a weighted quantile sketch algorithm and a novel sparsity-aware algorithm based on the block structure aids in reducing computation time and memory consumption when developing a model [45].Nevertheless, the XGBoost model performs well in existing literature compared to other ensemble algorithms, such as RF, GBDT, highly randomized trees (ERT), and GBM.
For all the 48,548 samples, the Pearson correlation coefficients (r) and p_values between the 17 selected variables (AOD, meteorology, DEM, GLC, LU and PD) and PM2.5 concentrations were calculated and analyzed (Table 2).The air quality variables (CO, SO2, NO2, and PM10) were not considered in this step due to the sparsely distributed station.The correlation coefficients between PM2.5 and five variables are positive, most of which are very low particularly for GLC (0.09), LU (0.07), and PD (0.02).The r of satellite-derived AOD was significantly higher than other variables, which is important in estimating PM2.5 (Figure 2).In contrast, the remaining twelve variables show negative correlation coefficients, especially for TEM_AVG, TEM_Min, TEM_Max, and EVP, with r of −0.48, −0.47, −0.47, and −0.35, respectively.The latitude (Lat) variable, with r and p_values of −0.00 and 0.26, was removed from the dataset due to the extremely low correlation between Lat and surface PM2.5 concentrations.The trained RF-XGBoost model was then used to predict PM 2.5 concentrations for each grid cell with a spatial resolution of 1 km × 1 km for each day.In addition, the monthly, seasonal, and annual PM 2.5 maps were generated by averaging each variable and then inputting the average values into the model.

Model Evaluation
In this study, the performance of the RF and XGBoost models was evaluated using the statistical metrics R 2 , RMSE, and MAE in conjunction with three universally accepted and independent validation approaches, namely sample, spatial, and temporal-based testing.The dataset was randomly split into the sample, spatial, and temporal-based training and testing datasets taking the index, site index, and DOY as basic factors, with the proportions of training and testing sets as 80% and 20%, respectively.The dataset was GWA when verifying the ability of the RF model to estimate AOD, whereas the dataset was DFM when verifying the ability of the XGBoost model to estimate PM2.5 concentrations.The sample, spatial, and temporal-based training datasets were used to train the RF and XGBoost model with the tuned hyper-parameters.In contrast, the testing dataset was used to evaluate the model's performance, which may lead to a comprehensive understanding of the RF and XGBoost model's performance.After that, the measured AOD and PM2.5 values were compared with the predicted values by the above metrics and scatter plot with a linear regression line.
The assessment of the XGBoost model is composed of two sections.The first section involves XGBoost model training and testing on the sample, temporal and spatial-based scales.The second phase includes forecasting daily PM2.5 concentrations for 2020 using the XGBoost model and then quantitatively evaluating the algorithm's predictive power.
The trained RF-XGBoost model was then used to predict PM2.5 concentrations for each grid cell with a spatial resolution of 1 km × 1 km for each day.In addition, the monthly, seasonal, and annual PM2.5 maps were generated by averaging each variable and then inputting the average values into the model.

Model Performance of RF
GWA containing 15,133 samples was used to build an optimal model with RF to estimate AOD values.The best hyper parameters (n_estimators = 300) of RF were obtained by parameter tuning.The amounts of samples among the sample, spatial, and temporalbased training datasets were approximately equal.According to the RF model fitting

Model Performance of RF
GWA containing 15,133 samples was used to build an optimal model with RF to estimate AOD values.The best hyper parameters (n_estimators = 300) of RF were obtained by parameter tuning.The amounts of samples among the sample, spatial, and temporalbased training datasets were approximately equal.According to the RF model fitting results, the R 2 between measured and estimated AOD is consistent (0.97) for the three datasets, with MAE equal to 0.03 and RMSE values less than 0.05, respectively (Figure 4a-c).Compared to the measured AOD, the estimated AOD has a high regression slope of 0.90 and a small intercept of 0.04.Regarding the RF model testing results, the sample and spatial-based testing datasets also performed well with R 2 greater than 0.79, MAE equal to 0.08, and RMSE less than 0.12, respectively (Figure 4d-e).However, the temporal-based testing dataset had lower R 2 (0.59) and higher MAE (0.13) and RMSE (0.19), respectively (Figure 4f).Therefore, the RF model can estimate the missing AOD at the station scale using GWA.

Statistics of Variables
All the data with 48,548 matched samples for the years 2018-2020 in GUA were plotted in the histograms along with the basic statistics including minimum, maximum, mean, and standard deviation for each variable (Figure 5).The pattern of the daily PM 2.5 concentrations were similar to that of the AOD (Wei et al. 2019a), which was consistent with the result of correlation analysis.The annual mean measured PM 2.5 concentration was 52.55 ± 45.22 µg/m 3 by averaging all ground-based stations in GUA, where the lowest PM 2.5 concentration was 26.01 ± 12.86 µg/m 3 in summer, and the highest was 91.88 ± 60.31 µg/m 3 in winter.The annual AOD after filling the missing values was 0.60.The seasonal mean AODs were 0.54, 0.55, 0.61, and 0.70 in spring, summer, autumn, and winter, respectively (Figure 5), with similar values in spring and summer.c).Compared to the measured AOD, the estimated AOD has a high regression slope of 0.90 and a small intercept of 0.04.Regarding the RF model testing results, the sample and spatial-based testing datasets also performed well with R 2 greater than 0.79, MAE equal to 0.08, and RMSE less than 0.12, respectively (Figure 4d-e).However, the temporal-based testing dataset had lower R 2 (0.59) and higher MAE (0.13) and RMSE (0.19), respectively (Figure 4f).Therefore, the RF model can estimate the missing AOD at the station scale using GWA.

Statistics of Variables
All the data with 48,548 matched samples for the years 2018-2020 in GUA were plotted in the histograms along with the basic statistics including minimum, maximum, mean, and standard deviation for each variable (Figure 5).The pattern of the daily PM2.

Seasonal Model Performance
DFM and GWA were split by season for 2018-2019 to test model performance on the sample-based scale in GUA (Figure 6B).For each of the four seasons (spring, summer, autumn, and winter), DFM was separated into 8989, 8643, 8899, and 8425 samples, whereas GWA was divided into 4405, 2903, 4057, and 3768 samples, respectively.Next, we divided each season's samples into training and testing samples with a 0.2 testing size for each.
The test accuracies of the DFM model had higher R 2 values (Figure 6B (a-d)) than that of the GWA model for all the four seasons (Figure 6B (e-h)).The models of autumn and winter performed equally well in DFM and GWA, with R 2 more than or equal to 0.88, which was higher than models of spring and summer.Because of the heavy pollution of PM 2.5 concentrations in winter, the MAE and RMSE values of DFM (14.00 µg/m 3 and 19.90 µg/m 3 ) and GWA (14.25 µg/m 3 and 21.35 µg/m 3 ,) were significantly higher than the other three seasons.In contrast, the DFM and DFC models had the lowest model performance in summer due to less PM 2.5 concentrations, mainly caused by frequent precipitation and high relative humidity.In general, the DFM model performed better than the GWA model for all the four seasons, especially in spring when there is an obvious gap in overall R 2 , MAE, and RMSE.

Site-Scale Model Performance
We evaluated the test accuracies of the DFM model at each monitoring station using the sample-based test.The spatial patterns showed that the samples of individual station ranged from 91 to 151, except for a station in Xi'an, which yielded only 31 samples, possibly due to random splitting (Figure 7a).The stations with samples between 100 and 140 were almost 82% of the total stations, indicating that the samples' distribution was relatively homogeneous.The mean R 2 was 0.88, and more than 85% of the stations had a R 2 higher than 0.82, particularly in the densely populated hinterland of the Guanzhong basin (GZB).In contrast, R 2 was quite low in a few stations located at higher altitudes around the GZB (Figure 7b), likely due to reduced environmental impact from human activities.The distribution of MAE and RMSE were similar (Figure 7c,d), and the average MAE and RMSE in GUA were 8.47 µg/m 3 and 12.37 µg/m 3 , respectively.More than 85% of the monitoring stations had a low prediction error with MAE of less than 10.00 µg/m 3 and RMSE less than 15.00 µg/m 3 .A few stations in Xi'an, Xianyang, and Linfen reported higher prediction errors due to more intensive human activities and considerable pollution emissions.The results revealed that the PM 2.5 -AOD relationship could be efficiently enhanced by the DFM-trained XGBoost model.

Model Prediction of XGBoost
The XGBoost model's predictability in GUA was evaluated using DFM from 2018 to 2019 and DFP in 2020 for model training and prediction.The daily PM 2.5 concentrations predicted by the XGBoost model and measured by air pollution stations were compared by analyzing the statistical metrics and the R 2 , MAE, and RMSE were 0.61, 17.66 µg/m 3 and 26.02 µg/m 3 , respectively.There were a total of 13,493 samples from 56 monitoring stations in DFP, and most of the stations (91%) had more than 240 samples (Figure 8a).More than 42% of the stations had R 2 larger than 0.61 (Figure 8b), which were mostly distributed across Linfen, Yuncheng, Weinan, Xianyang, Baoji, and parts of Xi'an.However, the XGBoost model underestimates the high PM 2.5 values [19,46,47] and slightly overrates the low PM 2.5 concentrations for the stations with R 2 less than 0.61.Therefore, the MAE and RMSE were higher in highly polluted areas, such as Xi'an, and the lower MAE and RMSE were correlated with the minute pollution sites of PM 2.5 .

Seasonal and Annual Distribution of PM 2.5
Figure 9 shows the seasonal and annual PM 2.5 concentrations distribution with a spatial resolution of 1 km × 1 km, derived from the RF-XGBoost model, demonstrating the temporal and spatial heterogeneity and aggregation.The predicted annual PM 2.5 concentrations were spatially consistent with in situ measurements and previous studies [42].Figure 9a-h depict the spatial distribution of the seasonal mean PM 2.5 concentrations calculated by the RF-XGBoost model, illustrating the spatial difference in PM 2.5 pollution.Winter always experienced the most serious pollution, with an average PM 2.5 value of 42.77 µg/m 3 and 42.08 µg/m 3 in 2018 and 2019, respectively, followed by spring with PM 2.5 values of 32.95 ± 6.62 µg/m 3 and 22.03 ± 5.97 µg/m 3 .Comparatively, the temporal variations of PM 2.5 concentrations in summer (17.66 ± 2.41 µg/m 3 and 12.85 ± 2.47 µg/m 3 ) and autumn (17.14 ± 5.16 µg/m 3 and 16.79 ± 4.28 µg/m 3 ) were relatively low.Compared to 2018, the seasonal average PM 2.5 concentrations decreased significantly in 2019, especially in spring, with a reduction of 10.92 µg/m 3 .However, the air pollution situation was still grim in the winter of 2019, with more than 65% of the study area meeting the mean PM 2.5 values > 35 µg/m 3 and the maximum reaching 95.57µg/m 3 .
ranged from 91 to 151, except for a station in Xi'an, which yielded only 31 samples, possibly due to random splitting (Figure 7a).The stations with samples between 100 and 140 were almost 82% of the total stations, indicating that the samples' distribution was relatively homogeneous.The mean R 2 was 0.88, and more than 85% of the stations had a R 2 higher than 0.82, particularly in the densely populated hinterland of the Guanzhong basin (GZB).In contrast, R 2 was quite low in a few stations located at higher altitudes around the GZB (Figure 7b), likely due to reduced environmental impact from human activities.The distribution of MAE and RMSE were similar (Figure 7c,d), and the average MAE and RMSE in GUA were 8.47 μg/m 3 and 12.37 μg/m 3 , respectively.More than 85% of the monitoring stations had a low prediction error with MAE of less than 10.00 μg/m 3 and RMSE less than 15.00 μg/m 3 .A few stations in Xi'an, Xianyang, and Linfen reported higher prediction errors due to more intensive human activities and considerable pollution emissions.The results revealed that the PM2.5-AOD relationship could be efficiently enhanced by the DFM-trained XGBoost model.

Model Prediction of XGBoost
The XGBoost model's predictability in GUA was evaluated using DFM from 2018 to 2019 and DFP in 2020 for model training and prediction.The daily PM2.5 concentrations predicted by the XGBoost model and measured by air pollution stations were compared by analyzing the statistical metrics and the R 2 , MAE, and RMSE were 0.61, 17.66 μg/m 3 and 26.02 μg/m 3 , respectively.There were a total of 13,493 samples from 56 monitoring stations in DFP, and most of the stations (91%) had more than 240 samples (Figure 8a).The spatial distribution patterns were similar in 2018 and 2019, as shown in Figure 9i-j, with the annual mean PM 2.5 concentrations being 29.62 ± 5.97 µg/m 3 and 24.63 ± 6.01 µg/m 3 , respectively.The highest values were consistently found in the plain area of GUA, where Linfen and Yuncheng in the east were subjected to the most severe pollution, followed by the cities of Weinan, Xianyang, and Xi'an when analyzed spatially.
Figure 9k shows the rate of change in annual PM 2.5 concentrations from 2018 to 2019 in GUA.Over 99% of the study region showed a decrease in PM 2.5 values, with significant reductions (>7 µg/m 3 ) occurring primarily in the east and the greatest decrease being 10.74 µg/m 3 .Meanwhile, almost 93% of the study area exhibited approximately 3~7 µg/m 3 reductions.On the contrary, increasing only occurred in the local areas of Weinan and Xianyang with a growth of less than 2 µg/m 3 .
More than 42% of the stations had R 2 larger than 0.61 (Figure 8b), which were mostly distributed across Linfen, Yuncheng, Weinan, Xianyang, Baoji, and parts of Xi'an.However, the XGBoost model underestimates the high PM2.5 values [19,46,47] and slightly overrates the low PM2.5 concentrations for the stations with R 2 less than 0.61.Therefore, the MAE and RMSE were higher in highly polluted areas, such as Xi'an, and the lower MAE and RMSE were correlated with the minute pollution sites of PM2.5.

Seasonal and Annual Distribution of PM2.5
Figure 9 shows the seasonal and annual PM2.5 concentrations distribution with a spatial resolution of 1 km × 1 km, derived from the RF-XGBoost model, demonstrating the temporal and spatial heterogeneity and aggregation.The predicted annual PM2.5 concentrations were spatially consistent with in situ measurements and previous studies [42].Figure 9a-h   μg/m 3 , respectively.The highest values were consistently found in the plain area of GUA, where Linfen and Yuncheng in the east were subjected to the most severe pollution, followed by the cities of Weinan, Xianyang, and Xi'an when analyzed spatially.
Figure 9k shows the rate of change in annual PM2.5 concentrations from 2018 to 2019 in GUA.Over 99% of the study region showed a decrease in PM2.5 values, with significant reductions (>7 μg/m 3 ) occurring primarily in the east and the greatest decrease being 10.74 μg/m 3 .Meanwhile, almost 93% of the study area exhibited approximately 3~7 μg/m 3 reductions.On the contrary, increasing only occurred in the local areas of Weinan and Xianyang with a growth of less than 2 μg/m 3 .

RF-XGBoost Model Performances
The RF-XGBoost model filled in missing AOD by employing many variables and produced a high sample-based R 2 (0.93) in GUA.Furthermore, we investigated the prediction power of the model based on data from 2020 on a daily scale and obtained an R 2 of 0.61.
Since the MAIAC AOD was released, a series of models have been proposed to explore the relationship between PM 2.5 and MAIAC AOD, along with meteorological conditions and other auxiliary data in China [24,25,27,[48][49][50][51], the Middle East [14,52,53], South Africa [15], and USA [54][55][56][57][58][59], mainly due to the high spatial resolution 1 km × 1 km.The predictive power of the model is essentially important for the usability of the model in the future; however, only a few studies have evaluated the predictive power using the dataset acquired before or after the modeling and testing datasets [25].
There are a series of novelties and advantages of the adopted RF-XGBoost model compared to previous studies (Table 3) that used MAIAC AOD as an explanatory factor for ground-level PM 2.5 concentrations.First, the tentative fill of missing AOD by RF could effectively increase the sample size, and removing the variables having a negligible correlation to PM 2.5 concentrations by XGBoost can simplify the dataset.This approach proposed in our study is versatile and has significantly improved the model performance.In terms of R 2 , the adopted RF-XGBoost model not only surpassed the statistical regression models [57] including the GAM model [60], the GWR model [25], and Multiple Linear Regression (MLR) [54,55], but also performed slightly better than machine and deep learning approaches, e.g., RF models [52], improved random forest [27], space-time random forest [26], GBM [56], Gradient Boosting Descent Tree [48], XGBoost models [53,59], spacetime extremely randomized trees [25], Deep Belief Network [53], and some ensemble models [50,58].Second, the high-density meteorological measurements used for modeling contributed to the high model performance.The atmospheric environment is very complex [24], and PM 2.5 concentrations usually change along with weather conditions.In this research, there are 1677 meteorological stations for the area of 161,526 km 2 .According to research based on statistics between PM 2.5 and each meteorological variable in the study area, pollution decreased sharply when cumulative precipitation values reached 0.30 mm.TEM_Avg between −5~15 • C and RHU_Avg between 50%~80% may lead to high pollution.EVP greater than 5 mm and WIN_Avg larger than 2 m/s contributed to the dispersion of PM 2.5 .Therefore, informative meteorology promoted our RF-XGBoost model to predict the ground-level PM 2.5 concentrations accurately.
Third, the selected RF-XGBoost model was applied to daily PM 2.5 concentrations in 2020 based on 13,493 data samples for model prediction with an R 2 of 0.61, which is superior to the existing models [25].The predicted R 2 of all models was lower than the validation R 2 mainly because the change in meteorological conditions was exceedingly complicated.The models trained based on meteorological conditions could not capture all weather conditions during the study period.Therefore, it's more appropriate to train the model based on meteorological conditions simultaneously to estimate PM 2.5 concentrations.Another contributory factor was the COVID-19 restrictions which greatly reduced the emission of PM 2.5 from the sources of industry, traffic, etc. [61][62][63].The COVID-19, especially the lockdown of the whole country in February 2020, had significantly reduced the primary emissions from the sources, including industrial operations, motor vehicle exhaust, construction operations, and smoke from restaurants, which were the major emission sources that cause air pollution [61].As the petrochemical industry, construction industry, and facility manufacturing were strongly affected and hampered by both the upstream and downstream chains, all other industries were almost brought to a standstill, which led to lowered PM 2.5 emission [62].The large reductions in vehicular emissions due to the sharp decline in vehicle and public transport can effectively mitigate air pollution in megacities [63].Therefore, the PM 2.5 concentrations during COVID-19 restrictions were much lower than previous years in the same time period, which made it difficult to predict.
A comparison between the results from RF-XGBoost and the dataset from the methodology developed by Donkelaar in 2021 [64] showed that the changing tendency of spatial and temporal distribution of PM 2.5 concentrations was consistent.The benefits of the RF-XGBoost model are that (1) the spatial resolution is 1 km × 1 km which showed more detail information of spatial distribution of PM 2.5 concentrations; (2) the values of PM 2.5 concentrations are more accurate compared with observations from ground stations averaged by monthly and annual means.On the contrary, the limitations are that (1) the RF-XGBoost model did not consider the representation of physics, chemistry, and transport within the atmosphere; (2) the RF-XGBoost model did not evaluate the uncertainty of PM 2.5 concentrations.Another limitation of the RF-XGBoost model is underestimating PM 2.5 in high pollution days and overestimating for low values, similar to most of the PM 2.5 estimation models based on machine learning [15,25,27,65,66].Several theories have been presented [25] to explain why the limitation of PM 2.5 models occurred, in which the small number of high-pollution samples was the main factor, limiting the model's learning capacity.For example, only 735 samples, less than 2% of all samples of DFM, have PM 2.5 concentrations greater than 200 µg/m 3 .

Comparision with Other Ensemble Models
As demonstrated in Section 2.3, the combination of RF and XGBoost was used to fill the missing AOD at the station scale and then accurately estimate ground-level PM 2.5 .Moreover, the RF-XGBoost model was compared with other ensemble models after the datasets and hyper parameters were pre-processed for RF-XGBoost and other ensemble models.Firstly, some features of GWA were removed and the remaining features were AOD, SSH, TEM_Avg, TEM_Max, TEM_Min, RHU_Avg, PRE_20, PRE_08, WIN_Avg, EVP, DOY, Lon, and DEM, which were consistent with the features of DFM and DFP that were used to evaluate the model performance of RF-XGBoost.Secondly, GWA was divided into two groups; the first group with the years 2018-2019 was prepared as the dataset for training and testing ensemble models with 12,368 samples, and the second group of 2020 with 2765 samples was used as the dataset for predicting.Third, DFM and DFP obtained by filling missing AOD with RF model were used to evaluate RF-XGBoost, where DFM with the years 2018-2019 with 34,956 samples was used to train and test, and DFP with the year 2020 with 13,493 samples was used to validate the model prediction.Finally, we tuned hyper parameter n_estimators = 300, the number of estimator being 300, for models used to compare and evaluate the model performance.
The RF, ETR, and GBDT were imported from the scikit-learn package, while the XGBoost and LightGBM were imported from independent packages, respectively.Among the five ensemble models, RF and ERT models are derived from "Bagging" that each tree in the ensemble is built from a sample drawn with replacement from the training set, which works best with strong and complex models.On the contrary, GBDT, XGBoost and LightGBM are derived from "Boosting" that synthesizes multiple interdependent and related weak estimators into a strong estimator [67], which usually work best in weak models.Table 4 shows the accuracies and efficiencies of five ensemble models used for estimating ground-level PM 2.5 in GUA with the same dataset and estimators.In terms of time consumption, the training speed of XGBoost and LightGBM is much faster than other models, especially the LightGBM which just took less than 0.005 s.From a memory consumption perspective, models based on the Boosting algorithm are less memory-consuming because they work in serial, which does not consume much memory.The RF-XGBoost model was the most time and memory-consuming because of the filling of missing AOD in the first step and then estimating PM 2.5 .Models in Table 4 performed well due to the combining of 300 weak estimators.Among them, ERT and LightGBM showed high estimation accuracy, whereas RF and ERT showed high prediction accuracy.The RF-XGBoost model showed the highest R 2 for the out-of-sample test and the second highest R 2 for predicting PM 2.5 in 2020, along with RMSE and MAE that were lower than other approaches considered.9i,j).The values for the entire yearly PM 2.5 concentrations in Shaanxi province in 2015 were greater than 35 µg/m 3 [51]; thus, the air quality has been dramatically improved due to the significant factors, especially the Three-Year Blue-Sky Action Plan initiated in 2018 and the air pollution control implemented by various levels of government in recent years.However, the GZB in GUA consistently had the highest levels of PM 2.5 pollution.
As mentioned above, winter always experienced serious pollution mainly due to burning coal for heating, which has been listed as one of the primary causes of air pollution.While the transport of PM 2.5 was from the neighboring provinces of Shanxi and Henan [68], even the North China Plain [69] was another contributor to air pollution caused by easterly winds in winter.Generally, convergence between southeast winds from a subtropical anticyclone and northwest winds from Siberian high forms a stable warm high at the surface layer and further forms an apparent low wind speed area over the GBZ, making it difficult for the pollutants to be evacuated from GBZ. GBZ, on the other hand, always had a lengthy high-pressure system in the winter [70], which readily led to temperature inversion layers and stable temperature stratification, which is not favorable for the diffusion and diluting of pollutants [71].In addition, the inversion limits the planetary boundary layer's diurnal evolution, reducing the vertical transmission of pollutants [72].The descending mountain breeze from the Loess Plateau and Qinling Mountains not only carried pollutants into the basin but also created a convergence zone that accumulated air pollution due to the cooling of mountainous areas at night [73].The frequent and varied degree of dusty weather in spring may have polluted the air in GBZ, which could explain the average PM 2.5 value of 32.95 µg/m 3 in 2018.
The meteorological conditions between summer and winter showed a significant difference that affected the PM 2.5 concentrations, especially temperature, relative humidity, and precipitation [74][75][76].Among them, rainfall was the main factor, with more than 3.2 mm daily in summer and less than 0.15 mm in winter, efficiently scavenging PM 2.5 .Meanwhile, high temperatures in summer (daily mean temperature of 24.95 • C) brought a series of intense thermal activities in the atmosphere, promoting the dispersion of PM 2.5 [73].Moreover, high relative humidity in summer (daily mean relative humidity of 68.6%) affected both the particle diameter [77] and optical properties [76], which led to the aggregation of PM 2.5 particles.PM 2.5 concentrations in particles decreased through dry and wet depositions [78] when aggregation grew with a certain weight.

Conclusions
The MAIAC AOD, with a spatial resolution of 1 km ×1 km, was extensively used to estimate ground-level PM 2.5 concentrations; however, the missing AOD caused by cloud cover and high surface reflectivity severely restricted the further improvement of model performance.Therefore, the RF-XGBoost pipeline is proposed, which employs an RF regression to boost the number of samples by filling the missing AOD on the station scale, improving training capacity.The built-in function of XGBoost regression is used to analyze feature importance, select variables, and then non-linearly investigate the link between PM 2.5 and AOD after filling, together with high-density meteorological fields, spatiotemporal information, and auxiliary variables.The sample-, spatial-, and temporalbased test approaches are applied to examine the model performance.The relevant statistics, machine learning, and deep learning models are also collected for comparison.
The RF-XGBoost model significantly improved the estimation of ground-level PM 2.5 concentrations in GUA with test R 2 , RMSE, and MAE of 0.93, 12.49 µg/m 3 , and 8.42 µg/m 3 , respectively, which surpasses all of the statistics and deep learning models and outperforms most of the machine learning models presented in existing studies.Meanwhile, The RF-XGBoost model has an advanced level of prediction with an R 2 of 0.61.The seasonal and annual PM 2.5 concentrations distribution from the RF-XGBoost model reveal that winter always experiences the most serious pollution, followed by spring, and PM 2.5 concentrations in 2019 significantly decreased compared to 2018.The daily ground-level PM 2.5 concentrations product with a spatial resolution of 1 km × 1 km in GUA can be accurately generated using the RF-XGBoost model, which can assist researchers working in environmental protection and epidemiological domains, particularly in urban areas.
5 concentrations in cities throughout GUA exceeds the standards II (35 μg/m 3 ) set by China's Ministry of Ecology and Environment based on the Bulletin of Ecological Environment from 2018 to 2020.

Figure 1 .
Figure 1.The study area and locations of the air quality observation stations and weather stations.

Figure 1 .
Figure 1.The study area and locations of the air quality observation stations and weather stations.

Figure 2 .
Figure 2. The correlation coefficient matrix between PM2.5 Observation and each variable in GUA.Figure 2. The correlation coefficient matrix between PM 2.5 Observation and each variable in GUA.

Figure 2 .
Figure 2. The correlation coefficient matrix between PM2.5 Observation and each variable in GUA.Figure 2. The correlation coefficient matrix between PM 2.5 Observation and each variable in GUA.The assessment of the XGBoost model is composed of two sections.The first section involves XGBoost model training and testing on the sample, temporal and spatial-based scales.The second phase includes forecasting daily PM 2.5 concentrations for 2020 using the XGBoost model and then quantitatively evaluating the algorithm's predictive power.The trained RF-XGBoost model was then used to predict PM 2.5 concentrations for each grid cell with a spatial resolution of 1 km × 1 km for each day.In addition, the monthly, seasonal, and annual PM 2.5 maps were generated by averaging each variable and then inputting the average values into the model.

Figure 3 .
Figure 3. Feature importance for the XGBoost model.

Figure 3 .
Figure 3. Feature importance for the XGBoost model.

Figure 4 .
Figure 4. Density scatterplots of the RF model with sample, spatial, and temporal-based train examination results (a-c) and test evaluation results (d-f) for GWA.The black dash and solid red lines represent the 1:1 and linear regression lines, respectively.
5 concentrations were similar to that of the AOD (Wei et al. 2019a), which was consistent with the result of correlation analysis.The annual mean measured PM2.5 concentration was 52.55 ± 45.22 μg/m 3 by averaging all ground-based stations in GUA, where the lowest PM2.5 concentration was 26.01 ± 12.86 μg/m 3 in summer, and the highest was 91.88 ± 60.31 μg/m 3 in winter.The annual AOD after filling the missing values was 0.60.The seasonal mean AODs were 0.54, 0.55, 0.61, and 0.70 in spring, summer, autumn, and winter, respectively (Figure 5), with similar values in spring and summer.

Figure 4 .
Figure 4. Density scatterplots of the RF model with sample, spatial, and temporal-based train examination results (a-c) and test evaluation results (d-f) for GWA.The black dash and solid red lines represent the 1:1 and linear regression lines, respectively.

3. 3 .
Model Performance of XGBoost 3.3.1.Reginal-Scale Model Performance The specific values of hyper parameters (num_round = 300, objective = reg:linear, booster = btree, eta = 0.1, max_depth = 9, gamma = 3, colsample_bytree = 0.8, colsam-ple_bylevel = 0.8, alpha = 60, lambda = 2) of the XGBoost model were used in this study by parameter tuning.As shown in Figure 6A, the sample-based training accuracies of the XGBoost model-fitting results have R 2 of 0.99 in both DFM and GWA. Figure 6A depicts the density scatterplots of model performance for DFM (Figure 6A (a-d)) and GWA (Figure 6A (e-h)) for 2018-2019 in GUA, using sample-based train, sample, spatial, and temporal-based tests.Because of the spatial and temporal heterogeneity of PM 2.5 concentrations, the sample-based test outperformed the temporal-based and spatial-based tests in DFM and GWA, with better R 2 and lower MAE and RMSE.On the sample scale, the DFM model showed better performance with higher R 2 of 0.93 but lower performance with higher MAE (8.42 µg/m 3 ) and RMSE (12.49µg/m 3 ) than GWA with R 2 , MAE, and RMSE of 0.91, 7.75 µg/m 3 , and 11.58 µg/m 3 , respectively.On both temporal and spatial scales, DFM performed better with greater R 2 and lower MAE and RMSE (Figure 6A).In general, spatial-based model accuracies for both DFM and GWA are significantly higher than temporal-based model accuracies.On the spatial-based scale, the DFM model has an improvement over the GWA model with better R 2 , MAE, and RMSE of 0.91, 9.55 µg/m 3 , and 13.80 µg/m 3 , respectively.The overall R 2 , MAE, and RMSE values for the DFM model for the temporal-based test were 0.81, 12.57 µg/m 3 , and 19.47 µg/m 3 , which are better than the values for the GWA model (R 2 = 0.75, MAE = 14.25 µg/m 3 , RMSE = 21.35µg/m 3 ), suggesting that DFM with filling the missing AOD has enhanced the model performance (Figure 6A (d, h)).However, according to the researches using machine learning algorithms [26,40], the DFM model tends to slightly underestimate PM 2.5 values on extremely polluted days (slope = 0.90, intercept = 5.42).

Figure 5 .
Figure 5. Histograms and descriptive statistics (minimum, maximum, mean, and standard deviation) for PM2.5 and the independent variables used for modeling.Data are from 2018-2020 over GUA.The number of samples is 48,449.

Figure 5 .
Figure 5. Histograms and descriptive statistics (minimum, maximum, mean, and standard deviation) for PM 2.5 and the independent variables used for modeling.Data are from 2018-2020 over GUA.The number of samples is 48,449.

Figure 6 .
Figure 6.Density scatterplots of regional-scale model (A) with sample_based train, sample_based test, spatial_based test and temporal_based test examination results for DFM (a-c) and DFC (d-f) at daily scale, and seasonal scale (B) with DFM (top row) and DFC (bottom row) based sample_based test for sprint (a,e), summer (b,f), autumn (c,g) and winter (d,h) in GUA.The black dash and solid red lines represent the 1:1 and linear regression lines, respectively.

Figure 6 .
Figure 6.Density scatterplots of regional-scale model (A) with sample_based train, sample_based test, spatial_based test and temporal_based test examination results for DFM (a-c) and DFC (d-f) at daily scale, and seasonal scale (B) with DFM (top row) and DFC (bottom row) based sample_based test for sprint (a,e), summer (b,f), autumn (c,g) and winter (d,h) in GUA.The black dash and solid red lines represent the 1:1 and linear regression lines, respectively.

Figure 7 .
Figure 7. Spatial distributions of (a) the number of examined samples (N) at each monitoring station, (b) R 2 , (c) MAE (µg/m 3 ), and (d) RMSE (µg/m 3 ) between PM 2.5 estimations and measurements from DFM for the years of 2018-2019 in GUA.
Figure9shows the seasonal and annual PM2.5 concentrations distribution with a spatial resolution of 1 km × 1 km, derived from the RF-XGBoost model, demonstrating the temporal and spatial heterogeneity and aggregation.The predicted annual PM2.5 concentrations were spatially consistent with in situ measurements and previous studies[42].Figure9a-hdepict the spatial distribution of the seasonal mean PM2.5 concentrations calculated by the RF-XGBoost model, illustrating the spatial difference in PM2.5 pollution.Winter always experienced the most serious pollution, with an average PM2.5 value of 42.77 μg/m 3 and 42.08 μg/m 3 in 2018 and 2019, respectively, followed by spring with PM2.5 values of 32.95 ± 6.62 μg/m 3 and 22.03 ± 5.97 μg/m 3 .Comparatively, the temporal variations of PM2.5 concentrations in summer (17.66 ± 2.41 μg/m 3 and 12.85 ± 2.47 μg/m 3 ) and autumn (17.14 ± 5.16 μg/m 3 and 16.79 ± 4.28 μg/m 3 ) were relatively low.Compared to 2018, the seasonal average PM2.5 concentrations decreased significantly in 2019, especially in spring, with a reduction of 10.92 μg/m 3 .However, the air pollution situation was still grim in the winter of 2019, with more than 65% of the study area meeting the mean PM2.5 values > 35 μg/m 3 and the maximum reaching 95.57μg/m 3 .The spatial distribution patterns were similar in 2018 and 2019, as shown in Figure 9i-j, with the annual mean PM2.5 concentrations being 29.62 ± 5.97 μg/m 3 and 24.63 ± 6.01

Figure 9 .
Figure 9.The mean predicted seasonal (a-h) and annual (i-j) PM2.5 concentrations distribution based on the XGBoost model and the rate of change of annual PM2.5 concentrations from 2018 to 2019 (k).

Figure 9 .
Figure 9.The mean predicted seasonal (a-h) and annual (i-j) PM 2.5 concentrations distribution based on the XGBoost model and the rate of change of annual PM 2.5 concentrations from 2018 to 2019 (k).

Table 1 .
Variables used in the model development and validation.

Table 2 .
Pearson correlation coefficients and p_values of each variable on PM 2.5 concentrations from the monitoring stations in GUA from 2018-2020.

Table 3 .
A comparison of model performance between RF-XGBoost and other models.

Table 4 .
A comparison of the model performances of different ensemble models and the RF-XGBoost model.TC and MC refer to the time and memory of computer consumption during the model training.Reasons for PM 2.5 Pollution in the GZB PM 2.5 distribution from the RF-XGBoost model demonstrated that the areas exceeding the secondary national air quality standard (35 µg/m 3 ) significantly decreased from ~18% in 2018 to ~9% in 2019 (Figure