1. Introduction
Fine particulate matter (PM
2.5) is a major air pollutant with substantial implications for air quality and human health. A growing body of evidence has shown that exposure to PM
2.5 is closely associated with cardiovascular and respiratory diseases and with increased premature mortality risk [
1,
2]. In China, rapid industrialization, urbanization, dense population, and intensive energy consumption have long made the Beijing–Tianjin–Hebei (BTH) region a hotspot of PM
2.5 pollution. In addition, the surrounding Taihang and Yanshan Mountains, together with unfavorable regional meteorological conditions, can weaken atmospheric ventilation and promote pollutant accumulation, giving the BTH region pronounced characteristics of regional haze and heavy PM
2.5 pollution [
3,
4].
Although annual mean PM
2.5 concentrations in the Beijing–Tianjin–Hebei (BTH) region have declined markedly since the implementation of China’s Air Pollution Prevention and Control Action Plan in 2013, with 2024 levels in Beijing, Tianjin, and Hebei all more than 60% lower than those in 2013 [
5], seasonal and regional pollution episodes still occur repeatedly. In particular, wintertime PM
2.5 pollution in North China and the BTH region is often aggravated by unfavorable meteorological conditions, including atmospheric stagnation, weak winds, stable stratification, high humidity, and suppressed boundary-layer development, which facilitate pollutant accumulation and enhance the effects of combustion-related emissions [
6,
7]. Recent studies further suggest that, despite the overall decline in PM
2.5 concentrations, regional transport, meteorological stagnation, and precursor sensitivity continue to play important roles in determining pollution severity and the evolution of pollution episodes in the BTH region [
8,
9]. Therefore, clarifying how meteorological factors and gaseous pollutants influence PM
2.5 under different pollution levels is essential for developing targeted and pollution-level-specific control strategies in this region.
Previous studies have extensively investigated PM
2.5 pollution in the BTH region from the perspectives of spatiotemporal distribution [
10,
11], influencing factors [
6,
12], and regional transport and pollution episodes [
13,
14]. However, most of these studies have focused on long-term average conditions, overall trends, or individual pollution events, while systematic comparisons across different PM
2.5 pollution levels remain limited. In addition, existing studies have seldom quantified the relative contribution strength, ranking variation, and influence patterns of meteorological factors and gaseous precursors under mild, moderate, and severe pollution conditions. Therefore, the differentiated driving mechanisms of PM
2.5 across pollution levels in the BTH region remain insufficiently understood.
In recent years, data-driven methods, including machine learning [
15], deep learning [
16,
17], and explainable artificial intelligence [
18,
19,
20], have been increasingly applied to air quality prediction and factor analysis. These approaches are capable of capturing complex nonlinear relationships among meteorological variables, precursor pollutants, and air-quality indicators, and often achieve strong predictive performance [
15,
16,
17]. In particular, interpretable frameworks based on SHAP have made it possible to quantify the contribution of individual predictors and improve model transparency [
18,
19].
To address these gaps, this study uses daily observations from approximately 65 air quality monitoring stations in the BTH region from 1 November 2021 to 31 October 2024 and investigates PM
2.5 pollution under different severity levels using a CatBoost–SHAP framework [
21,
22,
23]. Unlike studies focusing on the full range of daily PM
2.5 concentrations, this work specifically focuses on PM
2.5 pollution days. These pollution days are classified into three levels based on daily PM
2.5 concentrations: mild pollution (75 μg/m
3 ≤ PM
2.5 < 115 μg/m
3), moderate pollution (115 μg/m
3 ≤ PM
2.5 < 150 μg/m
3), and severe pollution (PM
2.5 ≥ 150 μg/m
3). Daily PM
2.5 concentrations are analyzed together with four gaseous pollutants (SO
2, NO
2, CO, and O
3) and five meteorological variables, including temperature (T), pressure (P), relative humidity (RH), precipitation (PRE), and wind speed (WS). Specifically, this study aims to: (1) reveal the temporal and spatial distribution characteristics of mild, moderate, and severe PM
2.5 pollution days in the BTH region during 2021–2024; (2) quantify the relative importance and ranking changes in meteorological factors and gaseous precursors across pollution levels; and (3) examine the dependence patterns of key drivers under different covariate backgrounds.
2. Methodology
2.1. Study Area and Data Source
The study area encompasses the Beijing–Tianjin–Hebei (BTH) region in North China, located in the northern part of the North China Plain and surrounded by the Taihang Mountains to the west and the Yanshan Mountains to the north. Spanning approximately 218,000 km2, this national-level strategic region serves as China’s political, economic, and cultural hub, with Beijing as the national capital and Tianjin as a major international port city. It plays a pivotal role in driving innovation, governance, coordinated regional development, and high-quality urbanization under the Beijing–Tianjin–Hebei Coordinated Development Plan.
Geographically, the BTH region includes the municipality of Beijing, the municipality of Tianjin, and the entirety of Hebei Province which consists of 11 prefecture-level cities (
Figure 1). Its coordinates range from approximately 113°04′ E to 119°53′ E and 36°01′ N to 42°37′ N. The terrain predominantly consists of plains in the east and southeast, with hills and low to medium mountains in the western and northern parts, which significantly influence local meteorological conditions and pollutant dispersion.
The dataset comprises daily mean PM
2.5 concentrations and four gaseous pollutants (SO
2, NO
2, CO, and O
3) collected from approximately 65 air quality monitoring stations across the BTH region (
Figure 1). All air pollutant data, along with five meteorological variables—mean air temperature (T), atmospheric pressure (P), relative humidity (RH), precipitation (PRE), and mean wind speed (WS)—were obtained from the Environmental Information and Analysis (EIA) Data Platform available at
http://eia-data.com/ (accessed on 31 December 2025), which integrates official observations from the China National Environmental Monitoring Center (CNEMC) network. To generate station-specific meteorological inputs, each air quality station was matched to its nearest meteorological station based on spatial proximity, and the corresponding meteorological time series was assigned as the matched record. The study period spans from 1 November 2021 to 31 October 2024, encompassing three complete seasonal cycles and precisely aligning with the three defined annual periods (1 November 2021–31 October 2022, 1 November 2022–31 October 2023, and 1 November 2023–31 October 2024).
The majority of these approximately 65 air quality stations are concentrated in densely populated urban and downtown areas, consistent with national monitoring priorities that emphasize locations with high population exposure and intense emission sources. Consequently, coverage is dense in major urban cores such as central Beijing, Tianjin, and Shijiazhuang, but relatively sparse in remote or mountainous counties, particularly in western and northern Hebei Province.
2.2. Descriptive Statistics
Descriptive statistics of datasets for the air quality variable (PM
2.5), four gaseous pollutants (SO
2, NO
2, CO, and O
3), and five meteorological variables (T, P, RH, PRE, and WS), aggregated across all 65 stations, are summarized in
Table 1. Statistics include the mean, minimum (Min) and maximum (Max) values, standard deviation (SD), and coefficient of variation (CV), calculated as (SD/mean) × 100%.
Table 1 shows that PM
2.5 exhibited substantial variability across the BTH region during the study period, with a mean concentration of 38.93 μg/m
3, a standard deviation of 32.78 μg/m
3, and a coefficient of variation of 84%, indicating considerable temporal and spatial fluctuations. Among the gaseous precursors, O
3 had the highest mean concentration (102.21 μg/m
3), whereas SO
2 showed the lowest mean level (6.49 μg/m
3). NO
2 and CO displayed moderate variability, suggesting relatively stable but still fluctuating precursor conditions.
2.3. Temporal Statistics of PM2.5 Pollution Level Days
Figure 2 presents the monthly distribution of PM
2.5 pollution days at three levels in the BTH region from 1 November 2021 to 31 October 2024. The figure consists of three subfigures corresponding to Year 1 (1 November 2021–31 October 2022), Year 2 (1 November 2022–31 October 2023), and Year 3 (1 November 2023–31 October 2024). Within each subfigure, three pie charts illustrate the distributions of mild pollution, moderate pollution, and severe pollution.
The PM
2.5 pollution levels were classified according to the Ambient Air Quality Standards of China (GB 3095–2012): mild pollution was defined as 75 μg/m
3 ≤ PM
2.5 < 115 μg/m
3, moderate pollution as 115 μg/m
3 ≤ PM
2.5 < 150 μg/m
3, and severe pollution as PM
2.5 ≥ 150 μg/m
3 [
24].
The sectors are labeled by month and colored by season, including winter (November–January, shades of blue), spring (February–April, shades of green), summer (May–July, shades of red), and autumn (August–October, shades of yellow). The classification of pollution days was based on the daily maximum PM2.5 concentration among approximately 65 monitoring stations. Specifically, a day was classified as a mild, moderate, or severe pollution day when this daily maximum PM2.5 concentration fell within the corresponding PM2.5 pollution-level range.
Clear seasonal differences were observed across the three pollution levels. Severe pollution days were concentrated predominantly in winter throughout the three study years, with only limited occurrences in spring and autumn and almost none in summer. Moderate pollution days were also mainly distributed in winter and spring, while autumn contributed a smaller share and summer remained negligible. In contrast, mild pollution days occurred more frequently and were most common in spring, followed by winter, with autumn showing a noticeable contribution and summer consistently recording the fewest pollution days.
Meanwhile, we also conducted the statistics of the total precipitation over the three-year study period: that is highest in summer (88,794.23 mm), followed by autumn (17,676.16 mm), spring (12,922.33 mm), and winter (3997.56 mm). This pattern was the opposite of the seasonal distribution of PM
2.
5 pollution days: summer had the highest rainfall but the fewest pollution events [
25].
At the monthly scale, January was one of the most prominent months for moderate and severe pollution, and November and December also formed a clear winter cluster. For mild pollution, March and April were particularly important in several years, while February became more prominent in the later study period. Autumn pollution was mainly concentrated in September and October, especially for severe events in Year 2 and Year 3.
Overall, the frequency and severity of PM2.5 pollution in the BTH region were dominated by winter, followed by spring.
2.4. Spatial Statistics of PM2.5 Pollution LEVEL Days
Figure 3 presents the spatial distribution of mild, moderate, and severe PM
2.5 pollution days in the BTH region during the study period. It summarizes the frequency of pollution-level days at different cities and highlights the spatial differences among the three pollution categories. Overall, PM
2.5 pollution days were unevenly distributed across cities in the BTH region, and the highest numbers of pollution days were recorded in Handan, Baoding, and Shijiazhuang, followed by Tianjin and Beijing.
2.5. CatBoost
CatBoost is an ensemble learning algorithm based on gradient boosting decision trees, which constructs a strong predictor by iteratively combining multiple weak learners [
21,
26]. Given a training dataset
, where
denotes the input feature vector and
denotes the observed target value of the
-th sample, the model starts from an initial prediction defined as:
where
is the loss function. At the
-th boosting iteration, the model prediction is updated by adding a new decision tree to the previous prediction:
where
is the predicted value after the
-th iteration,
is the learning rate, and
is the output of the newly added decision tree. Under the gradient boosting framework, the new tree is fitted to the pseudo-residuals, which are defined as the negative gradient of the loss function with respect to the current prediction [
26]:
After
iterations, the final prediction of the CatBoost model can be expressed as
Compared with conventional gradient boosting methods, CatBoost introduces ordered boosting to reduce prediction shift during training and thereby improve model robustness and generalization performance [
21]. In addition, as a tree-based ensemble model, CatBoost is capable of capturing nonlinear relationships and complex interactions among variables, which makes it suitable for identifying the combined effects of meteorological factors and gaseous precursors on PM
2.5 pollution levels.
2.6. SHapley Additive exPlanations
SHAP (SHapley Additive exPlanations) was introduced to explain how each predictor contributed to the CatBoost-based PM
2.5 estimates [
22]. For each sample, the model output was represented as the sum of a baseline prediction and the feature-level SHAP contributions. For the
-th sample, the prediction can be written as:
where
is the expected model output,
is the number of explanatory variables, and
denotes the SHAP value of feature
for sample
. Positive and negative SHAP values indicate whether a given feature increases or decreases the predicted PM
2.5 concentration relative to the baseline.
The SHAP value of feature
is obtained by averaging its marginal contribution over all possible feature subsets:
where
is the complete feature set,
is a subset that does not include feature
,
is the model output based on subset
, and
is the output after feature
is added. This formulation provides consistent feature attribution and links each prediction to the contribution of individual variables [
22].
Because the prediction model in this study was tree-based, SHAP values were calculated using TreeSHAP, which is designed for efficient interpretation of tree ensemble models [
23]. The global contribution of each variable was summarized by the mean absolute SHAP value:
where
represents the overall contribution strength of feature
across all samples. A larger
indicates that the variable played a more important role in PM
2.5 prediction. In this study, these SHAP values were used to compare variable importance across pollution levels and to examine the dependence patterns of dominant meteorological and gaseous pollutant factors.
2.7. Evaluation Metrics
Model performance is evaluated using mean absolute error (MAE), root mean square error (RMSE), and the coefficient of determination (R2). MAE reflects the average magnitude of prediction errors, RMSE penalizes larger errors more heavily, and R2 measures the proportion of variance in observed PM2.5 values explained by the predictions.
2.7.1. Mean Absolute Error (MAE)
MAE calculates the average absolute difference between actual and predicted values, offering a straightforward, outlier-insensitive error metric. It is defined as
where
and
represent the observed and predicted PM
2.5 values for sample
, respectively, and
is the number of samples. A smaller MAE indicates a lower average prediction error. To compare MAE across pollution levels with different PM
2.5 concentration baselines, normalized MAE was also calculated as:
where
is the mean observed PM
2.5 concentration for the corresponding daily model.
2.7.2. Root Mean Square Error (RMSE)
RMSE was used to measure prediction error while assigning greater weight to larger deviations between observed and predicted values. It is expressed as
where
,
, and
are as defined above.
Compared with MAE, RMSE is more sensitive to large errors and therefore provides complementary information on model performance. Similarly, normalized RMSE was calculated to reduce the influence of different PM
2.5 concentration scales among pollution levels:
where
is the mean observed PM
2.5 concentration for the corresponding daily model.
2.7.3. R-Squared (R2)
, or the coefficient of determination, is a unitless measure used to evaluate how much of the variance in the dependent variable is explained by the regression model. It reflects the overall goodness of fit and is defined as
where
is the observed value,
is the predicted value,
is the sample size, and
is the mean of the observed values. An
value close to 1 indicates strong explanatory power, while a value near 0 suggests limited improvement over the mean prediction. Negative values may occur when the model performs worse than the mean-based baseline.
3. Results
According to the pollution-day definition and temporal statistics presented in
Section 2.3, each day was classified as a mild, moderate, and severe PM
2.5 pollution day according to the PM
2.5 pollution-level range into which the daily maximum PM
2.5 concentration among the 65 monitoring stations fell. Based on this definition, a total of 425 pollution days were identified during the three-year study period and retained for subsequent analysis.
For each selected pollution day, observations from all 65 stations were used for modeling, including stations that did not reach the pollution level assigned to that day. with PM2.5 as the response variable and nine explanatory variables, including five meteorological factors and four gaseous pollutants. An 8:2 split was applied to the station observations to construct the training and test datasets; that is, the split was applied across stations rather than across the temporal sequence of the full study period. CatBoost was used to perform daily spatial prediction of PM2.5, while TreeSHAP was employed to quantify the contributions of the nine predictors.
3.1. Performance of CatBoost for PM2.5 Spatial Prediction
Across the 425 pollution days included in the analysis, the CatBoost model showed generally good predictive skill for station-level PM
2.5, although its performance varied with pollution severity.
Figure 4 presents the boxplots of the model performance metrics (R
2, RMSE, and MAE) over the 425 pollution days.
As shown in
Figure 4, across all pollution days, the median values of R
2, RMSE, and MAE were 0.866, 7.509, and 5.906, respectively. For mild pollution days, the median R
2, RMSE, and MAE were 0.821, 6.639, and 5.156, respectively. For moderate pollution days, the corresponding median values were 0.882, 7.544, and 6.135, representing increases of 7.4%, 13.6%, and 19.0% relative to the mild pollution level. For severe pollution days, the model achieved the highest explanatory power, with a median R
2 of 0.919, while the median RMSE and MAE increased to 11.001 and 8.736, respectively, corresponding to increases of 11.9%, 65.7%, and 69.4% compared with the mild pollution level.
To account for differences in baseline PM2.5 concentrations among pollution-day scenarios, normalized RMSE (RMSE%) and normalized MAE (MAE%) were additionally calculated for each daily model by dividing the RMSE and MAE by the mean observed PM2.5 concentration across all 65 stations on that day. Across all pollution days, the median values of RMSE% and MAE% were 0.85 and 0.89, respectively. The median RMSE% and MAE% were 0.88 and 0.98 for mild pollution days, 0.87 and 0.94 for moderate pollution days, and 0.89 and 0.88 for severe pollution days, respectively. Overall, the normalized errors showed only minor differences among pollution levels.
3.2. SHAP-Based Interpretation of Meteorological and Gaseous Pollutant Contributions
To further interpret the CatBoost model, the SHAP results of the nine explanatory variables were summarized by mean absolute SHAP values, relative percentage of contributions among the nine variables, rankings, and sign frequencies across pollution levels and study years (
Table 2).
Overall, the contribution structure showed a clear concentration pattern. Across all year–pollution combinations, CO, PRE, WS, and T consistently accounted for the vast majority of the total SHAP contribution, with a combined relative share of approximately 88–95%. This indicates that PM2.5 prediction mainly relied on a small core group of variables, rather than being evenly explained by all nine predictors. Within this group, CO made the largest and most stable contribution to model prediction, while PRE, WS, and T together formed the leading meteorological controls, although their internal ranking varied slightly across years and pollution levels. By contrast, SO2, RH, and NO2 provided only limited supplementary contributions, and O3 and P remained negligible throughout.
Among the gaseous pollutants, the contribution structure was overwhelmingly dominated by CO, whereas SO2, NO2, and especially O3 played much smaller roles. Among the meteorological variables, the main contribution came from PRE, WS, and T, while RH and P were consistently weak. Therefore, the overall SHAP structure was primarily shaped by one dominant gaseous pollutant indicator (CO) and three major meteorological regulators (PRE, WS, and T).
Across pollution levels, most variables showed increasing mean absolute SHAP values from mild to moderate to severe pollution, suggesting that their contributions generally strengthened as pollution intensified. This pattern was particularly clear for CO, and was also evident for WS, T, SO2, RH, NO2, O3, and P. However, PRE did not show a regular monotonic trend, but instead displayed substantial interannual and inter-level fluctuations. This irregularity likely reflects the event-dependent nature of precipitation, whose contribution depends more on whether rainfall occurs and under what pollution context it occurs than on pollution severity alone.
The annual results further confirmed the robustness of this structure. CO remained the top-ranked variable under mild, moderate, and severe pollution in all three years. PRE, WS, and T consistently occupied the next most important positions, although their internal order varied slightly across years and pollution levels. In contrast, SO
2, RH, and NO
2 generally remained in the middle positions, while O
3 and P consistently ranked at the bottom. Overall,
Table 2 indicates that PM
2.5 prediction in the BTH region was mainly explained by the combined contribution of CO, PRE, WS, and T, while the remaining variables played secondary or negligible roles.
3.3. SHAP Dependence Patterns of the Dominant Variables Under Different Covariate Backgrounds
Based on the results of
Section 3.2, the analyses of SHAP dependence patterns will be focused on CO, PRE, WS, and T, which together accounted for the vast majority of the total SHAP contribution.
Figures S1–S4 show the SHAP dependence plots of these four variables under mild PM
2.5 pollution conditions,
Figures S5–S8 show the corresponding plots under moderate pollution conditions, and
Figures S9–S12 show the corresponding plots under severe pollution conditions.
Each supplementary figure contains eight subfigures (a–h), with each subfigure showing the SHAP dependence pattern of the focal variable under the background of one of the other eight explanatory variables. In each subfigure, the x-axis represents the value of the focal variable, the y-axis represents its SHAP value for PM2.5 prediction, and the point color indicates the value of one of the other eight covariates. These plots therefore illustrate both the dependence pattern of each focal variable and its variation under different covariate backgrounds.
Among the four focal variables, CO showed the clearest and most stable dependence pattern across the three pollution levels. In
Figures S1, S5 and S9, CO-SHAP generally increased with increasing CO concentration, and the SHAP range became wider under heavier pollution conditions. In addition, relatively clear background relationships were observed with RH and NO
2, while a weaker relationship was also visible with SO
2, especially under mild and moderate pollution conditions.
PRE showed a nonlinear and irregular dependence pattern across all three pollution levels. Although PRE values were concentrated mainly in the low-value range, the corresponding SHAP values still exhibited considerable vertical variation, without a clear monotonic trend with increasing precipitation. Compared with the other focal variables, PRE did not show consistently clear relationships with the other eight covariates.
For WS, the dependence pattern with its own values was also nonlinear and without a clear monotonic trend. However, relatively clear background relationships were observed with CO and NO
2. In
Figures S3, S7 and S11, higher CO and NO
2 values were generally associated with higher WS-SHAP values, and this pattern became more evident under severe pollution conditions.
For T, the dependence pattern with its own values was broad and nonlinear across the three pollution levels. Relatively clear background relationships were observed with CO, NO
2, and especially P. In
Figures S4, S8 and S12, higher CO, NO
2, and P values were generally associated with higher T-SHAP values, and this pattern became slightly clearer under severe pollution conditions. Among these covariates, the relationship with P appeared to be the most distinct.
Overall, the dependence plots showed that CO had the clearest self-dependence, whereas PRE, WS, and T mainly exhibited nonlinear and irregular dependence patterns with their own values. At the same time, WS showed relatively clear background relationships with CO and NO2, and T showed relatively clear background relationships with CO, NO2, and P.
4. Discussion
4.1. Implications of Model Performance Differences Across Pollution Levels
The contrasting behavior of R2, RMSE, and MAE, and the normalized error metrics across pollution levels suggests that changes in pollution conditions affected model performance in different ways. Although R2 decreased from severe to mild pollution, the relative reduction was modest. This indicates that the model’s ability to capture the spatial variation in PM2.5 remained comparatively stable across pollution levels, whereas the magnitude of absolute prediction errors was more strongly influenced by pollution intensity.
The higher R2 under severe pollution indicates that the spatial variation in PM2.5 became more structured, allowing the model to capture relative differences among stations more effectively. This may be because severe pollution episodes are more strongly influenced by regional transport, pollutant accumulation, and stable meteorological conditions, which together can produce a more persistent and organized spatial pattern of PM2.5. By contrast, the lower and broader R2 under mild pollution implies that PM2.5 spatial patterns were less stable or less clearly organized under cleaner conditions.
Meanwhile, the increase in RMSE and MAE from mild to severe pollution indicates that absolute prediction errors became larger as PM2.5 concentrations increased. However, after normalization by the mean observed PM2.5 concentration of each daily model, the relative errors showed only minor differences among pollution levels. This suggests that the larger absolute errors under severe pollution were partly associated with higher PM2.5 concentration levels, rather than a clear decline in relative model performance. Therefore, severe pollution conditions appeared to improve the predictability of relative spatial patterns while simultaneously increasing the magnitude of absolute prediction errors.
4.2. Dominant Variable Contributions Across Pollution Levels
A clear pattern in
Table 2 is that the influence of most variables on model prediction increased with pollution severity. For the majority of predictors, mean absolute SHAP values were higher under moderate pollution than under mild pollution, and became highest under severe pollution. In particular, the four dominant variables identified in
Section 3.2—CO, PRE, WS, and T—generally became more influential as pollution intensified. However, an increase in mean absolute SHAP value should be interpreted as a stronger contribution to model prediction, rather than as evidence of a uniformly positive effect on PM
2.5.
Within this overall trend, CO remained the top-ranked variable across all years and pollution levels. PRE, WS, and T consistently remained within the top four, although their internal order varied slightly. In the total three-year results, PRE contributed more than WS under mild and moderate pollution, whereas WS exceeded PRE under severe pollution. This suggests that the relative hierarchy among the dominant meteorological variables changed as pollution intensified. The contrast was particularly evident in Year 2, when PRE dropped to the fourth position under severe pollution, while WS remained among the leading contributors. Overall,
Table 2 indicates that increasing pollution severity not only strengthened the influence of the dominant variables, but also reshaped their relative hierarchy, especially among the leading meteorological controls.
4.2.1. Stable Dominance of CO Across Years and Pollution Levels
Among all explanatory variables, CO showed the most stable dominance across the three study years and all pollution levels. Its persistent first-place ranking suggests that CO was the largest and most robust predictor of PM
2.5 in this study. Unlike the other dominant variables, CO also showed a relatively clear dependence pattern in
Section 3.3, with higher CO concentrations generally associated with higher CO-SHAP values, making its role easier to interpret in physical terms.
The stable importance of CO may reflect its close association with combustion-related emissions and pollution accumulation. In the BTH region, CO is commonly linked to traffic, residential heating, and other incomplete combustion sources that also contribute directly or indirectly to PM
2.5 formation [
27,
28]. In addition, CO may serve as an integrated indicator of accumulation-favorable conditions, because both CO and PM
2.5 can accumulate under weak atmospheric dispersion and stagnant meteorological conditions [
29,
30]. However, the high SHAP importance of CO should be interpreted as a strong predictive association rather than direct causal evidence. Because CO is often co-emitted with PM
2.5 or its precursors and is also influenced by similar accumulation conditions, its contribution may partly reflect co-pollutant conditions and combustion-related accumulation processes, rather than an independent physical driving effect. Therefore, the dominance of CO was not only statistically stable and physically meaningful, but CO should be interpreted mainly as an indicator of combustion-related pollution accumulation rather than as an isolated physical driver of PM
2.5 variation.
4.2.2. Roles of the Leading Meteorological Regulators: PRE, WS, and T
Among the meteorological variables, PRE, WS, and T consistently ranked within the top four across years and pollution levels, indicating that the leading meteorological controls on PM2.5 in this study were associated with scavenging, dispersion, and broader thermal background conditions.
PRE mainly reflects the removal effect of precipitation on airborne particles. Rainfall can reduce PM2.5 concentrations through wet deposition, but its contribution was highly variable across years and pollution levels. This suggests that the role of precipitation was strongly event-dependent and likely controlled by the occurrence, timing, and intensity of rainfall. The sharp reduction in PRE importance under severe pollution in Year 2 may indicate that rainfall was too limited or too infrequent to provide effective scavenging during those heavy-pollution events.
WS mainly represents the role of ventilation and dispersion. Higher wind speed generally promotes pollutant dilution and transport, whereas weak wind conditions favor stagnation and pollutant buildup. Its relatively strong contribution, especially under severe pollution, indicates that dispersion-related conditions became increasingly important in differentiating PM2.5 levels when pollution was already intense. The fact that WS generally exceeded PRE under severe pollution suggests that, during intense pollution episodes, weakly ventilated and stagnant atmospheric conditions may have played a more important role than episodic wet scavenging. Even when precipitation occurred, its removal effect may have been insufficient to offset the stronger control exerted by poor dispersion.
Compared with PRE and WS, T likely reflects a more complex background influence. Temperature may partly act as a proxy for broader seasonal and meteorological conditions associated with PM2.5 pollution. For example, lower temperatures are often associated with more stable atmospheric conditions and, at the same time, with stronger residential heating demand, both of which may favor PM2.5 accumulation. Thus, the role of T in this study may reflect combined thermal, seasonal, and emission-related backgrounds rather than a simple one-way temperature effect alone.
4.3. Physical Interpretation of the Dependence Patterns of the Dominant Variables
The dependence patterns of the dominant variables provide further insight into how they contributed to PM2.5 prediction under different pollution and meteorological backgrounds. Among the four dominant variables, CO showed the clearest and most stable self-dependence. CO-SHAP generally increased with increasing CO concentration across mild, moderate, and severe pollution conditions, indicating that the value of CO itself had a direct influence on its SHAP response. In addition to this self-dependence, CO-SHAP also showed relatively clear background relationships with RH and NO2, and a weaker relationship with SO2. Physically, higher RH likely reflects a more accumulation-favorable atmospheric background, while higher NO2 and SO2 indicate stronger coexisting combustion-related pollutant loads. Under such conditions, the contribution of CO to PM2.5 prediction became more pronounced.
By contrast, WS did not show a clear monotonic relationship with its own values. Instead, its SHAP values varied more clearly with CO and NO2 backgrounds, especially under severe pollution conditions. This suggests that the effect of wind speed became more important when the surrounding pollution burden was already high. In other words, under heavily polluted conditions, ventilation and dispersion played a larger role in determining PM2.5 levels.
A similar pattern was observed for T. T-SHAP did not show a clear monotonic relationship with temperature itself, but it varied more clearly under different CO, NO2, and especially P backgrounds. This suggests that temperature did not act mainly as an isolated direct driver. Instead, T likely reflected a broader seasonal and meteorological background. In particular, lower temperatures may be associated with wintertime conditions characterized by higher combustion-related CO and NO2 levels and weaker atmospheric dispersion under higher-pressure conditions. Therefore, T is better interpreted as a proxy for broader pollution and meteorological environments than as a simple one-way temperature effect.
Compared with the other dominant variables, PRE showed the least regular dependence pattern. Although PRE values were mostly concentrated in the low range, PRE-SHAP still varied substantially, without a clear monotonic relationship with precipitation itself or with the backgrounds of the other covariates. This suggests that the contribution of PRE was highly event-dependent. In physical terms, precipitation does not act as a continuously operating background factor, but as an episodic regulator whose effect depends on whether rainfall occurs and, on its timing, duration, and intensity relative to the pollution episode.
Overall, the dependence plots suggest that the dominant variables contributed to PM2.5 prediction in different ways. CO showed the clearest self-dependence, whereas WS and T were more background-dependent and PRE behaved more as an episodic nonlinear regulator. Together, these patterns indicate that PM2.5 prediction in the BTH region depended not only on the dominant variables themselves, but also on the broader covariate backgrounds in which they operated.
4.4. Spatial Implications of Pollution Days in BTH
The spatial statistics of pollution-level days showed that the highest numbers of PM
2.5 pollution days were concentrated in Handan, Baoding, and Shijiazhuang, followed by Tianjin and Beijing (
Figure 3 and
Section 2.4). This pattern suggests that the occurrence of pollution days in the BTH region was shaped by the combined influence of anthropogenic emissions, topographic constraints, meteorological conditions, and regional transport. At the regional scale, the BTH area is characterized by the Taihang Mountains to the west and the Yanshan Mountains to the north, a topographic setting that can weaken atmospheric ventilation and favor pollutant accumulation under stagnant weather conditions. This regional background helps explain why several core BTH cities consistently ranked high in the frequency and severity of PM
2.5 pollution days.
For Shijiazhuang, Tianjin, and Beijing, their high ranking can generally be understood in the context of their roles as major metropolitan centers in the BTH region, where dense population, intensive traffic activity, high energy consumption, and concentrated urban functions contribute to strong anthropogenic emission backgrounds. The ranking order among these cities, from Shijiazhuang to Tianjin and then to Beijing, may further reflect differences in the strength and duration of air pollution control efforts.
By contrast, the high ranking of Handan and Baoding may reflect more city-specific disadvantages beyond the common metropolitan background. In Handan, the frequent pollution days may be more closely associated with its historically heavy industrial structure and combustion-related emissions, which is broadly consistent with the dominant role of CO identified in this study. In Baoding, the concentration of pollution days may be more strongly linked to unfavorable dispersion conditions and its position within the regional pollution transport pathway.
4.5. Scale-Dependent Interpretation of Meteorological Roles
In this study, each air quality monitoring station was assigned meteorological variables from the nearest meteorological station. This approach has the advantage of using direct observational meteorological records and avoids additional uncertainty introduced by meteorological interpolation or model-based reconstruction. However, because meteorological stations are fewer than air quality stations, several nearby air quality stations, especially within the same city or densely populated urban area, may share the same meteorological record.
As a result, local-scale meteorological differences among these air quality stations cannot be fully represented. For example, several air quality stations may have identical wind speed, temperature, pressure, or precipitation inputs, while their PM2.5 concentrations differ due to local emissions, traffic intensity, urban structure, or other microscale factors. Therefore, the meteorological roles identified in this study should be interpreted mainly at the city or regional scale, rather than at the microscale within urban areas.
This limitation may smooth local meteorological heterogeneity and reduce the model’s ability to distinguish microscale meteorological effects on PM2.5 spatial variability. Nevertheless, the matched meteorological data still provide meaningful city-scale and regional-scale meteorological background information.
5. Conclusions
This study investigated the meteorological and gaseous pollutant drivers of PM2.5 across mild, moderate, and severe pollution-day scenarios in the Beijing–Tianjin–Hebei (BTH) region from 1 November 2021 to 31 October 2024 using the CatBoost–SHAP framework. The key findings are as follows.
(1) The CatBoost model showed good station-level PM2.5 spatial prediction ability across pollution days. Severe pollution cases had the highest R2, indicating a clearer spatial structure of PM2.5, while the increase in RMSE and MAE was mainly associated with the higher concentration scale and stronger spatial heterogeneity during severe pollution days, as RMSE% and MAE% differed only slightly among pollution levels.
(2) The SHAP results showed a highly concentrated contribution structure. Across all years and pollution levels, CO, PRE, WS, and T accounted for the vast majority of the total SHAP contribution. The fact that CO remained the most dominant and stable predictor likely reflected combustion-related emissions and pollutant accumulation processes, and CO should be interpreted as a robust predictive indicator rather than direct causal evidence.
(3) The importance of most variables increased with pollution severity. In particular, PRE contributed more than WS under mild and moderate pollution, whereas WS exceeded PRE under severe pollution, indicating a shift in the relative roles of scavenging and dispersion as pollution intensified.
(4) The dependence plots showed different contribution patterns among the dominant variables. CO showed the clearest self-dependence, WS and T were more background-dependent, and PRE behaved more as an episodic nonlinear regulator.
(5) PM2.5 pollution days in the BTH region showed clear temporal and spatial heterogeneity. Pollution days occurred mainly in winter and spring, and were most concentrated in Handan, Baoding, and Shijiazhuang, followed by Tianjin and Beijing.
(6) The meteorological effects identified in this study should be interpreted mainly at the city or regional scale rather than the microscale.
Overall, these findings indicate that mild, moderate, and severe PM2.5 pollution-day scenarios in the BTH region were jointly shaped by combustion-related accumulation, dominant meteorological regulators, and regional spatial–temporal heterogeneity. Because this study focused only on selected pollution days rather than regular, clean, or low-level PM2.5 conditions, the results should be interpreted as pollution-episode-specific and should not be directly generalized to the full PM2.5 concentration range.