High-Resolution Spatial Prediction of Daily Average PM2.5 Concentrations in Jiangxi Province via a Hybrid Model Integrating Random Forest and XGBoost
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Area
2.2. Datasets
2.2.1. Ground Station PM2.5 Data
2.2.2. MODIS Data
2.2.3. Meteorological Data
2.2.4. Elevation Data
2.2.5. Land Use Data
2.2.6. Data Integration
2.3. Build RF-XGBoost Model
2.4. Validation
3. Results
3.1. RF-XGBoost Performance
3.2. Estimated PM2.5 Mass Concentrations in Jiangxi Province
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
| Year | Annual Average | Spring | Summer | Autumn | Winter |
|---|---|---|---|---|---|
| 2020 | 15.64 | 15.34 | 9.83 | 16.49 | 22.61 |
| 2021 | 15.48 | 15.06 | 9.41 | 16.19 | 22.51 |
| 2022 | 13.75 | 14.10 | 8.69 | 15.21 | 21.29 |
| 2023 | 14.33 | 13.74 | 8.30 | 15.17 | 21.79 |
| City | 2020 (μg/m3) | 2021 (μg/ m3) | 2022 (μg/ m3) | 2023 (μg/ m3) | Avg (μg/m3) | Total GDP (100 Million Yuan) | Total Factory (ton) |
|---|---|---|---|---|---|---|---|
| Nanchang | 27.44 | 28.63 | 26.38 | 29.62 | 28.02 | 104,645.75 | 65 |
| Jiujiang | 36.26 | 29.36 | 27.85 | 29.36 | 30.71 | 68,724.00 | 96 |
| Jingdezhen | 24.47 | 22.64 | 20.47 | 22.37 | 22.48 | 60,671.25 | 25 |
| Pingxiang | 29.47 | 30.46 | 26.24 | 29.34 | 28.87 | 81,194.00 | 35 |
| Xinyu | 30.56 | 29.24 | 26.42 | 28.21 | 28.61 | 97,189.25 | 23 |
| Yingtan | 30.75 | 23.42 | 21.34 | 23.32 | 24.71 | 100,654.75 | 57 |
| Ganzhou | 24.63 | 21.64 | 19.43 | 20.47 | 21.54 | 47,200.75 | 35 |
| Jian | 28.45 | 26.54 | 24.45 | 27.34 | 26.69 | 57,341.25 | 45 |
| Yichun | 30.47 | 29.34 | 25.42 | 28.36 | 28.39 | 64,809.00 | 51 |
| Fuzhou | 27.75 | 24.46 | 22.64 | 24.21 | 24.77 | 51,134.50 | 39 |
| Shangrao | 21.23 | 26.42 | 21.74 | 23.21 | 22.15 | 47,979.25 | 41 |
References
- Vanoli, J.; Quint, J.K.; Rajagopalan, S.; Stafoggia, M.; Al-Kindi, S.; Mistry, M.N.; Masselot, P.; de la Cruz Libardi, A.; Ng, C.F.S.; Madaniyazi, L.; et al. Association between long-term exposure to low ambient PM2.5 and cardiovascular hospital admissions: A UK Biobank study. Environ. Int. 2024, 192, 109011. [Google Scholar] [CrossRef]
- Zhang, S.; Li, X.; Zhang, L.; Zhang, Z.; Li, X.; Xing, Y.; Wenger, J.C.; Long, X.; Bao, Z.; Qi, X.; et al. Disease types and pathogenic mechanisms induced by PM2.5 in five human systems: An analysis using omics and human disease databases. Environ. Int. 2024, 190, 108863. [Google Scholar] [CrossRef]
- Xu, J.; Ni, M.; Wang, J.; Zhu, J.; Niu, G.; Cui, J.; Li, X.; Meng, Q.; Chen, R. Low-level PM2.5 induces the occurrence of early pulmonary injury by regulating circ_0092363. Environ. Int. 2024, 187, 108700. [Google Scholar] [CrossRef]
- Cao, J.; Yang, C.; Li, J.; Chen, R.; Chen, B.; Gu, D.; Kan, H. Association between long-term exposure to outdoor air pollution and mortality in China: A cohort study. J. Hazard. Mater. 2011, 186, 1594–1600. [Google Scholar] [CrossRef]
- Zhou, D.; Yang, Y.; Zhao, Z.; Zhou, K.; Zhang, D.; Tang, W.; Zhou, M. Air pollution-related disease and economic burden in China, 1990–2050: A modelling study based on Global burden of disease. Environ. Int. 2025, 196, 109300. [Google Scholar] [CrossRef] [PubMed]
- Bai, K.; Li, K.; Sun, Y.; Wu, L.; Zhang, Y.; Chang, N.-B.; Li, Z. Global synthesis of two decades of research on improving PM2.5 estimation models from remote sensing and data science perspectives. Earth Sci. Rev. 2023, 241, 104461. [Google Scholar] [CrossRef]
- Tian, J.; Chen, D. A semi-empirical model for predicting hourly ground-level fine particulate matter (PM2.5) concentration in southern Ontario from satellite remote sensing and ground-based meteorological measurements. Remote Sens. Environ. 2010, 114, 221–229. [Google Scholar] [CrossRef]
- Jumaah, H.J.; Dawood, M.A.; Abd Alreza, T.A.; Meteab, M.A. Air pollution landscape in Iraq: A Sentinel-5P based assessment of key atmospheric pollutants. DYSONA Appl. Sci. 2026, 7, 82–87. [Google Scholar]
- Shang, K.; Yao, Y.; Di, Z.; Jia, K.; Zhang, X.; Fisher, J.B.; Chen, J.; Guo, X.; Yang, J.; Yu, R.; et al. Coupling physical constraints with machine learning for satellite-derived evapotranspiration of the Tibetan Plateau. Remote Sens. Environ. 2023, 289, 113519. [Google Scholar] [CrossRef]
- Jiang, T.; Chen, B.; Nie, Z.; Ren, Z.; Xu, B.; Tang, S. Estimation of hourly full-coverage PM2.5 concentrations at 1-km resolution in China using a two-stage random forest model. Atmos. Res. 2021, 248, 105146. [Google Scholar] [CrossRef]
- Sorek-Hamer, M.; Strawa, A.; Chatfield, R.; Esswein, R.; Cohen, A.; Broday, D. Improved retrieval of PM2.5 from satellite data products using non-linear methods. Environ. Pollut. 2013, 182, 417–423. [Google Scholar] [CrossRef] [PubMed]
- Chen, Z.-Y.; Zhang, T.-H.; Zhang, R.; Zhu, Z.-M.; Ou, C.-Q.; Guo, Y. Estimating PM2.5 concentrations based on non-linear exposure-lag-response associations with aerosol optical depth and meteorological measures. Atmos. Environ. 2018, 173, 30–37. [Google Scholar] [CrossRef]
- Chen, G.; Li, S.; Knibbs, L.D.; Hamm, N.A.S.; Cao, W.; Li, T.; Guo, J.; Ren, H.; Abramson, M.J.; Guo, Y. A machine learning method to estimate PM2.5 concentrations across China with remote sensing, meteorological and land use information. Sci. Total Environ. 2018, 636, 52–60. [Google Scholar] [CrossRef] [PubMed]
- Suleiman, A.; Tight, M.; Quinn, A. Applying machine learning methods in managing urban concentrations of traffic-related particulate matter (PM10 and PM2.5). Atmos. Pollut. Res. 2019, 10, 134–144. [Google Scholar] [CrossRef]
- Ye, Y.; Cao, Y.; Dong, Y.; Yan, H. A graph neural network and Transformer-based model for PM2.5 prediction through spatiotemporal correlation. Environ. Model. Softw. 2025, 191, 106501. [Google Scholar] [CrossRef]
- Zhao, C.; Wang, Q.; Ban, J.; Liu, Z.; Zhang, Y.; Ma, R.; Li, S.; Li, T. Estimating the daily PM2.5 concentration in the Beijing-Tianjin-Hebei region using a random forest model with a 0.01 × 0.01 spatial resolution. Environ. Int. 2020, 134, 105297. [Google Scholar] [CrossRef]
- Song, Y.; Zhang, C.; Jin, X.; Zhao, X.; Huang, W.; Sun, X.; Yang, Z.; Wang, S. Spatial prediction of PM2.5 concentration using hyper-parameter optimization XGBoost model in China. Environ. Technol. Innov. 2023, 32, 103272. [Google Scholar] [CrossRef]
- Yi, L.; Mengfan, T.; Kun, Y.; Yu, Z.; Xiaolu, Z.; Miao, Z.; Yan, S. Research on PM2.5 estimation and prediction method and changing characteristics analysis under long temporal and large spatial scale-A case study in China typical regions. Sci. Total Environ. 2019, 696, 133983. [Google Scholar] [CrossRef]
- Yan, X.; Zang, Z.; Luo, N.; Jiang, Y.; Li, Z. New interpretable deep learning model to monitor real-time PM2.5 concentrations from satellite data. Environ. Int. 2020, 144, 106060. [Google Scholar] [CrossRef]
- Fu, M.; Kelly, J.A.; Clinch, J.P. Prediction of PM2.5 daily concentrations for grid points throughout a vast area using remote sensing data and an improved dynamic spatial panel model. Atmos. Environ. 2020, 237, 117667. [Google Scholar] [CrossRef]
- Wu, Y.; Cai, D.; Gu, S.; Jiang, N.; Li, S. Compressive strength prediction of sleeve grouting materials in prefabricated structures using hybrid optimized XGBoost models. Constr. Build. Mater. 2025, 476, 141319. [Google Scholar] [CrossRef]
- Meiseles, A.; Rokach, L. Iterative Feature eXclusion (IFX): Mitigating feature starvation in gradient boosted decision trees. Knowl. Based Syst. 2024, 289, 111546. [Google Scholar] [CrossRef]
- Ali, A.; Huang, Z.; Bilal, M.; Assiri, M.E.; Mhawish, A.; Nichol, J.E.; de Leeuw, G.; Almazroui, M.; Wang, Y.; Alsubhi, Y. Long-term PM2.5 pollution over China: Identification of PM2.5 pollution hotspots and source contributions. Sci. Total Environ. 2023, 893, 164871. [Google Scholar] [CrossRef] [PubMed]
- Zhang, H.; Jiang, Y.; Ding, M.; Xie, Z. Level, source identification, and risk analysis of heavy metal in surface sediments from river-lake ecosystems in the Poyang Lake, China. Environ. Sci. Pollut. Res. 2017, 24, 21902–21916. [Google Scholar] [CrossRef]
- Jie, P.; Zhou, Y.; Zhang, Z.; Wei, F. Heating energy consumption prediction based on improved GA-BP neural network model. Energy 2025, 328, 136392. [Google Scholar] [CrossRef]
- Ou, J.; Zhang, J.; Li, H.; Duan, B. Road damage prediction and intelligent maintenance methods based on stacking ensemble learning. Adv. Eng. Inform. 2025, 66, 103466. [Google Scholar] [CrossRef]
- Li, X.; Chen, H.; Xu, L.; Mo, Q.; Du, X.; Tang, G. multi-model fusion stacking ensemble learning method for the prediction of berberine by FT-NIR spectroscopy. Infrared Phys. Technol. 2024, 137, 105169. [Google Scholar] [CrossRef]
- Roy, J.; Saha, S. Ensemble hybrid machine learning methods for gully erosion susceptibility map: K-fold cross validation approach. Artif. Intell. Geosci. 2022, 3, 28–45. [Google Scholar] [CrossRef]
- Guo, B.; Zhang, D.; Pei, L.; Su, Y.; Wang, X.; Bian, Y.; Zhang, D.; Yao, W.; Zhou, Z.; Guo, L. Estimating PM2.5 concentrations via random forest method using satellite, auxiliary, and ground-level station dataset at multiple temporal scales across China in 2017. Sci. Total Environ. 2021, 778, 146288. [Google Scholar] [CrossRef]
- Xu, Q.; Chen, X.; Yang, S.; Tang, L.; Dong, J. Spatiotemporal relationship between Himawari-8 hourly columnar aerosol optical depth (AOD) and ground-level PM2.5 mass concentration in mainland China. Sci. Total Environ. 2021, 765, 144241. [Google Scholar] [CrossRef]
- Yang, L.; Xu, H.; Yu, S. Estimating PM2.5 concentrations in Yangtze River Delta region of China using random forest model and the Top-of-Atmosphere reflectance. J. Environ. Manag. 2020, 272, 111061. [Google Scholar] [CrossRef]
- Chen, X.; Zhang, W.; He, J.; Zhang, L.; Guo, H.; Li, J.; Gu, X. Mapping PM2.5 concentration from the top-of-atmosphere reflectance of Himawari-8 via an ensemble stacking model. Atmos. Environ. 2024, 330, 120560. [Google Scholar] [CrossRef]
- Amiri, Z.; Shahne, M.Z. Modeling PM2.5 concentration in Tehran using satellite-based Aerosol optical depth (AOD) and machine learning: Assessing input contributions and prediction accuracy. Remote Sens. Appl. Soc. Environ. 2025, 38, 101549. [Google Scholar] [CrossRef]
- Chowdhury, S.; Saha, A.K.; Das, D.K. Hydroelectric Power Potentiality Analysis for the Future Aspect of Trends with R2 Score Estimation by XGBoost and Random Forest Regressor Time Series Models. Procedia Comput. Sci. 2025, 252, 450–456. [Google Scholar] [CrossRef]
- Zhang, D.; Du, L.; Wang, W.; Zhu, Q.; Bi, J.; Scovronick, N.; Naidoo, M.; Garland, R.M.; Liu, Y. A machine learning model to estimate ambient PM2.5 concentrations in industrialized highveld region of South Africa. Remote Sens. Environ. 2021, 266, 112713. [Google Scholar] [CrossRef] [PubMed]
- Yang, Y.; Wang, Z.; Cao, C.; Xu, M.; Yang, X.; Wang, K.; Guo, H.; Gao, X.; Li, J.; Shi, Z. Estimation of PM2.5 concentration across China based on multi-source remote sensing data and machine learning methods. Remote Sens. 2024, 16, 467. [Google Scholar] [CrossRef]
- Li, X.; Li, L.; Chen, L.; Zhang, T.; Xiao, J.; Chen, L. Random Forest estimation and trend analysis of PM2.5 concentration over the Huaihai economic zone, China (2000–2020). Sustainability 2022, 14, 8520. [Google Scholar] [CrossRef]
- Vovk, T.; Kryza, M.; Werner, M. Using random forest to improve EMEP4PL model estimates of daily PM2.5 in Poland. Atmos. Environ. 2024, 332, 120615. [Google Scholar] [CrossRef]
- Lu, J.; Zhang, Y.; Chen, M.; Wang, L.; Zhao, S.; Pu, X.; Chen, X. Estimation of monthly 1 km resolution PM2.5 concentrations using a random forest model over “2 + 26” cities, China. Urban Clim. 2021, 35, 100734. [Google Scholar] [CrossRef]












| Data Directory | Variable | Spatial Resolution | Temporal Resolution |
|---|---|---|---|
| PM2.5 data | Daily mean PM2.5 | - | Hour |
| mass concentration | |||
| MODIS data | Top-of-Atmosphere | 1 km | Day |
| reflectance (Band 1, 3, 7) | |||
| Meteorological | Planetary boundary layer | 0.25° | Hour |
| data | (PBLH) | ||
| Total precipitation (TP) | 0.25° | Hour | |
| Relative humidity (RH) | 0.25° | Hour | |
| 2 m air temperature (T2m) | 0.25° | Hour | |
| Surface skin temperature (SKT) | 0.25° | Hour | |
| Surface pressure (SP) | 0.25° | Hour | |
| 10 m northward wind speed (V10) | 0.25° | Hour | |
| 10 m eastward wind speed (U10) | |||
| Elevation Data | DEM | 1 km | - |
| Land Use Data | Normalized difference | 1 km | 30 days |
Day Data | Vegetation index (NDVI) Number of days in a year | - | 1 day |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tang, Y.; Deng, J.; Cui, X.; Liu, Z.; Yang, L.; Zhang, S.; Liang, Y. High-Resolution Spatial Prediction of Daily Average PM2.5 Concentrations in Jiangxi Province via a Hybrid Model Integrating Random Forest and XGBoost. Atmosphere 2025, 16, 1317. https://doi.org/10.3390/atmos16121317
Tang Y, Deng J, Cui X, Liu Z, Yang L, Zhang S, Liang Y. High-Resolution Spatial Prediction of Daily Average PM2.5 Concentrations in Jiangxi Province via a Hybrid Model Integrating Random Forest and XGBoost. Atmosphere. 2025; 16(12):1317. https://doi.org/10.3390/atmos16121317
Chicago/Turabian StyleTang, Yuming, Jing Deng, Xinyi Cui, Zuhan Liu, Liu Yang, Shaoquan Zhang, and Yeheng Liang. 2025. "High-Resolution Spatial Prediction of Daily Average PM2.5 Concentrations in Jiangxi Province via a Hybrid Model Integrating Random Forest and XGBoost" Atmosphere 16, no. 12: 1317. https://doi.org/10.3390/atmos16121317
APA StyleTang, Y., Deng, J., Cui, X., Liu, Z., Yang, L., Zhang, S., & Liang, Y. (2025). High-Resolution Spatial Prediction of Daily Average PM2.5 Concentrations in Jiangxi Province via a Hybrid Model Integrating Random Forest and XGBoost. Atmosphere, 16(12), 1317. https://doi.org/10.3390/atmos16121317

