Construction of a Model for Estimating PM2.5 Concentration in the Yangtze River Delta Urban Agglomeration Based on Missing Value Interpolation of Satellite AOD Data and a Machine Learning Algorithm
Abstract
1. Introduction
2. Data and Methods
2.1. Study Area
2.2. Data Sources
2.3. Data Preprocessing
- The daily Aqua and Terra DT AOD data were matched in time and space, and the linear relationship between the two was analyzed. The daily correlation coefficient R value was calculated to be between 0.60 and 0.98, indicating that there is a good linear correlation between the two data products. The missing AOD values on the grid were filled through linear interpolation between the available data points. The calculation formula is as follows:where and represent the missing Aqua and Terra AOD values, respectively; and are intercepts; and and are slopes that are calculated by establishing a linear relationship between the non-missing Aqua and Terra AOD values. When Aqua AOD data are missing, Formula (1) is used to calculate the missing AOD value; when Terra AOD is missing, Formula (2) is used to calculate the missing AOD value.
- The cubic convolution method provided by the ENVI remote sensing processing software was used to resample the DB AOD data of Aqua and Terra with a resolution of 10 km to a resolution of 3 km. The sampling rate is 0.3. The resampled DB AOD data were processed in the same way as in step 1.
- The linearly interpolated DT AOD was fused with the BD AOD; that is, the grid was filled with DB AOD where the DT AOD data were missing.
2.4. Research Methods
2.4.1. Correlation Analysis
2.4.2. XGBoost Model
2.4.3. LSTM Model
3. Results
3.1. Correlation Analysis Results
3.2. Inversion Results Based on the XGBoost Model and LSTM Model
4. Conclusions
5. Limitations and Perspectives
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Dennekamp, M.; Howarth, S.; Dick, C.A.J.; Cherrie, J.W.; Donaldson, K.; Seaton, A. Ultrafine particles and nitrogen oxides generated by gas and electric cooking. Occup. Environ. Med. 2001, 58, 511–516. [Google Scholar] [CrossRef] [PubMed]
- Happo, M.S.; Salonen, R.O.; Hälinen, A.I.; Jalava, P.I.; Pennanen, A.S.; Dormans, J.A.M.A.; Gerlofs-Nijland, M.E.; Cassee, F.R.; Kosma, V.-M.; Sillanpää, M.; et al. Inflammation and tissue damage in mouse lung by single and repeated dosing of urban air coarse and fine particles collected from six European cities. Inhal. Toxicol. 2010, 22, 402–416. [Google Scholar] [CrossRef]
- Duan, J.; Yu, Y.; Li, Y.; Jing, L.; Yang, M.; Wang, J.; Li, Y.; Zhou, X.; Miller, M.R.; Sun, Z. Comprehensive understanding of PM 2.5 on gene and microRNA expression patterns in zebrafish (Danio rerio) model. Sci. Total Environ. 2017, 586, 666–674. [Google Scholar] [CrossRef]
- Han, L.; Zhou, W.; Li, W.; Li, L. Impact of urbanization level on urban air quality: A case of fine particles (PM2. 5) in Chinese cities. Environ. Pollut. 2014, 194, 163–170. [Google Scholar] [CrossRef]
- Chen, C.Y.; Yin, X.B. Source, composition, formation and harm of PM2. 5 in haze. Univ. Chem 2014, 29, 1–6. [Google Scholar]
- Donner, L. Aerosols, Clouds, and Precipitation as Scale Interactions in the Climate System and Controls on Climate Change. In Proceedings of the APS March Meeting 2016, Baltimore, MD, USA, 14–18 March 2016; American Physical Society: College Park, MD, USA, 2016. [Google Scholar]
- Wu, R.; Dai, H.; Geng, Y.; Xie, Y.; Masui, T.; Liu, Z.; Qian, Y. Economic impacts from PM2. 5 pollution-related health effects: A case study in Shanghai. Environ. Sci. Technol. 2017, 51, 5035–5042. [Google Scholar] [CrossRef]
- Wang, J.; Wang, S.; Voorhees, A.S.; Zhao, B.; Jang, C.; Jiang, J.; Fu, J.S.; Ding, D.; Zhu, Y.; Hao, J. Assessment of short-term PM 2.5 -related mortality due to different emission sources in the Yangtze River Delta, China. Atmos. Environ. 2015, 123, 440–448. [Google Scholar] [CrossRef]
- Song, Y.; Hou, D.; Zhang, J.; O’Connor, D.; Li, G.; Gu, Q.; Li, S.; Liu, P. Environmental and socio-economic sustainability appraisal of contaminated land remediation strategies: A case study at a mega-site in China. Sci. Total Environ. 2018, 610, 391–401. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Y.; Li, Z. Remote sensing of atmospheric fine particulate matter (PM2. 5) mass concentration near the ground from satellite observation. Remote Sens. Environ. 2015, 160, 252–262. [Google Scholar] [CrossRef]
- Van Donkelaar, A.; Martin, R.V.; Brauer, M.; Kahn, R.; Levy, R.; Verduzco, C.; Villeneuve, P.J. Global estimates of ambient fine particulate matter concentrations from satellite-based aerosol optical depth: Development and application. Environ. Health Perspect 2010, 118, 847–855. [Google Scholar] [CrossRef]
- Paciorek, C.J.; Yang, L. Limitations of remotely sensed aerosol as a spatial proxy for fine particulate matter. Environ. Health Perspect. 2009, 117, 904–909. [Google Scholar] [CrossRef]
- Kahn, R.; Banerjee, P.; McDonald, D. Sensitivity of multiangle imaging to natural mixtures of aerosols over ocean. J. Geophys. Res. Atmos. 2001, 106, 18219–18238. [Google Scholar] [CrossRef]
- Liu, B.; Tan, X.; Jin, Y.; Yu, W.; Li, C. Application of RR-XGBoost combined model in data calibration of micro air quality detector. Sci. Rep. 2021, 11, 15662. [Google Scholar] [CrossRef] [PubMed]
- Dhaliwal, S.S.; Nahid, A.-A.; Abbas, R. Effective intrusion detection system using XGBoost. Information 2018, 9, 149. [Google Scholar] [CrossRef]
- Meng, Y.; Yang, N.; Qian, Z.; Zhang, G. What makes an online review more helpful: An interpretation framework using XGBoost and SHAP values. J. Theor. Appl. Electron. Commer. Res. 2020, 16, 466–490. [Google Scholar] [CrossRef]
- Nielsen, D. Tree Boosting with XGBoost-Why Does XGBoost Win “Every” Machine Learning Competition. Master’s Thesis, NTNU, Trondheim, Norway, 2016. [Google Scholar]
- Shi, X.; Wong, Y.D.; Li, M.Z.; Palanisamy, C.; Chai, C. A feature learning approach based on XGBoost for driving assessment and risk prediction. Accid. Anal. Prev. 2019, 129, 170–179. [Google Scholar] [CrossRef]
- Wang, C.; Deng, C.; Wang, S. Imbalance-XGBoost: Leveraging weighted and focal losses for binary label-imbalanced classification with XGBoost. Pattern Recognit. Lett. 2020, 136, 190–197. [Google Scholar] [CrossRef]
- Lin, Y. A note on margin-based loss functions in classification. Stat. Probab. Lett. 2004, 68, 73–82. [Google Scholar] [CrossRef]
- Shahani, N.M.; Zheng, X.; Liu, C.; Hassan, F.U.; Li, P. Developing an XGBoost regression model for predicting young’s modulus of intact sedimentary rocks for the stability of surface and subsurface structures. Front. Earth Sci. 2021, 9, 761990. [Google Scholar] [CrossRef]
- Asselman, A.; Khaldi, M.; Aammou, S. Enhancing the prediction of student performance based on the machine learning XGBoost algorithm. Interact. Learn. Environ. 2023, 31, 3360–3379. [Google Scholar] [CrossRef]
- Song, K.; Yan, F.; Ding, T.; Gao, L.; Lu, S. A steel property optimization model based on the XGBoost algorithm and improved PSO. Comput. Mater. Sci. 2020, 174, 109472. [Google Scholar] [CrossRef]
- Zhang, D.; Gong, Y. The comparison of LightGBM and XGBoost coupling factor analysis and prediagnosis of acute liver failure. IEEE Access 2020, 8, 220990–221003. [Google Scholar] [CrossRef]
- Zhang, P.; Jia, Y.; Shang, Y. Research and application of XGBoost in imbalanced data. Int. J. Distrib. Sens. Netw. 2022, 18, 15501329221106935. [Google Scholar] [CrossRef]
- Yu, Y.; Si, X.; Hu, C.; Zhang, J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef] [PubMed]
- Van Houdt, G.; Mosquera, C.; Nápoles, G. A review on the long short-term memory model. Artif. Intell. Rev. 2020, 53, 5929–5955. [Google Scholar] [CrossRef]
- Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
- Lindemann, B.; Müller, T.; Vietz, H.; Jazdi, N.; Weyrich, M. A survey on long short-term memory networks for time series prediction. Procedia Cirp 2021, 99, 650–655. [Google Scholar] [CrossRef]













| Variant | T AOD | A AOD | T | BLH | Pa | U | V | RH |
|---|---|---|---|---|---|---|---|---|
| Correlation Coefficient | 0.532 ** | 0.536 ** | −0.703 ** | −0.415 ** | 0.442 ** | −0.459 ** | 0.628 ** | −0.545 ** |
| Variant | National Roads | Provincial Roads | County Roads | Expressways | Urban Land | Paddy Fields | Dry Land | Urban Land | Rural Settlements | Other Construction Land |
|---|---|---|---|---|---|---|---|---|---|---|
| Correlation Coefficient | 0.190 * | 0.222 * | 0.258 ** | 0.357 ** | 0.207 * | 0.218 * | 0.300 ** | 0.223 * | 0.331 ** | 0.269 ** |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Qiu, J.; Dai, X.; Zhou, L. Construction of a Model for Estimating PM2.5 Concentration in the Yangtze River Delta Urban Agglomeration Based on Missing Value Interpolation of Satellite AOD Data and a Machine Learning Algorithm. Atmosphere 2026, 17, 11. https://doi.org/10.3390/atmos17010011
Qiu J, Dai X, Zhou L. Construction of a Model for Estimating PM2.5 Concentration in the Yangtze River Delta Urban Agglomeration Based on Missing Value Interpolation of Satellite AOD Data and a Machine Learning Algorithm. Atmosphere. 2026; 17(1):11. https://doi.org/10.3390/atmos17010011
Chicago/Turabian StyleQiu, Jiang, Xiaoyan Dai, and Liguo Zhou. 2026. "Construction of a Model for Estimating PM2.5 Concentration in the Yangtze River Delta Urban Agglomeration Based on Missing Value Interpolation of Satellite AOD Data and a Machine Learning Algorithm" Atmosphere 17, no. 1: 11. https://doi.org/10.3390/atmos17010011
APA StyleQiu, J., Dai, X., & Zhou, L. (2026). Construction of a Model for Estimating PM2.5 Concentration in the Yangtze River Delta Urban Agglomeration Based on Missing Value Interpolation of Satellite AOD Data and a Machine Learning Algorithm. Atmosphere, 17(1), 11. https://doi.org/10.3390/atmos17010011
