Next Article in Journal
Evaluation of Regional Air Quality Models over Sydney and Australia: Part 1—Meteorological Model Comparison
Next Article in Special Issue
No Particle Mass Enhancement from Induced Atmospheric Ageing at a Rural Site in Northern Europe
Previous Article in Journal
Seasonal Responses of Precipitation in China to El Niño and Positive Indian Ocean Dipole Modes
Previous Article in Special Issue
Black Carbon and Particulate Matter Concentrations in Eastern Mediterranean Urban Conditions: An Assessment Based on Integrated Stationary and Mobile Observations
Article Menu

Export Article

Open AccessArticle

PM2.5 Prediction Based on Random Forest, XGBoost, and Deep Learning Using Multisource Remote Sensing Data

State Key Laboratory of Remote Sensing Science, Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, Beijing 100101, China
University of Chinese Academy of Science, Beijing 100049, China
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Atmosphere 2019, 10(7), 373;
Received: 23 May 2019 / Revised: 23 June 2019 / Accepted: 2 July 2019 / Published: 4 July 2019
(This article belongs to the Special Issue Ambient Aerosol Measurements in Different Environments)
PDF [5729 KB, uploaded 4 July 2019]


In recent years, air pollution has become an important public health concern. The high concentration of fine particulate matter with diameter less than 2.5 µm (PM2.5) is known to be associated with lung cancer, cardiovascular disease, respiratory disease, and metabolic disease. Predicting PM2.5 concentrations can help governments warn people at high risk, thus mitigating the complications. Although attempts have been made to predict PM2.5 concentrations, the factors influencing PM2.5 prediction have not been investigated. In this work, we study feature importance for PM2.5 prediction in Tehran’s urban area, implementing random forest, extreme gradient boosting, and deep learning machine learning (ML) approaches. We use 23 features, including satellite and meteorological data, ground-measured PM2.5, and geographical data, in the modeling. The best model performance obtained was R2 = 0.81 (R = 0.9), MAE = 9.93 µg/m3, and RMSE = 13.58 µg/m3 using the XGBoost approach, incorporating elimination of unimportant features. However, all three ML methods performed similarly and R2 varied from 0.63 to 0.67, when Aerosol Optical Depth (AOD) at 3 km resolution was included, and 0.77 to 0.81, when AOD at 3 km resolution was excluded. Contrary to the PM2.5 lag data, satellite-derived AODs did not improve model performance. View Full-Text
Keywords: PM2.5; prediction; XGBoost; random forest; deep leaning; feature importance PM2.5; prediction; XGBoost; random forest; deep leaning; feature importance

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Supplementary material


Share & Cite This Article

MDPI and ACS Style

Zamani Joharestani, M.; Cao, C.; Ni, X.; Bashir, B.; Talebiesfandarani, S. PM2.5 Prediction Based on Random Forest, XGBoost, and Deep Learning Using Multisource Remote Sensing Data. Atmosphere 2019, 10, 373.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics



[Return to top]
Atmosphere EISSN 2073-4433 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top