Next Article in Journal
Eyjafjallajökull Volcanic Ash 2010 Effects on GPS Positioning Performance in the Adriatic Sea Region
Previous Article in Journal
Investigation of Sources, Diversity, and Variability of Bacterial Aerosols in Athens, Greece: A Pilot Study
Previous Article in Special Issue
A Machine Learning Based Ensemble Forecasting Optimization Algorithm for Preseason Prediction of Atlantic Hurricane Activity
Article

A Comparison of Machine Learning Methods to Forecast Tropospheric Ozone Levels in Delhi

1
V. Sue Cleveland High School, Rio Rancho, NM 87144, USA
2
Computational Physics and Methods Group, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
*
Author to whom correspondence should be addressed.
Academic Editors: Valentine Anantharaj, Forrest M. Hoffman, Udaysankar S. Nair and Samantha Vanessa Adams
Atmosphere 2022, 13(1), 46; https://doi.org/10.3390/atmos13010046
Received: 11 November 2021 / Revised: 19 December 2021 / Accepted: 24 December 2021 / Published: 28 December 2021
(This article belongs to the Special Issue Machine Learning Applications in Earth System Science)
Ground-level ozone is a pollutant that is harmful to urban populations, particularly in developing countries where it is present in significant quantities. It greatly increases the risk of heart and lung diseases and harms agricultural crops. This study hypothesized that, as a secondary pollutant, ground-level ozone is amenable to 24 h forecasting based on measurements of weather conditions and primary pollutants such as nitrogen oxides and volatile organic compounds. We developed software to analyze hourly records of 12 air pollutants and 5 weather variables over the course of one year in Delhi, India. To determine the best predictive model, eight machine learning algorithms were tuned, trained, tested, and compared using cross-validation with hourly data for a full year. The algorithms, ranked by R2 values, were XGBoost (0.61), Random Forest (0.61), K-Nearest Neighbor Regression (0.55), Support Vector Regression (0.48), Decision Trees (0.43), AdaBoost (0.39), and linear regression (0.39). When trained by separate seasons across five years, the predictive capabilities of all models increased, with a maximum R2 of 0.75 during winter. Bidirectional Long Short-Term Memory was the least accurate model for annual training, but had some of the best predictions for seasonal training. Out of five air quality index categories, the XGBoost model was able to predict the correct category 24 h in advance 90% of the time when trained with full-year data. Separated by season, winter is considerably more predictable (97.3%), followed by post-monsoon (92.8%), monsoon (90.3%), and summer (88.9%). These results show the importance of training machine learning methods with season-specific data sets and comparing a large number of methods for specific applications. View Full-Text
Keywords: ozone prediction; pollutant forecasting; machine learning; atmospheric monitoring; air quality ozone prediction; pollutant forecasting; machine learning; atmospheric monitoring; air quality
Show Figures

Figure 1

MDPI and ACS Style

Juarez, E.K.; Petersen, M.R. A Comparison of Machine Learning Methods to Forecast Tropospheric Ozone Levels in Delhi. Atmosphere 2022, 13, 46. https://doi.org/10.3390/atmos13010046

AMA Style

Juarez EK, Petersen MR. A Comparison of Machine Learning Methods to Forecast Tropospheric Ozone Levels in Delhi. Atmosphere. 2022; 13(1):46. https://doi.org/10.3390/atmos13010046

Chicago/Turabian Style

Juarez, Eliana K., and Mark R. Petersen. 2022. "A Comparison of Machine Learning Methods to Forecast Tropospheric Ozone Levels in Delhi" Atmosphere 13, no. 1: 46. https://doi.org/10.3390/atmos13010046

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop