Short-Term Air Pollution Forecasting Using Embeddings in Neural Networks
Abstract
:1. Introduction
2. NO2 Emissions and Short-Term Forecasting
Using Machine-Learning to Predict Pollutant Concentrations
3. NO2 Seasonality Analysis in the City of Erfurt
|
|
|
|
4. On the Use of Embedding Layer in Neural Network: Encoding Traffic
4.1. DNN + Embedding
Embedding Layer for Calendar Features
- Hour: The categorical variable takes values in , so in a one-hot-encoding it has dimension 24. For its embedding dimension, we chose six, which follows recommendations to use of the input dimension for the embedding space [33].
- Weekday: The categorical variable takes values in , where N depends on the representation of holidays outlined below. Its dimension is two. In our study, we defined: weekday as , considering seven weekdays and three types of holiday; partial holiday, public holiday and bridge day (bridge, partial, and public holiday describe days with influence through public holidays; public is the actual public holiday, partial is a public holiday in only parts of Germany and bridge describes days between a public holiday and weekends). A list of holidays used in this study can be found in the Appendix A.
- Month: The categorical variable takes values in , its dimension is three.
5. Experimental Study
5.1. Data
- Wind speed (km/h);
- Wind direction (degree);
- Precipitation (mm/h);
- Temperature (°C);
- Pressure (hPa);
- Cape (J/kg);
- Radiation (W/m).
5.2. Parameters, Data Structure and Models Configuration
6. Results
- Forecast the next 24 h;
- Forecast the next 72 h;
- Forecast the next 120 h.
6.1. Interpretability through the Embedding Space
6.2. Our Contribution to the Sustainable Development Goals
- No poverty;
- Zero hunger;
- Goodhealthandwell-being;
- Quality education;
- Gender equality;
- Clean water and sanitation;
- Affordable and clean energy;
- Decent work and economic growth;
- Industry, innovation and infrastructure;
- Reduced inequalities;
- Sustainablecitiesandcommunities;
- Responsibleconsumptionandproduction;
- Climateaction;
- Life below water;
- Life and land;
- Peace, justice and strong institution;
- Partnerships for the goals.
7. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Detailed Classification of Public Holidays
- public holidays: Christmas, Day After Christmas, New Years Day, First of May (International Workers Day), Day of German Unity, Good Friday, Easter Sunday, Easter Monday, Ascension Day, Pentecost Monday,
- partial holidays: assumption of Mary, Reformation Day, All Hallows Day, Day of Prayer and Repentance, Pentecost Sunday, the Christmas week,
- bridge days: all days between public holidays, Fridays if Thursdays are public holidays and Mondays if Tuesdays are public holidays.
References
- Saki, H.; Mohammadi, G. Estimation of health effect attributed to NO2 exposure by using of Air Q model in Ahwaz, 2009. Apadana J. Clin. Res. 2013, 2, 5–12. [Google Scholar]
- Dons, E.; Laeremans, M.; Anaya-Boig, E.; Avila-Palencia, I.; Brand, C.; de Nazelle, A.; Gaupp-Berghausen, M.; Götschi, T.; Nieuwenhuijsen, M.; Orjuela, J.P.; et al. Concern over health effects of air pollution is associated to NO2 in seven European cities. Air Qual. Atmos. Health 2018, 11, 591–599. [Google Scholar] [CrossRef]
- Zhao, S.; Liu, S.; Sun, Y.; Liu, Y.; Beazley, R.; Hou, X. Assessing NO2-related health effects by non-linear and linear methods on a national level. Sci. Total. Environ. 2020, 744, 140909. [Google Scholar] [CrossRef] [PubMed]
- Hesterberg, T.W.; Bunn, W.B.; McClellan, R.O.; Hamade, A.K.; Long, C.M.; Valberg, P.A. Critical review of the human data on short-term nitrogen dioxide (NO2) exposures: Evidence for NO2 no-effect levels. Crit. Rev. Toxicol. 2009, 39, 743–781. [Google Scholar] [CrossRef]
- Snowden, J.M.; Mortimer, K.M.; Kang Dufour, M.S.; Tager, I.B. Population intervention models to estimate ambient NO2 health effects in children with asthma. J. Expo. Sci. Environ. Epidemiol. 2015, 25, 567–573. [Google Scholar] [CrossRef]
- Statista. Per capita nitrogen oxide (NOx) emissions in 2020, by select country. 2022. Available online: https://www.statista.com/statistics/478834/leading-countries-based-on-per-capita-nitrogen-oxide-emissions/ (accessed on 1 February 2023).
- Zhou, R.; Wang, S.; Shi, C.; Wang, W.; Zhao, H.; Liu, R.; Chen, L.; Zhou, B. Study on the Traffic Air Pollution inside and outside a Road Tunnel in Shanghai, China. PLoS ONE 2014, 9, e112195. [Google Scholar] [CrossRef]
- Zhang, L.; Guan, Y.; Leaderer, B.; Holford, T. Estimating daily nitrogen dioxide level: Exploring traffic effects. Ann. Appl. Stat. 2013, 7, 1763–1777. [Google Scholar] [CrossRef]
- Agency, E.E. Impact of Selected Policy Measures on Europe’s AIR Quality. 2015. Available online: https://www.eea.europa.eu/data-and-maps/daviz/sector-share-of-nitrogen-oxides-emissions/ (accessed on 1 February 2023).
- Flämig, P.D.I.H. Luft- und Klimabelastung Durch Güterverkehr. 2021. Available online: https://www.forschungsinformationssystem.de/servlet/is/39787/ (accessed on 1 February 2023).
- Reddy, V.; Yedavalli, P.; Mohanty, S.; Nakhat, U. Deep air: Forecasting air pollution in Beijing, China. Environ. Sci. 2018, 1564. [Google Scholar]
- Tao, Q.; Liu, F.; Li, Y.; Sidorov, D. Air Pollution Forecasting Using a Deep Learning Model Based on 1D Convnets and Bidirectional GRU. IEEE Access 2019, 7, 76690–76698. [Google Scholar] [CrossRef]
- Liang, Y.C.; Maimury, Y.; Chen, A.; Juarez, J. Machine Learning-Based Prediction of Air Quality. Appl. Sci. 2020, 10, 9151. [Google Scholar] [CrossRef]
- Kleine Deters, J.; Zalakeviciute, R.; Gonzalez, M.; Rybarczyk, Y. Modeling PM 2.5 Urban Pollution Using Machine Learning and Selected Meteorological Parameters. J. Electr. Comput. Eng. 2017, 2017, 5106045. [Google Scholar] [CrossRef]
- Behm, S.; Haupt, H.; Schmid, A. Spatial detrending revisited: Modelling local trend patterns in NO2 concentration in Belgium and Germany. Spat. Stat. 2018, 28, 331–351. [Google Scholar] [CrossRef]
- Donnelly, A.; Naughton, O.; Broderick, B.; Misstear, B. Short-Term Forecasting of Nitrogen Dioxide (NO2) Levels Using a Hybrid Statistical and Air Mass History Modelling Approach. Environ. Model. Assess. 2017, 22, 231–241. [Google Scholar] [CrossRef]
- Samal, K.K.R.; Babu, K.S.; Das, S.K.; Acharaya, A. Time series based air pollution forecasting using SARIMA and prophet model. In Proceedings of the 2019 International Conference on Information Technology and Computer Communications, Singapore, 16–18 August 2019; pp. 80–85. [Google Scholar]
- Qadeer, K.; Jeon, M. Prediction of PM10 Concentration in South Korea Using Gradient Tree Boosting Models. In Proceedings of the 3rd International Conference on Vision, Image and Signal Processing, Vancouver, BC, Canada, 26–28 August 2019; Association for Computing Machinery: New York, NY, USA, 2019. [Google Scholar] [CrossRef]
- Qadeer, K.; Rehman, W.U.; Sheri, A.; Park, I.; Kim, H.; Jeon, M. A Long Short-Term Memory (LSTM) Network for Hourly Estimation of PM2.5 Concentration in Two Cities of South Korea. Appl. Sci. 2020, 10, 3984. [Google Scholar] [CrossRef]
- Li, Z.; Yim, S.H.L.; Ho, K.F. High temporal resolution prediction of street-level PM2.5 and NOx concentrations using machine learning approach. J. Clean. Prod. 2020, 268, 121975. [Google Scholar] [CrossRef]
- Iskandaryan, D.; Ramos, F.; Trilles, S. Bidirectional convolutional LSTM for the prediction of nitrogen dioxide in the city of Madrid. PLoS ONE 2022, 17, e0269295. [Google Scholar] [CrossRef] [PubMed]
- Dairi, A.; Harrou, F.; Khadraoui, S.; Sun, Y. Integrated multiple directed attention-based deep learning for improved air pollution forecasting. IEEE Trans. Instrum. Meas. 2021, 70, 1–15. [Google Scholar] [CrossRef]
- Al-Janabi, S.; Alkaim, A.; Al-Janabi, E.; Aljeboree, A.; Mustafa, M. Intelligent forecaster of concentrations (PM2.5, PM10, NO2, CO, O3, SO2) caused air pollution (IFCsAP). Neural Comput. Appl. 2021, 33, 14199–14229. [Google Scholar] [CrossRef]
- Casquero-Vera, J.A.; Lyamani, H.; Titos, G.; Borrás, E.; Olmo, F.; Alados-Arboledas, L. Impact of primary NO2 emissions at different urban sites exceeding the European NO2 standard limit. Sci. Total Environ. 2019, 646, 1117–1125. [Google Scholar] [CrossRef]
- Kurtenbach, R.; Kleffmann, J.; Niedojadlo, A.; Wiesen, P. Primary NO2 emissions and their impact on air quality in traffic environments in Germany. Environ. Sci. Eur. 2012, 24, 21. [Google Scholar] [CrossRef]
- Kamińska, J.A. A random forest partition model for predicting NO2 concentrations from traffic flow and meteorological conditions. Sci. Total Environ. 2019, 651, 475–483. [Google Scholar] [CrossRef] [PubMed]
- Jiménez-Hornero, F.; Jimenez-Hornero, J.; Gutiérrez de Ravé, E.; Pavón-Domínguez, P. Exploring the relationship between nitrogen dioxide and ground-level ozone by applying the joint multifractal analysis. Environ. Monit. Assess. 2009, 167, 675–684. [Google Scholar] [CrossRef]
- McCulloch, W.S.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 1943, 5, 115–133. [Google Scholar] [CrossRef]
- Hopfield, J. Neural Networks and Physical Systems with Emergent Collective Computational Abilities. Proc. Natl. Acad. Sci. USA 1982, 79, 2554–2558. [Google Scholar] [CrossRef] [PubMed]
- Hinton, G.E.; Osindero, S.; Teh, Y.W. A Fast Learning Algorithm for Deep Belief Nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef] [PubMed]
- Bengio, Y.; Ducharme, R.; Vincent, P.; Jauvin, C. A Neural Probabilistic Language Model. J. Mach. Learn. Res. 2003, 3, 1137–1155. [Google Scholar]
- Cartuyvels, R.; Spinks, G.; Moens, M.F. Discrete and continuous representations and processing in deep learning: Looking forward. AI Open 2021, 2, 143–159. [Google Scholar] [CrossRef]
- Wagner, A.; Ramentol, E.; Schirra, F.; Michaeli, H. Short- and long-term forecasting of electricity prices using embedding of calendar information in neural networks. J. Commod. Mark. 2022, 28, 100246. [Google Scholar] [CrossRef]
- Liang, S.; Huang, C.; Khalafbeigi, T. OGC SensorThings API Part 1: Sensing, Version 1.0; Open Geospatial Consortium: Wayland, MA, USA, 2016. [Google Scholar]
- The Sustainable Development Goals. Available online: https://www.undp.org/sustainable-development-goals (accessed on 19 January 2023).
Paper | Algorithm | Horizon | Compared | Data |
---|---|---|---|---|
[16] | Hybrid statistical | 24 h | - | Irish EPA |
[11] | LSTM encoder-decoder | 5 h, 10 h, 120 h | LSTM, sequence-to-scalar | Beijing |
[12] | 1D CNN-GRU | 1 h | SVR , DTR , LSTM, BGRU | UCI-repo |
[17] | Prophet | 1 h | Box-Jenkin | Bhubaneshwar |
[14] | ANN e | - | BT , LSVM | Belisario, Cotocollao |
[13] | Adaboost | 1 h, 8 h, 24 h | SVM, ANN, Random Forest | Taiwan EPA |
[18] | LGBM | - | XGB , LGBM | South Korea |
[19] | LSTM | 1 h | XGB, LGBM, LSTM | South Korea |
[20] | Random Forest | - | BRT , SVM, XGB, GAM , Cubist | Hong Kong |
[21] | BC-LSTM | 6 h | C-LSTM, LSTM | Madrid |
[22] | IMDA-VAE | - | GRU, BGRU, LSTM, VAE, C-LSTM, B-LSTM | Arizona, California, Pennsylvania, Texas |
[23] | DLSTM | 48 h | LSTM | China |
Sensor | Data Points | Training | Testing |
---|---|---|---|
DETH043 | 16959 | 2018, 2019, 2020 until August | September, October and November 2020 |
DETH020 | 25127 | 2018, 2019, 2020 until August | September, October and November 2020 |
DETH117 | 15967 | 2019, 2020 until August | September, October and November 2020 |
DETH081 | 15010 | 2019, 2020 until August | September, October and November 2020 |
Model | Parameters | Input Data |
---|---|---|
DNN | Table 4 | model 1: calendar |
model 2: cal+met | ||
LSTM | hidden layer = 2 | cal+met |
neurons/layer = 64 | ||
epochs = 30 | ||
dropout = 0.4 | ||
optimizer = Adam | ||
loss = MSE | ||
LGBM_Bosh parameter | max_depth: −1 | cal+met |
learning_rate: 0.005 | ||
num_iterations: 4837 | ||
feature_fraction:0.6 | ||
bagging_fraction: 0.9 | ||
bagging_freq: 5 | ||
LGBM_bayesian_opt | max_depth: 15 | cal+met |
min_split_gain: 0.1 | ||
num_iterations: 100 | ||
feature_fraction:1.0 | ||
bagging_fraction:1.0 | ||
num_leaves:5 | ||
LGBM_Qadeer | max_depth:-1 | cal+met |
learning_rate:0.005 | ||
num_iterations: 4837 | ||
feature_fraction:0.6 | ||
bagging_fraction:0.9 | ||
bagging_freq:5 | ||
Adaboost | base_estimator = Decision_tree | cal+met |
n_estimator = 50 | ||
learning_rate = 1.0 | ||
LSTM-encoder-decoder | dropout = 0.4 | cal+met |
loss = MSE | ||
epochs =30 | ||
optimizer = Adam | ||
encoder_LSTM | ||
embedding layer = 1 | ||
embedding layer neurons = 16 | ||
encoder_rnn_hidden = 224 | ||
decoder_LSTM | ||
embedding layer = 1 | ||
embedding layer neurons = 16 | ||
decoder_rnn_hidden = 224 |
Parameters | DNN |
---|---|
Model | Sequential |
Hidden layers | 2 |
Neurons per layer | 60/60/1 |
Loss function | mse |
Type of layer | dense |
Activation output | linear |
Activation hidden layers | relu/relu |
Epoch | 100 |
Optimizer | RMSprop(0.001) |
Sensor | Method | Data/Encode | 24 h | 72 h | 120 h |
---|---|---|---|---|---|
DE_DETH043 | DNN+embedding | model 1 | 9.1929 | 9.1248 | 9.3957 |
model 2 | 7.1700 | 7.3858 | 7.4489 | ||
LGBM | ordinal | 7.2325 | 7.3216 | 7.4517 | |
one_hot | 7.2789 | 7.4297 | 7.4600 | ||
LGBM-BOSCH | ordinal | 6.9420 | 7.0837 | 7.1495 | |
LSTM | input = 10 | 8.7743 | 8.4007 | 8.1686 | |
input = 24 | 10.7109 | 11.2327 | 10.8840 | ||
Adaboost | onehot | 13.2796 | 13.3503 | 13.1704 | |
ordinal | 12.5644 | 12.3406 | 12.5564 | ||
LSTM-AE | seq len 5 | 11.0249 | 14.0768 | 14.3268 | |
DE_DETH020 | DNN+embedding | model 1 | 6.5689 | 6.7591 | 6.8499 |
model 2 | 4.9559 | 5.1095 | 4.9550 | ||
LGBM | ordinal | 5.2947 | 5.2884 | 5.3362 | |
one_hot | 5.3856 | 5.4101 | 5.3858 | ||
LGBM-BOSCH | ordinal | 5.1039 | 5.1641 | 5.2147 | |
LSTM | input = 10 | 5.3348 | 5.3718 | 5.4127 | |
input = 24 | 6.4097 | 6.5072 | 6.2651 | ||
Adaboost | onehot | 8.2344 | 8.0955 | 7.9516 | |
ordinal | 9.3091 | 9.1417 | 9.3261 | ||
LSTM-AE | seq len 5 | 7.9955 | 9.3003 | 9.4265 | |
DE_DETH117 | DNN+embedding | model 1 | 6.6708 | 6.6076 | 6.7113 |
model 2 | 4.3912 | 4.5071 | 4.4123 | ||
LGBM | ordinal | 4.3028 | 4.3722 | 4.4029 | |
one_hot | 4.3391 | 4.4280 | 4.4166 | ||
LGBM-BOSCH | ordinal | 4.1809 | 4.2906 | 4.2971 | |
LSTM | input = 10 | 4.6209 | 4.7725 | 4.8600 | |
input = 24 | 5.9599 | 5.8841 | 6.0775 | ||
Adaboost | onehot | 7.4650 | 7.4633 | 7.5334 | |
ordinal | 7.4651 | 7.3091 | 7.2455 | ||
LSTM-AE | seq len 5 | 7.4113 | 7.7973 | 7.7239 | |
DE_DETH081 | DNN+embedding | model 1 | 7.5889 | 8.0022 | 8.1991 |
model 2 | 6.4381 | 6.3969 | 6.8792 | ||
LGBM | ordinal | 7.4838 | 7.7298 | 7.9176 | |
one_hot | 7.3926 | 7.5320 | 7.7081 | ||
LGBM-BOSCH | ordinal | 7.2872 | 7.5083 | 7.5210 | |
LSTM | input = 10 | 8.2826 | 7.9086 | 7.7629 | |
input = 24 | 11.3915 | 11.5506 | 11.3960 | ||
Adaboost | onehot | 13.3968 | 13.5412 | 13.4224 | |
ordinal | 13.5186 | 13.6836 | 13.1394 | ||
LSTM-AE | seq len 5 | 10.8096 | 15.0223 | 14.6538 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ramentol, E.; Grimm, S.; Stinzendörfer, M.; Wagner, A. Short-Term Air Pollution Forecasting Using Embeddings in Neural Networks. Atmosphere 2023, 14, 298. https://doi.org/10.3390/atmos14020298
Ramentol E, Grimm S, Stinzendörfer M, Wagner A. Short-Term Air Pollution Forecasting Using Embeddings in Neural Networks. Atmosphere. 2023; 14(2):298. https://doi.org/10.3390/atmos14020298
Chicago/Turabian StyleRamentol, Enislay, Stefanie Grimm, Moritz Stinzendörfer, and Andreas Wagner. 2023. "Short-Term Air Pollution Forecasting Using Embeddings in Neural Networks" Atmosphere 14, no. 2: 298. https://doi.org/10.3390/atmos14020298
APA StyleRamentol, E., Grimm, S., Stinzendörfer, M., & Wagner, A. (2023). Short-Term Air Pollution Forecasting Using Embeddings in Neural Networks. Atmosphere, 14(2), 298. https://doi.org/10.3390/atmos14020298