Prediction of Ambient PM2.5 Concentrations Using a Correlation Filtered Spatial-Temporal Long Short-Term Memory Model
Abstract
:1. Introduction
2. Research Framework
3. Data and Methods
3.1. Data Collection
3.2. Data Preprocessing
3.3. Methods
3.3.1. Long Short-Term Memory (LSTM)
3.3.2. CFST-LSTM
4. Results and Discussions
4.1. LSTM Structure Optimization
4.2. Comparison of Different Correlation Threshold
4.3. Model Comparison
4.4. Comparison of Ordinary LSTM and CFST-LSTM
4.5. Improvements Interpolation
5. Conclusions
- The proposed CFST-LSTM model outperforms other commonly seen machine learning/deep learning models with a better fitting degree and higher prediction accuracy. Its can reach 0.9583.
- Compared with ordinary LSTM, our method not only considers the influence from nearby stations but also filters out less related time series inputs, and this helps increase 2.88% performance in our tests on 30 sites. On the other hand, if only simply adding the time series inputs from other stations, the model performance will drop 3.32% due to a higher level of noise.
- According to the experiment on the improvements over the sites in California, the proposed method exhibited a higher improvement over ordinary LSTM in areas with denser sites, but lower improvement in sparser districts. This reflects that our method performs better at places with denser spatial inputs.
- Parameter optimization of the newly designed STF layer is quite important to the proposed method. The experiment showed that the difference in between proper and improper parameters can reach around 5.39% of the overall performance.
Author Contributions
Funding
Conflicts of Interest
References
- Cakmak, S.; Dales, R.E.; Rubio, M.A.; Vidal, C.B. The risk of dying on days of higher air pollution among the socially disadvantaged elderly. Environ. Res. 2011, 111, 388–393. [Google Scholar] [CrossRef]
- Bai, L.; He, Z.; Li, C.; Chen, Z. Investigation of yearly indoor/outdoor PM2.5 levels in the perspectives of health impacts and air pollution control: Case study in Changchun, in the northeast of China. Sustain. Cities Soc. 2019, 101871. [Google Scholar] [CrossRef]
- Xia, Y.; Guan, D.; Jiang, X.; Peng, L.; Schroeder, H.; Zhang, Q. Assessment of socioeconomic costs to China’s air pollution. Atmos. Environ. 2016, 139, 147–156. [Google Scholar] [CrossRef]
- Lin, C.; Lau, A.K.H.; Fung, J.C.H.; He, Q.; Ma, J.; Lu, X.; Li, Z.; Li, C.; Zuo, R.; Wong, A.H.S. Decomposing the Long-term Variation in Population Exposure to Outdoor PM2.5 in the Greater Bay Area of China Using Satellite Observations. Remote Sens. 2019, 11, 2646. [Google Scholar] [CrossRef] [Green Version]
- Tiwari, A.; Kumar, P.; Baldauf, R.; Zhang, K.M.; Pilla, F.; Di Sabatino, S.; Brattich, E.; Pulvirenti, B. Considerations for evaluating green infrastructure impacts in microscale and macroscale air pollution dispersion models. Sci. Total. Environ. 2019, 672, 410–426. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ma, J.; Ding, Y.; Cheng, J.C.P.; Jiang, F.; Tan, Y.; Gan, V.J.L.; Wan, Z. Identification of high impact factors of air quality on a national scale using big data and machine learning techniques. J. Clean. Prod. 2019, 118955. [Google Scholar] [CrossRef]
- Zhao, D.; Chen, H.; Li, X.; Ma, X. Air pollution and its influential factors in China’s hot spots. J. Clean. Prod. 2018, 185, 619–627. [Google Scholar] [CrossRef]
- Ma, J.; Cheng, J.C.P.; Lin, C.; Tan, Y.; Zhang, J. Improving air quality prediction accuracy at larger temporal resolutions using deep learning and transfer learning techniques. Atmos. Environ. 2019, 214, 116885. [Google Scholar] [CrossRef]
- Ma, J.; Ding, Y.; Gan, V.J.L.; Lin, C.; Wan, Z. Spatiotemporal Prediction of PM2.5 Concentrations at Different Time Granularities Using IDW-BLSTM. IEEE Access 2019, 7, 107897–107907. [Google Scholar] [CrossRef]
- Davis, J.M.; Speckman, P. A model for predicting maximum and 8h average ozone in Houston. Atmos. Environ. 1999, 33, 2487–2500. [Google Scholar] [CrossRef]
- Liu, T.; Lau, A.K.H.; Sandbrink, K.; Fung, J.C.H. Time Series Forecasting of Air Quality Based On Regional Numerical Modeling in Hong Kong. J. Geophys. Res. Atmos. 2018, 123, 4175–4196. [Google Scholar] [CrossRef] [Green Version]
- Kulkarni, G.E.; Muley, A.A.; Deshmukh, N.K.; Bhalchandra, P.U. Autoregressive integrated moving average time series model for forecasting air pollution in Nanded city, Maharashtra, India. Model. Earth Syst. Environ. 2018, 4, 1435–1444. [Google Scholar] [CrossRef]
- Osowski, S.; Garanty, K. Forecasting of the daily meteorological pollution using wavelets and support vector machine. Eng. Appl. Artif. Intell. 2007, 20, 745–755. [Google Scholar] [CrossRef]
- Gardner, M.W.; Dorling, S.R. Neural network modelling and prediction of hourly NOx and NO2 concentrations in urban air in London. Atmos. Environ. 1999, 33, 709–719. [Google Scholar] [CrossRef]
- Jusoh, N.; Ibrahim, W.J.W. Evaluating Fuzzy Time Series and Artificial Neural Network for Air Pollution Index Forecasting. In Proceedings of the Second International Conference on the Future of ASEAN (ICoFA) 2017—Volume 2; Saian, R., Abbas, M.A., Eds.; Springer: Singapore, 2018; pp. 113–121. [Google Scholar]
- Prakash, A.; Kumar, U.; Kumar, K.; Jain, V.K. A Wavelet-based Neural Network Model to Predict Ambient Air Pollutants’ Concentration. Environ. Model. Assess 2011, 16, 503–517. [Google Scholar] [CrossRef]
- Li, X.; Peng, L.; Yao, X.; Cui, S.; Hu, Y.; You, C.; Chi, T. Long short-term memory neural network for air pollutant concentration predictions: Method development and evaluation. Environ. Pollut. 2017, 231, 997–1004. [Google Scholar] [CrossRef]
- Yang, W.; Deng, M.; Xu, F.; Wang, H. Prediction of hourly PM2.5 using a space-time support vector regression model. Atmos. Environ. 2018, 181, 12–19. [Google Scholar] [CrossRef]
- Szpiro, A.A.; Sampson, P.D.; Sheppard, L.; Lumley, T.; Adar, S.D.; Kaufman, J.D. Predicting intra-urban variation in air pollution concentrations with complex spatio-temporal dependencies. Environ. 2010, 21, 606–631. [Google Scholar] [CrossRef] [Green Version]
- Deep Learning. Wikipedia. 2019. Available online: https://en.wikipedia.org/w/index.php?title=Deep_learning&oldid=887765315 (accessed on 18 March 2019).
- Ma, J.; Ding, Y.; Cheng, J.C.P.; Tan, Y.; Gan, V.J.L.; Zhang, J. Analyzing the Leading Causes of Traffic Fatalities Using XGBoost and Grid-Based Analysis: A City Management Perspective. IEEE Access 2019, 7, 148059–148072. [Google Scholar] [CrossRef]
- Ma, J.; Cheng, J.C.P. Identification of the numerical patterns behind the leading counties in the U.S. local green building markets using data mining. J. Clean. Prod. 2017, 151, 406–418. [Google Scholar] [CrossRef]
- Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Singh, K.P.; Gupta, S.; Rai, P. Identifying pollution sources and predicting urban air quality using ensemble learning methods. Atmos. Environ. 2013, 80, 426–437. [Google Scholar] [CrossRef]
- Wang, W.; Men, C.; Lu, W. Online prediction model based on support vector machine. Neurocomputing 2008, 71, 550–558. [Google Scholar] [CrossRef]
- Russo, A.; Raischel, F.; Lind, P.G. Air quality prediction using optimal neural networks with stochastic variables. Atmos. Environ. 2013, 79, 822–830. [Google Scholar] [CrossRef] [Green Version]
- Kumar, R.; Aggarwal, R.K.; Sharma, J.D. Energy analysis of a building using artificial neural network: A review. Energy Build. 2013, 65, 352–358. [Google Scholar] [CrossRef]
- Tealab, A. Time series forecasting using artificial neural networks methodologies: A systematic review. Future Comput. Inform. J. 2018, 3, 334–340. [Google Scholar] [CrossRef]
- Ma, J.; Ding, Y.; Cheng, J.C.P.; Jiang, F.; Wan, Z. A temporal-spatial interpolation and extrapolation method based on geographic Long Short-Term Memory neural network for PM2.5. J. Clean. Prod. 2019, 237, 117729. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Azzouni, A.; Pujolle, G. A Long Short-Term Memory Recurrent Neural Network Framework for Network Traffic Matrix Prediction. arXiv 2017, arXiv:170505690. Available online: http://arxiv.org/abs/1705.05690 (accessed on 29 September 2018).
- Ma, J.; Ding, Y.; Cheng, J.C.P.; Jiang, F.; Xu, Z. Soft detection of 5-day BOD with sparse matrix in city harbor water using deep learning techniques. Water Res. 2019, 170, 115350. [Google Scholar] [CrossRef]
- Peng, L.; Liu, S.; Liu, R.; Wang, L. Effective long short-term memory with differential evolution algorithm for electricity price prediction. Energy 2018, 162, 1301–1314. [Google Scholar] [CrossRef]
- Salman, A.G.; Heryadi, Y.; Abdurahman, E.; Suparta, W. Single Layer & Multi-layer Long Short-Term Memory (LSTM) Model with Intermediate Variables for Weather Forecasting. Procedia Comput. Sci. 2018, 135, 89–98. [Google Scholar] [CrossRef]
- Ma, J.; Cheng, J.C.P. Estimation of the building energy use intensity in the urban scale by integrating GIS and big data technology. Appl. Energy 2016, 183, 182–192. [Google Scholar] [CrossRef]
- Junninen, H.; Niska, H.; Tuppurainen, K.; Ruuskanen, J.; Kolehmainen, M. Methods for imputation of missing values in air quality data sets. Atmos. Environ. 2004, 38, 2895–2907. [Google Scholar] [CrossRef]
- Graves, A.; Fernández, S.; Schmidhuber, J. Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition. In Artificial Neural Networks: Formal Models and Their Applications—ICANN 2005, Warsaw, Poland, 11–15 September 2005; Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S., Eds.; Springer: Berlin/Heidelberg, Germany, 2005; pp. 799–804. [Google Scholar]
- Ma, J.; Li, Z.; Cheng, J.C.P.; Ding, Y.; Lin, C.; Xu, Z. Air quality prediction at new stations using spatially transferred bi-directional long short-term memory network. Sci. Total Environ. 2019, 135771. [Google Scholar] [CrossRef]
- Graves, A.; Jaitly, N.; Mohamed, A. Hybrid speech recognition with Deep Bidirectional LSTM. In Proceedings of the 2013 IEEE Workshop Autom. Speech Recognit. Underst., Olomouc, Czech Republic, 8–12 December 2013; pp. 273–278. [Google Scholar] [CrossRef]
- Shi, X.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.; Woo, W. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. In Advances in Neural Information Processing 28; Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R., Eds.; The MIT Press: Cambridge, MA, USA; London, UK, 2015; pp. 802–810. Available online: http://papers.nips.cc/paper/5955-convolutional-lstm-network-a-machine-learning-approach-for-precipitation-nowcasting.pdf (accessed on 7 September 2018).
- Bengio, Y.; Simard, P.; Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. Learn. Syst. 1994, 5, 157–166. [Google Scholar] [CrossRef]
- Athira, V.; Geetha, P.; Vinayakumar, R.; Soman, K.P. DeepAirNet: Applying Recurrent Networks for Air Quality Prediction. Procedia Comput. Sci. 2018, 132, 1394–1403. [Google Scholar] [CrossRef]
- Ma, J.; Cheng, J.C.P. Identifying the influential features on the regional energy use intensity of residential buildings based on Random Forests. Appl. Energy 2016, 183, 193–201. [Google Scholar] [CrossRef]
- Cheng, J.C.P.; Ma, L.J. A data-driven study of important climate factors on the achievement of LEED-EB credits. Build. Environ. 2015, 90, 232–244. [Google Scholar] [CrossRef]
- Cheng, J.C.P.; Ma, L.J. A non-linear case-based reasoning approach for retrieval of similar cases and selection of target credits in LEED projects. Build. Environ. 2015, 93, 349–361. [Google Scholar] [CrossRef]
- Ma, J.; Cheng, J.C.P. Data-driven study on the achievement of LEED credits using percentage of average score and association rule analysis. Build. Environ. 2016, 98, 121–132. [Google Scholar] [CrossRef]
- Jun, M.A.; Cheng, J.C.P. Selection of target LEED credits based on project information and climatic factors using data mining techniques. Adv. Eng. Inform. 2017, 32, 224–236. [Google Scholar] [CrossRef]
- Inverse Distance Weighting, Wikipedia. 2018. Available online: https://en.wikipedia.org/w/index.php?title=Inverse_distance_weighting&oldid=834154831 (accessed on 7 September 2018).
Attribute | Value |
---|---|
Data content | PM2.5 concentrations |
Temporal resolution | 1 h |
Location | California State, the U.S. |
Duration | 2016/01-2017/07 |
Number of stations | 30 |
Average 1st quantile | 5.3933 |
Average Mean | 11.3668 |
Average Median | 9.3133 |
Average 3rd quantile | 14.7833 |
Average Standard Deviation | 10.0275 |
Unit | Micrograms/Cubic Meter |
Layers | Nodes | RMSE | MAE | |
---|---|---|---|---|
2 | 32 | 2.0122 | 1.5385 | 0.9534 |
64 | 1.9926 | 1.5161 | 0.9583 | |
128 | 2.1270 | 1.5687 | 0.9477 | |
3 | 32 | 2.1156 | 1.6306 | 0.9483 |
64 | 2.0934 | 1.5553 | 0.9494 | |
128 | 2.1378 | 1.6026 | 0.9471 | |
4 | 32 | 2.1632 | 1.6269 | 0.9458 |
64 | 2.1998 | 1.6904 | 0.9439 | |
128 | 2.1744 | 1.7027 | 0.9453 |
Model | RMSE | MAE | |
---|---|---|---|
LASSO | 2.93213 | 1.951243 | 0.915126 |
Ridge | 2.89453 | 1.932423 | 0.918705 |
SVR | 2.79312 | 1.892831 | 0.925028 |
ANN | 2.78484 | 1.882643 | 0.925734 |
RNN | 2.70294 | 1.853425 | 0.929810 |
CFST-LSTM | 1.99257 | 1.516072 | 0.958348 |
CFST-ANN | 2.62591 | 1.79819 | 0.930665 |
CFST-RNN | 2.53594 | 1.70958 | 0.939712 |
Models | CFST-LSTM | O-LSTM | F-LSTM |
---|---|---|---|
Average R Square | 0.9155 | 0.8899 | 0.8613 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ding, Y.; Li, Z.; Zhang, C.; Ma, J. Prediction of Ambient PM2.5 Concentrations Using a Correlation Filtered Spatial-Temporal Long Short-Term Memory Model. Appl. Sci. 2020, 10, 14. https://doi.org/10.3390/app10010014
Ding Y, Li Z, Zhang C, Ma J. Prediction of Ambient PM2.5 Concentrations Using a Correlation Filtered Spatial-Temporal Long Short-Term Memory Model. Applied Sciences. 2020; 10(1):14. https://doi.org/10.3390/app10010014
Chicago/Turabian StyleDing, Yuexiong, Zheng Li, Chengdian Zhang, and Jun Ma. 2020. "Prediction of Ambient PM2.5 Concentrations Using a Correlation Filtered Spatial-Temporal Long Short-Term Memory Model" Applied Sciences 10, no. 1: 14. https://doi.org/10.3390/app10010014
APA StyleDing, Y., Li, Z., Zhang, C., & Ma, J. (2020). Prediction of Ambient PM2.5 Concentrations Using a Correlation Filtered Spatial-Temporal Long Short-Term Memory Model. Applied Sciences, 10(1), 14. https://doi.org/10.3390/app10010014