# Forecasting Fine Particulate Matter Concentrations by In-Depth Learning Model According to Random Forest and Bilateral Long- and Short-Term Memory Neural Networks

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{7}

^{*}

## Abstract

**:**

_{2.5}. However, the source information is limited, and the dynamic process is uncertain. The method of predicting short-term (3 h) and long-term trends has not been achieved. In order to deal with the issue, the research employed a novel mixed forecasting model by coupling the random forest (RF) variable selection and bidirectional long- and short-term memory (BiLSTM) neural net in order to forecast concentrations of PM

_{2.5}/0~12 h. Consequently, the average absolute percentage error of 1, 6, and 12 h shows that the PM

_{2.5}concentration prediction is 3.73, 9.33, and 12.68 μg/m

^{3}for Beijing, 1.33, 3.38, and 4.60 μg/m

^{3}for Guangzhou, 1.37, 4.19, and 6.35 μg/m

^{3}for Xi’an, and 2.20, 7.75, and 10.07 μg/m

^{3}for Shenyang, respectively. Moreover, the results show that the suggested mixed model is an advanced method that can offer high accuracy of PM

_{2.5}concentrations from 1 to 12 h post.

## 1. Introduction

_{2.5}(diameter less than or equal 2.5 μg/m

^{3}) concentration conditions, which would pose serious threats to public health and respiratory filtration systems. PM

_{2.5}is known as “pulmonary particulate matter” and is a key index for assessing fitness harm [7,8]. Therefore, an accurate understanding of PM

_{2.5}concentration is of great significance for early warning of atmospheric quality, which helps to reduce health damage and economic loss.

_{2.5}concentration prediction process [3,9,10]. For example, Lu et al. [11] showed that the coupled model of back-propagation artificial neural network (BPANN) as well as support vector regression (SVR) have a significant advantage in solving the nonlinear relationship between the input parameters and dependent variable than those of partial least squares regression (PLSR) under the same input parameters. Thus, more and more research has employed the machine learning methods to manage the nonlinear problems. By using the artificial neural network (ANN), support vector machine (SVM), and other machine study algorithms, Zhu and Lu [12] obtained a higher correlation with R

^{2}value at 0.8 than the linear methods performed during the PM

_{2.5}and PM

_{10}(diameter less than or equal 10 μg/m

^{3}) concentration forecasts. Moreover, in order to catch the hourly variation of the PM

_{2.5}concentration, Shang et al. [13] employed the extreme learning machine (ELM) and classification regression tree (CART) mixed models.

_{2.5}concentration. Without considering the input parameters of forward and backward information, the normal time series features are extracted from the forward LSTM layer, and the future change information is obtained from the reverse LSTM layer to further improve the prediction results.

_{2.5}concentration. For example, Dai et al. [18] showed that the RNN (recurrent neural network) model could calculate obvious deviation with gradient explosion and gradient disappearance, which is the same as the LSTM model [19], and could hardly reflect spatial information. Therefore, more and more studies introduced mixed models for prediction, which are beneficial to a singular model [9,20,21]. Using this method, each step would perform better with the advantage utilized, such as maximizing input parameter information [22], spatiotemporal data [23], deviation correction data [24], etc., to calculate more precise estimate consequences. As Zhang et al. [25] demonstrated, the PM

_{2.5}concentration prediction could be treated as a statistical method by capturing the historical trend and assessing the future periods. Additionally, the univariate and multivariate parameters constitute forecasting input elements. Taking the autoregressive integrated moving average (ARIMA) mode as an example [26], we could acquire accuracy results just by using the PM

_{2.5}series data information in the short term. A much better prediction could be obtained if studies apply plenty of variables as input parameters to acquire the variation of influencing factors and forecast targets [27,28]. An adaptive method for decomposing was widely used based on the RF algorithm, and the prediction of PM

_{2.5}concentration can serve as a statistical method by capturing historical trends and assessing future periods. Univariate and multivariate parameters constitute the predictive inputs. Taking ARIMA as an example [26], accurate results can be obtained in a short period using only PM

_{2.5}series data information.

_{10}, SO

_{2}, VOCs, and NOx, etc.) for changes in PM

_{2.5}concentrations. If many variables are used as input parameters to obtain the change in influencing factors and forecast targets, better prediction results can be obtained [27,28]. An adaptive decomposition method based on the RF (random forest) algorithm was widely used, and it had advantages in managing complicated nonlinear relations between variables. Bai et al. [4] used a radio-frequency model that incorporated different spatial–temporal variable sources for PM

_{2.5}predictions in New York state and achieved good consequences. Based on this function, the RF model has the advantage of using time series data and reflecting changing features, while the Fourier transforms method and other methods could not achieve those functions as well as the wavelet decomposition method.

_{2.5}concentration prediction methods are in depth but still present challenges. Most of the existing time series prediction focuses on the increasing forecast performance of the original sequence without making full use of the effective information implicit in the predictive error sequence. For instance, both precision of peak forecast [31] and the long-term forecast error reduction [29] need improvement. Considering the aforementioned issues, a novel mixed model was proposed with the (RF-BiLSTM) bonding RF approach as well as BiLSTM model to forecast the concentration of PM

_{2.5}in the short term (T + 1, T + 3 moments) as well as the long term (T + 12 moments).

_{2.5}concentration forecast mixed model was recommended to significantly ameliorate forecast precision for the short term as well as the long term. (2) The results showed that the RF model is introduced to decompose the test set independently while the BiLSTM model is coupled. (3) The model was compared with LSTM, SVM, RF, and Tree algorithms. (4) The model was compared with other algorithms, such as LSTM, SVM, RF, and tree. The parameters of different models were adjusted according to the model performance, and the data of different lead times were selected to observe the experimental results. (5) The new mixed-mixing model performs well in spatiotemporal generalization and in reflecting the context relationship of the time point.

## 2. Methods and Materials

#### 2.1. Study Area and Materials

_{2.5}concentration is needed to reduce risk exposure.

_{2.5}, PM

_{10}(particulate matter 10), SO

_{2}(sulfur dioxide), NO

_{2}(nitrogen dioxide), O

_{3}(ozone), and CO

_{2}(carbon dioxide)) were acquired from the Chinese ministry of environment (http://www.cnemc.cn/, accessed on 6 June 2022). The position of the research area is illustrated in Figure 1. Since there are transmission errors and sensor failures at the observation points [32], abnormal issues and irregular disappearance in the monitor information need verification as well as testing. Meanwhile, an unmodified, objective, and relatively complete basic data set of time series is the cornerstone of prediction. From previous studies, we chose the mean completion to recension pollutants concentration data, while the missing rate between 0% and 3%, and missing rate between 3.01% and 10% was applied using linear interpolation.

#### 2.2. Random Forest

_{2.5}concentration, sorted these features, and screened out the most important feature.

#### 2.3. Bi-Directional Long Short-Term Memory

#### 2.4. Modeling Process

_{2.5}sample in the time series, linear interpolation was employed to fill the missing value.

_{2.5}concentration.

#### 2.5. Evaluation Indicators

^{2}), MAE, and RMSE, which have been widely used in the assessment of precise indictors in former research scholar’s work [32,37]. The definitions of those indicators are as follows:

_{2.5}concentration in time I, ${\widehat{y}}_{i}$ represents concentration of PM

_{2.5}forecasting consequence of sample in time i, and ${\overline{y}}_{i}$ represents the mean value the observation concentration of sample in time i. R

^{2}denotes the degree of fitting value among the prediction concentration as well as actual concentration at the corresponding time, which when closer to the value of 1, the precise result performed much better. Additionally, the rest of the indictors of RMSE and MAPE are error assessment indicators that analyze deviation among prediction as well as actual value simultaneously.

## 3. Results and Discussion

#### 3.1. Prediction Result of RF-BiLSTM

#### 3.1.1. Prediction Results of RF Model

_{2.5}are usually non-stationary. This is mainly caused by the impact of air PM

_{2.5}on meteorology and pollution emissions. RF could decompose non-stative PM

_{2.5}time sequences in major factors and non-major factors. In order to avoid data disclosure, the meteorological parameters were decomposed using the radio frequency method, and the contribution efficiency of meteorological parameters to forecast results was evaluated by OOB value. The model was trained by randomly selecting data, and then the meteorological data were classified, and the classified data was learned, thereby increasing the precision of this model.

_{2.5}concentration, followed by DEWP, while the OOB value of Iws is the lowest at T-4 and has little influence on PM

_{2.5}concentration.

_{2.5}concentration (such as BJ, SY, and XA) in winter is relatively small. It is easy to cause fine particles to adhere and condense into nucleation, and the increase in temperature and light in summer will promote atmospheric oxidation and conversion of secondary pollutants.

_{2.5}concentration. However, Guangzhou is close to the ocean, and its humidity is relatively high, which has a great relationship with PM

_{2.5}concentration. In addition, whether it is dew point, temperature, or other factors, the RF model will conduct a preliminary screening of invalid information as input parameters of the BiLSTM model, which can significantly improve the accuracy of prediction [1,26].

#### 3.1.2. The Comparison of Prediction Results between LSTM and RF-BiLSTM for Short-Term

^{2}has increased significantly from 0.99 (LSTM model) to 0.995 (RF-BiLSTM) at the T + 1 moment. Without RF model classification, forecasting at the T + 1 moment will evaluate PM

_{2.5}concentration at a high level. After conducting the RF model classification based on the useless information filter of previous data with similar input parameters of the BiLSTM model, the high evaluation of predicting consequences could be noticeably modified. RMSE was decreased significantly by ~26.4% (from 9.87 μg/m

^{3}to 7.26 μg/m

^{3}), and MAE was decreased from 5.15 μg/m

^{3}to 3.73 μg/m

^{3}, with a decrease of 27.6%. The consequences validated fully illustrate that RF classification is a very necessary step which could significantly improve the forecast precision of the mixed model.

^{2}value ranged from 0.989 (T + 3) to 0.995 (T + 1). The reason may be that the prediction error of components increases rapidly with predicting time growth. Additionally, the same pattern applied in SY, XA, and GD cities is shown in Supplementary Materials Table S1, Figures S1–S3.

#### 3.2. Contrast of Forecast Consequence of Different Models

#### 3.2.1. Contrast Short-Term Forecast Consequences with SVM, TREE, and RF Models

_{2.5}prediction results and the actual PM

_{2.5}concentration among the different lead times. In general, RF-BILSTM has the best performance, and MAE and RMSE have small evaluation errors. The main reason was that, based on the feature selection of the RF model, the rF-BILSTM coupling model has a lower interference factor and a higher relationship influence factor than the traditional model.

^{3}higher than other models, and RMSE is 0.98~5.49 μg/m

^{3}higher than other models. The MAE and RMSE were 1.52–3.71 μg/m

^{3}and 3.19–6.95 μg/m

^{3}higher than those of other models. The RF-BiLSTM model was the best choice for BJ to forecast the concentration of PM

_{2.5}3 hours ahead. The accuracy of MAE is 0.28–2.73 μg/m

^{3}higher than other models, and RMSE is 0.31–7.53 μg/m

^{3}higher than other models.

_{2.5}concentration.

#### 3.2.2. Contrasting the Long-Term Forecast Consequences of Models Applied in Some Existing Researches

_{2.5}concentration; however, none of them make long-term predictions, such as the GRU, SVM, and LSTM models suggested by [4,9,38], as well as RF and Tree referred to in the previous section. Long-term prediction consequences of the RF-BiLSTM method are shown in Figure 7 and Table 2.

^{2}remained at more than 0.9, the MAE and RESM had a huge error compared to the moment at T + 1. It is obvious that the long-term PM

_{2.5}concentration forecast remains challenging as well as situated further discovered. Meanwhile, the mixed model suggested in the research keeps optimal consequences at T + 1 to T + 6 moments (with R

^{2}0.97–1.00, RMSE 7.26–16.93 μg/m

^{3}, MAE 3.73–9.33μg/m

^{3}). Consequently, in traditional machine study models, the R

^{2}value is usually lower than 0.8 at T + 6. All those results indicated that the RF-BiSLTM model is more suited to the integration of RF than other deep learning machine models, and mixed models could offer an efficient reference for policy execution.

#### 3.3. The Evaluation of the Model Robustness and Spatiotemporal Generalization

#### 3.3.1. Validation of Spatial Generalization of Mixed Model

_{2.5}concentration. The RF-BiLSTM mixed model prediction results in the next six moments and further span in the twelve moments are shown in Table 3, Table 4 and Table 5 and Supplementary Materials Figures S7–S9, and the kinds of input parameters were all classified using RF model.

^{2}value at the T + 1 moment could be 0.99, 1.00, and 1.00, separately), proving that the mixed model has great spatial generalization. GZ and XA have the best results in the three cities. On the other hand, the suspension of training group, continuous training group, and experimental group have significant influence on the prediction results. Compared with the LSTM, SVM, RF, and Tree model, the same regulars can be found in these three cities. Although R

^{2}can sometimes keep the same value, the MAE and RMSE can have bigger errors than RF-BiLSTM. Taking XA for example, the MAE and RMSE of the Tree model have more than 4~5 times the errors than the former model results. The results demonstrated that the mixed model has great generalization and firmness in the short term.

#### 3.3.2. Validation of Temporal Generalization of Mixed Model

^{2}have better performance with the value at 0.9, which demonstrated that the mixed model can reproduce PM

_{2.5}concentration variation trend characteristics.

^{3}, 1.77–6.52 μg/m

^{3}, MAE: 1.37–6.35 μg/m

^{3}, RMSE: The deviation of the model prediction results was large (MAE: 2.20–6.35 μg/m

^{3}, GZ: 4.51–25.37 μg/m

^{3}). This result indicated that the mixed model could better predict the concentration change trend with time going on, regardless of BJ or other regions, but the accuracy decreases gradually at the T + 6 moment.

## 4. Conclusions

_{2.5}is not the only product, but it is a crucial one. Accurately predicting PM

_{2.5}concentrations can help to issue air quality alerts, allow people to avoid long-term exposure to high pollution levels, and ease the pain of respiratory diseases. The PM

_{2.5}concentration curves of four typical cities with regional characteristics, Shenyang, Beijing, Xi’an, and Guangzhou, were illustrated by using the random forest model. The conclusion is as follows:

_{2.5}in China is related to lifestyle and meteorological factors. Xi’an is located in the mainland, so the accumulation of pollutants is mainly due to the more stationary wind. Meanwhile, Guangzhou is located in the south, adjacent to the Pearl River, and the air humidity is higher than in other areas, so the pollutant accumulation level is low. By comparing weekend and weekday PM

_{2.5}concentrations, it was found that human activities also have an impact on pollutant levels in the area.

_{2.5}concentration variation characteristics have been significantly captured based on the RF-BiLSTM model. Under this circumstance, joint prevention and control and targeted policies to reduce emissions could be established and implemented, and human health can be significantly improved.

## Supplementary Materials

_{2.5}concentration prediction results before and after RF model optimization in GZ, Figure S2 Comparison of PM

_{2.5}concentration prediction results before and after RF model optimization in XA, Figure S3 Comparison of PM

_{2.5}concentration prediction results before and after RF model optimization in SY, Figure S4 Fitting diagrams of RF-BiLSTM, LSTM, SVM, RF, Tree different models at T + 1 to T + 3 moments in GZ, Figure S5 Fitting diagrams of RF-BiLSTM, LSTM, SVM, RF, Tree different models at T + 1 to T + 3 moments in XA, Figure S6 Fitting diagrams of RF-BiLSTM, LSTM, SVM, RF, Tree different models at T + 1 to T + 3 moments in SY, Figure S7 Comparison of RF-BiLSTM results at T + 1 to T + 12 different moments in GZ, Figure S8 Comparison of RF-BiLSTM results at T + 1 to T + 12 different moments in XA, Figure S9 Comparison of RF-BiLSTM results at T + 1 to T + 12 different moments in SY.

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Guan, P.; Zhou, Y.; Cheng, S.; Duan, W.; Yao, S.; Li, J.; Yue, L. Characteristics of heavy pollution process and source appointment in typical heavy industry cities. China Environ. Sci.
**2020**, 40, 31–40. [Google Scholar] - Liu, H.; Long, Z.; Duan, Z.; Shi, H. A New Model Using Multiple Feature Clustering and Neural Networks for Forecasting Hourly PM
_{2.5}Concentrations, and Its Applications in China. Engineering**2020**, 6, 944–956. [Google Scholar] [CrossRef] - Wang, J.; Niu, T.; Wang, R. Research and Application of an Air Quality Early Warning System Based on a Modified Least Squares Support Vector Machine and a Cloud Model. Int. J. Environ. Res. Public Health
**2017**, 14, 249. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Bai, Y.; Li, Y.; Zeng, B.; Li, C.; Zhang, J. Hourly PM
_{2.5}concentration forecast using stacked autoencoder model with emphasis on seasonality. J. Clean. Prod.**2019**, 224, 739–750. [Google Scholar] [CrossRef] - Deng, Y.; Wang, B.; Lu, Z. A hybrid model based on data preprocessing strategy and error correction system for wind speed forecasting. Energy Convers. Manag.
**2020**, 212, 112779. [Google Scholar] [CrossRef] - O’Donnell, M.J.; Fang, J.; Mittleman, M.A.; Kapral, M.K.; Wellenius, G.A.; Investigators of the Registry of Canadian Stroke Network. Fine Particulate Air Pollution (PM
_{2.5}) and the Risk of Acute Ischemic Stroke. Epidemiology**2011**, 22, 422–431. [Google Scholar] [CrossRef] [Green Version] - Guan, P.; Wang, X.; Cheng, S.; Zhang, H. Temporal and spatial characteristics of PM2.5 transport fluxes of typical inland and coastal cities in China. J. Environ. Sci.
**2021**, 103, 229–245. [Google Scholar] [CrossRef] - Guan, P.; Zhang, H.; Zhang, Z.; Chen, H.; Bai, W.; Yao, S.; Li, Y. Assessment of Emission Reduction and Meteorological Change in PM
_{2.5}and Transport Flux in Typical Cities Cluster during 2013–2017. Sustainability**2021**, 13, 5685. [Google Scholar] [CrossRef] - Wang, P.; Zhang, G.; Chen, F.; He, Y. A hybrid-wavelet model applied for forecasting PM
_{2.5}concentrations in Taiyuan city, China. Atmos. Pollut. Res.**2019**, 10, 1884–1894. [Google Scholar] [CrossRef] - Wang, T.; Han, Y.; Hua, W.; Tang, J.; Huang, J.; Zhou, T.; Huang, Z.; Bi, J.; Xie, H. Profiling Dust Mass Concentration in Northwest China Using a Joint Lidar and Sun-Photometer Setting. Remote Sens.
**2021**, 13, 1099. [Google Scholar] [CrossRef] - Lu, X.; Sha, Y.H.; Li, Z.; Huang, Y.; Chen, W.; Chen, D.; Shen, J.; Chen, Y.; Fung, J.C.H. Development and application of a hybrid long-short term memory—Three dimensional variational technique for the improvement of PM
_{2.5}forecasting. Sci. Total Environ.**2021**, 770, 144221. [Google Scholar] [CrossRef] - Zhu, H.; Lu, X. The Prediction of PM
_{2.5}Value Based on ARMA and Improved BP Neural Network Model. In Proceedings of the 8th International Conference on Intelligent Networking and Collaborative Systems (INCoS), Ostrava, Czech Republic, 7–9 September 2016; pp. 515–517. [Google Scholar] - Shang, Z.; Deng, T.; He, J.; Duan, X. A novel model for hourly PM
_{2.5}concentration prediction based on CART and EELM. Sci. Total Environ.**2019**, 651, 3043–3052. [Google Scholar] [CrossRef] [PubMed] - Xing, H.; Wang, G.; Liu, C.; Suo, M. PM
_{2.5}concentration modeling and prediction by using temperature-based deep belief network. Neural Netw.**2021**, 133, 157–165. [Google Scholar] [CrossRef] [PubMed] - Zhao, J.; Deng, F.; Cai, Y.; Chen, J. Long short-term memory—Fully connected (LSTM-FC) neural network for PM
_{2.5}concentration prediction. Chemosphere**2019**, 220, 486–492. [Google Scholar] [CrossRef] [PubMed] - Ding, A.; Huang, X.; Nie, W.; Chi, X.; Xu, Z.; Zheng, L.; Xu, Z.; Xie, Y.; Qi, X.; Shen, Y.; et al. Significant reduction of PM
_{2.5}in eastern China due to regional-scale emission control: Evidence from SORPES in 2011–2018. Atmos. Chem. Phys.**2019**, 19, 11791–11801. [Google Scholar] [CrossRef] [Green Version] - Du, L.; Wang, Y.; Wu, Z.; Hou, C.; Mao, H.; Li, T.; Nie, X. PM
_{2.5}-Bound Toxic Elements in an Urban City in East China: Concentrations, Sources, and Health Risks. Int. J. Environ. Res. Public Health**2019**, 16, 164. [Google Scholar] [CrossRef] [Green Version] - Dai, X.; Liu, J.; Li, Y. A recurrent neural network using historical data to predict time series indoor PM
_{2.5}concentrations for residential buildings. Indoor Air**2021**, 31, 1228–1237. [Google Scholar] [CrossRef] [PubMed] - Li, T.; Hua, M.; Wu, X. A Hybrid CNN-LSTM Model for Forecasting Particulate Matter (PM2.5). IEEE Access
**2020**, 8, 26933–26940. [Google Scholar] [CrossRef] - Niu, Y.; Cheng, S.-Y.; Ou, S.-J.; Yao, S.-Y.; Shen, Z.-Y.; Guan, P.-B. Applying Photochemical Indicators to Analyze Ozone Sensitivity in Handan. Huanjing Kexue
**2021**, 42, 2691–2698. [Google Scholar] - Wu, Q.; Lin, H. A novel optimal-hybrid model for daily air quality index prediction considering air pollutant factors. Sci. Total Environ.
**2019**, 683, 808–821. [Google Scholar] [CrossRef] [PubMed] - Du, S.; Li, T.; Yang, Y.; Horng, S.-J. Deep Air Quality Forecasting Using Hybrid Deep Learning Framework. IEEE Trans. Knowl. Data Eng.
**2021**, 33, 2412–2424. [Google Scholar] [CrossRef] [Green Version] - Zhu, J.; Deng, F.; Zhao, J.; Zheng, H. Attention-based parallel networks (APNet) for PM2.5 spatiotemporal prediction. Sci. Total Environ.
**2021**, 769, 145082. [Google Scholar] [CrossRef] - Sun, W.; Li, Z. Hourly PM
_{2.5}concentration forecasting based on mode decomposition-recombination technique and ensemble learning approach in severe haze episodes of China. J. Clean. Prod.**2020**, 263, 121442. [Google Scholar] [CrossRef] - Zhang, L.; Lin, J.; Qiu, R.; Hu, X.; Zhang, H.; Chen, Q.; Tan, H.; Lin, D.; Wang, J. Trend analysis and forecast of PM
_{2.5}in Fuzhou, China using the ARIMA model. Ecol. Indic.**2018**, 95, 702–710. [Google Scholar] [CrossRef] - Huang, G.; Li, X.; Zhang, B.; Ren, J. PM
_{2.5}concentration forecasting at surface monitoring sites using GRU neural network based on empirical mode decomposition. Sci. Total Environ.**2021**, 768, 144516. [Google Scholar] [CrossRef] [PubMed] - Chang-Hoi, H.; Park, I.; Oh, H.-R.; Gim, H.-J.; Hur, S.-K.; Kim, J.; Choi, D.-R. Development of a PM
_{2.5}prediction model using a recurrent neural network algorithm for the Seoul metropolitan area, Republic of Korea. Atmos. Environ.**2021**, 245, 118021. [Google Scholar] [CrossRef] - Zhang, B.; Zhang, H.; Zhao, G.; Lian, J. Constructing a PM
_{2.5}concentration prediction model by combining auto-encoder with Bi-LSTM neural networks. Environ. Model. Softw.**2020**, 124, 104600. [Google Scholar] [CrossRef] - Cheng, Y.; Zhang, H.; Liu, Z.; Chen, L.; Wang, P. Hybrid algorithm for short-term forecasting of PM
_{2.5}in China. Atmos. Environ.**2019**, 200, 264–279. [Google Scholar] [CrossRef] - Sawlani, R.; Agnihotri, R.; Sharma, C. Chemical and isotopic characteristics of PM
_{2.5}over New Delhi from September 2014 to May 2015: Evidences for synergy between air-pollution and meteorological changes. Sci. Total Environ.**2021**, 763, 142966. [Google Scholar] [CrossRef] [PubMed] - Qi, Y.; Li, Q.; Karimian, H.; Liu, D. A hybrid model for spatiotemporal forecasting of PM
_{2.5}based on graph convolutional neural network and long short-term memory. Sci. Total Environ.**2019**, 664, 1–10. [Google Scholar] [CrossRef] - Yang, K.; Teng, M.; Luo, Y.; Zhou, X.; Zhang, M.; Sun, W.; Li, Q. Human activities and the natural environment have induced changes in the PM
_{2.5}concentrations in Yunnan Province, China, over the past 19 years. Environ. Pollut.**2020**, 265, 114878. [Google Scholar] [CrossRef] [PubMed] - Hochreiter, S.; Schmidhuber, J.J.N.C. Long Short-Term Memory. Neural Comput.
**1997**, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed] - Choi, S.W.; Kim, B.H.S. Applying PCA to Deep Learning Forecasting Models for Predicting PM2.5. Sustainability
**2021**, 13, 3726. [Google Scholar] [CrossRef] - Shi, L.; Zhang, H.; Xu, X.; Han, M.; Zuo, P. A balanced social LSTM for PM
_{2.5}concentration prediction based on local spatiotemporal correlation. Chemosphere**2022**, 291, 133124. [Google Scholar] [CrossRef] - Wei, J.; Yang, F.; Ren, X.-C.; Zou, S. A Short-Term Prediction Model of PM
_{2.5}Concentration Based on Deep Learning and Mode Decomposition Methods. Appl. Sci.**2021**, 11, 6915. [Google Scholar] [CrossRef] - Yu, Z.; Yang, K.; Luo, Y.; Shang, C. Spatial-temporal process simulation and prediction of chlorophyll-a concentration in Dianchi Lake based on wavelet analysis and long-short term memory network. J. Hydrol.
**2020**, 582, 124488. [Google Scholar] [CrossRef] - Niu, M.; Wang, Y.; Sun, S.; Li, Y. A novel hybrid decomposition-and-ensemble model based on CEEMD and GWO for short-term PM
_{2.5}concentration forecasting. Atmos. Environ.**2016**, 134, 168–180. [Google Scholar] [CrossRef]

**Figure 1.**Spatial distribution of the Beijing (BJ), Guangdong (GD), Xi’an (XA), and Shenyang (SY) study areas.

**Figure 5.**Contrast of the concentration of PM

_{2.5}forecast consequence before and after RF model optimizing in BJ point.

**Figure 6.**Fitting diagrams of RF-BiLSTM, LSTM, SVM, RF, and Tree different models at T + 1, T + 2, and T + 3 moments in BJ.

Model | BJ(T + 1) | BJ(T + 2) | BJ(T + 3) | ||||||
---|---|---|---|---|---|---|---|---|---|

R^{2} | MAE | RMSE | R^{2} | MAE | RMSE | R^{2} | MAE | RMSE | |

LSTM | 0.99 | 5.15 | 9.86 | 0.99 | 6.45 | 12.21 | 0.98 | 6.96 | 13.41 |

RF-BiLSTM | 1.00 | 3.73 | 7.26 | 0.99 | 5.23 | 10.2 | 0.99 | 5.36 | 10.47 |

SVM | 0.97 | 9.87 | 18.6 | 0.91 | 16.82 | 31.15 | 0.84 | 22.28 | 40.39 |

RF | 0.94 | 11.48 | 23.96 | 0.87 | 19.09 | 36.69 | 0.76 | 25.37 | 49.49 |

Tree | 0.95 | 11.29 | 21.82 | 0.88 | 18.66 | 34.57 | 0.80 | 24.47 | 45.23 |

Moment | T + 1 | T + 2 | T + 3 | T + 4 | T + 5 | T + 6 | T + 7 | T + 8 | T + 9 | T + 10 | T + 11 | T + 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|

R^{2} | 1.00 | 0.99 | 0.99 | 0.98 | 0.98 | 0.97 | 0.96 | 0.97 | 0.97 | 0.96 | 0.94 | 0.95 |

MAE | 3.73 | 5.23 | 5.36 | 7.29 | 7.35 | 9.33 | 11.89 | 10.42 | 9.85 | 10.84 | 12.89 | 12.68 |

RMSE | 7.26 | 10.20 | 10.47 | 13.69 | 12.81 | 16.93 | 19.03 | 17.61 | 17.11 | 19.01 | 24.14 | 23.33 |

Model | GZ(T + 1) | GZ(T + 2) | GZ(T + 3) | ||||||
---|---|---|---|---|---|---|---|---|---|

R^{2} | MAE | RMSE | R^{2} | MAE | RMSE | R^{2} | MAE | RMSE | |

LSTM | 0.98 | 1.98 | 2.68 | 0.98 | 2.39 | 3.3 | 0.96 | 2.85 | 4.03 |

RF-BiLSTM | 0.99 | 1.33 | 1.77 | 0.98 | 2.06 | 2.82 | 0.98 | 2.26 | 3.13 |

SVM | 0.89 | 4.64 | 6.93 | 0.79 | 6.58 | 9.77 | 0.70 | 7.91 | 11.66 |

RF | 0.89 | 4.89 | 7.12 | 0.78 | 6.91 | 10.09 | 0.70 | 8.31 | 11.78 |

Tree | 0.89 | 4.96 | 4.96 | 0.78 | 6.86 | 10.01 | 0.69 | 8.4 | 11.93 |

Model | XA(T + 1) | XA(T + 2) | XA(T + 3) | ||||||
---|---|---|---|---|---|---|---|---|---|

R^{2} | MAE | RMSE | R^{2} | MAE | RMSE | R^{2} | MAE | RMSE | |

LSTM | 1.00 | 1.65 | 3.07 | 0.99 | 2.09 | 3.92 | 0.99 | 2.08 | 4.43 |

RF-BiLSTM | 1.00 | 1.37 | 2.75 | 0.99 | 1.65 | 3.52 | 0.99 | 1.78 | 4.16 |

SVM | 0.99 | 3.03 | 4.72 | 0.96 | 5.75 | 8.85 | 0.92 | 8.07 | 12.19 |

RF | 0.99 | 3.37 | 5.3 | 0.96 | 6.04 | 9.25 | 0.91 | 8.51 | 12.91 |

Tree | 0.99 | 3.58 | 5.44 | 0.95 | 6.36 | 9.61 | 0.91 | 8.83 | 13.2 |

Model | SY(T + 1) | SY(T + 2) | SY(T + 3) | ||||||
---|---|---|---|---|---|---|---|---|---|

R^{2} | MAE | RMSE | R^{2} | MAE | RMSE | R^{2} | MAE | RMSE | |

LSTM | 0.99 | 3.85 | 9.22 | 0.98 | 5.11 | 11.24 | 0.97 | 6.11 | 15.07 |

RF-BiLSTM | 1.00 | 2.2 | 4.51 | 0.99 | 3.17 | 6.71 | 0.99 | 4.79 | 10.35 |

SVM | 0.97 | 8.31 | 15.3 | 0.91 | 14.07 | 25.58 | 0.84 | 18.36 | 33.17 |

RF | 0.93 | 9.85 | 21.58 | 0.84 | 16.68 | 33.59 | 0.72 | 22.17 | 44.08 |

Tree | 0.94 | 10.58 | 20.67 | 0.83 | 16.82 | 34.66 | 0.76 | 21.54 | 41.05 |

Moment | T + 1 | T + 2 | T + 3 | T + 4 | T + 5 | T + 6 | T + 7 | T + 8 | T + 9 | T + 10 | T + 11 | T + 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|

R^{2} | 0.99 | 0.98 | 0.98 | 0.96 | 0.95 | 0.95 | 0.94 | 0.93 | 0.94 | 0.93 | 0.92 | 0.91 |

MAE | 1.33 | 2.06 | 2.26 | 2.91 | 3.19 | 3.38 | 3.57 | 3.84 | 3.75 | 3.97 | 4.13 | 4.6 |

RMSE | 1.77 | 2.82 | 3.13 | 4.20 | 4.60 | 4.87 | 5.14 | 5.48 | 5.36 | 5.67 | 5.91 | 6.52 |

Moment | T + 1 | T + 2 | T + 3 | T + 4 | T + 5 | T + 6 | T + 7 | T + 8 | T + 9 | T + 10 | T + 11 | T + 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|

R^{2} | 1.00 | 0.99 | 0.99 | 0.99 | 0.98 | 0.97 | 0.97 | 0.97 | 0.98 | 0.97 | 0.95 | 0.94 |

MAE | 1.37 | 1.65 | 1.78 | 2.16 | 2.69 | 4.19 | 3.98 | 4.09 | 3.61 | 4.67 | 5.93 | 6.35 |

RMSE | 2.75 | 3.52 | 4.16 | 4.82 | 5.82 | 7.12 | 7.26 | 7.32 | 7.01 | 8.19 | 9.57 | 10.38 |

Moment | T + 1 | T + 2 | T + 3 | T + 4 | T + 5 | T + 6 | T + 7 | T + 8 | T + 9 | T + 10 | T + 11 | T + 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|

R^{2} | 1.00 | 0.99 | 0.99 | 0.97 | 0.71 | 0.94 | 0.94 | 0.93 | 0.94 | 0.90 | 0.87 | 0.91 |

MAE | 2.2 | 3.17 | 4.79 | 5.9 | 6.2 | 7.75 | 7.88 | 8.89 | 8.45 | 10.6 | 10.65 | 10.07 |

RMSE | 4.51 | 6.71 | 10.35 | 14.14 | 14.17 | 20.27 | 19.90 | 22.20 | 20.73 | 26.52 | 29.59 | 25.37 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Zhao, J.; Yuan, L.; Sun, K.; Huang, H.; Guan, P.; Jia, C.
Forecasting Fine Particulate Matter Concentrations by In-Depth Learning Model According to Random Forest and Bilateral Long- and Short-Term Memory Neural Networks. *Sustainability* **2022**, *14*, 9430.
https://doi.org/10.3390/su14159430

**AMA Style**

Zhao J, Yuan L, Sun K, Huang H, Guan P, Jia C.
Forecasting Fine Particulate Matter Concentrations by In-Depth Learning Model According to Random Forest and Bilateral Long- and Short-Term Memory Neural Networks. *Sustainability*. 2022; 14(15):9430.
https://doi.org/10.3390/su14159430

**Chicago/Turabian Style**

Zhao, Jie, Linjiang Yuan, Kun Sun, Han Huang, Panbo Guan, and Ce Jia.
2022. "Forecasting Fine Particulate Matter Concentrations by In-Depth Learning Model According to Random Forest and Bilateral Long- and Short-Term Memory Neural Networks" *Sustainability* 14, no. 15: 9430.
https://doi.org/10.3390/su14159430