A Hybrid Model for PM 2.5 Concentration Forecasting Based on Neighbor Structural Information, a Case in North China

: PM 2.5 concentration prediction is an important task in atmospheric environment research, so many prediction models have been established, such as machine learning algorithm, which shows remarkable generalization ability. The time series data composed of PM 2.5 concentration have the implied structural characteristics such as the sequence characteristic in time dimension and the high dimension characteristic in dynamic-mode space, which makes it different from other research data. However, when the machine learning algorithm is applied to the PM 2.5 time series prediction, due to the principle of input data composition, the above structural characteristics can not be fully reﬂected. In our study, a neighbor structural information extraction algorithm based on dynamic decomposition is proposed to represent the structural characteristics of time series, and a new hybrid prediction system is established by using the extracted neighbor structural information to improve the accuracy of PM 2.5 concentration prediction. During the process of extracting neighbor structural information, the original PM 2.5 concentration series is decomposed into ﬁnite dynamic modes according to the neighborhood data, which reﬂects the time series structural characteristics. The hybrid model integrates the neighbor structural information in the form of input vector, which ensures the applicability of the neighbor structural information and retains the composition form the original prediction system. The experimental results of six cities show that the hybrid prediction systems integrating neighbor structural information are signiﬁcantly superior to the traditional models, and also conﬁrm that the neighbor structural information extraction algorithm can capture effective time series structural information.


Introduction
In 2015, PM 2.5 was considered the fifth leading risk factor of death, which has caused 4.2 million deaths worldwide [1,2]. Being an essential index to describe the quality of atmospheric environment, the higher the PM 2.5 concentration is, the more serious the air pollution is [3]. PM 2.5 mostly stems from the burning of fossil fuels and industrial production processes, which often carry toxic organic ingredients and heavy metals, posing serious health problems to human beings [4][5][6]. Many research studies and investigations have confirmed that PM 2.5 is an important inducer of cardiovascular diseases, lung cancer and respiratory system diseases and so on [5,7,8]. It is noteworthy that particulate matter induces oxidative stress leading to potential cell damages [9]. With high PM 2.5 concentration, North China is a representative city cluster with poor air quality, which has aroused great concern of the public and the government [10,11]. Therefore, for the purpose of mitigating its impacts on human health and welfare, accurate and effective prediction of PM 2.5 concentration is an important means [12].
At present, the Air Quality Forecast (AQF) Systems is mainly divided into deterministic and empirical models in terms of techniques [7,8,13,14]. Relevant research shows that the deterministic models, based on physicochemical process to simulate the highly complex transport and diffusion process of air pollutants, can not fully explicate the high dimensional nonlinearity of the correlated substances forming air pollutants in that the pollution sources and model parameters are uncertain, which leads to the result that the prediction accuracy of the deterministic models are lower than that of the well-developed data-driven empirical models [7,15,16]. However, the empirical models, especially the machine learning models with large amount of research data as modeling elements, can simplify the modeling process and accurately represent the complicated non-linear relationships between the predicted pollutant concentration and potential influencing factors and have a good generalization ability [7,15,17]. The artificial neural network (ANN) with self-learning mechanism and support vector machine (SVM) aiming at structural risk minimization can accurately simulate the nonlinear characteristics of atmospheric motion and widely apply to the single-step and multi-step prediction of atmospheric pollutant concentration [3,7,[18][19][20][21]. Qi et al. [12] put forward a hybrid prediction model for the PM 2.5 concentration, which is a combination of deep learning methods and the long-short memory (LSTM) with historical pollutant concentration, meteorological data, spatial and temporal terms as system inputs. MA et al. [22] constructed a new interpolation/extrapolation algorithm using LSTM neural network, which had been successfully used in PM 2.5 prediction. Suleiman et al. [18] analyzed the emission reduction effect of traffic-related PM 10 and PM 2.5 of 19 stations in London by using the evaluation system built by ANN, BRT (boosted regression tree) and SVM. Zheng et al. [2] extracted the dynamic variation characteristics of daily satellite images based on the convolutional neural network, and then applied random forest regression to achieve ground-level PM 2.5 estimation. Zhou et al. [23] integrated copula function into a hybrid model composed of multiple deterministic ANN and Bayesian models, effectively eliminating data conversion and error correction process in order to obtain accurate ensemble probability prediction of PM 2.5 . Zhou et al. [15] probed into a new multi-objective SVM which could effectively solve the error accumulation problem and effectively enhance the forecasting accuracy of PM 2.5 in Taipei City. Yang et al. [24] proposed a space-time SVM that could tackle spatial heterogeneity with performance superior to the benchmark model on Beijing's PM 2.5 concentration prediction task. Biancofiore et al. [25] compared the results of PM 10 and PM 2.5 simulation experiments of the recursive neural network, the multiple linear regression and traditional ANN in the Adriatic coast for three years, which showed that the improved model presents superior generalization ability. Niu et al. [3] constructed a mixed system with daily PM 2.5 concentration decomposed by empirical mode decomposition as input information of traditiong SVM to significantly improve the precision of single-step prediction and the ability of direction judgment. Neto et al. [26] extracted the "deterministic" component in time series by decomposition, which is combined with four ANNs to improve the prediction accuracy of PM 2.5 and PM 10 .
From the above research, we can find out that machine learning algorithm is very appropriate for PM 2.5 time series analysis, especially its combination with other different intelligent algorithms can significantly improve the applicability of general model for specific data. Whether ANN or SVM is used to process time series prediction, its principle is to determine the nonlinear relationship between the independent variable x composed of lags information and the dependent variable y representing the future value [27]. It can also be understood that the information needed for time series prediction comes from the observation values close to the prediction point. To a large extent, how to fully mine the information representation ability of neighbor data is the key to gain better performance for prediction tasks. In many prediction systems based on machine learning models, the neighbor structural information is expressed as input-matrix consisting of the lags in the past domain, which can be identified by partial auto-correlation functions (PACF) [28][29][30]. The input structure of the prediction model only depends on the lag term and can not reflect the structural information of the sequence composed of the neighbor data used for prediction. For example, this leads to the loss of temporal correlation between adjacent data at different time points that is a very important feature of time series data. Undoubtedly, this disadvantage is the shackle to improve the generalization ability of prediction model. We believe that time series data constitute an implicit function, so the observation value of a prediction time point is determined by the implicit function of its neighbor sequence. Because of the dynamic and complexity of the time series, the function of the neighbor sequence is likely to be different for the prediction target at different time. Therefore, it is hoped that the best approximation function can be constructed according to the neighbor sequence of different prediction time points to reflect the internal structure of data as much as possible, so as to fully represent the neighbor structural information.
In our study, we put forward a new hybrid prediction system combining the neighbor structural information, which is characterized by time series structure based on the principle of dynamic decomposition. It solves the issues such as the difficulty to determine the lag order and the unexpressed sequence characteristics of the time series data in the time dimension when the traditional machine learning algorithm is applied to time series prediction. The innovations of this paper can be described as follows: (1) a novel neighbor structural information extraction model is put forward by means of dynamic decomposition, which embodies the time structural characteristics in the dynamic-mode space and uses the optimal combination to construct the neighbor structural information series; (2) a new hybrid prediction system is constructed, which is based on machine learning algorithms and integrates neighbor structural information in a simple way to obtain higher PM 2.5 concentration prediction accuracy.

Study Area and Available Data
North China is embedded in an area with Yan Mountain in the north and Taihang Mountain in the west, which contributes to the accumulation of aerosols and leads to frequent pollution incidents [31]. Because North China, including the political and economic center of China, is one of the most polluted areas, Beijing, Tianjin, Shijiazhuang, Taiyuan, Zhengzhou and Jinan, being representative cities of North China, are selected as the sampling sites for atmosphere pollution analysis [31,32]. Figure 1 exhibits the distribution of sampling sites above. The research data include six pollutants monitored at a frequency of one hour [33]. The observation time of the research data spans from 1 January 2019 to 31 January 2019, including 744 observation samples, each of which is a vector composed of PM 2.5 , PM 10 , SO 2 , NO 2 , O 3 and CO. Compared with daily or monthly data, hourly concentration data displays the stronger dependence in time dimension, that is, the prediction pollutant concentration is closely related to the neighbor samples. Therefore, it is very reasonable and meaningful for such data set to be selected to study the series neighbor structural information.

Methods
When machine learning algorithms are applied to time series prediction, it is inevitable to determine the functional relationship y = f (X, Y) between the independent variable X, which is composed of the past time series, and the dependent variable Y (predicted values), where f represents the prediction model and y is the prediction result. The normalized input variables of the prediction system are measured in equal interval observation, which can include both the historical values of the prediction target and the related influence factors. The task of the machine learning models is to build prediction model y t+1 = f (X t , . . . , X t−n ) in essence that uses the n-dimensional features with n lagged variable to interpret the input variables X i = {x i,1 , . . . , x i,m }. From the above analysis, we can find that in the modeling process of machine learning model, there is not enough attention to the structural characteristics, such as the sequence feature of time series data in time dimension and the noise interference of time series. We can think that the value of a certain time point in the time series is closely related to its adjacent data in its neighborhood, which is determined by the fitting function of the historical data in the neighborhood. Accordingly, for the sake of higher accurate prediction accuracy, it can be regarded as an effective means to mine the neighbor structural information.

Neighbor Structural Information Extraction Algorithm Based on Time Series Dynamic Decomposition
This paper proposes a method of extracting neighbor structural information based on dynamic decomposition, which can make full use of the unrevealed dynamics of time series. We first divide the time series into several parts to separate the dynamic modes that are helpful to the forecasting, then the optimal combination of the decomposed time series is carried out to extract the neighbor structural information. For any given series r(t), it is well known that the prediction of t f time point mainly depends on the time series information before t f . Further more, the value of t f is more sensitive to the time series data closer to it, that is, the data that are closer to t f in time dimension show stronger correlation. The dependence decays asymptotically to zero as the distance dist(t, t f ) → ∞. This characteristic will be reflected in the newly developed method.
We give an example to show the main idea of dynamic decomposition of the time series. For clear understanding, we consider the continuous time series. Let r be a sinusoidal signal with frequency ω and amplitude α, i.e., r(t) = α sin ωt. Injecting r into two stable first order systems with zero initial state, we havė (1) A simple computation shows that and thus where As a consequence of Formula (3), there exist two constants k 1 and k 2 such that which implies that, when t or λ j is large enough, The Formula (6) shows that the projection of r on the space x j (t) is a weighted mean value of r with respect to the weight function e −λ j t . Since the weight function is exponentially stable, the time series decomposition mainly uses the neighbor structural information before t. Moreover, since system Equation (1) is stable, the high frequency noise added in r can be filtered. In other words, such an approximation is robust to the high frequency noise. To sum up, there are four highlights for the dynamic series decomposition: • Separate the dynamic modes hidden in the time series itself so that we can use the neighbor structural information sufficiently by optimized combination; • We are able to choose the neighbor structural information by the tuning the parameter λ j , j = 1, 2, that reflects the characteristic of the time series we mentioned above; • Compared with machine learning models, the structural characteristics of time series data in time dimension are preserved; • The white noise in the time series are filtered. So such a decomposition is robust to the white noise. Now, we return to the general case. Suppose that (A, B) is controllable with state space R n and input space R. Let r(t) be a general time series. We divide the series r by the following systemẋ (t) = Ax(t) + Br(t).
We solve Equation (8) to get which implies that the information of the injection r(t) is decomposed into n parts and is contained in the components x j (t), j = 1, 2, · · · , n. Inspired by the aforementioned example, the projection ofr on span{x 1 , x 2 , · · · , x n } can approximate r effectively. Moreover, such an approximation is robust to the white noise in some sense because the high frequency noise can be filtered by the integral in Equation (9) provided A is Hurwitz. The choice of A, B and the order n depends on the prior information of time series r. The tuning parameter −λ(A) is the largest real part of eigenvalue of A, i.e., is the spectrum of the matrix A. Roughly speaking, the criterion of the order n is that guarantee n is lager than the modes that contained in r. The choice of λ(A) is depends on the sampling frequency of the time series. The higher the frequency is, the larger the parameter λ(A) is required.

The Hybrid Model Based on Neighbor Structural Information for PM 2.5 Concentration Prediction
In order to utilize the neighbor structural information effectively, a hybrid model which combines the neighbor structural information with the traditional machine learning algorithm is proposed. The basic idea of the hybrid model is to integrate the neighbor structural information into the modeling data set in the form of input vector elements. Compared with the traditional modeling process, although the steps of neighbor structural information extraction will increase, the efficiency of the prediction system will not be reduced considering the simple implementation of the neighbor structural information extraction algorithm. It can be seen that the effectiveness of neighbor structural information is the key to improving the generalization ability of hybrid prediction system. The modeling flow of the hybrid system is illustrated by Figure 2, and the specific process based on neighbor structural information is shown via the following Algorithm 1.

Algorithm 1 PM 2.5 concentration prediction hybrid model based on neighbor structural information
is the input vector composed of influence factors at time i, and y i is the output formed by the concentration value of PM 2.5 at the corresponding time.
Ensure: the prediction value f (x t+1 ) of PM 2.5 at time t + 1.
entails the neighbor structural informationr i−1 and y i is the PM 2.5 concentration at time i.
3: According to the principle of the previous step, the input vector x t+1 = (r t , PM 2.5 t , PM 10 t , prediction model is constructed at time t + 1. 4: Training the traditional machine learning algorithm on the previously constructed data set {(x i , y i )} t i=1 , the optimal parameters are selected according to 10-fold cross validation. 5: Inputting the vector x t+1 at time t + 1 into the prediction system of preceding training, the prediction result f (x t+1 ) is obtained.

Performance Evaluation Index of Prediction Model
To prove the generalization ability of the hybrid prediction system with neighbor structural information, the measures describing system performance from different perspectives are applied in this study.
in which, n is the sample size, y i means the observations, while y i represents the prediction result obtained by the forecasting system. MAE and RMSE display the error between the prediction values and the observed ones, which can also be understood as the closer to zero, the higher the forecasting accuracy of the algorithms. IA represents the correlation between the forecasts and the observations and DA indicates the prediction accuracy of the forecast results for the time series trend, of which larger values mean better prediction performance. The MFB and MFE values closing to zero represent better generalization ability of the prediction model. Table 1 lists the statistic results of monitoring concentration of six atmospheric pollutants used for atmospheric environment assessment, which are completed by Eviews software. The mean value of PM 2.5 concentration in Shijiazhuang is the largest among those of the six research sites, reaching 136.7097 µg/m 3 , which is far beyond the limit value 75 µg/m 3 of 24-hour average concentration issued by the National Ambient Air Quality Standard of China (GB 3095-2012) [34]. It is followed by Taiyuan with a PM 2.5 concentration of 130.3347 µg/m 3 . The main reason why PM 2.5 concentration in these two cities exceeds the standard seriously is that there are more coal combustion for power and indoor heating supply in winter, accompanied by higher SO 2− 4 , and some coal-related ions such as NH + 4 and CL − than other seasons [35][36][37]. In addition, PM 2.5 concentration is further aggravated during the heating period. In particular, Beijing has the lowest mean PM 2.5 , but it shows the maximum monitoring value of 428.0000 µg/m 3 on 12 January 2019, which experienced a haze event. Generally, the high PM 2.5 concentration of the six cities in North China prove that aerosol pollution in this area is an urgent problem. Further, it can be seen that the maximum of mean values of PM 10 , NO 2 and CO appear in Shijiazhuang, the maximum value of SO 2 is produced in Taiyuan, and the maximum value of O 3 is represented in Beijing. According to the national standard, PM 2.5 is the primary air pollutant currently monitored in most cases. Furthermore, Std.Dev. (Standard Deviation) demonstrates the degree to which the prediction values deviate from the mean value, and the maximum Std.Dev is obtained from the Shijiazhuang data set. By analyzing the statistical results of monitored pollutants concentration, the distribution attributes at different stations are explored. PM 2.5 , owning the properties of mixed pollutants, has obvious correlation with other monitoring pollutants in theory. To explain more accurately the influence degree of PM 2.5 by other air pollutants, Equation (16) is used to calculate the mutual information, which is described as follows.

Data Statistics and Analysis
where X PM 2.5 and Y i denote PM 2.5 and a certain air pollutant, respectively. Table 2 shows the results calculated according to Formula (16). The analysis results clearly show that PM 2.5 of all sites possesses the largest mutual information with PM 10 , which verifies that the two variables have the strongest correlation. This is mainly due to the fact that PM 2.5 and PM 10 are both particulate pollutants, except for the difference in particle diameter. In terms of monitoring, PM 10 concentration value covers the concentration value of PM 2.5 , so it is self-evident that they show a very strong coherence. Due to the same sources of pollution, PM 2.5 is a large fraction (the majority) of PM 10 , ranging typically from 60% to 80% of PM 10 [38]. Among the single pollutants, the mutual information values of NO 2 in all data sets are greater than 3.8, which means that NO 2 has a profound impact on PM 2.5 . Cities in North China are facing serious NO 2 pollution, so the treatment of NO 2 is beneficial for controlling PM 2.5 concentration. Next, the air pollutant expressing strong correlation with PM 2.5 is SO 2 in Tianjin, Shijiazhuang, Taiyuan and Jinan, and O 3 in Beijing and Zhengzhou. Although CO has the weakest influence on PM 2.5 among all monitored pollutants, it is also an indispensable and important factor for PM 2.5 forecasting. In addition, scatter plot ( Figure 3) composed of features shows that PM 2.5 represents a strong linear correlation with PM 10 and CO; the relationship between PM 2.5 and SO 2 , NO 2 is more obvious linear relationship; while PM 2.5 and O 3 show a logarithmic correlation. It can be concluded that other pollutants have a great impact on PM 2.5 . The high correlation between air pollutants is due to the fact that they are affected by the same sources and experience the same meteorological influence (mainly transport and dispersal). It is very reasonable and scientific to select the above air pollutants as the input data of the prediction system.

Neighbor Structural Information Extraction
For extracting the neighbor structural information and mining the structural characteristics of the original time series, this paper exploits neighbor structural information extraction algorithm based on dynamic decomposition that decomposes the original PM 2.5 series into the dynamic modes, and then forms the neighbor structural information series by optimal combination. Figure 4 demonstrates that the original PM 2.5 sequences are decomposed into three dynamic model subseries DF i , i = 1, 2, 3 with very similar frequencies and gradually increasing amplitudes, which are values in different feature directions in the dynamic modes space. NI is the series representing the neighbor structural information obtained by the optimal combination of the dynamic model subseries. Table 3 lists the mutual information values between the extracted neighbor structural information and PM 2.5 , which are all greater than 4.2, indicating that the neighbor structural information has a strong correlation with PM 2.5 . It can be concluded that the neighbor structural information effectively covers the structural information in the fields adjacent to the prediction points and has potential ability to improve the prediction accuracy.

Results of the Hybrid Model Based on Neighbor Structural Information
In order to prove that the extracted neighbor structural information can be effectively integrated into the traditional machine learning algorithms, the representative single model ANN and SVM are selected as the basic models to construct the hybrid prediction system. The first 480 data in the experimental data set are allocated as training-validation data set, while the rest of data (264 samples) are used for testing.

Prediction Results Comparison between ANN and ANN N I
In the process of establishing ANN N I model, the input layer comprises 7 nodes, the hidden layer contains 4 nodes, and the output layer is PM 2.5 concentration. The logistic sigmoid function, as the well behaved function, is selected to realize the connection of hidden layer nodes. In addition, the parameters optimization in the models is implemented by cross validation method for the best prediction results.  The prediction performance assessment of ANN and ANN N I is shown in Table 4. The ANN N I model integrates the neighbor structural information, and contains the implied structural characteristics of the PM 2.5 series in the neighbor domain of the prediction points, so it can obtain better results under the indicators given in this paper. Table 4 lists the prediction errors of six data sets during the test period. MAEs of the hybrid ANN N I models for different test sets are 4.2307, 4.3846, 8.5755, 11.8033, 6.8320 and 5.4676, respectively, which are significantly reduced compared with the 6.5432, 5.3457, 9.3995, 12.6996, 8.5289 and 7.0559 obtained by the basic model ANNs. Similarly, the same results can be obtained according to the values of RMSE. In addition, compared with ANNs, ANN N I models have larger IA and DA, which means that ANN N I 's prediction values show stronger correlation with observed values and reflect more accurate trend judgment ability. Figure 5 shows the time series plots composed of test set prediction results and observations. We can observe from the graph that the predicted values of ANN and ANN N I are very close to the observed values, indicating that the above prediction results are of high accuracy. However, at the extreme points and their adjacent points, ANN N I models have more sensitive prediction ability, which benefits from their incorporation into the structural information of the neighbor domain. It is particularly noteworthy that the ANN N I models need to determine the span of the neighbor domain in the process of extracting neighbor structural information, which is an application of the lags to a certain extent. By combining ANN with neighbor structural information, the prediction model not only obtains the ability to express the spatial characteristics of high-dimensional dynamic modes provided by neighbor structural information, but also helps to solve the problem that the lag order of ANN model is difficult to determine.

Prediction Results Comparison between SVM and SVM N I
SVM N I is a hybrid system based on SVM, which contains neighbor structural information. SVM with structural risk minimization as the goal has a global optimal solution. However, when SVM is applied to time series prediction, the input vector is usually composed of historical data and influence factors, and then it is mapped to high-dimensional space by kernel function to obtain the optimal regression function. Therefore, the structural information of time series data cannot be reflected in the traditional SVM model. In the process of modeling, the gridsearch method is used to optimize the parameters. With optimal parameters, the prediction results of SVM and SVM N I are given in Table 4 respectively. The MAE of SVM N I on the Tianjin data set is 27.58% lower than that of SVM if performed best, which is followed by a 23.72% decline in Beijing. In terms of the Shijiazhuang data, SVM N I demonstrates an ability to reduce MAE by 1.93%, which is not impressive enough though. For most data sets, SVM N I can significantly improve the model accuracy, but it should be noted that the effect on individual data sets may not be obvious, on the premise of not reducing the generalization ability of the original model at least. Compared with SVM, the improvement of SVM N I 's IA index is not very effective, but the DA index has been greatly improved, indicating that SVM N I can achieve more accurate trend prediction. Furthermore, Figure 6 shows the frequency distribution of prediction errors with different error ranges on each site test set, from which it can be observed that more prediction errors of SVM N I model are around 0, compared with SVM model. In general, SVM N I 's prediction results are closer to the observed values, showing a higher prediction accuracy. Moreover, the applicability of the neighbor structural information to the SVM model further proves the robustness of the neighbor structural information extraction algorithm proposed in this paper.

Prediction Results Comparison between ANN N I and SVM N I
In this paper, the PM 2.5 hybrid models ANN N I and SVM N I are based on the traditional ANN and SVM respectively, which can improve the generalization ability greatly. It is concluded that the proposed bybrid theory can improve the forecast system in many aspects, such as prediction accuracy, direction discrimination and correlation in accordance with the results of several performance criteria given in Table 4. Therefore, there is no doubt that the combination of neighbor structural information is helpful for mining the missing structural characteristic when machine learning algorithm is applied to time series forecast task. In addition, it is proved that the proposed neighbor structural information extraction algorithm based on dynamic decomposition is a significant technique to analyze the structural characteristics in time dimension. However, due to the difference of the basic models, the generalization ability of hybrid prediction system based on different machine learning algorithms is also different. As shown in Table 4, SVM N I 's prediction results are superior to the ANN N I model in evaluation indexes. Due to the dynamics and timeliness of time series prediction, the capacity of training data set will not be large, so the SVM model for structural risk minimization is more appropriate for the case of small data set. In Figure 7, the prediction error range of the prediction models is shown by boxplots described by quartile values. According to the distribution characteristics of the prediction error results of different models, we notice that forecasting errors of hybrid systems are closer to 0. However, SVM N I model is more accurate than ANN N I model. In particular, it is worth paying attention to the fact that the outliers of SVM N I are closer to 0, which means that SVM N I can effectively correct outliers prediction. Figure 8 is the soccer plot composed of MFB and MFE indexes. The MFB and MFE values of the models fall into the continuous box, indicating that the prediction results are acceptable. Further more, they are in the dashed box, which means that the prediction model has good generalization ability. Figure 8 also clearly shows that the SVM N I model's statistical index results are closer to 0 than ANN N I 's. Especially for the MFB index of Beijing site, the result obtained by SVM N I model strides into the area marking the accurate prediction results. Table 5 gives the mean test of model residuals with or without neighbor structural information. We notice that there is a significant difference between ANN and SVM, which may lead to the difference of prediction performance between ANN N I and SVM N I .  (c) Shijiazhuang station.   Note: The number in the brackets is the p-Value, *** indicates the significance level of 1%, and ** indicates the significance level of 5%.

Conclusions
Time series prediction of PM 2.5 concentration is an essential and practical topic in the field of atmospheric research. Time series data has distinct characteristics in data structure, and its effective use is very helpful for the prediction system to obtain higher prediction accuracy. According to the modeling principle of machine learning algorithm, this paper proposes an algorithm of extracting neighbor structural information based on dynamic decomposition, and integrates it into machine learning model to construct a hybrid prediction model based on neighbor structural information. We use dynamic decomposition to decompose the original series into multiple dynamic modes to realize the structural information represented by the neighbor data, and then construct the neighbor structural information series through optimal combination. The simulation results of six groups of experimental data all show that the prediction model combined with neighbor structural information can obtain more accurate prediction results.
Therefore, the following conclusions can be drawn: (1) The method of time dynamic decomposition is suitable for the extraction of neighbor structural information representing the structural characteristics of time series. (2) The hybrid prediction model integrates the neighbor structural information to make up for the lack of structural characteristics of time series when machine learning models perform time series prediction. (3) The neighbor structural information extraction algorithm based on dynamic decomposition is generally applicable to traditional machine learning models. However, different basic machine learning algorithms lead to variant prediction ability of hybrid models. (4) The structural characteristics are inherent features of time series. Therefore, the algorithm proposed in this paper is also made available for other time series forecasting tasks in addition to PM 2.5 concentration prediction.