A Deep Belief Network Combined with Modiﬁed Grey Wolf Optimization Algorithm for PM2.5 Concentration Prediction

: Accurate PM2.5 concentration prediction is crucial for protecting public health and improving air quality. As a popular deep learning model, deep belief network (DBN) for PM2.5 concentration prediction has received increasing attention due to its e ﬀ ectiveness. However, the DBN structure parameters that have a signiﬁcant impact on prediction accuracy and computation time are hard to be determined. To address this issue, a modiﬁed grey wolf optimization (MGWO) algorithm is proposed to optimize the DBN structure parameters containing number of hidden nodes, learning rate, and momentum coe ﬃ cient. The methodology modiﬁes the basic grey wolf optimization (GWO) algorithm using the nonlinear convergence and position update strategies, and then utilizes the training error of the DBN to calculate the ﬁtness function of the MGWO algorithm. Through the multiple iterations, the optimal structure parameters are obtained, and a suitable predictor is ﬁnally generated. The proposed prediction model is validated on a real application case. Compared with the other prediction models, experimental results show that the proposed model has a simpler structure but higher prediction accuracy.


Introduction
With continuous advancement of industrialization and urbanization, artificial pollutants such as industrial production and automobile exhaust are increasing.As a consequence, urban air quality has gradually deteriorated, which seriously affects people's living environment.PM2.5, as one of the main pollutants in urban atmosphere has a small particle size, allowing staying in the atmosphere for a long time.It can enter human body through breathing and then deposits in the alveoli, causing great harm to human body [1].Besides, PM2.5 has a negative impact on atmospheric visibility and climate change [2].Therefore, accurate PM2.5 concentration prediction is necessary for controlling air pollution and protecting human health.
Early prediction methods of PM2.5 concentration are mostly simple regression analysis.The regression equation between influencing factors and pollutant is established [3][4][5].Subsequently, back-propagation (BP) neural network [6,7], support vector machine (SVM) [8,9], and other machine learning prediction methods are developed for the concentration prediction.However, these traditional machine learning methods are hard to learn the intrinsic relationship between the influencing factors and pollutant due to the shallow learning.In terms of prediction accuracy, it is not satisfactory.In 2006, Hinton et al. proposed a deep learning model, so-called deep belief network (DBN) [10].It consists of multiple restricted Boltzmann machines (RBMs) and a back propagation (BP) neural network, allowing processing large amounts of data.In recent decades, the DBN has been successfully applied to fault diagnosis, wind speed forecasting, breast cancer classification, and so on [11][12][13][14][15][16].Due to its advantage in prediction accuracy, in this paper, we introduce the DBN to conduct the PM2.5 concentration prediction.However, how to determine the structure parameters is an important issue when building a DBN network for PM2.5 concentration.The unsatisfactory network structure parameters-which contain number of hidden nodes, learning rate, and momentum coefficient-will reduce the prediction accuracy and increase the calculation time.
This paper aims to establish a prediction model with high accuracy for PM2.5 concentration.Main contributions are as follows: 1.
An advanced deep learning model, so-called DBN, is introduced to predict the PM2.5 concentration, which establishes the close relationship between the influencing factors and pollutant.

2.
A modified grey wolf optimization (MGWO) algorithm is proposed to determine the DBN structure parameters, which improves the prediction accuracy of PM2.5 concentration and reduces the computation time.

3.
The proposed model is successfully applied to the PM2.5 concentration prediction of Baoding city in China where air pollution is particularly serious.
The rest of this paper is organized as follows.Section 2 is devoted to the description of the DBN and the MGWO algorithm, including how to construct their combination.Section 3 presents the data source and the process of model establishment.Besides, several experiments have been done to validate the effectiveness and superiority of the proposed prediction model.In Section 4, the concluding remarks and future work are given.

Deep Belief Network
Deep belief network (DBN) is a probability generation model with multiple hidden layers.It maximizes the generation probability of entire model by training weights between nodes.Figure 1 shows the basic structure of the DBN.It consists of several restricted Boltzmann machines (RBMs) and a back propagation (BP) neural network.RBM, whose output is fed into the input of the next RBM, is a two-layer neural network with directional connections.Thus, a multi-hidden layer structure can be continuously superimposed.
Appl.Sci.2019, 9, x 2 of 13 learning prediction methods are developed for the concentration prediction.However, these traditional machine learning methods are hard to learn the intrinsic relationship between the influencing factors and pollutant due to the shallow learning.In terms of prediction accuracy, it is not satisfactory.In 2006, Hinton et al. proposed a deep learning model, so-called deep belief network (DBN) [10].It consists of multiple restricted Boltzmann machines (RBMs) and a back propagation (BP) neural network, allowing processing large amounts of data.In recent decades, the DBN has been successfully applied to fault diagnosis, wind speed forecasting, breast cancer classification, and so on [11][12][13][14][15][16].Due to its advantage in prediction accuracy, in this paper, we introduce the DBN to conduct the PM2.5 concentration prediction.However, how to determine the structure parameters is an important issue when building a DBN network for PM2.5 concentration.The unsatisfactory network structure parameters-which contain number of hidden nodes, learning rate, and momentum coefficient-will reduce the prediction accuracy and increase the calculation time.
This paper aims to establish a prediction model with high accuracy for PM2.5 concentration.Main contributions are as follows: 1.An advanced deep learning model, so-called DBN, is introduced to predict the PM2. 5 concentration, which establishes the close relationship between the influencing factors and pollutant.2. A modified grey wolf optimization (MGWO) algorithm is proposed to determine the DBN structure parameters, which improves the prediction accuracy of PM2.5 concentration and reduces the computation time.3. The proposed model is successfully applied to the PM2.5 concentration prediction of Baoding city in China where air pollution is particularly serious.The rest of this paper is organized as follows.Section 2 is devoted to the description of the DBN and the MGWO algorithm, including how to construct their combination.Section 3 presents the data source and the process of model establishment.Besides, several experiments have been done to validate the effectiveness and superiority of the proposed prediction model.In Section 4, the concluding remarks and future work are given.

Deep Belief Network
Deep belief network (DBN) is a probability generation model with multiple hidden layers.It maximizes the generation probability of entire model by training weights between nodes.Figure 1 shows the basic structure of the DBN.It consists of several restricted Boltzmann machines (RBMs) and a back propagation (BP) neural network.RBM, whose output is fed into the input of the next RBM, is a two-layer neural network with directional connections.Thus, a multi-hidden layer structure can be continuously superimposed.Figure 2 shows the basic structure of the RBM.Wherein v i and h i are the visible and hidden nodes of the input and output layers, respectively.The visible and hidden nodes are fully connected in both directions, while there is no connection between them.b and c denote the biases of the output and input layers respectively, while w denotes the connection weight between the visible node and the hidden node.Thus, the model parameter set θ consists of b, c, and w.
Appl.Sci.2019, 9, x 3 of 13 Figure 2 shows the basic structure of the RBM.Wherein i v and i h are the visible and hidden nodes of the input and output layers, respectively.The visible and hidden nodes are fully connected in both directions, while there is no connection between them.b and c denote the biases of the output and input layers respectively, while w denotes the connection weight between the visible node and the hidden node.Thus, the model parameter set θ consists of b , c , and w .The energy function of the RBM is defined as Then, the joint probability distribution function of the visible and hidden nodes can be obtained by the energy function.
(3) where ( ) Z θ is a partition function for normalization.
In the RBM, given the input vector v , the activation probability of the hidden node i h of the output layer can be expressed as 1) 1 exp( ) Given the output vector h , the activation probability of the visible node i v of the input layer can be expressed as To obtain the optimal solution of the model, the negative log-likelihood function of training set D is taken as the loss function and is given by where N is the size of the training set.After that, each weight is updated by the partial derivatives of the loss function for parameter set θ , as follows.The energy function of the RBM is defined as Then, the joint probability distribution function of the visible and hidden nodes can be obtained by the energy function.
where Z(θ) is a partition function for normalization.
In the RBM, given the input vector v, the activation probability of the hidden node h i of the output layer can be expressed as Given the output vector h, the activation probability of the visible node v i of the input layer can be expressed as To obtain the optimal solution of the model, the negative log-likelihood function of training set D is taken as the loss function and is given by where N is the size of the training set.After that, each weight is updated by the partial derivatives of the loss function for parameter set θ, as follows.
where <:> d denotes the statistical probability of the samples, and <:> m denotes the generation probability of the model.Through adjusting the weights of each node, DBN makes the statistical probability of the samples equal to the generation probability of the model as much as possible.
The training process of the DBN is divided into two stages.The first stage is to train each RBM from bottom up.The second stage is to fine-tune the network parameters from top down.In the RBM, the unbiased statistical probability of the samples can be calculated using Equations ( 4) and ( 5), while the unbiased generation probability of the model is difficult to obtain.To address this issue, Hinton proposed a contrast divergence algorithm [17] to obtain an approximation of the RBM distribution by one Gibbs sampling.The process can be described mathematically as where σ and λ denote the learning rate and the momentum coefficient, respectively.

Modified Grey Wolf Optimization Algorithm
Grey wolf optimization (GWO) algorithm [18] simulates intelligent hunting behavior of grey wolves.According to hierarchical mechanism, grey wolf population can be divided into chief wolf α, deputy chief wolf β, ordinary wolf δ, and bottom wolf ω from high to low.When the wolves capture prey, the other individuals are organized to besiege the prey under the leadership of wolf α.In the GWO algorithm, the α, β, and δ denote the individuals with best fitness, the second best fitness and the third best fitness respectively, while the remaining individuals are noted as ω.The position of the prey is defined as the global optimal solution of optimization problem.Then, the GWO algorithm can be briefly described as follows.
In a D-dimensional search space, suppose that the position of the ith grey wolf is , where X d i denotes the position of the ith grey wolf in the dth dimension.First of all, the grey wolf population surrounds the prey.The mathematical description of the process is where t is the current number of iterations; where r 1 and r 2 denote the random variables between [0, 1]; and a is the convergence factor.In the evolution process of the algorithm, the convergence factor a changes with the current number of the iterations and is given by where T is the maximum number of the iterations.Obviously, the convergence factor a decreases from 2 to 0 as t increases, which serves as the global search and local search for the adjustment of the algorithm.
Next, the grey wolf population hunts.The process, which is guided by α, β, and δ, aims to update the all individual positions.The mathematical description is Finally, the grey wolves attack and capture the prey.The attack behavior is mainly achieved by linearly decreasing the convergence factor a from 2 to 0. When |A| ≤ 1, the grey wolves concentrate on attacking the prey, corresponding to the local search of the algorithm, while when |A| > 1, the grey wolves disperse for the global search.
How to balance the local search and the global search is a very important problem often encountered in swarm intelligence algorithm [19].The local search ability can speed up the convergence of the algorithm; while the global search can maintain the diversity of the population.For the GWO algorithm, we find that the convergence process in practice is not linear.Instead, a nonlinear convergence factor may be more appropriate for balancing the local search and the global search.Thus, a new nonlinear convergence factor is proposed to replace the original linear convergence factor (Equation ( 12)) and is given by where k denotes the attenuation order of the nonlinear convergent factor and takes the integer between 0 and 5.The nonlinear convergence factor with larger k decreases more sharply.Figure 3 shows the convergence factors with different values of k.At the beginning of the iterations, the attenuation degree of a is reduced for constructing the global search.In the later stage of the iterations, the attenuation degree of a is improved for constructing the local accurate search.In this paper, value of k is taken as 3.
where T is the maximum number of the iterations.Obviously, the convergence factor a decreases from 2 to 0 as t increases, which serves as the global search and local search for the adjustment of the algorithm.
Next, the grey wolf population hunts.The process, which is guided by α , β , and δ , aims to update the all individual positions.The mathematical description is ( 1) ( 1) 3 Finally, the grey wolves attack and capture the prey.The attack behavior is mainly achieved by linearly decreasing the convergence factor a from 2 to 0. When 1 A ≤ , the grey wolves concentrate on attacking the prey, corresponding to the local search of the algorithm, while when 1 A > , the grey wolves disperse for the global search.
How to balance the local search and the global search is a very important problem often encountered in swarm intelligence algorithm [19].The local search ability can speed up the convergence of the algorithm; while the global search can maintain the diversity of the population.For the GWO algorithm, we find that the convergence process in practice is not linear.Instead, a nonlinear convergence factor may be more appropriate for balancing the local search and the global search.Thus, a new nonlinear convergence factor is proposed to replace the original linear convergence factor (Equation 12) and is given by where k denotes the attenuation order of the nonlinear convergent factor and takes the integer between 0 and 5.The nonlinear convergence factor with larger k decreases more sharply.Figure 3 shows the convergence factors with different values of k .At the beginning of the iterations, the attenuation degree of a is reduced for constructing the global search.In the later stage of the iterations, the attenuation degree of a is improved for constructing the local accurate search.In this paper, value of k is taken as 3.The original GWO algorithm updates the grey wolf positions through calculating the average of the three best grey wolf positions.However, the update strategy ignores the characteristics of the different solutions, which may achieve the final solution with lower accuracy.In this paper, we design a weighting factor considering the contribution of each solution.Thus, Equation ( 14) is modified as where f α , f β , and f δ denote the values of the fitness of α, β, and δ, respectively; and w α , w β , and w δ denote the values of the weighting factor of α, β, and δ, respectively.

DBN Structure Parameters Determined by MGWO Algorithm
Due to lack of the effective training algorithms for the DBN structure parameters containing the number of hidden layers, the number of hidden nodes, the learning rate and the momentum coefficient, the selection of structure parameters mainly relies on the manual experience or the multiple experiments.To address this issue, the modified grey wolf optimization (MGWO) algorithm is proposed to optimize the number of hidden layers, the number of hidden nodes, the learning rate, and the momentum coefficient.Figure 4 shows the combination process of MGWO algorithm and DBN.Wherein N grey wolves are selected and the DBN structure parameters are searched in parallel.
the three best grey wolf positions.However, the update strategy ignores the characteristics of the different solutions, which may achieve the final solution with lower accuracy.In this paper, we design a weighting factor considering the contribution of each solution.Thus, Equation ( 14) is modified as where f α , f β , and f δ denote the values of the fitness of α , β , and δ , respectively; and w α , w β , and w δ denote the values of the weighting factor of α , β , and δ , respectively.

DBN Structure Parameters Determined by MGWO Algorithm
Due to lack of the effective training algorithms for the DBN structure parameters containing the number of hidden layers, the number of hidden nodes, the learning rate and the momentum coefficient, the selection of structure parameters mainly relies on the manual experience or the multiple experiments.To address this issue, the modified grey wolf optimization (MGWO) algorithm is proposed to optimize the number of hidden layers, the number of hidden nodes, the learning rate, and the momentum coefficient.Figure 4 shows the combination process of MGWO algorithm and DBN.Wherein N grey wolves are selected and the DBN structure parameters are searched in parallel.The parameter optimization process of DBN based on MGWO algorithm is as follows.
Step 1: Initialize the grey wolf population.Each individual position consists of the number of hidden layers l , the number of hidden nodes n , the learning rate σ , and the momentum coefficient λ .
Step 2: Learn the training samples and take the mean square error of prediction results using DBN as the individual fitness function of MGWO algorithm.
Step 3: Calculate a according to Equation (15) and update A and C .The parameter optimization process of DBN based on MGWO algorithm is as follows.
Step 1: Initialize the grey wolf population.Each individual position consists of the number of hidden layers l, the number of hidden nodes n, the learning rate σ, and the momentum coefficient λ.
Step 2: Learn the training samples and take the mean square error of prediction results using DBN as the individual fitness function of MGWO algorithm.
Step 3: Calculate a according to Equation ( 15) and update A and C.
Step 4: Calculate w according to Equation ( 17) and update the individual position according to Equations ( 13) and (16).
Step 5: Return the optimal individual position if the termination condition is reached; otherwise, repeat the Step 3-Step 5.
The key to finding the global optimal solution in the MGWO algorithm is to determine the termination condition and the fitness function [20,21].In this paper, the training error of the DBN is used to calculate the fitness function of the MGWO algorithm, and the training error threshold of the DBN is taken as the termination condition of the MGWO algorithm.The calculation steps of the training error are as follows.
Step 1: If the fitness of the grey wolf of the tth generation is f (l t , n t , σ t , λ t ), then the number of hidden layers, the number of hidden nodes, the learning rate, and the momentum coefficient can be noted as l t , n t , σ t , and λ t .Then, initialize the DBN parameter set θ consisting of weights and biases.
Step 2: Let v 0 be the input sample vector, q be the iteration number of the DBN and e be the training error of the DBN.
Step 3: Calculate the feature vectors h 0 , v 0 , h 1 , v 1 , • • • , h l t of the visible and hidden layers of the RBM according to Equations ( 4) and (5).
Step 4: Get the joint probability distribution of the initial state and the update state of the RBM according to Equation (7), and then substitute it into Equation ( 8) to modify the parameter set θ.
Step 5: Iterate the training set by q times with random batches, and repeat the Step 3-Step 4.
Step 6: Fine-tune the θ using the BP algorithm.
Step 7: Calculate the h l t using the θ and get the training error e = h l t − v 0 .Thus, the MGWO algorithm is with the DBN through the fitness function.The fitness value can reflect the quality of DBN structure parameters, so as to generate a suitable predictor.

Data Source
Baoding city is located in the central part of Hebei Province, China.The city with the land area of 22,190 square kilometers has four distinct seasons.In recent decades, air pollution has become more and more serious in this city.Especially in winter, there is severe haze.In this paper, we take the Baoding city as the research area and expect to establish a long-term and effective prediction model for PM2.5 concentration.
For the research area, we got the three parts of data, including PM2.5, aerosol optical depth (AOD) and meteorological parameters.Table 1 reports the related parameters of data source.The brief description is as follows.

1.
PM2.5 data-The PM2.5 data come from the monitoring station for air pollution particles in Baoding city.Its unit is µg/m 3 .The data are sourced from the china meteorological website (http://www.tianqihoubao.com/lishi/), and the selection duration is 2014-2016.

2.
Aerosol optical depth-Aerosol is the general term of solid and liquid particulate matter suspended in the atmosphere.AOD, one of the optical properties of atmospheric aerosols, is equal to the integral of aerosol extinction coefficient from the ground to the top of the atmosphere.It is used to characterize the degree of extinction caused by the aerosol scattering in cloudless atmospheric vertical columns.AOD data are derived from MODIS aerosol products.MODIS offers two AOD products with resolutions of 10 km and 3 km.Considering the small ground coverage in Baoding city, the MOD04_3K product with the resolution of 3 km is chosen.The data are sourced from the official website of MODIS products, and the selection duration is 2014-2016.

3.
Meteorological parameters-Monitoring stations in Baoding city provide 9 meteorological parameters, including average temperature, maximum temperature, minimum temperature, air pressure, average relative humidity, total precipitation, average visibility, average wind speed, and maximum continuous wind speed.The data are sourced from the global weather data website (https://en.tutiempo.net/climate),and the selection duration is 2014-2016.

Model Establishment and Verification
First of all, this paper obtains 1058 sets of available data.To make the data have same range and promote the network convergence, the original sample data are normalized by (x − x min )/(x max − x min ), where x, x min , and x max denote the original data, the minimum and maximum values of the original data, respectively.
Next, the MGWO algorithm is utilized to search the number of the hidden nodes, the learning rate and the momentum coefficient of the DBN in parallel.In the MGWO algorithm, the population size is set to 20 and the maximum number of iterations is set to 50.The DBN network adopts a classic four-layer structure, including an input layer, a first hidden layer (H 1 ), a second hidden layer (H 2 ), and an output layer.In the DBN, the maximum number of RBM iterations is set to 50; the maximum number of BP neural network iterations is set to 100; the training error threshold is set to 0.02.The analytic space of the number of hidden layer nodes is set between 0 and 500, and the analytic spaces of the learning rate and the momentum coefficient are set between 0 and 1.
Figure 5 shows the distribution of grey wolf population in the MGWO optimization process for the hidden nodes.The grey wolf population is able to obtain the information about the solution during the search process.Through the surrounding, hunting, and attacking operations, the grey wolf population gradually gathers into the optimal solution area.In the experiment, the initial population of the MGWO algorithm is randomly distributed in the analytic space.As the iteration increases, the grey wolves approach the optimal solution step by step.After eight iterations, the MGWO algorithm finds the optimal solution of the DBN, that is, the number of H 1 nodes is 6, the number of H 2 nodes is 5, the learning rate is 0.077 and the momentum coefficient is 0.807.
The DBN with these searched parameters is used to learn the training samples.The establishment process of the DBN can be divided into two steps.The first step is to train each RBM separately.This is an unsupervised process, ensuring that feature information is preserved as much as possible when feature vectors are mapped to different feature spaces.The second step is to fine-tune the weights and biases of the network.The BP neural network takes the output feature vector of the RBM as its input feature vector and trains the whole network in a supervised manner.
After the MGWO optimized DBN (MGWODBN) model is established, there are two ways to verify the model, as follows: 1.
MGWODBN predicts all trained data, and the linear fitting equation of the observed and predicted values is obtained as y = 1.117x − 7.947, where y is the actual observed value; x is the predicted value of the model.The root mean square error (RMSE) is 18.532 µg/m 3 , and the coefficient of determination (R 2 ) is 0.713.Figure 6a shows the verification results.It can be seen that the sample points are roughly distributed on both sides of the diagonal line and are more aggregated, indicating that the model has a better fitting effect.

2.
Cross-validation-90% data are randomly selected to train the model, and the remaining 10% data are used as the verification points.Repeated 10 experiments showed that the linear fitting equation of the observed and predicted values is obtained as y =  The DBN with these searched parameters is used to learn the training samples.The establishment process of the DBN can be divided into two steps.The first step is to train each RBM separately.This is an unsupervised process, ensuring that feature information is preserved as much as possible when feature vectors are mapped to different feature spaces.The second step is to finetune the weights and biases of the network.The BP neural network takes the output feature vector of the RBM as its input feature vector and trains the whole network in a supervised manner.
After   The DBN with these searched parameters is used to learn the training samples.The establishment process of the DBN can be divided into two steps.The first step is to train each RBM separately.This is an unsupervised process, ensuring that feature information is preserved as much as possible when feature vectors are mapped to different feature spaces.The second step is to finetune the weights and biases of the network.The BP neural network takes the output feature vector of the RBM as its input feature vector and trains the whole network in a supervised manner.
After the MGWO optimized DBN (MGWODBN) model is established, there are two ways to verify the model, as follows: 1. MGWODBN predicts all trained data, and the linear fitting equation of the observed and predicted values is obtained as 1.117 7.947 yx  , where y is the actual observed value; x is the predicted value of the model.The root mean square error (RMSE) is 18.532 g/m 3 , and the coefficient of determination (R 2 ) is 0.713.Figure 6a shows the verification results.It can be seen that the sample points are roughly distributed on both sides of the diagonal line and are more aggregated, indicating that the model has a better fitting effect.2. Cross-validation-90% data are randomly selected to train the model, and the remaining 10% data are used as the verification points.Repeated 10 experiments showed that the linear fitting equation of the observed and predicted values is obtained as 1.200 11 710 y x ..The RMSE is 19.815 g/m 3 , and the R 2 is 0.677.Figure 6b shows the verification results.It can be seen that the sample points are roughly distributed on both sides of the diagonal line.The less sample points deviate from the diagonal line, which satisfies the law of error distribution.These results show that the verification results are good.

Compared with Other Prediction Models
The mean absolute error (MAE), the mean square error (MSE) and the R 2 are popular indicators for evaluating the prediction results [22][23][24][25].Thus, we use the three indicators to evaluate the

Figure 3 .
Figure 3.Comparison of convergence factors with different values of k .

Figure 3 .
Figure 3.Comparison of convergence factors with different values of k.

Figure 5 .
Figure 5. Distribution of grey wolf population in the MGWO optimization process.(a) Initial population.(b) The fourth-generation population.(c) The eighth-generation population.

3 )Figure 5 .
Figure 5. Distribution of grey wolf population in the MGWO optimization process.(a) Initial population.(b) The fourth-generation population.(c) The eighth-generation population.

Figure 5 .
Figure 5. Distribution of grey wolf population in the MGWO optimization process.(a) Initial population.(b) The fourth-generation population.(c) The eighth-generation population.

Table 1 .
Related parameters of data source 19200x − 11.710.The RMSE is19.815µg/m 3 , and the R 2 is 0.677.Figure6bshows the verification results.It can be seen that the points are roughly distributed on both sides of the diagonal line.The less sample points deviate from the diagonal line, which satisfies the law of error distribution.These results show that the verification results are good. sample Figure6ashows the verification results.It can be seen that the sample points are roughly distributed on both sides of the diagonal line and are more aggregated, indicating that the model has a better fitting effect.2. Cross-validation-90% data are randomly selected to train the model, and the remaining 10% data are used as the verification points.Repeated 10 experiments showed that the linear fitting equation of the observed and predicted values is obtained as 3, and the R 2 is 0.677.Figure6bshows the verification results.It can be seen that the sample points are roughly distributed on both sides of the diagonal line.The less sample points deviate from the diagonal line, which satisfies the law of error distribution.These results show that the verification results are good.