A Combination Forecasting Strategy for Precipitation, Temperature and Wind Speed in the Southeastern Margin of the Tengger Desert

: Global warming is inevitably the cause of local climate change, which will have a profound impact on regional ecology, especially in the desertiﬁed steppe and steppeﬁed desert transition zones with fragile ecological environments. In order to investigate the change trends of precipitation, temperature and wind speed for e ﬀ ectively realizing the restoration and protection of desert ecosystems, a combination forecasting strategy including the data pre-processing technique, sub-models selection and parameter optimization was proposed and three numerical simulation experiments based on the combination model with the weights optimized by the particle swarm optimization algorithm were designed to forecast the precipitation, temperature and wind speed in the southeastern margin of the Tengger Desert in China. Numerical results showed that the proposed combination prediction method has higher forecasting accuracy and better robustness than single neural network models and hybrid models. The proposed method is beneﬁcial to analyze climate change in arid regions. ﬁnal benchmark. Experiment results show that the weight of sub-models VMD-LS-BPNN, VMD-LS-SVM and VMD-LS-NAR are 0.681, 0.019 and 0.300 respectively, while the weight of sub-models VMD-LS-BPNN, VMD-LS-SVM and DWT-LS-NAR are 0, 1 and 0, which means that the sub-models select strategy based on prediction standards of MAE, RMSE and NSCE is robust and reliable.


Introduction
Global warming and changes in precipitation [1] will inevitably affect the composition, structure and function of biological soil crust (BSC) [2,3]. As a pioneer of degraded vegetation restoration [4,5], BSC has become an important organization on the surface of arid and semiarid areas through its microbial community metabolism [3,6]. It is widely spread in arid and semiarid regions as one of the major components of desert ecosystems because BSC has developed strong adaptability to resistant the drought, extreme temperatures, and UV-B radiation [7,8] to adapt the extreme environment. Its existence and development is an important indicator of the reversal of the ecological environment [9], as well as an important indicator for the evaluation of the health of desert ecosystems [10].
Studies have shown that the main a-biological factors affecting BSC are water, temperature, wind speed, light, etc., among which water is considered to be the most important a-biological factor affecting the ecological and physiological functions of BSC [2,8,11]. Almost all studies on the ecological and physiological functions of BSC involve the influence of water [10]. The scarcity of rain and high temperature are two major constraint factors during the growing season in the desert [3]. Temperature and water often determine the distribution limit of plants, restricting the germination and growth rate of plants and all physiological changes in plants. Temperature is also one of the important factors affecting plant growth and development [3,6,12]. The physiological functions and metabolic rate of BSC which is an optimized algorithm based on swarm intelligence with better performance than genetic algorithm(GA) [30], is adopted to optimize the weights of the traditional combination model. Thus

Study Area and Research Data
China is seriously affected by sandstorm and desertification, especially in the desertified steppe and steppefied desert transition zones to the west of the Helan mountain with annual precipitation less than 200 mm, this area is seriously affected by desertification and sandstorm, and it is also the key area for the construction of non-irrigation reparation and the construction of national ecological barrier in the north of China [4,5]. The meteorological factors such as temperature, precipitation and wind speed were collected at the Shapotou Desert Research and Experiment Station (37.270N, 104.570E), operated by the Chinese Academy of Sciences in the southeastern margin of the Tengger Desert [9,10]. This region is a transition zone from desertification grassland to steppe desert, with an altitude of 1339 m. This area is covered with the tall and dense trellised crescent-shaped dune chains, and the soil matrix is loose and barren flowing sand-sand soil. Precipitation is the only recharge water source in this region, which plays an important role in maintaining the stable and sustainable development of the desert ecosystem. The change of soil moisture is divided into three periods: the water loss period from April to summer and autumn rainy seasons, the replenishment period in summer and autumn rainy seasons, and the stable period in winter and spring. The influence of precipitation is only shown in the soil layer of 0~40 cm, and the stable water content of the sand layer is only 2%~3% [11]. From 1991 to 2018, the average temperature was 10.78 • C, the low-temperature extreme value was −26.2 • C and the high-temperature extreme value was 40 • C, the average sunshine hours were 2649.725 h, the average annual precipitation was 180.58 mm but the average evaporation was 2520.4 mm and the average wind speed was 2.8 m/s. Figure 1 shows the curve and histogram of MP, MMT and MAWS time series and Table 1 shows the numerical characters of those meteorological factors for study area from the year 1991 to 2018. The Kolmogorov-Smirnov test result shows that the MAP, MMT and MAWS time series disobey normal distribution, which can also be intuitively seen from the histogram in Figure 1 or from the skewness and kurtosis in Table 1.

Data Pre-Processing Techniques
In reality, due to the fact that the precipitation, temperature and wind speed are closely related to other meteorological parameters such as barometric pressure, airflow and humidity, the precipitation, temperature and wind speed time series could be easily influenced by the landform and geomorphology parameters. Predicting precipitation, temperature and wind speed with the time series directly often has large errors. In order to overcome those deficiencies, the data pre-processing techniques such as DWT, EEMD and VMD are used to reduce noises.

Daubechies Wavelet Transform
Given a meteorological factor time series of ( ) x t with the length N, the DWT consists of log2N steps at most, and is used to decomposes the time series into low-pass filter A and high-pass filter D, the low-pass filter A reflects the main features and the high-pass filter D represents random factors often called the noise of the signal [36]. The DWT can be achieved by using the Matlab wavelet toolbox and the decomposition processes is shown in Figure 4.

The Ensemble Empirical Mode Decomposition Method
The EEMD, which is extended from EMD to overcome the drawback of frequency mixing, is widely used to decompose non-linear and non-stationary signal sequences [37][38][39]. It defines the true IMFs components as the mean of an ensemble of trials and each trial consists of the decomposition results of the signal plus a white noise of finite amplitude.

Data Pre-Processing Techniques
In reality, due to the fact that the precipitation, temperature and wind speed are closely related to other meteorological parameters such as barometric pressure, airflow and humidity, the precipitation, temperature and wind speed time series could be easily influenced by the landform and geomorphology parameters. Predicting precipitation, temperature and wind speed with the time series directly often has large errors. In order to overcome those deficiencies, the data pre-processing techniques such as DWT, EEMD and VMD are used to reduce noises.

Daubechies Wavelet Transform
Given a meteorological factor time series of x(t) with the length N, the DWT consists of log 2 N steps at most, and is used to decomposes the time series into low-pass filter A and high-pass filter D, the low-pass filter A reflects the main features and the high-pass filter D represents random factors often called the noise of the signal [36]. The DWT can be achieved by using the Matlab wavelet toolbox and the decomposition processes is shown in Figure 4.

The Ensemble Empirical Mode Decomposition Method
The EEMD, which is extended from EMD to overcome the drawback of frequency mixing, is widely used to decompose non-linear and non-stationary signal sequences [37][38][39]. It defines the true IMFs components as the mean of an ensemble of trials and each trial consists of the decomposition results of the signal plus a white noise of finite amplitude.

The Variational Mode Decomposition Method
VMD is a new signal decomposition method that decomposes the complex signals into amplitude modulation and frequency modulation signals [40]. It is a non-stationary signal processing method with preset scale, which can be used in wind time series analysis. In the processes of obtaining the components of the signal, VMD completely abandoned the EMD using the loop filter processing method, through an iterative search for the optimal solution of variational model to determine each modal function of center frequency and bandwidth. The signal frequency band was adaptively decomposed to get the default scale of several band-limited intrinsic mode functions, which is a kind of completely non-recursive signal decomposition method. The overall framework of the VMD is the variational problem, which mainly includes constructing the variational problem and its solution. Compared with the recursive filter mode of EMD and EEMD, VMD transforms the signal into non-recursive and variational mode decomposition mode. The principle and mathematical derivation of VMD can be found in Dragomiretskiy K. et al., 2014 [40].
In this paper, precipitation, temperature and wind speed time series are decomposed as a series of bandwidth-limited sub-sequences to reduce the complexity and instability of the original time series by using WT, EEMD and VMD. Due to the different time series have their inherent nonlinear variation rules, the collected data would be polluted with different forms of noise for different reasons, which means that no single de-noising method can realize the de-noising of all different time series, different de-noising method should be used to clean data set, then the best de-noising method can be selected according to one or more evaluation criteria.

Back-Propagation Neural Network
As a mature and robust neural network model, the BPNN was proposed by Rumelhart and Mc Celland in 1986, and has been widely used in non-linear curve fitting to uncover the nonlinearity even in the absence of the relationship information between inputs and outputs. The BPNN usually consists of an input layer, one or more hidden layers and an output layer. The nodes of each adjacent layer are interconnected with weights. Each node in the network is a neuron whose function is to calculate the inner product of the input vector and weight vector by a nonlinear transfer function to get a scalar result. BPNN is a feed-forward neural network practiced by the back-propagation algorithm [25].

Support Vector Machine
SVM was proposed by Cortes and Vapnik in 1995 to solve the classification and predict problems under the limited sample sizes [41]. The main idea of SVM for forecasting is to map the data into a high dimensional feature space via the kernel function and construct a linear regression model in high dimensional space based on the structural risk minimization principle.

Extreme Learning Machine
ELM is a single-layer feed-forward neural network [42,43] and it achieves the learning process with the input weights and hidden biases are initialized with random numbers, the calculation of the output weight can be obtained by solving the inverse operation on the hidden layer output matrix. Due to the input weights and hidden biases is randomly initialized, the feature mapping of the ELM is also random.

Nonlinear Auto-Regressive models (NAR)
A typical NAR neural network consists of input layer, hidden layer and output layer and input delay function [34,44] the output of NAR is denoted as where d is the delay order. Compared with the normal neural network model, the difference between the NAR model and the BPNN model is that the delay function is added in the first hidden layer to record previous data and the delay order determines the number of neural network inputs. The optimal network model can be selected by adjusting the number of delays, neurons and hidden layers.

The Hybrid Forecasting Model
In this paper, neural networks such as BPNN, SVM, ELM, NAR were selected as the basic forecasting model to forecast the precipitation, temperature and wind speed of the Tengger desert in China. Due to the original precipitation time series data are full of noise, as predicting precipitation with the time series directly often has a large error. The DWT, EEMD and VMD were selected separately to eliminate the noise in the original data. It is already widely known that different input dimensions can produce different forecast results with various accuracies, therefore, it is crucial for BPNN, SVM, ELM, NAR to determine the input samples for better forecasting performance. In this study, the LS method was adopted to determine the input dimensions. Step 1. VMD, EEMD and DWT are applied to decompose the original time series and extract the basic characteristics from the non-stationary precipitation, temperature and wind speed time series, respectively.
Step 3. The longitudinal data selection method is used to determine the input dimensions. It divides the original data set into subsets according to a particular date and uses the dimension (d) of input data set determined by BPNN, SVM, ELM, NAR to select training samples and testing samples. In the iterative process, let i denote the starting point of the sample selection, the group of training sample input is a vector from i + 1 to i + d, and the output is i + d + 1 from the subset. Testing samples input is a vector from i + 2 to i + d + 1 the input dimension here is three and the previous forecasting results will also be applied to calculate the future-step results.
Step 4. Four neural network models BPNN, SVM, ELM, NAR were selected to forecast the original time series and de-noised time series of precipitation, temperature and wind speed based on the historical data, respectively.
Step 5. The evaluation criteria NSCE, MAE, RMSE, MAPE and NMSE, which are defined and shown in Table 2 Table 2. The model evaluation criteria of Nash-Sutcliffe coefficient of efficiency (NSCE), mean absolute error (MAE), root mean squared error (RMSE), mean absolute percentage error (MAPE) and normalized mean squared error (NMSE).

NSCE
Nash-Sutcliffe Coefficient of Efficiency In Table 2, x i andx i represent the i-th actual and forecasted values(or de-noised data), respectively, and n is the sample size.

The Traditional Combination Model Optimized by the Particle Swarm Optimization Algorithm
The TCM takes the form where y t denote the output of TCM at time t and y j,t denote the forecasting results of the j-thforecast model for time series x t , w j is the weight of the j-th forecast model which are all constrained to be w j ∈ [0, 1] and meet the normative constraint m j=1 w j = 1. The best weight w j of the jth, j = 1, 2, · · · m forecast model can be obtained by using the particle swarm optimization(PSO) to solve the following optimization problem: where e jt = x t − y j,t , t = 1, 2, · · · T denote the residual of thej-th individual model at time t.
In order to fast and effective get the weight of the traditional combination model to enhance the forecasting performance, the evolutionary algorithms, PSO is applied to optimize the weight of TCM. The particle swarm optimization technology and has been widely applied to non-linear optimization problems [28]. It can solve both continuous and discrete optimization problems, and this is because that PSO only needs function evaluations instead of initial values. Besides, it can also escape local optimal solutions. In the standard PSO algorithm, the basic idea is that the particles adjust their speed based on their experience. The main idea of PSO is as following: x iN ) represent the position of the i-th particle, P i = (P i1 , P i2 , · · · , P iN ) represent the best position, that is P best , and V i = (V i1 , V i2 , · · · , V iN ) denote the speed of the particles, g best represent the index of the best particle among all the particles in the group. The update of particles' speed and position using the following formulas: where c 1 and c 2 are acceleration coefficients, w is the inertia factor, U 1 and U 2 are two independent identically distribute random variables uniformly distributed in the range of [0, 1] and N represents the number of particles. The proper value of inertia weight w provides a balance between global and local explorations and lower iteration times to find an adequately optimal solution.

Results and Discussion
In this section, three simulation experiments are designed to forecast the MP, MMT and MAWS time series respectively, the experimental results and related analysis will be included in order to demonstrate the forecasting performance of proposed four single neural network forecasting models (BPNN, SVM, ELMNAR) and twelve hybrid models

Experiment I: The Monthly Precipitation Forecasted by the Ensemble Model
Precipitation plays an important role in maintaining the stable and sustainable development of desert ecosystem [11,12], the variation of precipitation is expected to influence the functioning of desert ecosystems by altering the community species richness, abundance, coverage and biomass of BSC. Due to the metabolically active of BSC only happens in wet conditions, the drying rates of soil surfaces in deserts has significant impacts on the physiological functioning of these communities [16], and the precipitation intensity and intermittency play an important role in the dynamics of vegetation cover and deep soil moisture [17]. Therefore, it is of great significance to forecast the long-term changes of precipitation for the restoration and protection of desert ecosystems in the desertified steppe and steppefied desert transition zones with a fragile ecological environment.
Experiment I was designed to predict the monthly precipitation, the dataset from January 1991 to December 2010 is used for calibration of the proposed method and the dataset from January 2011 to December 2018 is used for validation. The experiment processes are shown in Figure 2. Figure 2 show that DWT and VMD have better de-noising performance than the EEMD, the NSCE and RMSE of DWT and VMD are significantly smaller than the NSCE and RMSE of EEMD. In the data preprocess, the main experimental parameters of the VMD are α = 0.05, τ = 0, K = 7, where α is the balancing parameter of the data-fidelity constraint, τ is time-step of the dual ascent and K is the number of modes to be recovered. The level of DB3 wavelet transformation is 5. The input dimension is 3 and the output dimension is 1 for BPNN, SVM, and the input dimension is 4 and the output dimension is 1 for ELM, the feedback delays is 12 and the number of hidden layer nodes is 10 for NAR, each model is separately trained by using the training and test sets, respectively. Therefore  The evaluation criteria of NSCE, MAE, RMSE, MAPE and NMSE were computed and listed in Table 3. Due to the monthly precipitation being zero in some months, the value of NSCE was negative and the value of MAPE and NMSE were infinite. In order to get robust forecasting performance, the sub-models are selected based on the order of MAE, RMSE and NSCE. If the selected sub-models have the same neural network structure, the minimum value of MAE was selected as the final benchmark. According to this selection criterion, the hybrid models VMD-LS-BPNN and VMD-LS-NAR were selected as the sub-models and combined by using the PSO-TCM, The MAE, RMSE, NMSE are computed and listed in Table 3. The evaluation criteria of NSCE, MAE, RMSE, MAPE and NMSE were computed and listed in Table 3. Due to the monthly precipitation being zero in some months, the value of NSCE was negative and the value of MAPE and NMSE were infinite. In order to get robust forecasting performance, the sub-models are selected based on the order of MAE, RMSE and NSCE. If the selected sub-models have the same neural network structure, the minimum value of MAE was selected as the final benchmark. According to this selection criterion, the hybrid models VMD-LS-BPNN and VMD-LS-NAR were selected as the sub-models and combined by using the PSO-TCM, The MAE, RMSE, NMSE are computed and listed in Table 3. As shown in Table 3, the best single predictor model is NAR and the worst single predictor model is ELM. The MAE and RMSE of single neural network models are all bigger than the corresponding hybrid models, and the changing trend of NSCE showed in opposite direction, which means that under the same input dimension and the output dimension, the hybrid models have higher prediction accuracy than the single neural network models, data preprocess methods can effectively improve the forecasting performance, the most obvious improvement in prediction is the hybrid VMD-LS-    Table 4, the following main conclusions can be obtained:

As shown in Figures 2 and 3 and
(1) The MP is the accumulation of the daily precipitation related with other meteorological parameters and unknown factors, the collected time series is inevitably accompanied with higher noise, predicting the monthly precipitation with the time series directly will inevitably cause larger error. The appropriate de-noising method selection is an effective way to overcome this defect.
(2) In the decomposition process of precipitation time series, DWT and VMD have better de-noising performance than the EEMD. There are some negative results in the series of EEMD, it is because the monthly precipitation is zero in some months and the standard deviation of monthly precipitation time series is 20.1. The IMF components defined as the mean of an ensemble of trials and each trial consists of the decomposition results of the signal plus a white noise of finite amplitude, it inevitably leads to negative values in the decomposition sequence.
(3) The forecasting performance of NAR is better than BPNN, SVM and ELM under the evaluation criteria of NSCE, MAE, RMSE. The forecasting model based on the data preprocessing technology and the longitudinal data selection method can significantly improve the prediction accuracy, which means that the hybrid models have better forecasting performance than the corresponding single machine learning model.
(4) The established PSO-TCM model can significantly improve the prediction accuracy, and the sub-model selection processes is also important according to the forecasting performance of a single neural network and hybrid models. Numerical results show that the combined model has good robustness and high precision.
The precipitation is considered to be the most important a-biological factor affecting the ecological and physiological functions of BSC. As shown in Figure 3, the precipitation in Shapotou area of the Tengger Desert has a significant difference and obviously intermittent between different years and months. The intermittent precipitation and instantaneous high evaporation of the soil-surface result in the shallow layer of dune soil has frequent alternations between the wet and dry state during the rainy season, which inevitably affects the functioning of desert ecosystems by altering biotic components such as the species composition of BSC. In fact, the vegetation restoration processes and the variation of vegetation cover in arid and semiarid regions are complicated with the uncertain precipitation intensity or precipitation intermittency. it remains unclear how the components of BSC will respond to the prolonged warming and reduced precipitation that is predicted to occur with climate change [2].

Experiment II: The Monthly Mean Temperature Forecasted by the Ensemble Model
Temperature and precipitation determine the distribution limit of the biotic community, restricting the germination and growth rate of plants and all physiological changes in plants [12]. The physiological functions and metabolic rate of the biotic community are closely related to temperature changes. In order to investigate how climate change, especially the effect of temperature affect the hydrological functioning of the biotic community [10], constructing an accurate and robust temperature prediction program became an important part of the experiment.
Experiment II was designed to forecast the MMT from January 2011 to December 2018, the dataset from January 1991 to December 2010 is used as the training set in the predicting procedure. The main experimental steps such as the data pre-process methods and results, the main parameters settings of hybrid mode and the sub-model selection strategy are shown in Figure 4.
Due to the MMT is the average of the daily average temperature, some of the noise in the MMT time series has been eliminated by averaging and as shown in part A of  The monthly mean temperature forecasting results of single neural network models, hybrid forecasting models and the traditional combination model (TCM) optimized by the PSO algorithm are plotted as shown in Figure 5, the values of model evaluation criteria for the monthly mean temperature are computed and listed in Table 4. As shown in Table 4, the best single predictor model is BPNN and the best hybrid predictor model is EEMD-LS-SVM. Compared with the single BPNN, SVM, ELM and NAR models, The MAE and RMSE of the corresponding hybrid models are all smaller than the single neural network models. The hybrid EEMD-LS-SVM model is selected due to the forecasting performance of EEMD-LS-SVM is the best among the hybrid models and single neural network models. The hybrid VMD-LS-BPNN and VMD-LS-NAR models are selected because the forecasting performance of VMD-LS-BPNN model is the second-best among the hybrid models and The monthly mean temperature forecasting results of single neural network models, hybrid forecasting models and the traditional combination model (TCM) optimized by the PSO algorithm are plotted as shown in Figure 5, the values of model evaluation criteria for the monthly mean temperature are computed and listed in Table 4. As shown in Table 4, the best single predictor model is BPNN and the best hybrid predictor model is EEMD-LS-SVM. Compared with the single BPNN, SVM, ELM and NAR models, The MAE and RMSE of the corresponding hybrid models are all smaller than the single neural network models. The hybrid EEMD-LS-SVM model is selected due to the forecasting performance of EEMD-LS-SVM is the best among the hybrid models and single neural network models. The hybrid VMD-LS-BPNN and VMD-LS-NAR models are selected because the forecasting performance of VMD-LS-BPNN model is the second-best among the hybrid models and the main predict model of VMD-LS-NAR is different from BPNN and SVM. The hybrid EEMD-LS-BPNN is not selected because it has the same main predict model as the hybrid VMD-LS-BPNN model. The ELM model and the DWT-LS-ELM, EEMD-LS-ELM, VMD-LS-ELM models were not selected due to the forecasting risks increasing. Based on the above strategy, there were three hybrid models, EEMD-LS-SVM, VMD-LS-BPNN and VMD-LS-NAR selected as the sub-models for PSO-TCM. Numerical results show that the optimal combination weight of EEMD-LS-SVM, VMD-LS-BPNN and VMD-LS-NAR are 0.455,0.311and 0.250 respectively. Table 4 show that the forecasting performance of hybrid models are markedly increased than the single neural network models, and the ensemble PSO-TCM model has higher forecasting accuracy than the selected optimal sub-models. The MAE and RMSE of PSO-TCM are 1.1793 and 1.4846, and NSCE of PSO-TCM is 0.9812, which is very close to 1, this suggests that the PSO-TCM model has high robustness and prediction accuracy.  Table 4, the following main conclusion can be obtained:

As shown in Figures 4 and 5 and
(1) The variation of MMT time series is relatively stable compared with the monthly precipitation, the data preprocess methods DWT, EEMD and VMD almost have the same effect and the NSCE and RMSE have little difference in this case. In the decomposition process of MMT time series, EEMD and VMD have better de-noising performance than the DWT. Accordingly, the best data preprocessing method is selected only based on the forecasting performance of the hybrid models. The appropriate de-noising method selection combined with the best neural network model is an effective way to improve the forecasting accuracy.
(2) Elaborately select the sub-models for PSO-TCM according to the forecasting performance of single neural network and hybrid models is crucial for improve the prediction accuracy, the PSO-TCM model can effectively improve the forecasting accuracy and decrease the predicted risk.
The increase in temperature leads to an increase in surface evaporation, which influenced the physiological functions and metabolic rate of the biotic community in the desertified steppe and steppefied desert transition zones. Recently, a ten-year observational study shows that moss cover was more sensitive to temperature rise, with the increase of temperature, the proportion of mosses in moss-and lichen-dominated crusts was decreased, which means that moss biomass was negatively correlated with warming intensity [10,12].

Experiment III: The Monthly Average Wind Speed Forecasted by the Ensemble Model
The wind is the movement of air, carbon dioxide and oxygen in the air are the main raw materials and material conditions for plant photosynthesis, the concentration of these two gases directly affects the healthy growth and flowering of plants. The ecological benefits of plants are reflected in helping pollinate and spread seeds, wind speed is an important abiotic factors for the variation of vegetation cove, it affects the physiological, biochemical, material metabolism and ecological adaptability of BSC in arid and semi-arid ecosystems, Long-term observations have shown that the higher the wind speed often companioned with the lower the development degree of BSC [15]. The atmospheric dust removal is one of the important nutrient input sources in the desert ecosystem, the higher the wind speed, the more serious the surface wind erosion and soil nutrient loss. On the other hand, sand buries caused by strong winds can also cause the death of BSC photosynthetic components, and then caused the changes in the stability of the BSC subsoil and the structure of the BSC community [2].
Experiment III was designed to forecast the MAWS from January 2011 to December 2018, the dataset of MAWS from January 1991 to December 2010 was used as the training set. The data pre-process methods, the longitudinal data selection method, parameter settings of hybrid models and the sub-model selection are shown in Figure 6. The MAWS is the average of the daily average wind speed and shows periodic changes, and noise still exists in the wind speed time series. Part A of Figure 6 shows that the de-noising effect of DWT, EEMD and VMD and the values of NSCE and RMSE. The NSCE of DWT is the smallest and the RMSE of DWT is the biggest among the values of NSCE and RMSE. Although the values of NSCE and RMSE have little difference, the forecasting performance of the sub-models has a big difference as shown in Figure 5, therefore, the data pre-processes techniques EEMD and VMD methods are all selected to decompose the MAWS time series. The main parameters of sub-models such as the number of input dimensions, the hidden layers and the feedback delays are listed in Part B of Figure 6. Part C of Figure 6 shows the detailed sub-model selection process and combination processes.
The evaluation criteria of MAE, RMSE, NSCE, MAPE and NMSE for single neural network models and hybrid models are computed and shown in Table 5. The single BPNN, SVM and NAR almost have the same forecasting performance and the MAE, RMSE, NSCE, MAPE and NMSE have little difference. The predicted result of ELM seems to be not satisfactory due to the ELM randomly initializes the input weights. Table 5 shows that the MAE, RMSE and MAPE of single BPNN, SVM, ELM, NAR models are all bigger than the corresponding hybrid models in general. Contrary to the variation tendency of MAE, RMSE and MAPE, the NSCE and NMSE of the single neural network models showed in opposite direction, which means that under the same parameter settings, the hybrid models have better forecasting performance than the single BPNN, SVM, ELM, NAR models, and appropriately choose the data preprocessing methods can effectively improve the forecasting accuracy. Among the hybrid models, if the evaluation criteria of MAE, RMSE and NSCE are selected as the prediction standards, the hybrid VMD-LS-BPNN, VMD-LS-SVM and VMD-LS-NAR models are selected as the optimal sub-models of PSO-TCM. In addition, MAPE is a unit-free evaluation criterion, it has good sensitivity and very low outlier protection for small changes in data, if the MAPE is selected as a sub-model selection criteria, the hybrid VMD-LS-BPNN, VMD-LS-SVM and DWT-LS-NAR models are selected as the optimal sub-models of PSO-TCM. In order to reduce the prediction risk of the PSO-TCM model, both of these two sub-models select strategies are adopted. The finally forecasting performance can be determined by the evaluation criteria of MAE, RMSE, NSCE, MAPE and NMSE. If the results are not consistent with each other, the MAE, RMSE, NSCE and MAPE will be selected as the final benchmark. Experiment results show that the weight of sub-models VMD-LS-BPNN, VMD-LS-SVM and VMD-LS-NAR are 0.681, 0.019 and 0.300 respectively, while the weight of sub-models VMD-LS-BPNN, VMD-LS-SVM and DWT-LS-NAR are 0, 1 and 0, which means that the sub-models select strategy based on prediction standards of MAE, RMSE and NSCE is robust and reliable. The evaluation criteria of MAE, RMSE, NSCE, MAPE and NMSE for single neural network models and hybrid models are computed and shown in Table 5. The single BPNN, SVM and NAR almost have the same forecasting performance and the MAE, RMSE, NSCE, MAPE and NMSE have little difference. The predicted result of ELM seems to be not satisfactory due to the ELM randomly  The MAWS predicts results of sub-models and PSO-TCM are plotted as shown in Figure 7, the values of MAE, RMSE, NSCE, MAPE and NMSE of PSO-TCM are also listed in Table 5.
As shown in Figures 6 and 7 and Table 5, the following main conclusion can be obtained: (1) DWT, EEMD and VMD are all popular and effective data preprocess method, selecting appropriate de-noising method according to de-noising effect can effectively improve the prediction accuracy. In experiment III, EEMD and VMD have better de-noising performance than DWT in the decomposition process of MAWS time series, predicting the MAWS combine with the appropriate de-noising method is an effective way to improve the forecasting performance. In Table 5, the MAE and RMSE of hybrid models are significantly smaller than the corresponding single machine learning model.
(2) The forecasting performance of the hybrid VMD-LS-NAR model is the best among the hybrid models and single neural network models. Compared with the single NAR model, the MAE of VMD-LS-NAR decreased about 0.1932, the magnitude of the declines is the biggest among all hybrid models.
(3) The forecasting performance of SVM is the best among the single neural network models under the evaluation criteria of MAE and RMSE, which means there is no particular one model that can suitable for all cases, we should compare the model forecasting performance under specific conditions to determine the most suitable predict model. Accurate wind speed prediction can provide a necessary reference for the effective use of grass checker sand barriers and planting arid shrubs in desert control. It is also interesting to note that predicting wind speed accurately is very important for wind farm construction such as power grid operation scheduling, control, maintenance, and resource planning of wind energy conversion systems etc. [2,10].

Conclusions
Global warming speeds up the water cycle and increases the spatial heterogeneity of precipitation. The changes in precipitation inevitably affect the species diversity of desert communities, structure and function. It is particularly important to investigate the changing trend of precipitation, temperature, wind speed respectively under the background of global climate warming for effectively realizing the restoration and protection of desert ecosystems.
In this paper, the temperature, precipitation and wind speed time series were collected at the Shapotou Desert Research and Experiment Station, operated by the Chinese Academy of Sciences in the southeastern margin of Tengger desert. In order to investigate the change trend of precipitation, temperature, wind speed, three numerical simulation experiments including four single neural network models, twelve hybrid models and one combination model optimized by particle swarm optimization algorithm are used to forecast the monthly precipitation, the monthly mean temperature and the monthly average wind speed time series respectively. The experimental results show that selecting appropriate de-noising method according to de-noising effect can effectively improve the prediction accuracy. The hybrid models based on the data preprocessing technology and the longitudinal data selection method have better forecasting performance than the single machine learning model. The monthly precipitation, the monthly mean temperature and the monthly average wind speed are forecasted by a single artificial intelligence model or the hybrid models based on data preprocessing technology and proper single model could not achieve favorable performance. Elaborately select the sub-models for the particle swarm optimization algorithm optimize the weights of the traditional combination model according to the forecasting performance of single neural network and hybrid models is crucial to improve the prediction accuracy; The particle swarm optimization algorithm optimizes the weights of the traditional combination model can effectively improve the forecasting accuracy and have better adaptability, and robustness. The proposed method is beneficial to analyze the relationship between sustainable development and the severe natural condition in arid regions under the impacts of climate change.