A Short-Term Forecast Model of foF2 Based on Elman Neural Network

: The critical frequency foF2 of the ionosphere F2 layer is one of the most important parameters of the ionosphere. Based on the Elman neural network (ENN), this paper constructs a single station forecasting model to predict foF2 one hour ahead. In order to avoid the network falling into local minimum, the model is optimized by the improved particle swarm optimization (IPSO). The input parameters used in the model include local time, seasonal information, solar cycle information and magnetic activity information. Data of the Wuhan Station from 2008 to 2016 were used to train and test the model. The prediction results of foF2 show that the root mean square error (RMSE) of the Elman neural network model is 4.30% lower than that of the back-propagation neural network (BPNN) model. The RMSE is further reduced by 8.92% after using the IPSO to optimize the model. This indicates that the Elman neural network model optimized by the improved particle swarm optimization is superior to the BP neural network and Elman neural network in the forecast of foF2 one hour ahead at Wuhan station.


Introduction
The critical frequency foF2 of the ionosphere F2 layer is of great significance for the prediction of high frequency(HF) communication frequency and has been a hot spot in ionosphere research. Influenced by solar, geomagnetic and other factors, the change of foF2 is a time sequence nonlinear process. The back-propagation neural network has been widely used to predict foF2 over the past few decades because it can fit nonlinear processes well. For instances, predicting the noon value of foF2 at Grahamstown [1]; predicting the trend of foF2 at 19 Ionosphere stations in the Asia/Pacific sector [2]; comparing neural network-based predictions of foF2 with the IRI-2012 model in Southeast Asia [3]; using the data of 13 stations to establish a global foF2 prediction model [4].
The performance of the back-propagation neural network (BPNN) model is better than the traditional empirical model, such as the IRI model [5]. However, it also has the problems of slow convergence and falls easily into the local minimum. Thus, other algorithms are used to train neural network models to overcome these shortcomings, such as the genetic algorithm [6] and AdaBoost-BP algorithm [7], and have achieved good results.
However, none of the above models can accurately reflect the temporality of foF2, because they are all feed-forward neural networks. Meanwhile, the recursive neural network (RNN) with memory function has been widely used in the prediction of time sequence and achieved good results. For examples, visual tracking [8], load dispatch of microgrid [9] and total electron content [10]. Therefore, in this paper, we developed a model based on the Elman neural network, which is a kind of RNN, to predict foF2 one hour ahead. The improved particle swarm optimization was used to train the model.

Elman Neural Network (ENN)
The Elman neural network was first proposed by J. L. Elman in 1990 for speech processing. Based on the neural network, the ENN adds the context layer to connect the input and output of the hidden layer, which enable the network to store previous state and have dynamic properties. Compared with BP neural networks, ENN is better able to capture information about sequence data. The change of foF2 is stable in a short time, and the correlation of foF2 in adjacent time is great. Thus, we chose the Elman neural network to build the model. The Elman neural network structure is shown in Figure 1: The Elman neural network consists of four layers, which are input layer, hidden layer, context layer and output layer [11]. The input layer transmits external signals to the hidden layer. The context layer stores the output signal of the hidden layer and serves as the input for the next step of the hidden layer. Then, the hidden layer activates the signal it receives and passes it to the next layer. Finally, the signal is output by the output layer.
Suppose m is the length of input, the input set is: n is the total number of hidden layers and t is the input set number. The output of the hidden layer at t step is: where H i (t) represents the output of the hidden layer i, W i represents the weight matrix from the hidden layer (i-1) to the hidden layer i, Z i (t) represents the output of the context layer i, which is given by: f represents the activation function of the hidden layer, which can be calculated from the following: and y(t), which represents the output of the network, is given by: where W n+1 represents the weight matrix from the hidden layer n to the output layer.

Improved Particle Optimization Algorithm
The most classic training algorithm for neural networks is the back-propagation algorithm. The algorithm iteratively updates the model parameters by using the derivative of the error function to the model parameters. However, when the model function is a non-convex function, the algorithm tends to converge to a local minimum. Therefore, this paper chooses the improved particle swarm optimization algorithm as the optimization algorithm. The improved particle swarm optimization (IPSO) algorithm is a group stochastic optimization technique inspired by the foraging behavior of birds. By pursuing the optimal particle instead of the derivative, the movement of the whole group evolves from disorder to order in the solution space, and ultimately the optimal solution is obtained.

Particle Swarm Optimization for ENN
The particle swarm optimization algorithm randomly initializes a group of particles, each of which has three characteristics of position, velocity and fitness function value, representing a potential solution to the problem. The position of the particle with the optimal function fitness is marked as the optimal particle position. The position of each particle is updated according to its historical optimal positions and the global optimal position [12]. Finally, particles converge to the global optimal solution. The process of particle swarm optimization for the ENN is as follows: 1. In order to calculate the value of the weight matrix in the ENN, the element of the weight matrix is transformed into a vector, which is the position of the particle. Suppose k is the length of the particle, then m represents the number of particle iterations. The position of the particle can be expressed as P(m) = (p 1 (m), p 2 (m) · · · p k (m)). The velocity of the particle determines the change of its position in the next iteration, which can be expressed as V(m) = (v 1 (m), v 2 (m) · · · v k (m)). Initialize P(1) and V(1) between [0,1].
2. Evaluating the fitness of each particle by using the root mean square error (RMSE); the calculation formula is as follows: where y(t) represents the output of the network, N represents the number of input sets, f o (t) represents the observed value. P best = (p 1 , p 2 · · · p t ) represents the optimal position that the particle has searched thus far, and G best = (g 1 , g 2 · · · g t ) represents the optimal position that all particles has searched thus far. The initial P best and G best take the position of the first particle. 3. Replace P best or G best with the position of the particle, if the current fitness of the particle is better than P best or G best . Update the speed and position of the particle as follows: where ω represents the weight coefficient, c 1 and c 2 represent the non-negative acceleration constants and r is a random number between [0,1]. In this paper, we make ω = 0.8, c 1 = c 2 = 2.1. 4. The network training stops and the G best is output, when G best satisfies the convergence condition, or the number of iterations reaches the maximum. Otherwise, the training returns to the second step and continues to iterate until the termination condition is satisfied.

Improvement of Particle Swarm Optimization
Particle swarm optimization has a fast convergence speed. However, the similarity between the particles is high at the later stage of training, which makes the training of the model easy to fall into local optimum. Therefore, the following improvements have been made to the particle swarm optimization algorithm in this paper: 1. Using local averages instead of global averages can make training more difficult to fall into local minimum but take more time [13]. Therefore, this paper combines two methods. Suppose each particle has s neighbors, the s + 1 particles are represented as P 1 , P 2 , · · · P s+1 . The average position of these s + 1 particles is: Then, the speed update formula of the particle swarm algorithm becomes: where c 3 is the non-negative acceleration constants, ω 1 is the weight coefficient of P ia . Here, we make c 3 = 0.2, and: where m max is the maximum number of iterations. In this paper, we set this value to 300. 2. At the later stage of training, all particles are sorted according to their fitness function values and re-initialized the lower-ranking particles, if the global optimum particle has not changed or changed little after many iterations. In this paper, all particles are sorted if the function fitness of G best decreases by less than 0.1 MHz after 40 iterations. Then, the last 25% of the particles are initialized and the initialization range is set to [−3,3] to ensure the diversity of particles.

Setting of Training Data
For an effective prediction of foF2, the ionosphere data of Wuhan Station from 2008 to 2013 is used to train the model. The ionosphere data inevitably contain gaps due to ionosphere disturbances. If the data for the previous hour and the next hour are both present, linear interpolation is used to complete the missing data. Otherwise, the data is discarded. Details of the input parameters are described below.

Diurnal and Seasonal Variation
Due to the influence of the earth's rotation and revolution, the ionosphere has obvious diurnal and seasonal variations, which makes foF2 have the same variation. Hour number (HR) from 0 to 23 and day number (DN) from 1 to 365 are used to describe diurnal and seasonal variations, respectively. In order to avoid unrealistic numerical discontinuity, HR and DN are converted to quadrature components: HRC = cos 2π × HR 24 (12)

Solar and Magnetic Activities
Solar variation causes periodic changes of the ionosphere. As the measurement of solar variation, we chose sunspots number (SSN) and 10.7 cm solar flux F10.7 as input parameters. Sporadic changes in the ionosphere are mostly related to geomagnetic disturbances. Moreover, we chose the Kp index and Dst index to describe magnetic activities [14].
It is more difficult to forecast foF2 during the geomagnetic disturbances period, and some specialized studies have been carried out [15]. However, in this paper, we did not filter the data from the geomagnetic disturbances, but used all the data to train the model.

Present Value of foF2
A large amount of data analysis indicates that the value to be predicted has a good correlation with the value of the present time. Thus, the value of the present time is added to the input set.
Finally, we got nine inputs to predict foF2, as shown in Table 1.

Architecture of the ENN
In order to build the ENN model, the number of hidden layers and the number of neurons in each hidden layer need to be determined. First, we assumed that the number of hidden layer neurons is a fixed value and the number of hidden layers is a variable. After determining the optimal number of hidden layers, the optimal number of neurons in each hidden layer is calculated.
However, there is still no method to accurately calculate the number of hidden layer neurons in the neural network, and only the approximate range can be calculated. The formula is as follows: where N h is the number of hidden layer neurons, N i is the number of input layer neurons, N o is the number of output layer neurons and α is a random number between [1,10]. Too few neurons in the hidden layer will increase the model error and too many neurons will make the model training more complicated. According to formula (15), we determine that the number of the hidden layer neurons range from 6 to 12, and set the number of hidden layers to vary from 1 to 5. These models are trained with the BP algorithm and evaluated by RMSE. The result can be seen in Figure 2. It can be seen from Figure 2 that the model takes the minimum value when the number of hidden layers is 3, regardless of the number of neurons in hidden layers. Thus, the number of hidden layers is set to 3.
Then, the number of each hidden layer neurons was set to vary from 6 to 12. The number of neurons from the first hidden layer to the third hidden layer is expressed as a combination {h 1 , h 2 , h 3 }, h ∈ [6,12]. Finally, 343 combinations are obtained. According to the order of {6, 6, 6}, {6, 6, 7} . . . {12, 12, 12}, these combinations are numbered. The result is shown in Figure 3. As shown in Figure 3, the learning ability of the model increases and RMSE decreases with the increase of the number of hidden layer neurons. The numbers of five combinations which have the smallest RMSE are 242, 268, 294, 327 and 333. Since the increase of the number of hidden layer neurons will increase the training complexity of the IPSO, this paper uses the combination of {11, 9, 7}, which is numbered 268, as the number of hidden layer neurons.

Discussion
The IPSO-ENN model constructed in this paper has five layers, which are one input layer, three hidden layers and one output layer. The input layer has nine neurons, corresponding to nine inputs. The three hidden layers contain eleven, nine and seven neurons, respectively. The output layer has one neuron that represents the predicted value of foF2. The ENN model, which is trained by BP algorithm, and the BPNN model are chosen as the comparison of the IPSO-RNN model. Both the ENN model and the BPNN model are generated by Matlab neural network toolbox and have the same configuration as the IPSO-ENN model. The data of Wuhan Station in 2014 and 2016 are used to test the forecasting ability of all models. 2014 is the high year of solar activity, while 2016 is the low year of solar activity. The RMSE and relative error (RE) are used to evaluate the prediction results of foF2. The RE formula is as follows: where N is the number of input samples, f o is the observed value and y is the predicted value.
The prediction results are shown in Table 2. As can be seen from the   Figure 4 shows statistical histogram of the absolute error (AE) between measured foF2 and foF2 obtained from the IPSO-ENN model, the ENN model and the BPNN model. The AE formula is as follows:   Figure 5 shows the RMSE of each month in 2014 and 2016. It can be seen that the trend of monthly RMSE of the three models is basically the same and the IPSO-ENN model has the smallest RMSE in every month, which indicates that the IPSO-ENN model is more accurate for the prediction of foF2 in a single month. A further analysis was performed by investigating the predictability of foF2 diurnal variability. Two-day data from 2014 and 2016 were selected and they were plotted together with the predicted values from the IPSO-ENN model, the ENN model and the BPNN model, as shown in Figure 6. We found that all three models fit the rising and falling segments of foF2 curve well and reflect the diurnal trend of foF2. However, the IPSO-ENN model fit significantly better in the peaks and troughs of the curve. We also found that the diurnal variation of foF2 in 2016 was less than that of foF2 in 2014, which may have caused the RMSE in 2016 to be less than the RMSE in 2014. In order to evaluate the forecasting ability of the model during the ionosphere disturbance period, the data of an ionosphere disturbance occurred on December 20, 2015 was selected for the test. This disturbance was caused by a geomagnetic storm, and the change of Kp index during the period is shown in Figure 7. Comparing Figures 7a and 7b, it can be found that ionosphere disturbances occur after geomagnetic storms. The change of foF2 during the ionosphere disturbance period is very intense, which leads to a large error in the prediction. The RMSE of the IPSO-ENN model during this period is 1.23 MHz, which is less than 1.36 MHz of the ENN model and 1.25 MHz of the BPNN model. Details of the ionosphere disturbance should be incorporated into the IPSO-ENN model to improve the prediction ability of the model, which requires further investigation.

Conclusions
In this paper, a new IPSO-ENN model for foF2 forcast is developed. Data of Wuhan Station from 2008 to 2016 are used to train the model and test its forecasting ability. Experiments show that the optimum structure of the IPSO-ENN model is three hidden layers containing 11, nine and seven neurons, respectively. By comparing with the ENN model and the BPNN model, we found that the prediction ability of the IPSO-ENN model is optimal at all time scales. During the ionosphere disturbance, the prediction RMSE of the IPSO-ENN model is also the smallest among the three models. Further research is needed to improve the predictive abilities of the IPSO-ENN model in the ionosphere disturbance period.