Short-Term Electricity Price Forecasting Based on BP Neural Network Optimized by SAPSO

: In the electricity market environment, the market clearing price has strong volatility, periodicity and randomness, which makes it more difﬁcult to select the input features of artiﬁcial neural network forecasting. Although the traditional back propagation (BP) neural network has been applied early in electricity price forecasting, it has the problem of low forecasting accuracy. For this reason, this paper uses the maximum information coefﬁcient and Pearson correlation analysis to determine the main factors affecting electricity price ﬂuctuation as the input factors of the forecasting model. The improved particle swarm optimization algorithm, called simulated annealing particle swarm optimization (SAPSO), is used to optimize the BP neural network to establish the SAPSO-BP short-term electricity price forecasting model and the actual sample data are used to simulate and calculate. The results show that the SAPSO-BP price forecasting model has a high degree of ﬁt and the average relative error and mean square error of the forecasting model are lower than those of the BP network model and PSO-BP model, as well as better than the PSO-BP model in terms of convergence speed and accuracy, which provides an effective method for improving the accuracy of short-term electricity price forecasting.


Introduction
In the electricity market, electricity price forecasting is one of the basic conditions for optimal decision-making. The prediction accuracy of the electricity price directly affects the income and risk of transactions and it is the information that market participants pay close attention to [1][2][3][4]. Power generation companies need to determine the bidding curve according to the predicted electricity price in order to maximize profits while reducing risk [5,6]. However, due to the advance of power system development, the fluctuation of the electricity price is not increasing, but affected by many uncertain factors, such as the change of supply and demand, network congestion, market and the psychology of bidders [7]. These random factors, which are difficult to quantify and screen, make it more difficult to forecast electricity price.
At present, the mainstream short-term electricity price forecasting methods can be divided into traditional methods and intelligent methods. The main idea of traditional methods is to take time series as the basis of modeling, such as in the auto-regressive method, auto-regressive moving average method, multiple linear regression method and so on [8][9][10][11][12]. The advantages of these models are simple models and high speed of operation, but the prediction accuracy of the data with nonlinear relationship is not high. Intelligent methods mainly include the neural network, support vector machine, random forest, etc. [13][14][15]. Deep learning is also one of the popular intelligent prediction methods, such as long short-term memory (LSTM) and gated recurrent unit (GRU) [16][17][18][19][20]. Among intelligent methods, the neural network has a strong nonlinear function fitting ability. The main reason is that the method collects abstract features between massive data in the

Feature Selection
There are many factors affecting the price of electricity, including some random factors that are difficult to determine and quantify. The holistic approach is obviously impractical. According to the analysis of actual data, it is necessary to conduct a correlation analysis for the factors that may affect the price of electricity and select the main influencing factors as the input factors of neural network prediction. The following factors were taken into consideration: • Market supply and demand index (SDI). System load demand generally has a direct impact on the price of electricity; usually, the greater the electricity consumption, the higher the price of electricity. At the same time, the electricity supplied by the system also affects the price of electricity. Generally, the more the maximum available capacity of the system is, the lower the price of electricity is. Therefore, it's not reasonable to think about demand and supply alone, but to put them together and use SDI as the impact factor. The specific calculation formula is as follows: In the formula, D(t) is the maximum load in the forecast period t and S max (t) is the maximum available capacity of the period t;

•
Previous system marginal electricity price (SMP). Because the bidding mode in the system does not change greatly in a short time, the previous SMP can be used as a strong influence factor in the short-term electricity price forecast; • The installed capacity ratio, that is, the proportion of the installed capacity of the power plant in the total system capacity; generally speaking, the larger the proportion, the greater the influence of the power generation company on the quotation; • System reserve demand and reactive power demand are also one of the factors that affect the system electricity price.
In this paper, the maximum information coefficient (MIC) and Pearson correlation coefficient are used to comprehensively evaluate the influence of various factors on electricity price fluctuation.
The Pearson coefficient is used to measure the linear correlation between two variables and its value is between −1 and 1. Greater than 0 means positive correlation and less than 0 means negative correlation. The calculation method is the quotient of covariance and standard deviation of two variables and the formula is as follows: The MIC is a kind of excellent data correlation calculation used to measure the degree of linear or nonlinear correlation between two variables and is of high accuracy. The idea of the MIC is to draw two random variables into a scatter graph and then divide them with small squares continuously to calculate the probability of falling on each square, so as to estimate the joint probability density distribution. The calculation formula is as follows: In the formulas, |X| and |Y| represent the number of cells divided on the x and y axes and P(X, Y) is the joint probability between the two variables.
The dataset contains 50 samples and each sample contains values of supply and demand index, previous SMP, system reserve rate, installed capacity and actual electricity value [39]. The previous SMP and the actual electricity price are between 0.1 and 0.6. The supply and demand index, system reserve rate and installed capacity are all in the form of ratios and their values are between 0 and 1.
First, we calculated the MIC value and Pearson coefficient between the above four factors and the actual electricity price for all samples, with X = [x 1 , x 2 , . . . , x i ], i = 50, being the influencing factor of 50 samples and Y = [y 1 , y 2 , . . . , y i ], i = 50, being the electricity price of 50 samples. The calculation results are shown in Table 1. According to the calculation results in Table 1, it can be seen that the MIC and Pearson coefficient values of SDI and previous SMP, are high, indicating that they are the main factors affecting the electricity price. The Pearson coefficient values of system reserve rate and installed capacity ratio are low but the MIC values are not, indicating that the linear correlation between them and electricity price is weak, but the nonlinear correlation is strong. Combined with the above analysis of the related influencing factors of electricity price, four factors, such as supply and demand index, previous SMP, system reserve rate and installed capacity ratio, were selected as the input factors for the prediction of electricity price by the neural network.

SAPSO Algorithm
The prototype of the particle swarm optimization algorithm is extracted from the model of bird group. At the beginning, each bird in the group has a flying speed and direction and each bird should strive to keep flying in the population space without colliding with each other. When a bird finds food, it leaves the population and flies to the food. Then, other birds move closer to it and eventually fly to the food. Finally, all birds land where the food is. Figure 1 is a schematic diagram of particle motion. The mathematical description of the particle swarm optimization algorithm is to randomly generate a specified number of particles, take the optimal value experienced by each particle in the iterative process as its own individual extreme value and record its position as p i . The optimal value of all particles is taken as the global extreme value and its position is recorded as p g . In each iteration, the particles update the position and velocity information of each particle by tracking the above two extreme values according to Equations (5) and (6).
In the formulas, v is the velocity of the particle, x is the position of the particle, i is the number of the particle, p i is the individual extreme value, p g is the global extreme value, k is the number of iterations, c 1 and c 2 are learning factors, w is the inertia weight and r 1 and r 2 are random numbers between 0 and 1.
Aiming at the shortcomings of the particle swarm algorithm, that is, it easily matures early and the later convergence speed is too slow, SAPSO is obtained by improving the simulated annealing algorithm (SA). Inspired by the cooling process of solid annealing, the idea of the SA algorithm is that it reaches an equilibrium state at each temperature in the cooling process and can reach a minimum at room temperature. In the process of searching for the optimal solution, the acceptance rule of inferior solution at each temperature of SA is used to make it possible for each individual extreme value to be accepted and the sudden jump probability of replacing the current optimal solution can be given to reasonably prevent falling into the local extreme value.
After accepting the inferior solution, the updating formula of particle velocity becomes as follows: In the formula, p g is the replacement value of the global extreme value p g selected from p i .
According to the rules of SA, p i is a special solution worse than p g and the betterquality p i should be given the higher probability of replacement. At temperature T, the sudden jump probability of p i relative to p g can be calculated as follows: where f represents the fitness value and M is the population size.

Function Testing
Using the same initial population, simulation experiments were carried out to compare the performance of the improved particle swarm optimization algorithm and the basic particle swarm optimization algorithm. The mean and standard deviation were used to analyze the experimental results. The mean and standard deviation are methods to analyze the experimental data which can eliminate the random error and compare the experimental results under the same other conditions. Different test functions were used to test the improved particle swarm optimization algorithm and the basic particle swarm optimization algorithm and the performance of the algorithm was analyzed by comparing the mean and standard deviation.
In order to test the optimization effect of SAPSO algorithm, three commonly used test functions were selected to test SAPSO. The expression, decision variable constraints and optimal solution of the function are as follows: • Sphere function: min( f 1 (x)) = f 1 (0, 0, · · · , 0) = 0; • Rastrigin function: • Ackley function: Different test functions were used to test PSO and SAPSO. Considering the influence of random factors of the algorithm, the two algorithms ran 20 times each when using each test function. The dimension of the test function is unified as 10 dimensions, the number of particles is 100 and the maximum number of iterations is 200. The optimization results are shown in the Table 2. It can be seen, from Table 1, that the calculation results of SAPSO on the test functions f 1~f 3 are better than those of PSO. For the function f 1 , both PSO and SAPSO can converge to the optimal value quickly. For function f 3 , PSO cannot converge to the global optimal value, while SAPSO can obtain better optimization results quickly. Figure 2 shows the convergence curves of PSO and SAPSO in solving the sphere, Rastrigrin and Ackley functions. It can be seen, from the figure, that SAPSO has a good solution effect, effectively improves the early convergence problem, improves the convergence speed and greatly improves the optimization ability compared with PSO.

BP Neutral Network Optimization Based on SAPSO
In 1985, Rumelhart and McClelland proposed the BP neural network. After decades of development, the BP neural network has been widely recognized by the academic community and has shown its skills in various fields, such as information, medicine, psychology, engineering, control, transportation and so on. The learning process of the BP network is consistent with the cognitive process law in the human learning process. It goes through the following three links to simulate the human brain: the first step is to use neural network to absorb knowledge from the outside through the learning process; in the second step, internal neurons store the acquired knowledge; the third step is to use the acquired knowledge for migration to solve similar problems encountered next time. The above three links are cyclic and interrelated, forming a complete organic advanced intelligent system.
The model shown in Figure 3 is a BP network with one hidden layer. The BP network is the learning process of the error back propagation algorithm, which is divided into two stages, namely, forward propagation and back propagation. In forward propagation, information is transferred from the input layer to the middle layer (the middle layer can be designed as a single hidden layer or multiple hidden layers), is processed by the middle layer and then transferred to the output layer to complete the forward propagation process of one learning. When the actual output is inconsistent with the expected output, it enters the back propagation stage of error. The error passes through the output layer, modifies the weight of each layer in the way of error gradient descent and transmits it back to the hidden layer and the input layer one by one. The characteristic function of a neuron node is S-type function, as shown in Equation (12): Aiming at the shortcomings of the BP neural network, we used the intelligent optimization algorithm to optimize the initial weights and thresholds of the BP network before training, thus greatly shortening the training time of the BP network and avoiding the algorithm from falling into the local optimum to a certain extent. The SAPSO algorithm can overcome the shortcomings of the PSO algorithm and optimize the BP network. It can more quickly and effectively optimize the weight and threshold of the network, so that the model has higher stability and accuracy. In the SAPSO algorithm, the value of each element of particle vector X i = (X i1 , X i2 , . . . , X id ) represents the weight or threshold value of the neurons in the BP neutral network, where d is the number of ownership and threshold value in the BP neutral network. The fitness function of the particle in the particle swarm optimization algorithm is calculated as follows: In the formula, n is the number of samples, S is the number of particles, Y i,j represents the jth actual value of the ith sample and y i,j represents the jth calculated value of the ith sample.
The steps of the SAPSO algorithm to optimize the BP network are as follows: 1.
Initialize the BP neural network parameters, including network structure, number of neurons at each layer and transmission function. Set particle swarm parameters, including population size, particle dimension, maximum number of iterations, inertia weight and learning factor; 2.
Initialize the population position and speed. The initial position of particles is randomly set within the value range of weight and threshold of the BP network; 3.
Calculate the fitness value of each particle. The particle position is assigned to the neural network as the weights and thresholds, the model is used to calculate the sample prediction error of the training set and the particle fitness value is calculated according to Formula (14); 4.
Determine individual and global extrema. The initial individual extremum is the adaptive value of each particle, the best one of which is called the global extremum. 5.
The optimal position of each individual is given a jump probability, which is calculated according to Equation (8). According to the probability, an individual optimal position is randomly selected to replace the global optimal position in the speed update formula; 6.
Judge whether the maximum number of iterations has been reached. If so, proceed to step 7. Otherwise, anneal, update the speed and position of the particles and then go to step 3 to continue the cycle; 7.
The optimal position of particles is taken as the initial weights and thresholds of the BP neural network; 8.
Train the BP network and output results.
The algorithm flow chart is pictured in Figure 4 below.

Results
Considering the neural network activation function is sensitive to the data between −1 and 1, all sample data were normalized before network training to reduce training time and improve training effect. The electricity price, supply and demand index, previous SMP, system reserve rate and installed capacity were normalized according to Equation (15).
In the formula, X is the normalized input variable and max(X) and min(X) are the maximum and minimum values of the variable to be normalized, respectively.

Algorithm Parameter Setting
The parameter selection of the intelligent algorithm has a great impact on the accuracy and stability of the prediction results. Through many simulation experiments, the parameters of the SAPSO-optimized BP network algorithm were set as follows: learning factor c 1 = 1.5, c 2 = 1.5; particle position limit x max = 10 and x min = −10; speed limit v max = 5 and v min = −5; cooling rate q = 0.998. The number of particles was 250 and the maximum number of iterations was 200. The PSO-optimized BP network adopts the same parameters as above to ensure the comparability of the experiment.
The structure setting of the BP network also has a certain impact on the prediction effect. The neural network structure includes input layers, hidden layers and output layers. The network was established by the function newff. Due to the small number of input variables and samples, the network structure did not need to be too complicated; then, the number of hidden layers of the BP network was selected as 1. Here, we used Matlab R2017a and the version number of Neural Network Toolbox used was Version 10.0. The transfer functions of hidden and output layer were Tansig and purelin, respectively, and the training function of back propagation was trainlm, all of which were set by default, with faster speed and higher accuracy. Tansig is a tangent siomoid transfer function and the input value can take any value and the output value is between −1 and +1; purelin is a linear transfer function and the input and output values can be arbitrary. By changing the number of hidden layer neurons to 3, 9 and

Evaluating Indicator
In order to verify the accuracy and feasibility of the electricity price prediction model proposed in this paper, the mean absolute percentage error (MAPE) and root mean square error (RMSE) were used as the quantitative criteria for the prediction accuracy evaluation and their expressions are as follows: In the formula, n is the number of predicted data, Y(t) is the actual value of the electricity price at the predicted time and Y × (t) represents the predicted electricity value at the predicted time.

Results Analysis
Sample data were collected according to the four impact factors identified above and the data in Table 3 Tables 3-5 show the predicted value and error calculation results of electricity price for testing set. It can be seen that, no matter how many neurons in the hidden layer of the neural network, the prediction effect of SAPSO-BP is better than that of PSO-BP and the electricity price prediction effect of the BP network is the worst. Although individual samples have large error values, generally speaking, the predicted electricity price of SAPSO-BP is the closest to the real electricity price and the prediction error is the smallest.   Figure 5 shows the comparison between the predicted electricity price and the real electricity price of the nine models for the testing set. The trend of the electricity price curve is consistent with the data in the table, which can show that the SAPSO-BP model has higher accuracy. The longitudinal comparison in Figure 5 shows that the prediction effect order, from good to bad, is SAPSO-BP, PSO-BP and BP. The horizontal comparison in Figure 5 shows that the number of neurons in the hidden layer has little effect on the prediction results. The overall prediction accuracy of the SAPSO-BP method is relatively high and most of the sample prediction errors were less than 5%, but the prediction accuracy of the first, 11th and 17th samples was very poor, exceeding 20%, or even as high as 60%. This should not be a problem with the network model, because the sample training accuracy can meet the preset requirements. The problem may come from the data themselves and there are two possibilities. First, the training sample data contain noise and abnormal samples, which causes the network to fall into a wrong mode in order to correct the above defects during training, causing the illusion that the training accuracy meets the requirements but the network mode is wrong, which leads to large errors in the prediction results. Second, there are noise and abnormal samples in the test sample data, which leads to large errors in the prediction results. In view of the above problems, methods such as wavelet analysis and rough set theory can be used in advance to correct the sample data to eliminate inherent noise and anomalies and improve the accuracy of prediction.  Table 6 shows the calculated values of the evaluation indexes of each model for the testing set. The mean absolute percentage errors of the BP model with hidden layer neurons of 3, 6 and 9 were 32.89%, 27.90% and 38.00%, respectively, and the root mean square errors were 0.0858, 0.0885 and 0.0902, respectively. The mean absolute percentage errors of the PSO-BP model with hidden layer neurons of 3, 6 and 9 were 28.29%, 21.83% and 17.06%, respectively, and the root mean square errors were 0.0688, 0.0780 and 0.0733, respectively. The average absolute percentage errors of the SAPSO-BP model with hidden layer neurons of 3, 6 and 9 were 9.03%, 7.63% and 9.71%, respectively, and the root mean square errors were 0.0639, 0.0548 and 0.0577, respectively. According to the evaluation index, the results of the SAPSO-BP model are obviously better than those of the PSO-BP and BP model. The number of hidden neurons has little effect on the results. For the SAPSO-BP model, when the number of hidden neurons is 9, the prediction result is slightly better than the other two cases.  Figure 6 shows the regression curve of predicted electricity price and actual electricity price of the nine models for the testing set. The R value represents the correlation between the predicted value and the actual value, which can explain the fitting degree between the network output value and the actual value. The closer the R value is to 1, the better the model fitting, but it may also be over fitting. It can be seen, from the figure, that the fitting degree of SAPSO-BP model is the best and the R value is very close to 1. The R value of BP-15N is also close to 1. However, according to the predicted electricity price data, the prediction result is not good, which may be due to too many hidden layer neurons and over fitting. Therefore, the selection of the neural network model should be suitable, not based on the assumption that the more complex the model, the better.  Figure 7 shows the comparison of convergence curves of different algorithms. By comparing the convergence process between SAPSO-BP and PSO-BP, it can be seen that PSO-BP, apparently, incurred convergence stagnation several times, while SAPSO-BP improved the phenomenon by which the PSO algorithm is prone to local convergence stagnation by optimizing the PSO algorithm. In addition, the SAPSO-BP algorithm converged around 80 generations, while PSO-BP converged to the optimal value around 100 times. Therefore, the SAPSO algorithm-optimized BP neutral network has better convergence effect and faster convergence speed.

Conclusions
In view of the randomness of electricity price in the electricity market environment and the low accuracy of traditional BP neutral network electricity price prediction, this paper proposes a short-term electricity price prediction model based on the SAPSO-BP neutral network and draws the following conclusions: (1) The electricity price has great randomness and uncertainty and there are many factors affecting electricity price fluctuation. By analyzing the main factors affecting electricity price, MIC and Pearson correlation analysis were used to determine the supply and demand indexes, early system marginal electricity price, system reserve rate and installed capacity as the input factors of the electricity price prediction model. (2) The combination of the SA algorithm can enhance the ability of PSO to jump out of the local optimal value and avoid the occurrence of the premature phenomenon. The linearly increased inertia weight was used to improve the disadvantage of the decline of the later convergence speed of PSO and to improve the convergence speed of the algorithm. The results show that the improved algorithm has better performance. Institutional Review Board Statement: Not applicable.