Annual Electric Load Forecasting by a Least Squares Support Vector Machine with a Fruit Fly Optimization Algorithm

The accuracy of annual electric load forecasting plays an important role in the economic and social benefits of electric power systems. The least squares support vector machine (LSSVM) has been proven to offer strong potential in forecasting issues, particularly by employing an appropriate meta-heuristic algorithm to determine the values of its two parameters. However, these meta-heuristic algorithms have the drawbacks of being hard to understand and reaching the global optimal solution slowly. As a novel meta-heuristic and evolutionary algorithm, the fruit fly optimization algorithm (FOA) has the advantages of being easy to understand and fast convergence to the global optimal solution. Therefore, to improve the forecasting performance, this paper proposes a LSSVM-based annual electric load forecasting model that uses FOA to automatically determine the appropriate values of the two parameters for the LSSVM model. By taking the annual electricity consumption of China as an instance, the computational result shows that the LSSVM combined with FOA (LSSVM-FOA) outperforms other alternative methods, namely single LSSVM, LSSVM combined with coupled simulated annealing algorithm (LSSVM-CSA), generalized regression neural network (GRNN) and regression model.


Introduction
With the rapid development of China's electric power industry, electric load forecasting technology has aroused widespread concerns among practitioners and academia.An effective and accurate electric load forecast can provide the basis for the decision-making of electric power system planners.To a certain extent, the annual electric load forecasting can affect the development trends of the electric power industry.With the construction and development of the "Strong Smart Grid" in China, the renewable distributed energy generation capacity is growing rapidly, which may influence the stability of power system operation.In view of this, more accurate annual electric load forecasting is needed for maintaining the secure and stable operation of the electric power grid.However, annual electric loads have complex and non-linear relationships with some factors such as the political environment, human activities, and economic policy [1], making it is quite difficult to accurately forecast annual electric loads.
To improve the accuracy of annual electric load forecasting, many approaches have been proposed by scholars and practitioners in the past decades, such as time series technology and regression models [2][3][4][5][6].However, it is difficult to achieve significant improvements in terms of forecasting accuracy with these forecasting methods due to their poor non-linear fitting capability.In recent years, many artificial intelligence forecasting techniques have been applied in annual power load forecasting to improve the forecasting accuracy.Niu et al. [7] proposed a combined forecasting method based on a particle swarm optimization method, which can improve the forecasting stability and reliability.Wang et al. [1] proposed a hybrid model combining support vector regression and a differential evolution algorithm to forecast the annual power load, which was proven to outperform the SVR model with default parameters, regression forecasting model and back propagation artificial neural network (BPNN).Xia et al. [8] developed a medium and long term load forecasting model by using a radial basis function neural network (RBFNN), and the computational results indicated that this proposed model has a higher forecasting accuracy and stability.Hsu and Chen [9] formulated an artificial neural network model by collecting empirical data to forecast the regional peak load of Taiwan.Abou El-Ela et al. [10] proposed the artificial neural network (ANN) technique for long-term peak load forecasting, which was applied at the Egyptian electrical network based on its historical data.Meng et al. [11] applied the partial least squares method which could simulate the relationship between the electricity consumption and its influencing factors to forecast electricity load, and the empirical results revealed that this method is effective.Chen [12] proposed a collaborative fuzzy-neural approach for forecasting Taiwan's annual electricity load, and this approach could improve the forecasting accuracy.Kandil et al. [13] implemented a knowledge-based expert system to support the choice of the most suitable load forecasting model, and the usefulness of this method was demonstrated by a practical application.Hong [14] proposed an electric load forecasting model which combined the seasonal recurrent support vector regression model with a chaotic artificial bee colony algorithm, and this method could provide a more accurate forecasting result than the TF-ε-SVR-SA and ARIMA model.Pai et al. [15] used support vector machines with a simulated annealing algorithm to forecast Taiwan's electricity load, and the empirical results revealed this model outperforms the general regression neural network model and the autoregressive integrated moving average model.These methods, to a certain extent, all improve the annual electric load forecasting accuracy.
The least squares support vector machine (LSSVM) is a reformulation of the support vector machine (SVM) which leads to solving a linear KKT system [16,17].The LSSVM can approach the non-linear system with high precision, making it a powerful tool for modeling and forecasting non-linear systems [18].The LSSVM model has been successfully used to solve forecasting problems in many fields, such as CO concentration [19], gas [20,21], short term electric load [22][23][24], revenue [25], precipitation [26], wind speed [27], hydropower consumption forecasting [28], and so on.However, it is very regretful to find that the LSSVM model has rarely been applied to annual electric load forecasting.This paper examines the feasibility of using the LSSVM model to forecast annual electric loads.The forecasting performance of the LSSVM model largely depends on the values of its two parameters.Currently, several meta-heuristic algorithms have been used to determine the appropriate values of these two parameters, including particle swarm optimization [20], genetic algorithm [22], chaotic differential evolution approach [29], artificial bee colony algorithm [30], and simulated annealing algorithm [31].However, these optimization algorithms have the drawbacks of being hard to understand and reaching the global optimal solution slowly.The fruit fly optimization algorithm (FOA) proposed by Pan in 2011 [32], is a novel evolutionary computation and optimization technique.This new optimization algorithm has the advantages of being easy to understand due to the shorter program code compared with other optimization algorithms and of reaching the global optimal solution fast.Therefore, this paper attempts to use the FOA to automatically determine the appropriate values of the two necessary parameters in order to improve the performance of the LSSVM model in annual electric load forecasting.
The rest of this paper is organized as follows: Section 2 introduces the LSSVM model and FOA, then a hybrid annual electric load forecasting model (LSSVM-FOA) that combines LSSVM model and FOA is discussed in detail.Section 3 introduces the sample data processing procedure used in this paper, and the computation, comparison and discussion of a numerical example is presented.Section 4 concludes this paper.

Least Squares Support Vector Machine (LSSVM) Model
The LSSVM is an extension of SVM which applies the linear least squares criteria to the loss function instead of inequality constraints [33].The basic principle is as follows [34]: given a set of samples x y  , where x i  R n is the input vector and y i  R is the corresponding output value for sample i.By a nonlinear function φ, the data are mapped from the original feature space to a higher dimensional transformed one, thus, to approximate it in a linear way as follows: where w denotes the weight vector; and b denotes the error.
In the primal space, the LSSVM formulation with the equality constraints can be described as: where C is the regularization parameter; and ξ i is the slack variable.The Lagrangian function L can be constructed by: where a i is the Lagrange multiplier.The Karush-Kuhn-Tucker (KKT) conditions for optimality are given by: Eliminating the variables w and ξ i , the optimization problem can be transformed into the following linear solution: where According to the Mercer's condition, the Kernel function can be set as: Then, the LSSVM model for regression becomes: There are several different types of Mercer kernel function K(x, x i ) such as sigmoid, polynomial and radial basis function (RBF).The RBF is a common option for the kernel function because of fewer parameters that need to be set and an excellent overall performance [35].Therefore, this paper selected the RBF [as shown in Equation ( 8)] as the kernel function: Consequently, there are two parameters that need to be chosen in the LSSVM model, which are the bandwidth of the Gaussian RBF kernel "σ" and the regularization parameter "C".In this paper, the FOA is used to determine the optimal values of these two parameters.

Fruit Fly Optimization Algorithm (FOA)
The fruit fly optimization algorithm (FOA) is a new swarm intelligence algorithm, which was proposed by Pan [32] in 2011.It is a kind of interactive evolutionary computation method.By imitating the food finding behavior of the fruit fly swarm, the FOA can reach the global optimum.
Fruit flies are a kind of insect, which live in the temperate and tropical climate zones and eat rotten fruit.The fruit fly is superior to other species in vision and osphresis.The food finding process of fruit fly is as follows: it firstly smells the food source with its osphresis organ, and flies towards that location; after it gets close to the food location, its sensitive vision is also used for finding food and other fruit flies' flocking location, and then it flies towards that direction.The FOA has been applied to several fields including traffic incidents [36], export trade forecasting [37], and the design of analog filters [38].Figure 1 shows the food finding iterative process of a fruit fly swarm.According to the food finding characteristics of fruit fly swarm, the FOA can be divided into several steps, as follows: Step 1: Parameter Initialization The main parameters of FOA are the maximum iteration number maxgen, the population size sizepop, the initial fruit fly swarm location (X_axis,Y_axis), and the random flight distance range FR.
Step 2: Population Initialization Give the random flight direction and the distance for food finding of an individual fruit fly by using osphresis: Step 3: Population Evaluation Firstly, the distance (Dist) of the fruit fly to the origin needs to be calculated.Secondly, the smell concentration judgment value (S) needs to be calculated.Suppose that S is the reciprocal of Dist: Then, we calculate the smell concentration (Smell i ) of the individual fruit fly location by substituting the smell concentration judgment value (S i ) into the smell concentration judgment function (also called Fitness function).Finally, find out the individual fruit fly with the maximal smell concentration (the maximal value of Smell i ) among the fruit fly swarm: Step 4: Selection Operation Keep the maximal smell concentration value and x, y coordinates.Then, the fruit flies fly towards the location with the maximal smell concentration value by using vision.Enter iterative optimization to repeat the implementation of step 2-3.When the smell concentration is not superior to the previous iterative smell concentration any more, or the iterative number reaches the maximal iterative number, the circulation stops: )

LSSVMFOA Forecasting Model
The diagram of procedure structure of the LSSVM-FOA forecasting model is illustrated in

Step3: Preliminary Calculations
Calculate the distance Dist i of the fruit fly i to the origin, and then calculate the smell concentration judgment value S i .In the LSSVM-FOA program, we employ (D(i,1),D(i,2)) to represent Dist i , and set D(i,1) = (X(i,1)^2 + Y(i,1)^2)^0.5,D(i,2) = (X(i,2)^2 + Y(i,2)^2)^0.5,respectively.Similarly, we use (S(i,1), S(i,2)) to represent S i in the LSSVM-FOA program, and set S(i,1) = 1/D(i,1), S(i,2) = 1/D(i,2), respectively.Then, input S i into the LSSVM model for annual electric load forecasting.In the LSSVM-FOA program, the parameters [C,σ] of LSSVM model are represented by [S(i,1),S(i,2)], and we set C = 20 * S(i,1) and σ 2 = S(i,2), respectively.According to the electric load forecasting result, the smell concentration Smell i (also called the fitness function value) can be calculated.The Smell i is employed by the root-mean-square error (RMSE), as shown in Equation (18), which measures the deviations between the forecasting values and actual values: ) where n is the number of forecasting periods; i f is the actual value at period i; i f  denotes the forecasting value at period i.

Step5: Circulation Stops
When gen reaches the max iterative number, the stop criterion satisfies, and the optimal parameters of LSSVM model are obtained.Otherwise, go back to Step2.

The Preprocessing of Sample Data
The sample data were selected from the annual electricity consumption of China between 1978 and 2011, shown in Table 1.Before the calculation, the sample data were normalized to make them in the range from 0 to 1 using the following formula: where x imin and x imax denote the minimal and maximal value of each input factor, respectively.The sample data were divided into the training data and testing data.Different from the short term electric load forecasting, the annual electric load forecasting is not suitable for selecting the factors such as temperature, moderate [1].Therefore, this paper selected the last three load data (L n−3 , L n−2 , L n−1 ) as the input variables of the LSSVMFOA model, and the output variable is L n .Due to using the last three electric load data as the input variables to forecast, the training data started in 1981 and ended in 2005, and the testing data were from 2006 to 2011.
In the training stage, a roll-based data processing procedure was used.Firstly, the top three load data (from 1978 to 1980) of the sample data were substituted into the LSSVM-FOA model, and then the electric load forecasting value of 1981 could be obtained.Secondly, the next roll-top three load data (from 1979 to 1981) were fed into the LSSVM-FOA model, and the forecasting value of 1982 could be produced.In this step, the electric load value of 1981 which was fed into the proposed LSSVM-FOA model should employ the actual electric load value of 1981.Similarly, the forecasting processes were cycling until all the electric load forecasting values (from 1981 to 2005) were obtained.
Because of the roll-based data processing procedure, the value of n in Equation ( 18) equals to 25.  [39]; the data of 2011 comes from reference [40].

The Selection of Comparison Models
To compare the annual electric load forecasting result, several other electric load forecasting models were selected.From Table 1, we can discern that the annual electric load series shows an increasing approximately linear trend.Therefore, the regression forecasting model was employed.In the meantime, the single LSSVM model, LSSVM model combined with coupled simulated annealing algorithm (LSSVM-CSA) [41], and generalized regression neural network (GRNN) model were also employed for comparison.GRNN is a kind of radial basis function (RBF) networks which is based on a standard statistical technique called kernel regression, and it has excellent performances on approximation ability and learning speed [42,43].In GRNN model, there is only one parameter σ that needs to be determined.

FOA Result for Parameter Determination of the LSSVM Model
In LSSVM-FOA model, the values of the two parameters of LSSVM model were dynamically tuned by the FOA. Figure 3a shows the fruit fly swarm flying route for parameter optimization.It can be seen that the fruit fly swarm flying route is relatively stable, and the fruit fly swarm moves straight to the food location.The fruit fly swarm fixes the food location accurately and fast.The iterative RMSE trend of the LSSVM-FOA model when searching for the optimal parameters is shown in Figure 3b.
After 100 evolution iterations, the convergence can be seen in generation 17 with the coordinate of (441,362), and the optimal values of the parameters σ and C are 0.7051, 17.3571, respectively.

Forecasting Result and Discussion
According to the result of the FOA tuning the parameters of LSSVM model, the values of σ and C were chosen as 0.7051 and 17.3571, respectively.In the single LSSVM model, the values of σ and C were chosen as 5 and 10, respectively.In the LSSVM-CSA model, radial basis function was chosen as the kernel function.According to the result of CSA optimizing the parameters of LSSVM model, the optimal values of σ and C were 10.8494 and 12185.8,respectively.In the GRNN model, the spread parameter value was chosen as 0.2.
With the LSSVM-FOA, single LSSVM, LSSVM-CSA, GRNN and regression model, the training times of the data are 17, 13, 36, 14 and 8 s, respectively.The training time of these five models on disposing of the training data are different.The LSSVM-FOA and LSSVM-CSA use longer times than the single LSSVM, GRNN and regression model because they need to determine the parameters in the each generation.However, the LSSVM-FOA uses 19 s less than the LSSVM-CSA computation.
Table 2 lists the annual electric load forecasting results with the LSSVM-FOA, LSSVM, LSSVM-CSA, GRNN, and regression model.Figure 4 describes the relative errors of the forecasting results of these five models.From Table 2 and Figure 4, the deviations between the forecasting results of these five forecasting models and the actual values can be captured.The relative error ranges [−3%,+3%] and [−1%,+1%] are always considered as a standard to assess the performance of a forecasting model [46].Firstly, the relative errors of annual electric load forecasting points of LSSVM-FOA model are all in the range [−3%,+3%], and the maximum and minimum relative errors are 2.265% in 2008 and −0.603% in 2009, respectively.In addition, two out of six points means that 33% of the forecasting points are in the scope of [−1%,+1%], which are −0.603% in 2009 and −0.811% in 2011.Secondly, the single LSSVM model has two forecasting points that exceed the relative error range [−3%,+3%], which are 3.139% in 2008 and 4.412% in 2009, respectively.However, all the forecasting points exceed the scope of [−1%,+1%], and the maximum and minimum relative errors are 4.412% in 2009 and −1.863% in 2011, respectively.Thirdly, the LSSVM-CSA model has one forecasting point that exceeds the relative error range [−3%,+3%], which is 3.529% in 2008.For LSSVM-CSA model, there is one forecasting point in the scope of [−1%,+1%], which is −0.632% in 2007, and the maximum and minimum relative errors are 3.529% in 2008 and −0.632% in 2007, respectively.Fourthly, the GRNN model has three forecasting points that exceed the relative error range [−3%,+3%], which are 3  The mean absolute percentage error (MAPE), mean square error (MSE), and average absolute error (AAE) were also used to assess the performances of different forecasting models in this paper.The values of MAPE, MSE, and AAE can be calculated by: ) where A(i) is the actual electric load value at time i; and F(i) is the forecasting value at time i.
Comparisons of the values of MAPE, MSE, and AAE for the LSSVM-FOA, LSSVM, LSSVM-CSA, GRNN and regression model are listed in Table 3.It can be seen that the MAPE value of LSSVM-FOA model is 1.305%, which is much smaller than that obtained by single LSSVM, LSSVM-CSA, GRNN and regression model (which are 2.682%, 1.959%, 2.692%, and 3.273%, respectively).The MSE value of LSSVM-FOA model is 2,476, which is dramatically smaller than that obtained by another four models (which are 10,695, 6,308, 10,210, and 20,853, respectively).The AAE value of LSSVM-FOA model is 0.0126, which is much smaller than that obtained by single LSSVM, LSSVM-CSA, GRNN and regression model (which are 0.0265, 0.0196, 0.0261, and 0.0333, respectively).Meanwhile, the values of MAPE, MSE, and AAE of LSSVM-CSA model are much smaller than that of single LSSVM, GRNN and regression models.These indicate that the meta-heuristic algorithms for parameter selection have the potential to be employed for the LSSVM-based annual electric load forecasting model to improve the forecasting accuracy.In this paper, the LSSVM-FOA model has better forecasting performance than the LSSVM-CSA model.Furthermore, because the values of MAPE, MSE, and AAE are the largest, the regression model has the lowest forecasting accuracy, which reveals its poor non-linear fitting capability.The MAPE value of the single LSSVM model is smaller than that of GRNN model, but the MSE and AAE values are much larger.So, it is still unclear when the LSSVM-based annual electric load forecasting model performs better than the GRNN-based annual electric load forecasting model in this paper.In conclusion, the proposed LSSVM-FOA model greatly narrows the deviations between the forecasting values and actual values, and outperforms the single LSSVM, LSSVM-CSA, GRNN, and regression model in the annual electric load forecasting.

Conclusions
With the construction of the "Strong Smart Grid" and the increasing generation capacity of renewable distributed energy, accurate electric load forecasting is a guide for effective implementations of energy policies in China of greatly importance.However, the non-linear relationship of annual electric load with its influencing factors makes electric load forecasting very complicated.Thus, how to improve the annual electric load forecasting accuracy is worthy of study.The least squares support vector machine has been widely applied to a variety of fields, but it is regretful to find that the LSSVM have rarely been applied to the problem of annual electric load forecasting.The fruit fly optimization algorithm (FOA) is a new swarm intelligence algorithm which has the advantages of being easy to understand due to its shorter program code compared with other meta-heuristic algorithms, and reaching the global optimal solution fast.In this paper, we hybridized the LSSVM and FOA, in the so-called LSSVM-FOA model, to examine its potential for annual electric load forecasting.To validate the proposed method, four other alternative models (single LSSVM, LSSVM-CSA, GRNN, and regression model) were employed to compare the forecasting performances.Example computation results show that the relative errors of annual electric load forecasting points of LSSVM-FOA model are all in the range [−3%,+3%], and the values of MAPE, MSE and AAE are much smaller than that obtained by single LSSVM, LSSVM-CSA, GRNN, and regression model.These indicate the proposed LSSVM-FOA model has significant superiority over other alternative forecasting models in terms of the annual electric load forecasting accuracy.The hybridization of the least squares support vector machine and fruit fly optimization algorithm is feasible.The LSSVM-FOA model uses 19 s less than the LSSVM-CSA computation, which testifies to the FOA's advantage in reaching the global optimal solution fast compared with other meta-heuristic algorithms.Although the LSSVM-FOA model is a little time consuming compared with single LSSVM, some attentions should be paid to this new hybrid forecasting model.The proposed LSSVM-FOA model which uses the FOA to automatically determine the appropriate values of the two parameters for the LSSVM model can effectively improve the annual electric load forecasting accuracy.We also conclude that the artificial intelligence forecasting models have much better performance than the regression models, which reveals that artificial intelligence forecasting models have good non-linear fitting capacity.Meanwhile, the meta-heuristic algorithms for parameter selection have the potential to be employed for the LSSVM-based annual electric load forecasting model to improve the forecasting accuracy.

Figure 1 .
Figure 1.Food finding iterative process of a fruit fly swarm.

Figure 2 .
Figure 2. Diagram of the procedure structure of the LSSVM-FOA forecasting model.

Figure 3 .
Figure 3. (a) The fruit fly swarm flying route for parameter optimization; (b) The iterative RMSE trend of the LSSVM-FOA model searching for optimal parameters.

Figure 4 .
Figure 4.The relative errors of the forecasting results of the different forecasting models.