Research and Application of a Hybrid Forecasting Model Based on Data Decomposition for Electrical Load Forecasting

Accurate short-term electrical load forecasting plays a pivotal role in the national economy and people’s livelihood through providing effective future plans and ensuring a reliable supply of sustainable electricity. Although considerable work has been done to select suitable models and optimize the model parameters to forecast the short-term electrical load, few models are built based on the characteristics of time series, which will have a great impact on the forecasting accuracy. For that reason, this paper proposes a hybrid model based on data decomposition considering periodicity, trend and randomness of the original electrical load time series data. Through preprocessing and analyzing the original time series, the generalized regression neural network optimized by genetic algorithm is used to forecast the short-term electrical load. The experimental results demonstrate that the proposed hybrid model can not only achieve a good fitting ability, but it can also approximate the actual values when dealing with non-linear time series data with periodicity, trend and randomness.


Introduction
The electric power industry plays a pivotal role in the national security, social stability and all aspects of people's life.As is known to all, electricity, as one of the most important energy resources, is difficult to store.A great variety of instability factors can affect the electric system, such as emergencies, holidays, population changes, the weather and more [1].Therefore, there is a high demand for the generation, transmission and sales of electricity, because excess supply can result in wasted energy resources and in case of excess demand the need for electricity cannot be satisfied.Therefore, performing load forecasting based on the historical data has been a basic task in the operation of electric systems [2].With the rapid development of society and continuous improvement of economic levels, people have gradually shown a higher desire for electricity, which poses a huge challenge to the forecasting accuracy.A higher accuracy can improve the electric energy usage, enhance the safety and reliability of power grid and have a big impact on all sections in the electric power system.Accurate forecasting of electrical load plays a significant role, which can be reflected in the following aspects: Improve the social and economic benefits.The electrical power sector is supposed to ensure a good social benefit through providing safe and reliable electricity and improving the economic benefits considering the cost problems.Thus, the electrical load forecasting is beneficial for electrical power system to achieve the economic rationality of power dispatching.Ensure the reliability of electricity supply.Whether the power generation or supply, equipment needs periodical overhauls to ensure the safety and reliability of electricity.However, when to overhaul or replace the equipment should be based on accurate electrical load forecasting results.Plan for electrical power construction.The construction of electrical power production sites cannot stay unchanged, and should be adjusted and perfected, to satisfy the demands of a constantly changing future with the progress of society and development of the economy.
There are a great number of methods to forecast the electrical load, and in general the electrical load forecasting can be divided into three types, according to the applied field and forecasting time: Long-term electrical load forecasting.This means a time interval above five years and is usually conducted during the planning and building stage of the electrical system, which considers the characteristics of the electrical system and the development tendencies of the national economy and society; Middle-term electrical load forecasting.It is mainly applied in the operation stage of the electrical power system, for direction of the scientific dispatch of power, arrangement of overhauling and so on; Short-term electrical load forecasting.It plays a pivotal role in the whole electrical system and is the most important part, for it is the basis of long-and middle-term electrical load forecasting.Besides, it can ensure the stable and safe operation of the electrical power system based on the forecasting data.
Electrical load forecasting is a very complicated work.On the one hand, the electrical power system itself is complex and of large size.On the other hand, the electrical market closely combines the electrical power system with the whole society.Therefore, to properly monitor changes of the electrical load has become increasingly crucial for utilities so as to secure a steady power supply and make a suitable plans for investing in power facilities [3].On the contrary, the inaccurate electrical load forecasting would be counterproductive.The overestimated future electrical load will result in an unnecessary generation of electrical power; while the underestimated forecasting would lead to trouble in offering sufficient electrical power, resulting in high losses for per peaking unit [4,5].In addition, the inaccurate electrical load forecasting would also directly increase the operating costs.Therefore, to develop a better forecasting method and improve the forecasting ability has been more and more imperative, which is a both significant and challenging task [6].
In recent years, the study of short-term electrical load time series forecasting has mainly included four aspects, which are classic forecasting methods, modern forecasting methods, combined forecasting methods and hybrid forecasting methods [7].
The classic forecasting models refer to regression analysis, time series analysis and so on.The regression analysis models regard the influencing factors of time series as independent variables, and the historical data as the dependent variable, ensuring the relationship between the series and influencing factors.These methods are based on the analysis of historical data, so they can better model the history, however, as time goes by, the forecasting effect of regression analysis models will become weaker and weaker.The regression analysis process is easy, and the parameter estimation methods are complete; however, when dealing with non-linear time series data, the forecasting quality is bad and the forecasting accuracy is low.Another drawback is that it is difficult to select the influencing factors owing to the complexity of the objective data [8].Time series forecasting aims to construct mathematical models based on the statistics of historical data, and it requires relatively small datasets and achieves a fast analysis speed, which can capture the variation trends of the recent data.However, it has a high requirement for stability, so when the influence of random factors is strong, the model will achieve a bad forecasting effect and low forecasting accuracy.
The modern forecasting methods include artificial intelligence neural networks [9,10], chaotic time series methods [11], expert system forecasting methods [12], grey models [13,14], support vector machines [15,16], fuzzy systems [17], self-adaptable models [18], optimization algorithms and so on.The artificial neural networks (ANNs) can simulate the human brain to realize the intelligent dealing, and it can obtain a good forecasting performance when addressing the non-structural and non-linear time series data owing to their ability of self-adaptability, self-learning and memory.In 1991, Park [19] first applied ANNs in electrical load forecasting, proving the good performance of the model and at the same time concluding that ANNs were applicable in electrical load forecasting.Since then a large number of researchers have utilized many types of ANNs to forecast the time series [20][21][22]; however, ANNs also have its own limitations and disadvantages: (1) It is difficult to determine scientifically the number of layers and neurons of a network structure; (2) ANNs have a relatively slow self-learning convergence rate, which makes it easy to fall into a local minimum; (3) The ability to express the fuzzy awareness of human brain is not strong.Therefore, other methods, such as support vector machine (SVM) and evolution algorithms (EA), are used to overcome the dependence of ANNs on the samples, enhance the extrapolation power, and reduce the learning time.Pandian [23] and Pai [24] applied ANNs in electrical load forecasting systems.The optimization algorithms are enlightened by the biological evolution, which is effective in dealing with complicated problems.Optimization algorithms are usually combined with other forecasting methods, with the aim of selecting and recognizing parameters.For example, in the aspect of ANNs, optimization algorithms do not depend on subjective experience to determine parameters; instead, it can select more reasonable parameters through objective algorithms.
In view of the limitations and accuracy errors of single algorithms, they cannot be adapted to all situations; therefore, the combined models have gradually become the development tendency currently [25].The combined forecasting models were initially proposed by Bates and Granger who proved that the linear combination of two forecasting models could obtain better forecasting results than the single models alone.Xiao et al. [26] and Wang et al. [27] also proved that the forecasting accuracy of the combined model were higher than that of a single model.The basic principles of the combined forecasting methods are to integrate the forecasting output results of different single models based on certain weights, narrowing the value range of the forecasting down to a smaller scale.A problem is supposed to be studied from different angles instead of a single angle, and this is why the combined forecasting model is needed.The information obtained from each single forecasting method is not the same, and a weight is necessary to express the outputs of each single model more comprehensively in order to retain the original valuable information.Recently the combined forecasting models have been commonly used to solve forecasting issues, but how to select the single model properly and distribute the weight reasonably is a challenging task.
The theory of hybrid algorithms can get over the shortcomings of the single forecasting model through integrating two or more than two single models.As discussed above, the single models have their own advantages and disadvantages when dealing with different forecasting problems.In comparison, the hybrid forecasting methods can increase the forecasting accuracy through determining an optimal combination and putting the advantages of single models into full play.In other words, the hybrid algorithms can integrate many different forecasting techniques to solve practical problems in practice.For example, the blind number theory can be applied in middle-and long-term electrical load forecasting to build a hybrid model, which can enhance the forecasting effects well due to the irregular nature of electrical load time series.
Affected by many factors, the complexity of time series continues improving, and several techniques are utilized to solve the forecasting problems of time series.Azimi et al. [28] built a novel hybrid model to forecast the short-term electrical load, because a single model cannot figure out the characteristics of the time series data.Khashei and Bijari [29] considered that there was no a single model that could ensure the real process of the data generation.Shukur and Lee [30] proposed a hybrid model, including ANN and auto regressive integrated moving average (ARIMA), taking full advantage of the linear and non-linear advantages of the two models.Considerable experimental results demonstrate that the forecasting accuracy of the hybrid model represents a great improvement Energies 2016, 9, 1050 4 of 30 when compared with other single models.Aiming to improve the forecasting quality, Niu [31] built a new hybrid ANN model and combined some statistical methods to conduct forecasting.Lu and Wang [32] developed a growing hierarchical self-organizing map (SOM) with support vector machine (SVM) to forecast the product demand.Okumus and Dinler [33] integrated the adaptive neuro-fuzzy inference system and ANNs to forecast the wind power and their experimental results proved that the proposed hybrid model was better than applying the single model.Che and Wang [34] put forward the SVMARIMA hybrid model with SVM and ARIMA to forecast both the linear and non-linear trends more accurately.Meng et al. [35] developed a hybrid model for short-term wind speed forecasting by applying wavelet packet decomposition, crisscross optimization algorithm and artificial neural networks, and their experimental results showed that the proposed hybrid model had the minimum mean absolute percentage error, regardless of whether one-step, three-step or five-step prediction was used.Elvira [36] selected five forecasting methods to forecast the electrical load in summer and winter in the southeastern region of Oklahoma respectively.The empirical results showed that there was no one model that could always perform the best in all conditions, and differences in the original time series data and the evaluation metrics used to measure errors would both have an impact on the selection of the optimal model.Wu et al. [37] proposed a hybrid forecasting method based on seasonal index adjustment, and applied it in the forecasting of short-term wind speed and electrical load.The experimental results indicated that compared with the method without seasonal index adjustment, the proposed hybrid model could achieve a better forecasting result.
As discussed above, the single modela cannot satisfy the requirementa for forecasting accuracy in practice, and there is no one model applicable in any situation.Given that the actual data will be affected by various factors, which are difficult to recognize and measure, and it is not possible to take every related factor into consideration, the model is supposed to be built based on some key factors that can be extracted.The establishment of the hybrid model has become the mainstream currently.Therefore, this paper proposes a hybrid forecasting model considering periodicity, trend and randomness for electrical load time series.The contributions of the model are summarized as follows: (1) The time series data have the characteristics of continuity, periodicity, trend and randomness, and considerable work has been done to select suitable models and the optimize the model parameters; however, few studies focus on building forecasting models based on the characteristics of the time series data.Therefore, the initial contribution of this paper is to decompose the time series data.Based on the traditional additive model, the layer-upon-layer decomposition and reconstitution method is applied to improve the forecasting accuracy.Then according to the data features after decomposition, suitable models could be found to perform the forecasting.Through effective decomposition of the data and selection of reasonable model, the forecasting quality and accuracy could be improved to a great degree.(2) This paper uses the generalized regression neural network (GRNN) to improve the forecasting performance.The data after decomposition have noises, so the empirical mode decomposition (EMD) is applied to reduce the noise in the data.Then the genetic algorithm (GA) is utilized to optimize the GRNN to conduct the forecasting to enhance the forecasting accuracy of the single model.(3) The practical application of the proposed hybrid model in this paper is to forecast the short-term electrical load in New South Wales of Australia, and compare it with the single models and models without decomposition.The forecasting results demonstrate that the proposed model has a strong non-linear fitting ability and good forecasting quality for electrical load time series.Both the simulation results and the forecasting process could fully show that the hybrid model based on the data decomposition has the features of small errors and fast speed.The algorithm applied in the electrical power system is not only applicable, but also effective.
The rest of this paper is organized as follows: Section 2 describes the method and Section 3 introduces the detailed steps of the hybrid model, respectively.The experimental results are shown in Section 4. Section 5 presents the conclusions.

Methods
Conducting an accurate electrical load forecasting needs better developed forecasting methods and it is imperative to have improved forecasting abilities.This paper proposes a hybrid model to perform short-term electrical load forecasting, and this part introduces the fundamental methods, including additive model of time series, moving average model, cycle adjustment model, empirical mode decomposition and generalized regression neural network.

Additive and Multiplicative Model of Time Series
In general, a time series can be decomposed into two types of models through data transformation, including the additive model and the multiplicative model, as shown in Equations ( 1) and (2): where S t is a seasonal item, indicating the law of transformation of time series with the season, which exists objectively.Actually, the electrical load time series always shows a seasonal cycle fluctuation; that is to say, the sequence will change repeatedly and continuously with time, showing a periodicity rule.Therefore, this paper classifies the seasonal item into a periodic item considering the clarity of expression.T t is a trend item, denoting the law of transformation of time series with the trend.It mainly represents a long-term changing rule, because the time series will keep increasing, decreasing or remain stable.C t is a periodic item and it indicates a periodic and non-seasonal law of transformation of time series with time.The number of a cycle fluctuation periods is expressed as h.R t is a random item, which indicates the random change.Through decomposition, the original time series could be transformed into a stationary time series, which could achieve a good fitting and forecasting result.

Moving Average Model
The original time series will show the features of continuity, periodicity, trend and randomness.In order to eliminate the features and obtain a smoother time series, the moving average model will be applied.The algorithm principle is to calculate the average of the historical data, and the average is regarded as the next forecasting value until the final forecasting goal is realized.In other words, a new value will replace the old value, among which the number of items of the moving average is fixed.The detailed calculation equation is described as follows: where X = {y 1 , y 2 , • • • y t } is the original time series, N is the number of average, M t is the moving average in the t-th period, y t is the observed value in the t-th period and N is the number of fixed items.The forecasting equation is: (4)
The average of each group can be used to approximate the periodic average [38].The s-th average period is: The average of all data is: The periodic value after adjustment is: Equations ( 5)-( 7) represent the periodic variation law.

Empirical Mode Decomposition
The empirical mode decomposition, initially proposed in 1998, belongs to the data mining methods, which play a crucial role in dealing with the non-linear data Currently, it has been applied in many fields, such as geography [39], economics [40] and so on.EMD is a type of new method to divide the same non-stationary into different frequencies.The sequence of the composed different signal scales is called intrinsic mode function (IMF), which is the non-linear and stationary signal.IMF has an obvious feature that the wave amplitude changes with time.For given signal x(t) ∈ R t , the detailed steps of EMD are described as follows (as shown in Figure 1I): Step 1. Find all the local extreme points of x(t).
Step 2. For all local extreme points of x(t), build the envelope function of the signal, respectively, which can be denoted as e max (t) and e min (t).
Step 3. Calculate the average of the envelope function: Step 4. Calculate the differential function between signal x(t)and the envelope average function Step 5. Replace the original signal x(t) with h(t), and repeat above steps from Step 2 to Step 4 until all averages of envelope function tends to zero.In this way an IMF c 1 (t) is decomposed.
Step 6. c 1 (t) represents the component with the highest frequency, so the low frequency of the original signal is r 1 (t): Step 7.For x 1 (t), repeat Step 2, Step 3 and Step 4, and the second IMF c 2 (t) can be obtained until the differential function r n (t) is a constant function or monotone function.Finally, the original signal x(t) can be represented by IMF c j (t), j = 1, 2, • • • , n and r n (t) as shown in Equation ( 13): The EMD steps of the time series are shown in Figure 1I, and the pseudo code of EMD is described in Algorithm 1 below.
Equation ( 13): The EMD steps of the time series are shown in Figure 1I, and the pseudo code of EMD is described in Algorithm 1 below.

Parameters:
d -represent a random number in the algorithm with the value between 0.2 and 0.3.
T-a parameter describing the length of the original electrical load time series data.
5: Calculate the upper envelope Ui(t) and Li(t) via cubic spline interpolation.

Parameters:
δ-represent a random number in the algorithm with the value between 0.2 and 0.3.T-a parameter describing the length of the original electrical load time series data.

Generalized Regression Neural Network (GRNN)
The generalized regression neural network, first proposed by Specht in 1991, is a type of radial basis function neural network (RBF).The theory of GRNN is based on non-linear regression analysis, and in essence, the purpose of GRNN is to calculate y with the biggest probability value based on the regression analysis of dependent variable Y and independent variable x.Assume that joint probability density function of the random variable x and y is f (x, y), and the observed value x is known as X, so the regression of y about x is: The density function f (X, y) can be estimated from the sample data set {x i , y i } n i=1 by applying Parzen non-parametric estimation: where X i and Y i is the sample observed value of x and y, n is the sample size, p is the number of dimension of random variable x and σ is the smoothing factor.f (X, y) can replace f (X, y) of Equation ( 15), so the function after transformation is: For ∞ −∞ ze −z 2 dz = 0, after calculating the two integration, the output of GRNN can be Ŷ(X) obtained as follows: After obtaining the training samples of GRNN, the training process of the network involves optimizing the smoothing parameter σ.In order to improve the fitting ability of GRNN, σ needs to be optimized, which indicates the importance of optimizing the smoothing parameter σ in GRNN.
As for the structure of GRNN, it is similar to that of RBF, including input layer, pattern layer, summation layer and output layer.The corresponding network input is X = [x 1 , x 2 , . . ., x n ], and its output is Y = [y 1 , y 2 , . . ., y n ] T , which are described below. (

1) Input layer
The number of neuron of the input layer is the same as the dimension number of input variable, which plays a role in transferring signals. (

2) Pattern layer
The number of neuron of the pattern layer is the same as the number of learning samples, and the transfer function is where X is the input variable of the network, and X i is the learning sample of ith neuron.
Energies 2016, 9, 1050 9 of 30 (3) Summation layer Two methods can be applied to calculate the neuron.One is shown in Equation (10): where the arithmetic sum of each neuron is calculated, the link weight is 1, and the transfer function is: The other method is: where the weighted arithmetic sum of each neuron is calculated, and the link weight between the i-th neuron and j-th molecular sum neurons is the j-th element of i-th output sample Y j .The transfer function is: (4) Output layer The number of neuron of output layer is the same as the dimension number k of output variable.The output of summation layer is divided by each neuron as shown in Equation ( 23): Then there are some weights in GRNN to connect different layers, and the least mean squares and differential chain rule are applied to adjust them.Initially, we define the least mean square of each neuron in the output layer: where d k (X) is the expected output, F k (W, X) is the actual output.E k can arrive at the smallest value through adjusting the weights according to Equation (25) by using the least mean squares method: where η k is the learning rate.Therefore, the key to realizing the least square mean is to solve (−∂E k /∂w ki ), so by using the differential chain rule, we can get: where , which can be denoted as δ k .Then we can get (−∂E k /∂w ki ) = δ k y i according to Equation ( 27): , where y i is the output of i-th neuron in the hidden layer, and the input of kth neuron in the output layer.The detailed structure of GRNN is described in Figure 1IV.

The Proposed Hybrid Model
In the proposed data decomposition hybrid model (DDH), we initially remove the periodicity in the original series, and then the EMD-GA-GRNN is applied to forecast the electrical load time series without periodicity.After that the periodicity is added to the forecasted time series by using the additive model.This part will introduce the basic ideas of both DDH and EMD-GA-GRNN.

Genetic Algorithm
The genetic algorithm is based on the natural selection rule and biological evolution principle, and its basic idea is to generate a set of initial solutions (population) in the problem space.Each group of solutions is regarded as the individuals in the population, which is defined as a chromosome.In the searching process, the adaptive value of chromosomes is the standard used to evaluate and select individuals.In the next generation, new individuals are generated through crossover and mutation operations, becoming a new generation of the population [41].The above steps are repeated so that the chromosome can converge to a desired optimum value and solution.GA is applied in this paper to optimize GRNN, and the detailed steps are described as follows (as shown in the pseudo code of Algorithm 2 and Figure 1II): Step 1. Initialize the population.Each individual in the population is a real number, with a known net structure, the initial values can form a neural network with structure, weight value and threshold value.
Step 2. Ensure the fitness function.The fitness value F is the absolute error values between the forecasting output and expected output calculated by Equation ( 28): where n is the number of the output node of the network, y i is the expected output of ith node, o i is the forecasting output of ith node, and k is the coefficient.
Step 3. Selection operation.This operation is based on the proportion of the fitness, and the selection probability of each individual i is p: where F i is the fitness of individual i, and the smaller fitness is better.Before the selection operation, the reciprocal of fitness should be calculated.k is the coefficient and N is the number of individual in the population.
Step 4. Crossover operation.The individual is coded by using the real number, and the crossover operation in the jth position between kth chromosome a k and a l lth chromosome a l : where b is a random number of [0,1].
Energies 2016, 9, 1050 11 of 30 Step 5. Mutation operation.Select the j-th gene of i-th individual to conduct the mutation operation, and the method is: where a max is the upper bound of gene a ij , a min is the lower bound of gene r 2 is a random number, g is the current iteration number, G max is the maximum iteration number and r is a random of [0,1].
Algorithm 2: Pseudo Code of the genetic algorithm

Data Decomposition Hybrid (DDH) Model
The time series always changes as time goes by, and such change has the features of continuity, periodicity, trend, and a certain randomness.In the previous research, no matter which models, including single model, combined model or hybrid model, they are all applied in forecasting the whole time series.Unlike the previous research, this paper proposes a data decomposition hybrid model (DDH) based on the periodicity, trend and randomness in the time series.The basic idea of DDH is to decompose the times series based on the main influencing factors.On the basis of decomposition and recombination of traditional additive model, the layer-upon-layer decreasing is applied to improve the forecasting accuracy.Then suitable models are selected to conduct the forecasting according to the data characteristics and features.The effective decomposition of data and proper forecasting models for each part can enhance the fitting performance of the model and decrease the forecasting errors to a great degree compared with conventional single forecasting methods.The detailed steps of DDH are described below (as shown in Figure 1III): Step 1. Observe whether the time series Y t contains trend, periodicity and randomness, and judge the applicability of the additive model and multiplicity model.In general, compared to the additive model, the multiplicity model is more suitable for time series with large fluctuations [42].The electrical load time series have a relatively stable fluctuation range; therefore, the additive model is chosen, and the following discussion is based on it.
Step 2. Apply the moving average method or other methods to extract the periodicity C t .
Step 3. Without the periodicity C t , the rest of the data can be defined as trend T t .If T t is far larger than C t , a periodic adjustment of C t should be conducted to obtain the estimated periodicity Ĉt , and this is because if we firstly forecast larger data, there will be much noise in the latter data, which will affect the forecasting accuracy.Then the new trend T t can be obtained (T t = Y t − Ĉt ).Finally, EMD-GA-GRNN can be utilized to forecast T t , and the forecasting value is Tt .On the contrary, if C t is far larger than T t , EMD-GA-GRNN is used to forecast the trend T t , and get the forecasting value Tt .Then the periodicity data C t can be obtained.Finally, the estimated value Ĉt is obtained through the periodic adjustment.
Step 4. The original randomness R t is calculated (R t = Y t − Ĉt − Tt ).We forecast the randomness after decomposition by applying GA-GRNN to get the forecasting value Rt .The randomness after decomposition is nearly stable, so EMD is unnecessary.
Step 5. Utilize the additive model to get the final forecasting values of the time series: Ŷt = Ĉt + Tt + Rt .

The EMD-GA-GRNN Forecasting Model
In the model of DDH, EMD-GA-GRNN is proposed, which is based on the data state after applying the layer-upon-layer decreasing method.However, data after the layer-upon-layer decreasing method may include some noise due to the forecasting accuracy in the former forecasting methods.Thus, it is pivotal to apply a proper method to remove the noise in the decomposed data.This paper chooses the empirical mode decomposition method considering its advantages in dealing with non-linear time series data.Then the GRNN is utilized to forecast the dealt data, because it performs well in fitting non-stationary data.The training process of GRNN is actually to ensure the optimum s, and the specific steps of the hybrid model EMD-GA-GRNN are listed as follows (Pseudo code of Algorithm 3):

4: FOR EACH
Step 2. Standardize and code the time series after the denoising.
Step 3. Generate the initial population P(t), and the evolutionary generation is t = 0.
Step 4. Code the chromosome, and get the parameters of GRNN, which can be used to train the network structure.
Step 5. Set the individual evaluation standard according to the fitness function in Equation (34): where Y j (i) is the output of GRNN and Y j (i) is the output.
Step 6. Apply the optimum strategy based on the values of fitness function.
Step 7. Judge whether the fitness value meets the accuracy requirement.If so, the process ends; or move to the next step.
Step 8. Judge whether the current iteration t gets to the maximum iteration.If so, the process ends; or go to the next step.
Step 9. Perform the selection, crossover and mutation operation for the current population.
Step 10.Generate the new generation of the population, and the iteration t becomes t + 1, return Step 3.

Experiments
With the rapid development of technology and science, the electrical power system in each country tends to develop fast as well.Similarly, the power grid management has become more complicated.The forecasting is the premise and basis of decision and control; therefore, the premise and the most vital step of electrical load management is to conduct the electrical load forecasting.The accurate forecasting can not only help the electrical power system operate safely based on reasonable maintenance schedules, but it can also decrease the grid costs and maximize the profits.

Model Evaluation
To conduct the model evaluation can lead to a clear and direct understanding of the forecasting accuracy, and it is helpful to analyze the reasons causing errors to enhance the forecasting performance.The main reasons are listed below: (1) Selection of influencing factors when constructing mathematical models.In truth, the time series is affected by various factors, and it is difficult to master all of them.Therefore, errors between forecast values and actual values cannot be avoided.(2) Improper algorithms.For forecasting, we just build a relatively appropriate model, so if the algorithms are chosen wrongly, the errors would become larger.(3) Inaccurate or incomplete data.The forecasting should be based on the historical data, so inaccurate or incomplete data can result in forecasting errors.
When there are abnormal values, we are supposed to find the reasons causing the errors and correct each step of the model.The forecasting accuracy plays a crucial role in assessing a forecasting algorithm, and two types of evaluation metrics are chosen to evaluate the forecasting accuracy: the accuracy of forecasting a single point and the overall accuracy of forecasting multiple points.Two evaluation metrics are applied to examine a single point forecasting accuracy, which are absolute error (AE) and relative error (RE).Then we select four evaluation metrics, including mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE) and mean error (ME), to evaluate the model performance more comprehensively.MAPE is a generally accepted metric for forecasting accuracy, and MAE and RMSE can measure the average magnitude of the forecast errors; however, RMSE imposes a greater penalty on a large error than several small errors [43].
For a group of time series x t (t = 1, 2, . . ., T), the corresponding forecasting output is xt and detailed description of evaluation metrics is shown in Table 1.
Table 1.The evaluation metrics.

Name of Metrics Equation
No.

Name of Metrics Equation
No.
The smaller values of the six metrics are, the higher forecasting accuracy is.Therefore, the evaluation metrics can both reflect the forecasting results and its accuracy clearly and directly and provide a reference base for decisions, which is beneficial to improving the model and conducting the analysis.Thus, the significance of the evaluation metrics is very large.

Experimental Setup
This paper uses the 30-min interval data of New South Wales, Australia in April 2011 to verify the effectiveness of the proposed hybrid DDH model based on data decomposition.In the first experiment, the data size is 1440, and data in the first 29 days are the training set, and the testing set includes data in the 30th day.The detailed ideas of the proposed electrical load hybrid model is summarized as follows (as shown in Figure 2): (1) The original electrical load time series data Y t has an obvious trend and periodicity.Initially, the moving average method is conducted to extract the periodicity C t .For the periodicity C t , conduct the periodic adjustment and obtain Ĉt .(2) Subtract the periodicity of the original time series data, and get the original trend T t (T t = Y t − Ĉt ).
For the original data without periodicity, EMD needs to be initially applied to eliminate the noises and improve the forecasting accuracy.Then the genetic algorithm could be used to optimize GRNN to obtain the forecasting trend item Tt .(3) Finally, the randomness can be obtained through R t = Y t − Ĉt − Tt , then the GRNN optimized by the genetic algorithms is utilized to forecast the randomness and the forecasting value is obtained.
The trend tends to be steady; therefore, there is no need to eliminate noises.(4) The final forecasting is performed by the additive model of time series Ŷt = Ĉt + Tt + Rt .
the effectiveness of the proposed hybrid DDH model based on data decomposition.In the first experiment, the data size is 1440, and data in the first 29 days are the training set, and the testing set includes data in the 30th day.The detailed ideas of the proposed electrical load hybrid model is summarized as follows (as shown in Figure 2): (1) The original electrical load time series data t Y has an obvious trend and periodicity.Initially, the moving average method is conducted to extract the periodicity t C .For the periodicity t C , conduct the periodic adjustment and obtain ˆt C .
(2) Subtract the periodicity of the original time series data, and get the original trend T Y C = -).For the original data without periodicity, EMD needs to be initially applied to eliminate the noises and improve the forecasting accuracy.Then the genetic algorithm could be used to optimize GRNN to obtain the forecasting trend item ˆt T .
(3) Finally, the randomness can be obtained through R Y C T = --, then the GRNN optimized by the genetic algorithms is utilized to forecast the randomness and the forecasting value is obtained.The trend tends to be steady; therefore, there is no need to eliminate noises.

Empirical Results
The model performance is evaluated based on the upper data, and the results are obtained by using MATLAB ® (2015a), which was implemented under Windows 8.1 with a 2.5 GHz Intel Core i5 3210 M, 64 bit CPU with 4 GB RAM. Figure 3 shows the data decomposition process.

Empirical Results
The model performance is evaluated based on the upper data, and the results are obtained by using MATLAB ® (2015a), which was implemented under Windows 8.1 with a 2.5 GHz Intel Core i5 3210 M, 64 bit CPU with 4 GB RAM. Figure 3 shows the data decomposition process.(1) Figure 3A shows the results after decomposition by moving average, from which it can be seen that the original electrical load data contains a certain periodicity, and the variation of the period is roughly equal, so the additive model is more suitable.The length of the period h = 48 can be ensured based on the data distribution.Thus the moving average method is used to decompose the electrical load data into two parts, which are periodicity and trend.Besides, from the decomposed results, it can be known that the level of trend is nearly ten times the periodicity.This is because that the moving average method can demonstrate the large trend of the development, eliminating the fluctuation factors such as season.Therefore, the periodic adjustment should be conducted through extracting the periodicity.(1) Figure 3A shows the results after decomposition by moving average, from which it can be seen that the original electrical load data contains a certain periodicity, and the variation of the period is roughly equal, so the additive model is more suitable.The length of the period h = 48 can be ensured based on the data distribution.Thus the moving average method is used to decompose the electrical load data into two parts, which are periodicity and trend.Besides, from the decomposed results, it can be known that the level of trend is nearly ten times the periodicity.This is because that the moving average method can demonstrate the large trend of the development, eliminating the fluctuation factors such as season.Therefore, the periodic adjustment should be conducted through extracting the periodicity.(2) Figure 3B is the electrical load data after periodic adjustment, from which is can be known that the electrical load data after the periodic adjustment have periodic sequence and basis trend characteristics.
(3) Figure 3C demonstrates the output results of trend data after EMD.It shows that nine components are obtained, including IMF 1 , IMF 2 , . . ., IMF 8 and R n , after EMD data decomposition.The high-frequency data in highest component is removed, and the rest data are regarded as the new trend time series data.(4) Figure 3D clearly reveals the trend data after EMD decomposition by removing the high frequency component, and it can be obviously seen that the data denoised by EMD are smoother than the original data.
Next, data after removing the high frequency component by EMD is fitted and forecast by GRNN.The genetic algorithm is applied to optimize the smoothing factor σ in GRNN.The hybrid electrical load forecasting model EMD-GA-GRNN constructed in this paper is applied to forecast the trend value in the next time point by using the historical data in the past time point.In this experiment, the trend value of the former four time points are used to forecast the trend value of the 5th time point.For the given data, the data need initially to be divided into the training sample and testing sample.Take the training sample for example, x 1 , x 2 , x 3 , x 4 , x 5 is the first sample group, and x 1 , x 2 , x 3 , x 4 are independent variables, and x 5 is the objective function value.Similarly, x 2 , x 3 , x 4 , x 5 , x 6 is the second sample group, x 2 , x 3 , x 4 , x 5 are independent variables, and x 6 is the objective function value.By that analogy, the final training matrix is: where each column is a sub-sample sequence, and the last row is the expected output.The training sample is used to train GA-GRNN, after that the network after training is obtained.The forecasting effects can be clearly seen from Figure 3D that EMD-GA-GRNN has a better fitting effect, and MAPE between network output and real value is 2.11%.The training model in Figure 4 is shown as follows.To the best of our knowledge, a great variety of forecasting approaches can achieve good performance in dealing with non-linear time series; therefore, in this paper we compared the proposed GRNN with three other well-known and commonly used methods, including wavelet neural network (WNN), the secondary exponential smoothing method (SES) and auto regressive integrated moving average (ARIMA).The forecasting results are compared as shown in Figure 5, from which it can be known that: (1) The speed to forecast the nonlinear time series data by using WNN is fast, with a better ability of generalization and a higher accuracy; however, the stability is weak.(2) The advantages of SES are the simple calculation, strong adaptability and stable forecasting results, but the ability to address nonlinear time series data is weak.(3) ARIMA performs well with a relatively higher accuracy when forecasting the electrical load data.However, as time goes by, the forecasting errors would gradually become larger and larger, which is only suitable for short-term forecasting.(4) On the whole, compared with other methods, GRNN can obtain a better and more stable forecasting result, as it deals with the non-linear data well and can fit and forecast the electrical load data well.
Energies 2016, 9, 1050 18 of 30   Next the randomness is obtained by R t = Y t − Ĉt − Tt .Because it tends to be stationary, we can only apply GA-GRNN to get the forecasting value Rt .The forecasting results of DDH can be calculated Y C T R = + + , and results are shown in Figure 6. Figure 6II demonstrates that the forecasting error in the 11 th time point is the largest with an MAPE within 5%, and this results is satisfactory.

Comparative Analysis
In order to prove the good performance of the proposed DDH model in this paper, three other hybrid models are compared with it, which are EMD-GA-WNN, GA-GRNN and EMD-GA-GRNN.The comparison results are shown in Table 2.
(1) From Figure 7, it can be seen that EMD-GA-WNN does not perform well when forecasting the electrical load data, and the relative errors of some parts even exceed 5%.This may be caused by the weak forecasting stability of WNN, and although GA can optimize its parameters, the effect to improve its stability is weak.(2) As for GA-GRNN and EMD-GA-GRNN, MAPEs are all within 5%, which indicates that the two forecasting models have better performance.In detail, the forecasting effect of EMD-GA-GRNN is much better than that of GA-GRNN, proving the function of EMD in improving the forecasting accuracy.(3) The DDH model based on the data decomposition put forward in this paper can control the MAPE at 4%; thus, it can be known that it has a very strong fitting ability for non-linear data

Comparative Analysis
In order to prove the good performance of the proposed DDH model in this paper, three other hybrid models are compared with it, which are EMD-GA-WNN, GA-GRNN and EMD-GA-GRNN.The comparison results are shown in Table 2.
(1) From Figure 7, it can be seen that EMD-GA-WNN does not perform well when forecasting the electrical load data, and the relative errors of some parts even exceed 5%.This may be caused by the weak forecasting stability of WNN, and although GA can optimize its parameters, the effect to improve its stability is weak.(2) As for GA-GRNN and EMD-GA-GRNN, MAPEs are all within 5%, which indicates that the two forecasting models have better performance.In detail, the forecasting effect of EMD-GA-GRNN is much better than that of GA-GRNN, proving the function of EMD in improving the forecasting accuracy.
Energies 2016, 9, 1050 20 of 30 (3) The DDH model based on the data decomposition put forward in this paper can control the MAPE at 4%; thus, it can be known that it has a very strong fitting ability for non-linear data and forecasting ability for the electrical load time series.Both the simulation results and the forecasting process demonstrate that the proposed model can have a good performance when forecasting the non-linear time series data with periodicity, trend and randomness.(4) From the evaluation metrics in Figure 7, it can be known that the forecasting ability of GRNN is better than WNN, which is because that GRNN can deal well with the data such as electrical load time series; therefore, this paper also establishes the model based on GRNN.The proposed forecasting model EMD-GA-GRNN and EMD-GA-GRNN based on WNN and GRNN can improve the forecasting accuracy well.However, in comparison, GRNN is more suitable for the nonlinear time series data, and MAPEs of EMD-GA-WNN and EMD-GA-GRNN are 2.22% and 1.53%, respectively.Certainly, EMD can reduce the forecasting errors in some degree.Besides, MAPE decreases from 1.62% of GA-GRNN to 1.53% of EMD-GA-GRNN.However, DDH model can reduce MAPE within 1%.
Energies 2016, 9, 1050 20 of 30 and forecasting ability for the electrical load time series.Both the simulation results and the forecasting process demonstrate that the proposed model can have a good performance when forecasting the non-linear time series data with periodicity, trend and randomness.(4) From the evaluation metrics in Figure 7, it can be known that the forecasting ability of GRNN is better than WNN, which is because that GRNN can deal well with the data such as electrical load time series; therefore, this paper also establishes the model based on GRNN.The proposed forecasting model EMD-GA-GRNN and EMD-GA-GRNN based on WNN and GRNN can improve the forecasting accuracy well.However, in comparison, GRNN is more suitable for the nonlinear time series data, and MAPEs of EMD-GA-WNN and EMD-GA-GRNN are 2.22% and 1.53%, respectively.Certainly, EMD can reduce the forecasting errors in some degree.Besides, MAPE decreases from 1.62% of GA-GRNN to 1.53% of EMD-GA-GRNN.However, DDH model can reduce MAPE within 1%.The summary is concluded in Remark 1.

Remark 1.
It can be concluded that compared to the single forecasting model, DDH model is more suitable for forecasting the electrical load time series data with a higher fitting ability and better forecasting capacity.
The analysis above only shows results of three models in one experiment, but it cannot comprehensively and fully demonstrate the model performance.Each model will be trained 10 times with the same iteration numbers to make the forecasting results more stable.The obtained forecasting quality and results are shown in Figure 8 and Table 3.The two figures both indicate that DDH model based on the data decomposition perform well when measured by different evaluation metrics.A smaller MAE means a higher forecasting accuracy, a lower RMSE indicates a better fitting degree of electrical load, and MAPE is an index to assess the forecasting ability of the model.At present, for the data of New South Wales, the best standard is about 1%.From the average of MAE in ten experiments, DDH has the smallest value, indicating the best forecasting accuracy.What is more, the smallest RMSE cannot only mean that DDH can fit the electrical load time series well, but it can also prove that the forecasting results of the model are stable.The summary is concluded in Remark 1.

Remark 1.
It can be concluded that compared to the single forecasting model, DDH model is more suitable for forecasting the electrical load time series data with a higher fitting ability and better forecasting capacity.
The analysis above only shows results of three models in one experiment, but it cannot comprehensively and fully demonstrate the model performance.Each model will be trained 10 times with the same iteration numbers to make the forecasting results more stable.The obtained forecasting quality and results are shown in Figure 8 and Table 3.The two figures both indicate that DDH model based on the data decomposition perform well when measured by different evaluation metrics.A smaller MAE means a higher forecasting accuracy, a lower RMSE indicates a better fitting degree of electrical load, and MAPE is an index to assess the forecasting ability of the model.At present, for the data of New South Wales, the best standard is about 1%.From the average of MAE in ten experiments, DDH has the smallest value, indicating the best forecasting accuracy.What is more, the smallest RMSE cannot only mean that DDH can fit the electrical load time series well, but it can also prove that the forecasting results of the model are stable.Initially, in order to further prove the effectiveness of the proposed DDH hybrid model, we expand our sample size by using the data in 89 days to forecast the data in the 90th day.That is to say, the first 89th days are the training set, and the testing set include data in the 90th day.The experiment results of both working days and weekends are shown in Table 4. Besides, experiments of days in different seasons are also done to examine the effectiveness and robustness of the proposed hybrid model, which are listed in Table 4 and detailed analysis are as follows: (1) As for the weekly analysis, it can be seen that the average MAPE of DDH in one week is 1.01%, which is lower than EMD-GA-WNN and EMD-GA-GRNN.In addition, we also compare the forecasting performance of the proposed DDH model in this paper to the models in the literature, including [1,4,44,45].As shown in Table 6, the model in this paper improves the forecasting accuracy by 0.089% compared to the HS-ARTMAP network.The MAPEs of the combined model based on BPNN, ANFIS and diff-SARIMA and hybrid model based on WT, ANN and ANFIS are 1.654% and 1.603%, respectively.In the compared models, the combined model based on BPNN, RBFNN, GRNN and GA-BPNN has the lowest MAPE, which is 1.236%.Therefore, in summary, the DDH model outperforms the other compared models in the literature.The superior performance of DDH is because that the model can deal with both trend and periodicity in the original time series, which can greatly enhance the forecasting accuracy.Besides, compared to conventional BPNN and ARIMA, GRNN has a strong ability of generalization, robustness, fault tolerance and convergence ability.The proposed DDH Data from April to June 2011 1.010 /

Discussion on Model Features
As discussed above, the major model in DDH model is GRNN which is optimized by GA.The experimental results also demonstrate their effectiveness in forecasting the short-term electrical load time series.This part will discuss the advantages of GRNN and GA further and more deeply.As shown in Table 7, GRNN has four obvious features: 1.
It has a relatively low requirement for the sample size during the model building process, which can reduce the computing complexity; 2.
The human error is small.Compared with the back propagation neural network (BPNN), GRNN is different.During the training process, the historical samples will directly control the learning process without adjusting the connection weight of neurons.What is more, parameters like learning rate, training time and the type of transfer function, need to be adjusted.Accordingly, there is only one parameter in GRNN that needs to be set artificially, which is the smoothing factor; 3.
Strong self-learning ability and perfect nonlinear mapping ability.GRNN belongs to a branch of RBF neural networks with strong nonlinear mapping function.To apply GRNN in electrical load forecasting can better reflect the nonlinear mapping relationship; 4.
Fast learning rate.GRNN uses BP algorithm to modify the connection weight of the relative network, and applies the Gaussian function to realize the internal approximation function, which can help arrive at an efficient learning rate.The above features of GRNN play a pivotal role in performing the electrical load forecasting when the original data are fluctuating and non-linear.
The genetic algorithm is utilized to optimize the only one parameter in GRNN, and it is a type of algorithm that works without limiting the field or type of the problem.That is to say, it does not depend on detailed problems, and can provide a universal framework to solve problems.Compared to the traditional optimization, it has the following advantages: • Self-adaptability.When solving problems, GA deals with the chromosome individuals through coding.During the process of evolution, GA will search the optimal individuals based on the fitness function.Parallelism.On the one hand, it can search multiple individuals in the solution space; on the other hand, multiple computers can be applied to perform the evolution calculation to choose the best individuals until the computation ends.The above advantages make GA widely used in many fields, such as function optimization, production dispatching, data mining, forecasting for electrical load and so on.

Conclusions
The electrical load forecasting can not only provide the electricity supply plans for regions in a timely and reliable way, but it can also help maintain normal social production and life.Thus, to improve the forecasting accuracy of electrical load can lower risks, improve the economic benefits, decrease the costs of generating electricity, enhance the safety of electrical power systems and help policy makers make better action plans.Therefore, how to forecast the changing trends and features of electrical loads in the power grid accurately and effectively has become a both significant and challenging problem.This paper proposes a Data Decomposition Hybrid (DDH) model based on the data decomposition that can deal well with the task, and it mainly contains two key steps: The first one is to decompose the data based on the main factors of electrical load time series data.On the basis of decomposition and reconstitution of traditional time series additive model, the layer-upon-layer decreasing decomposition is applied for the reconstitution to enhance the forecasting accuracy.Then according to the characteristics of the decomposed data, suitable forecasting models are found to fit and forecast the sub-sequence.Through the effective decomposition of electrical load time series data and selection of proper forecasting models, the fitting ability and forecasting capacity can be well improved.
The second idea is to improve the forecasting accuracy of Generalized regression neural network (GRNN).The major forecasting model in this paper is GRNN, and genetic algorithm is utilized to optimize parameters in GRNN.Before that EMD is applied to eliminate the noises in the data.Thus, with the help of EMD and GA, the forecasting performance of GRNN can be greatly enhanced.
The experimental results show that compared with EMD-GA-WNN, GA-GRNN and EMD-GA-GRNN, the proposed hybrid model has a good forecasting effect for electrical load time series data with periodicity, trend and randomness.In practice, the DDH model based on data decomposition can reach a high forecasting accuracy, becoming a promising method in the future.Besides, if the time series show an obvious periodicity, trend and randomness, the hybrid model can be applied commonly and effectively in other forecasting fields, such as product sales forecasting, tourism demand forecasting, warning and forecasting of flood, wind speed forecasting, traffic flow forecasting and so on.
However, with the development of technology and information, there are still many problems existing in the forecasting field.This paper mainly focuses on the study of a hybrid forecasting model based on time series decomposition and how to improve the forecasting accuracy, and further analysis can be conducted in the following aspects: (1) This paper ignores the influences of other factors on the electrical time series owing to the limitations of data collection; therefore, how to design a forecasting model and algorithm of multiple variables is a problem worth studying; (2) The forecasting techniques continue to improve, and there is no a perfect forecasting model that can deal well with all time series forecasting problems.Thus, it is necessary to develop new algorithms to achieve the future forecasting work; (3) Denoising of time series.The EMD method applied in this paper is just one type of denoising method, and other algorithms, such as Kalman filtering and wavelet packet decomposition, should be compared to EMD to select a better one.

Figure 1 .
Figure 1.Steps of the main methods and proposed hybrid model in this paper.
sequence of denoising data.

Figure 1 .
Figure 1.Steps of the main methods and proposed hybrid model in this paper.
a sequence of verifying data Output: fitness_value x b -the value with the best fitness value in the population of populations Parameters: Gen max -the maximum number of iterations; n-the number of individuals F i -the fitness function of the individual i; x i -the population i g-the current iteration number of GA; d-the number of dimension 1: /*Initialize the population of n individuals which are x i \(i = 1, 2, ..., n) randomly.*/2: /*Initialize the parameters of GA: Initial probabilities of crossover p c and mutation p m .*/3: FOR EACH (i: 1 ≤ i ≤ n) DO 4: Evaluate the corresponding fitness function F i f itness_popu(best(idx, 1), 1) 5: END FOR 6: WHILE (g < Gen max ) DO FOR EACH (I = 1:n) DO 7: IF (p c > rand) THEN 8: /*Conduct the crossover operation*/ a kj = a kj (1 − b) + a lj b and a lj = a lj (1 − b) + a kj b 9: END IF 10: IF (p m > rand) THEN 11: /*Conduct the Mutate operation*/ a ij = a ij + (a ij − a max ) * f (g), r > 0.5 a ij + (a min − a ij ) * f (g), r ≤ 0.5 12: END IF END FOR 13: FOR EACH (i: 1 ≤ i ≤ n) DO 14: Evaluate the corresponding fitness function F i f itness_popu(best(idx, 1), 1) 15: END FOR 16: /*Update the best nest x p of the d generation in the genetic algorithm.*/17: FOR EACH (i: 1 ≤ i ≤ n) DO IF (F p < F b ) THEN 18: /* The global best solution can be obtained to replace the local optimal x b ←x p */ 19: END IF END FOR END WHILE 20: RETURN x b /* The optimal solution in the global space has been obtained.*/ The objective fitness function*/ Parameters: Gen max -the maximum number of iterations; n-the number of individuals F i -the fitness function of individual i; x i -the total population i G-the current iteration number; d-the number of dimension 1: /* Process original electrical load time series data with the noise reduction method EMD */ 2: /*Initialize the population of n individuals x i (i = 1, 2, ..., n) randomly.*/3: /*Initialize the original parameters: Initial probabilities of crossover p c and mutation p m .*/Algorithm 3: Cont.

2 6: END FOR 7 :
WHILE (g < Gen max ) DO 8: FOR EACH (i = 1:n) DO IF (p c > rand) THEN 9: Conduct the crossover operation of GA to optimize the smoothing factor of GRNN 10: END IF 11: IF (p m > rand) THEN 12: Conduct the mutate operation of GA to optimize the smoothing factor of GRNN 13: END IF END FOR 14: FOR EACH (i: 1 ≤ i ≤ n) DO 15: Evaluate the corresponding fitness function

2 16 :
END FOR 17: /*Update best nest x p of the d generation to replace the former local optimal solution.*/18: FOR EACH (i: 1 ≤ i ≤ n) DO IF (F p < F b ) THEN x b ←x p ; 19: END IF END FOR END WHILE 20: RETURN x b /* Set the weight and threshold of the GRNN according to x b .*/21: Use x t to train the GRNN and update the weight and threshold of the GRNN and input the historical data into GRNN to obtain the forecasting value ŷ.

( 4 )
The final forecasting is performed by the additive model of time series ˆˆt

Figure 2 .
Figure 2. The process of electrical load forecasting for New South Wales.Figure 2. The process of electrical load forecasting for New South Wales.

Figure 2 .
Figure 2. The process of electrical load forecasting for New South Wales.Figure 2. The process of electrical load forecasting for New South Wales.

Figure 3 .
Figure 3.The forecasting effects.(A) The original electrical load time series; (B) Electrical load time series data after adjustment; (C) EMD decomposition results; (D) EMD trend series, effect of EMD and results of EMD-GA-GRNN forecasting.

Figure 3 .
Figure 3.The forecasting effects.(A) The original electrical load time series; (B) Electrical load time series data after adjustment; (C) EMD decomposition results; (D) EMD trend series, effect of EMD and results of EMD-GA-GRNN forecasting.

Figure 4 .
Figure 4.The generalized regression neural network model.Figure 4. The generalized regression neural network model.

Figure 4 .
Figure 4.The generalized regression neural network model.Figure 4. The generalized regression neural network model.

Figure 4 .
Figure 4.The generalized regression neural network model.

Figure 5 .
Figure 5. Forecasting results for trend of each model after removing the periodicity.Figure 5. Forecasting results for trend of each model after removing the periodicity.

Figure 5 .
Figure 5. Forecasting results for trend of each model after removing the periodicity.Figure 5. Forecasting results for trend of each model after removing the periodicity.
model Ŷt = Ĉt + Tt + Rt , and results are shown in Figure6.Figure6IIdemonstrates that the forecasting error in the 11th time point is the largest with an MAPE within 5%, and this results is satisfactory. .Because it tends to be stationary, we can only apply GA-GRNN to get the forecasting value ˆt R .The forecasting results of DDH can be calculated by the additive model ˆˆt t t t

Figure 6 .
Figure 6.Forecasting results and MAPE of DDH model.

Figure 6 .
Figure 6.Forecasting results and MAPE of DDH model.

Figure 7 .
Figure 7. Forecasting results of each model.Figure 7. Forecasting results of each model.

Figure 7 .
Figure 7. Forecasting results of each model.Figure 7. Forecasting results of each model.

Figure 8 .
Figure 8. Model evaluation of three forecasting models.

Figure 8 .
Figure 8. Model evaluation of three forecasting models.

Table 2 .
The forecasting output of each model.

Table 3 .
Forecasting performance evaluation results.
About other indexes, including MAE, RMSE and ME, DDH all obtain the best forecasting results.When comparing the working days with weekends, the proposed hybrid model can both have a high forecasting accuracy, which proves the effectiveness of the model.(2) Table 5 shows the forecasting results of days in different seasons.Based on the comparison, it can be concluded that DDH is superior to the other two models with the values of MAPE 0.96%, 1.18%, 1.18% and 1.13% in spring, summer, autumn and winter, respectively.The results can validate that the proposed hybrid DDH model has a high degree of robustness and forecasting accuracy.
Remark 2. The performance of the DDH model is stable and good when forecasting the electrical load data in one week and different seasons.

Table 4 .
Forecasting performance evaluation results of one week with larger training set.

Table 5 .
Forecasting performance evaluation results of different seasons with a larger training set.

Table 6 .
Comparison of MAPE with models in the literature.
If the fitness value of chromosome is large, it indicates a stronger adaptability.It obeys the rules of survival of the fittest; meanwhile, it can keep the best state in a changing environment; • Population search.The conventional methods usually search for single points, which is easily trapped into a local optimum if a multimodal distribution exists in the search space.However, GA can search from multiple starting points and evaluate several individuals at the same time, which makes it achieve a better global searching; • Need for a small amount of information.GA only uses the fitness function to evaluate the individuals without referring to other information.It has a small dependence or limitation conditions to the problems, so it has a wider applicability; • Heuristic random search.GA highlights the probability transformation instead of the certain transformation rule; •