Article Forecasting Monthly Electric Energy Consumption Using Feature Extraction

Monthly forecasting of electric energy consumption is important for planning the generation and distribution of power utilities. However, the features of this time series are so complex that directly modeling is difficult. Three kinds of relatively simple series can be derived when a discrete wavelet transform is used to extract the raw features, namely, the rising trend, periodic waves, and stochastic series. After the elimination of the stochastic series, the rising trend and periodic waves were modeled separately by a grey model and radio basis function neural networks. Adding the forecasting values of each model can yield the forecasting results for monthly electricity consumption. The grey model has a good capability for simulating any smoothing convex trend. In addition, this model can mitigate minor stochastic effects on the rising trend. The extracted periodic wave series, which contain relatively less information and comprise simple regular waves, can improve the generalization capability of neural networks. The case study on electric energy consumption in China shows that the proposed method is better than those traditionally used in terms of both forecasting precision and expected risk.


Introduction
Monthly forecasting of electric energy consumption plays an important role in the operation of thermal power plants and is one of the most important basis for coal dispatch, electricity trading, and so on.In addition, monthly forecasting of electric energy consumption is vital for the planning and maintenance of the grid.However, a number of difficulties are associated with forecasting.First, the steady monthly consumption trends of electric energy often change every few years as a result of macroeconomic conditions and social development.Therefore, the data used for modeling can only be derived from a continuous number of years, during which macroeconomic conditions may have varied slightly.Second, the monthly consumption trend data is extremely complex because of the effects of people's living habits, weather conditions, and other unexpected factors.Thus, the monthly consumption trends often comprise at least three kinds of sub-trends, namely, a long-term rising trend, numerous periodical waves, and the stochastic series.
Classical techniques, such as regression [1] and expert systems [2], are incapable of generating precise forecasting results because of low adaptability.On the other hand, time series methods [3][4][5] use a moving average to simplify the raw trends.These methods remove tiny waves to smoothen the data figures.As a result, time series methods have been proven to be better than classical techniques.However, the primary limitation of these methods is the inability to distinguish stochastic waves from a number of other useful waves.As a result, useful information may be mistakenly eliminated and the forecasting results of these methods are often unsatisfactory because not all the useful information in the raw data is considered.
In the past several years, neural networks (NNs) have been widely applied in short-term load forecasting (STLF) [6][7][8][9][10] and short-term price forecasting [11,12].The primary advantage of a NN is its capability for modeling non-linear relations without the supervision of human experts [13,14].When used for load forecasting, NNs require significant volumes of data for training [15], while requiring relatively less information from each sample (the relationship between input and output vectors is relatively simple).These factors are important for forecasting precision, however, based on the aforementioned reasons, collecting adequate steady monthly data, even for the short term, is difficult.Furthermore, compared with short-term load data (e.g., 96-point daily ones), monthly data often contain more information.To date, although NN has been widely used for monthly data forecasting, its applications are limited to either special trends [16] or special points (e.g., peak load prediction) [17].
The key to further improvement of the forecasting precision of NN in monthly forecasting is the reduction of the amount of information in each sample.In fact, sample simplification is also a feasible method for STLF [6].For monthly forecasting, feature extraction is an effective method for reducing the amount of information in each sample.Zhao and Wei [18] have summarized a number of methods for extracting the series features.González-Romera et al. [19][20][21] adopted a moving average algorithm to extract the rising trend from a monthly electric energy demand series.The width of the data window in the moving average is selected by measuring the fitting accuracy and the smoothness of the obtained rising figure.The periodic wave is forecasted using the Fourier series, whereas the rising trend is predicted by NNs.The extracting techniques, combined with other methods, were proven to be capable of yielding better forecasting results compared with the time series and single NN methods.
However, the previously designed models were still observed to have certain limitations.First, the Fourier series can only simulate waves with invariable amplitude.However, the amplitude of periodic waves extracted from the raw trends will inevitably increase because of rapid increases in total monthly electric energy consumption, especially in developing countries.Second, the moving average method is only capable of separating waves from the rising trend, but is incapable of eliminating the stochastic effect.Third, the use of NNs to simulate the extracted rising trend cannot easily mitigate minor unexpected factors (e.g., the continuous occurrence of freak weather), thereby affecting generalizability.
The present paper adopts a discrete wavelet transform (DWT) to decompose the raw forecasting figure .DWT is not only capable of extracting the rising trend and periodic waves, but it can also distinguish stochastic behavior.Periodic waves are forecasted by NNs, which can simulate their increasing amplitude.As a feasible method for simulating convex function, the grey model (GM) is adopted to forecast the rising trend.
The paper is structured as follows: Section 2 briefly introduces the principles and primary algorithms used in the present work.Section 3 presents the case study on the monthly electric energy consumption in China as an example to test the model.Aside from the forecasting method discussed in Section 2, two other transitional methods are designed to test the contribution of each innovation, and both methods are compared with the traditional algorithm through forecasting error analysis.Section 4 concludes the present paper.

Feature Extraction Based on DWT
As a time-frequency decomposition method for a discrete signal, DWT is used in the present work with the aim of decomposing the monthly time series.The DWT function ψ j,k (t) can be written as: , ( ) ( ) where a j is a scale function and ka j b is a translation function, both of which are derived from a continuous wavelet transform (CWT) by discreteness.
For any discrete function where * denotes the complex conjugation of ψ.The discrete wavelet coefficient can be used to reconstruct the function f(t) as follows: , , ( ) ( ) where C is a constant.A schematic diagram of the DWT process is shown in Figure 1.The DWT decomposes a signal into low-frequency (z L ) and high-frequency coefficients (y 0 − y L ).The low-frequency and high-frequency coefficients comprise the approximate features and the details of f(t), respectively.The relationship between these values is defined by For a wavelet function, Daubechies wavelets were selected for extracting the features of the monthly electric energy consumption series.These wavelets are an orthogonal family defining a discrete wavelet transform and are characterized by a maximal number of vanishing moments for a given support [22].

GM for Exponential Trend Forecasting
If the socio-economy runs smoothly, both the economic growth rate and electric elasticity coefficient tend to be constant.As a result, the reconstructed result of low frequency extracted from the monthly electric energy consumption series using DTW has a steadily increasing trend, which can be simulated using the following equation:  4) cannot be obtained by minimizing the residual sum of square (RSS) because of the location of b 3 .Deng [23,24] proposed a method for solving this problem.
The following differential form of Equation ( 4) can be processed to obtain the optimum parameters of a and u by minimizing the RSS: To process the discrete series, Equation ( 5) should be discretized as follows: where x (0) (k) (k = 1,2,…) refers to the raw series, and x (1) (k) (k = 1,2,…) stands for the accumulated generating series and is written as (1)   (0) Integrating the raw and the accumulated generating series with the iterative Equation (6) will yield the following: = ⋅ N B A (7) where (1) (1) (1)

B
The RSS can then be minimized to obtain the optimum parameters of Equation ( 5): Using a matrix derivative will yield: Using the optimum parameters obtained in Equation ( 9) and adopting x (1) (1) as a particular solution will yield the following solution for Equation ( 5): (1) (0)
In Equation (11), k is the only variable.The forecasting value is obtained only by entering an appropriate k.Considering the characteristics of Equation ( 11) k should equal i − 1 when forecasting the ith value.

RBF NNs for Periodical Waves
In 1988, Broomhead and Lowe introduced the radial basis function (RBF) to the NN field.As a classic multilayer feedforward NN, the RBF NN often consists of three layers, namely, the input, the hidden, and the output layers.The input layer gathers the input vector x, whereas the output layer yields the output vector y.According to the Kolmogorov theorem, given any n-dimension continuous mapping f, U × U…… × U → R × R…… × R, where f(x) = y and U ∈ [0, 1], f can be simulated using a three-layer feedforward NN.For any nonlinear fluctuant, a three-layer feedforward NN can be used for theoretical simulation at any precision.
Unlike other feedforward NNs, the kernel function of an RBF NN is a Gauss function that is usually written as: where u j is the output value of jth node in the hidden layer, x is the input vector of the hidden layer, w j is the center of the Gauss function of the jth node, σ j 2 is the spread of the Gauss function, and N 1 is the number of nodes in the hidden layer.Based on Equation ( 12), the output of each Gauss function will be between 0 and 1.Furthermore, each Gauss function has a reflection (significantly unequal 0) only when the input vector is very close to the space center (w j ).
Figure 2 shows the output distribution of the RBF NN with two Gauss function nodes.The output of the RBF NN is clearly seen to rapidly approach 0 when the inputs deviate from the centers.For high-dimension input vectors, the RBF NN has better learning efficiency compared with the traditional error back propagation NN, which adopts an S-like kernel function that has reflections in infinite space.That is, the RBF NN often has a rapid convergence rate and enhanced training speed, which are very important because NNs will be frequently utilized in the proposed forecasting model.

Forecasting Model Design
The operation of the monthly electric energy consumption forecasting model involves the following steps: first, the historical data are decomposed by DWT and then reconstructed by each frequency.Reconstructed results with a small amplitude and an indistinct wave period (considered as stochastic waves) are eliminated; second, the rising trend (the reconstructed result of low-frequency) is modeled using GM (1, 1), and other periodical waves (the remaining reconstructed results from different high frequencies) are determined using the RBF NNs; third, the forecasting results of monthly electric energy consumption are obtained by adding the forecasting values of the previous models.The next section will demonstrate the forecasting steps in detail.

Experimental Setup and Forecasting Results
The actual values of the monthly electric energy consumption (10 8 kWh) in China from January 1990 to December 2006 were selected to validate the aforementioned methods.Data were obtained from the website of the Chinese Economic and Financial Database of the China Center for Economic Research (CCER) [25].The monthly electric energy consumption curve (Figure 3) obviously exhibits a basic exponential rising trend and is characterized by numerous waves.The reconstructed low-frequency Result (1) reflects the primary monthly electric energy consumption trend, which exhibits an exponential rise.Results (2)-( 4) reflect periodical waves of different frequencies.Amplitudes were observed to gradually increase with the development of the social economy.Results ( 5) and ( 6) have no distinct wave periods, and their amplitudes were generally very small.Thus, these results were classified as stochastic waves and consequently eliminated.In fact, the summation of Results (1)-( 4) is 98.4% similar to the actual data.For the extracted rising trend, the reconstructed low-frequency result has been modeled using GM (1,1).For each forecasting point, 12 consecutive previously obtained values have been adopted to model the exponential forecasting equation.The trend value of forecasting was derived assuming k = 12 in Equation (11).
For the reserved reconstructed high-frequency results (Results (2)-(4) in Figure 4), RBF NN has been adopted to simulate the relationship between the current and a number of its previous values.For each forecasting point, 60 of its consecutive previous values were selected for RBF NN training.The 13th to 60th values were taken as outputs.For each output value, 12 previous values were selected as the inputs.After training, the NN is used for forecasting.Based on the same rule, when forecasting the jth point, the 12 consecutive previous values are entered and the output of NN is considered the forecasting result.For example, if the value of certain remaining reconstructed results of January 2005 were to be forecasted, the values from January 2000 to December 2004 from the same time series were selected to establish the NN structure.When constructing training samples, the values from January 2001 to December 2004 were successively selected as outputs, and the 12 previous values of each of these outputs were selected as the inputs.
In the current paper, all RBF NNs were implemented using the Matlab Neural Network Toolbox and were designed to have similar parameters, i.e., the neuron number of the RBF layer refers to the dimension of the input vector and the training goal defaults to 0. Furthermore, the value of spread will greatly affect the feature of RBF.The larger that spread is the smoother the function approximation will be.Here it is set at 50.After the NN structures were established, the values from January to December of 2004 were used as the inputs for the network.The consequent output was the forecasting result of January 2005.The maximal absolute percentage error (MaxAPE), mean absolute percentage error (MAPE), median absolute percentage error (MdAPE), and geometric mean relative absolute error (GMARE) were used as indicators of forecasting precision: ( ) ( ) MaxAPE max 100, 1, 2,..., ( ) where x(i) is the electric energy consumption value in the ith month; ˆ( ) x i represents its forecasting result; and N is the number of data used for the MAPE calculation: where * ˆ( ) x i is the forecasting result obtained from the benchmark method.The results of the above indicators are listed in Table 2. Compared with the Method 2, the application of GM (Method 3) has similar forecasting precision (MAPE) with less forecasting risk (MaxAPE).This finding proves that Method 3 performs better than Method 2 in terms of mitigating the stochastic effect on the primary trend.The application of DWT (Method 4) improves both the forecasting precision and forecasting risk.This result proves the positive effects of DWT in simplifying the periodic waves and eliminating the stochastic effects.When Method 1 is used, the MAPE and maximum error are both further reduced.This finding proves that Method 1 performs better than Method 2 in terms of both forecasting precision and risk.The variance of absolute percentage error of Methods 1, 2, 3, and 4 were 2.72, 8.18, 5.65, and 2.82.The biggest variance value of Methods 2 was the root cause of smallest GMARE and made it possible to obtain the smallest MAPE.Furthermore, the wave amplitude of the actual data increases with time, as shown in Figures 3-5.The performance of Method 1 is acceptable for simulating this feature.The wave amplitude of Method 2 tends to be invariable, thereby resulting in poor forecasting performance.

Conclusions
The difficulties in forecasting monthly electric energy consumption are attributable to the excessive information in the values relative to the limited number of samples.In the current paper, the DWT is adopted to extract the features of monthly data into several relatively simple series.After the elimination of the stochastic series, the rising trend and other periodic waves are retained.To mitigate the stochastic effect on the primary trend, the GM is selected to model the rising trend of the retained components.Considering the increasing amplitude of retained periodic waves, the RBF NN is selected for use in simulation.Summing up the forecasting values of the GM and RBF NNs will yield the forecasting result of monthly electric energy consumption.The values of for 24 consecutive months of electric energy consumption in China are forecasted to test the effect of the proposed method.Compared with the traditional method, the following conclusions are confirmed: first, the primary contribution of GM is the reduction of the maximum forecasting error; second, the application of DWT in forecasting may result in a simultaneous reduction of the MAPE and maximum error, and third, when GM and DWT are used simultaneously, the MAPE and maximum error are both further reduced.In summary, the method proposed in the present study performs better in terms of forecasting precision and expected risk.
b 1 , b 2 , and b 3 are the shape parameters.In fact, by adjusting the parameters of b 1 , b 2 , and b 3 , Equation (4) can simulate any smoothing convex trend.The optimum parameters of Equation (

Figure 2 .
Figure 2. Outputs of RBF NN with two nodes in the hidden layer.

Figure 3 .
Figure 3. Monthly electricity consumption in China from January 1990 to December 2006.

Figure 3
Figure3was decomposed using the db(5) wavelet and was further reconstructed by each frequency.The reconstructed results of each frequency are shown in Figure4.The reconstructed low-frequency Result (1) reflects the primary monthly electric energy consumption trend, which exhibits an exponential rise.Results (2)-(4) reflect periodical waves of different frequencies.Amplitudes were observed to gradually increase with the development of the social economy.Results (5) and (6) have no distinct wave periods, and their amplitudes were generally very small.Thus, these results were classified as stochastic waves and consequently eliminated.In fact, the summation of Results (1)-(4) is 98.4% similar to the actual data.

Figure 5 .
Figure 5. Curves of actual data based on the forecasting results of Methods 1 and 2.

Table 2 .
Results of forecasting precision indicators.