A Novel Hybrid Model Based on Extreme Learning Machine, k-Nearest Neighbor Regression and Wavelet Denoising Applied to Short-Term Electric Load Forecasting

: Electric load forecasting plays an important role in electricity markets and power systems. Because electric load time series are complicated and nonlinear, it is very difﬁcult to achieve a satisfactory forecasting accuracy. In this paper, a hybrid model, Wavelet Denoising-Extreme Learning Machine optimized by k-Nearest Neighbor Regression (EWKM), which combines k-Nearest Neighbor (KNN) and Extreme Learning Machine (ELM) based on a wavelet denoising technique is proposed for short-term load forecasting. The proposed hybrid model decomposes the time series into a low frequency-associated main signal and some detailed signals associated with high frequencies at ﬁrst, then uses KNN to determine the independent and dependent variables from the low-frequency signal. Finally, the ELM is used to get the non-linear relationship between these variables to get the ﬁnal prediction result for the electric load. Compared with three other models, Extreme Learning Machine optimized by k-Nearest Neighbor Regression (EKM), Wavelet Denoising-Extreme Learning Machine (WKM) and Wavelet Denoising-Back Propagation Neural Network optimized by k-Nearest Neighbor Regression (WNNM), the model proposed in this paper can improve the accuracy efﬁciently. New South Wales is the economic powerhouse of Australia, so we use the proposed model to predict electric demand for that region. The accurate prediction has a signiﬁcant meaning.


•
A novel hybrid model named ELM-WA-KNN is proposed for electric load forecasting in New South Wales, Australia.

•
The wavelet analysis (WA) is introduced to eliminate the noise of the electric load time series.

•
k-Nearest Neighbor regression (KNN) is used to get the input-output relationship in the hybrid model.

•
The kernel function of KNN is established by extreme learning machine (ELM).

•
The proposed ELM-WA-KNN model has the best performance among all the considered models.

Introduction
Electric load prediction is one of the major tasks in power management departments which carry out electric power dispatching, usage and planning.Improving the accuracy of load forecasting is helpful for planning power management and is advantageous to the arrangement of power grid operations.Meanwhile, it is also advantageous for the reduction of power generation costs and is beneficial to establish the power for construction plans.Therefore, electric load forecasting has become an important part of power system management modernization.
In recent years, a large number of techniques for time series forecasting have been used in electric load forecasting.In traditional predictive methods, the linear regression model and grey model have been widely used.Goia et al. proposed a linear regression model using heating demand data to forecast the short-term peak load in a district-heating system [1].Zhou et al. presented a trigonometric grey prediction approach by combing the traditional GM (1,1) with the trigonometric residual modification technique for forecasting electricity demand [2].Akay et al. proposed GPRM to predict Turkey's total and industrial electricity consumption [3].Meanwhile, artificial neural networks and support vector machines have a variety of applications in electric load forecasting.Chang et al. proposed the so-called EEuNN framework by adopting a weighted factor to calculate the importance of each factor among the different rules to predict monthly electricity demand in Taiwan [4].Kavaklioglu used SVR to model and predict Turkey's electricity consumption [5].Wang et al. presented a combined ε-SVR model considering seasonal proportions based on development trends from history data to forecast the short-term electricity demand [6].Other predictive techniques have also been proposed to deal with the electric load forecasting problem.Kucukali et al. attempted to forecast Turkey's short-term gross annual electric demand by applying fuzzy logic methodology based on the economical, political and electricity market conditions of the country [7].Dash et al. presented the development of a hybrid neural network to model a fuzzy expert system for time series forecasting of electric loads [8].Taylor showed that for predictions up to a day-ahead the triple seasonal methods outperform the double seasonal methods in predicting electricity demand [9].
However, no matter which techniques are utilized to predict electric load, with only a single model it is difficult to achieve high precision because of the various shortcomings in the models [10].Therefore, hybrid models are built with different combinations of data mining technology to improve the prediction accuracy in load forecasting research.Wu et al. proposed a Particle Swarm Optimization-Supporter Vector Machine (PSO-SVM) model based on cluster analysis techniques and data accumulation pretreatment in short-term electric load forecasting [11].Chen et al. proposed a new electric load forecasting model by hybridizing FTS and GHSA with LSSVM [12].Huang presented a SVR-based load forecasting model which hybridized the chaotic mapping function and QPSO with SVM [13].Zhang et al. successfully established a novel model for electric load forecasting by combination of SSA, SVM and CS [14].Hence, it is apparent that lots of novel hybrid models combined with different data mining techniques could improve the prediction precision in the research of load forecasting.
In this paper we also propose an outstanding hybrid model to predict electric loads.In non-parametric model research, KNN regression has been widely used in time series prediction.The KNN regression is one of the historical approximation methods in machine learning.The main idea of the algorithm is that we get the output by calculating the degree of similarity between the independent variables [15].Poloczek proposed the KNN regression as a geo-imputation preprocessing step for pattern-label-based short-term wind prediction of spatio-temporal wind data sets [16].Ban proposed a new multivariate regression approach for financial time series based on knowledge shared from referential nearest neighbors [17].Hu proposed a conjunction model named EMD-KNN which was based on an EMD and KNN regressive model for forecasting annual average rainfall [18].Li established a novel hybrid BAMO with KNN to predict apoptosis protein sequences using statistical factors and dipeptide composition [19].Wang selected k-nearest neighbors to correct the commonly used precipitation data on the Qinghai-Tibetan plateau by establishing the relationship between daily precipitation and environmental as well as other meteorological factors [20].Meanwhile in the neural network research field, Huang established an innovative algorithm called extreme learning machine (ELM) which was based on a traditional single-hidden-layer feed-forward neural network [21].Generally speaking, ELM not only reduces the training time of neural networks but also has a great predictive performance.Meanwhile the speed of ELM is faster than traditional machine learning algorithms.Ma built an adaptive prediction model based on ELM to predict traffic flows [22].Masri targeted predicting the functional properties of soil samples by establishing a hybrid model based on a Savitzky-Golay filter for preprocessing and ELM for obtaining non-linear relationship [23].Liao used ELM and economic theory to study stock price forecasting and the results showed that ELM had the highest prediction accuracy [24].Zhang researched the short-term prediction of wind by a proposed model based on wavelet decomposition and ELM [25].Shamshirband predicted the horizontal global solar radiation by using and extreme learning machine method and the comparative results clearly specified that ELM can provide reliable predictions with better precision compared to the traditional techniques [26].In the field of signal processing, compared to traditional methods such as EMD and EEMD, the wavelet transform (WT) has a great ability to obtain the characteristics of data from the time and frequency domains [27].Hence we can eliminate the noise in signals and grab the main information using WT.Wang targeted forecasting future stock prices by utilizing wavelet analysis to denoise the time series applying a neural network to obtain the nonlinear relationships [28].Wang proposed a novel approach for short-term load forecasting by applying wavelet denoising in a combined model that is a hybrid of SARIMA and a neural network [29].Qin predicted chaotic time series based on wavelet denoising, phase space reconstruction and LSSVM, and the proposed model had better performance than the traditional models [30].Abbaszadeh proposed a new hybrid model based on a wavelet denoising technique to denoise hydrological time series and applied ANN to acquire the best prediction of hydrological data [31].
In this paper, because the Australian region of New South Wales (abbreviated as NSW) is an active economic region in the Asia-Pacific region and it has a large population, its economic development and peoples' lives have become inextricably tied to electricity, so the electricity department is concerned with ensuring sufficient power production.Accurate electric load forecasting could be particularly important for this, as surplus electricity production will lead to environmental pollution and resource wastage.Thus, effective electric load forecasting could be a useful indicator for decision-makers, and effective early warning of an increase in electricity demand is important to ensure supply-demand balance.Based on these facts, we propose a new hybrid model based on the wavelet denoising technique, k-nearest neighbor regression and extreme learning machine to forecast the short-term electricity load in New South Wales.
This paper mainly consists of three parts: (1) the first part introduces the wavelet denoising technique, the k-nearest neighbor regression (KNN), extreme learning machine (ELM) and the establishment of the proposed hybrid model; (2) the second part presents the data set of the experiment, the evaluation criteria of models, the predictive values of the electric load and the analysis of comparative models; (3) the third part is a summary of the proposed model to illustrate the great predictive performance we could achieve for electric loads through the proposed ELM-WA-KNN hybrid model (EWKM).

Materials and Methods
The proposed hybrid model presented in this paper for short-term electric load forecasting mainly consists of three basic data mining technologies: wavelet denoising technique, k-nearest neighbor regression (KNN) and extreme learning machine (ELM).Detailed descriptions are given below.

Wavelet Denoising Technique
The wavelet transform proposed by Mallat has become one of the strongest mathematical tools for providing a time-frequency representation of an analyzed signal in the time series domain.Detailed coefficients are produced by high-pass filters and approximation series are produced by low-pass filters [32].With the development of the wavelet analysis theory, the dyadic wavelet transform (DWT) for discrete time series y t is as shown in Equation (1): where s = 2 m and τ = 2 m • n are the scale and location of the discrete wavelet, respectively.N stands for the integer power of 2, and ψ(•) for the wavelet function.By the transform method, DWT could eliminate the white nose of the time series and acquire the useful information of the time series on a different scale.

Extreme Learning Machine (ELM)
Extreme learning machine (ELM) is a type of the single-hidden-layer feed-forward neural network (SLFN) which cannot adjust the parameters of the neural network, but ELM has the threshold of the hidden layer and the weights between each layers.The model of the ELM is as shown in Figure 1.


for the wavelet function.By the transform method, DWT could eliminate the white nose of the time series and acquire the useful information of the time series on a different scale.

Extreme Learning Machine (ELM)
Extreme learning machine (ELM) is a type of the single-hidden-layer feed-forward neural network (SLFN) which cannot adjust the parameters of the neural network, but ELM has the threshold of the hidden layer and the weights between each layers.The model of the ELM is as shown in Figure 1.Given a data set , the output of the ELM is : where ; j b stands for the learning parameters of the hidden layers and represents the weights between the input layer and hidden layer.
Meanwhile the is the weights between the hidden layer and output layer and ) (x g is the activation function of the hidden layer.Specifically, the radial basis function is selected as the active function in the hidden node in the experiment.So we could obtained the formula of the ELM as follows: where the weight matrix between the hidden layer and output layer is


stands for the ELM output.It is apparent that H represents the result of the hidden layer.Here, two important ELM theorems by Liang must to be mentioned [33]: If SLFN with L additive nodes and with activation function ) (x g which is differentiable in any interval of R is given, then the output matrix of hidden layer H is invertible and 0  T H .

Given a data set
where i = 1, 2, . . ., N; b j stands for the learning parameters of the hidden layers and T represents the weights between the input layer and hidden layer.Meanwhile the is the weights between the hidden layer and output layer and g(x) is the activation function of the hidden layer.Specifically, the radial basis function is selected as the active function in the hidden node in the experiment.So we could obtained the formula of the ELM as follows: where the weight matrix between the hidden layer and output layer is β = [β 1 , β 2 , .., β L ] T and Y = [y 1 , y 2 , .., y N ] T stands for the ELM output.It is apparent that H represents the result of the hidden layer.Here, two important ELM theorems by Liang must to be mentioned [33]: Theorem 1.If SLFN with L additive nodes and with activation function g(x) which is differentiable in any interval of R is given, then the output matrix of hidden layer H is invertible and Hβ − T = 0.
Theorem 2. If small positive value ε > 0 and activation function g(x): R → R which is differentiable in any interval is given, then there exists L ≤ N such that N arbitrary distinct input vectors randomly produced based upon any continuous probability distribution with probability one.
Based on Theorems 1 and 2, we could solve the equation by the least squares method and the result is: where H −1 is the Moore-Penrose pseudo inverse of the hidden layer.

k-Nearest Neighbor Regression (KNN)
KNN is a non-parametric technology which derives from pattern recognition studies [34].With the development of the study of nonlinear dynamics, many researchers have utilized KNN to frequently predict time series because the algorithm has a great ability to get the nonlinear properties of a time series.The main idea of KNN is that the similarity (neighborhood) between the independent variable of the predictors is used and the independent variable in the historical observations is calculated to acquire the best estimators for the predictor [15].
KNN applies a metric on the predictors to seek the set of k past nearest neighbors in the historical data set for the current condition.To deal with the regressive problem, Lall and Sharma proposed the kernel function and we could get the result of the KNN regression as follows [35]: where Y i represents the value of the prediction; Y i(j) stands for the magnitude of nearest neighbor j in the above formula.It must be noticed that j is the order of the nearest neighbors based on the distance from the current condition i (j = 1 to k ).The similarity between predictor and historical label is depended on the distance as following formula: where d jt is the tth independent variable of Y j ; d it stands for the tth independent variable of Y i and q represents the number of independent variables in the formula.
Establishing the kernel function of KNN is therefore the main concern to predict the time series.Many researchers have tried different methods to build fitter kernel functions in dealing with the regression.However almost all the kernel functions are established based on the linear relationship and these methods have failed to get the nonlinear property, so in this paper the kernel function is obtained by ELM, which is a popular data mining technique, to search for the nonlinear properties.
Firstly, the electric load time series including the historical data set, the training data set and the testing data set is decomposed by DWT.The one low-frequency signal and one high-frequency signal are regarded as the available time series and white noise, respectively.It could be expressed as follows: Secondly, we select the last six electric load data as the input variable and the following one as the output variable of the hybrid model.The formula of the system is as follows: Thirdly, the distances between the training (or testing) target and one of historical targets is represented by the following formula: Suppose x i stands for any one of the training (or testing) targets and {x i−6 , x i−5 , x i−4 , x i−3 , x i−2 x i−1 } are the corresponding characteristic indicators of the training (or testing) data.In addition, h j is the historical data target and {h i−6 , h i−5 , h i−4 , h i−3 , h i−2 , h i−1 } are the corresponding characteristic indicators of the historical data.In this paper, the distance is calculated by Euclidean distance: Furthermore, the historical target of x i and the distance corresponding to the historical target can be expressed as follows: {(d i,7 , h 7 ), (d i,8 , h 8 ), (d i,9 , h 9 ), . . ., (d i,p , h p )} Next, the distances can be listed in ascending order and the first k historical targets can be obtained as h i(1) , h i(2) , . . ., h i(k) .
Then, the kernel function can be obtained by ELM: where the kernel function f (•) can be trained by ELM.It cannot be ignored that a simple linear relationship is employed as the traditional method to build the kernel function, but the method cannot deal with the nonlinear relationship.Hence, ELM is selected to establish the kernel function because of its great accuracy and high speed.Finally, the prediction value of the innovative EWKM hybrid model can be obtained, and the basic structure of the proposed model is shown in Figure 2.

Study Area Description
New South Wales is a state on the east coast of Australia (Figure 3).The estimated population of NSW at the end of March 2016 was 7.7 million, making it Australia's most populous state.NSW is Australia's economic powerhouse, and also one of the most active economic regions in the Asian-Pacific region.Among its industrial sectors, the most outstanding are the iron and steel industries.According to the data of the Australian immigration information network, the state's GDP accounted for one third of Australia's gross domestic product and more than 35% of the country's products and services are manufactured in NSW.Its steel production accounted for about 85% of Australia's total output, centering on the port of Newcastle and Ken Blah.Coal and related products are the state's biggest exports.With an A$5 billion value, they account for about 19% of all exports.Electricity is inseparable from both industry and people's life, and the electricity department must therefore ensure adequate power production.

Data Description
To verify the effectiveness of the proposed model, the data sets of electric load (Mw) from NSW

Study Area Description
New South Wales is a state on the east coast of Australia (Figure 3).The estimated population of NSW at the end of March 2016 was 7.7 million, making it Australia's most populous state.NSW is Australia's economic powerhouse, and also one of the most active economic regions in the Asian-Pacific region.Among its industrial sectors, the most outstanding are the iron and steel industries.According to the data of the Australian immigration information network, the state's GDP accounted for one third of Australia's gross domestic product and more than 35% of the country's products and services are manufactured in NSW.Its steel production accounted for about 85% of Australia's total output, centering on the port of Newcastle and Ken Blah.Coal and related products are the state's biggest exports.With an A$5 billion value, they account for about 19% of all exports.Electricity is inseparable from both industry and people's life, and the electricity department must therefore ensure adequate power production.

Study Area Description
New South Wales is a state on the east coast of Australia (Figure 3).The estimated population of NSW at the end of March 2016 was 7.7 million, making it Australia's most populous state.NSW is Australia's economic powerhouse, and also one of the most active economic regions in the Asian-Pacific region.Among its industrial sectors, the most outstanding are the iron and steel industries.According to the data of the Australian immigration information network, the state's GDP accounted for one third of Australia's gross domestic product and more than 35% of the country's products and services are manufactured in NSW.Its steel production accounted for about 85% of Australia's total output, centering on the port of Newcastle and Ken Blah.Coal and related products are the state's biggest exports.With an A$5 billion value, they account for about 19% of all exports.Electricity is inseparable from both industry and people's life, and the electricity department must therefore ensure adequate power production.

Data Description
To verify the effectiveness of the proposed model, the data sets of electric load (Mw) from NSW (Australia) are used as the experimental data.They were obtained from the Australian Energy Market Operator (http://www.aemo.com.au/)[36].Since the proposed hybrid model EWKM contains KNN and ELM, the sample data are divided into three groups: the first group is the historic subset of KNN, which contains 17,568 data points (from 1 January 2016 00:30 to 1 January 2017 00:00); the second is the training subset of ELM, which contains 2832 data points (from 1

Data Description
To verify the effectiveness of the proposed model, the data sets of electric load (Mw) from NSW (Australia) are used as the experimental data.They were obtained from the Australian Energy Market Operator (http://www.aemo.com.au/)[36].Since the proposed hybrid model EWKM contains KNN and ELM, the sample data are divided into three groups: the first group is the historic subset of KNN, which contains 17,568 data points (from 1 January 2016 00:30 to 1 January 2017 00:00); the second is the training subset of ELM, which contains 2832 data points (from 1 January 2017 00:30 to 1 March 2017 00:00); the remaining 2158 data points (from 1 March 2017 00:30 to 14 April 2017 23:30) as the testing subset of ELM can be seen in Table 1 and Figure 4.

ELM-KNN as a Simulation Tool
The purpose of this section is to examine the fitting capacity of the combination of ELM and KNN for solving complex simulation problems.One thing that must be mentioned is that complex linear systems always appear in the application of projects, which can make it difficult to establish a suitable model, so there is a need for more research on how to find the reasonable and efficient methods to establish the nonlinear relationship between the input and the output.This section proposes a hybrid method that combines ELM and KNN.Three functions with different characteristics are employed as the benchmarks to prove its high fitting capacity.The structure of the three benchmarks considered in the experiment are as follows: F3: Ackley function (d = 2): The range of variables x 1 and x 2 are all from −2 to 2, with the step size of 0.04.That means the variables   In order to prove the high fitting capacity of the proposed method ELM-KNN, three other method: KNN, BPNN-KNN and SVM-KNN are employed for comparison.After implementing these algorithms using Matlab 2014(b), we have carried out extensive simulations and each algorithm has been run 100 times so as to perform an average statistical analysis.Table 2 shows the results with the average value of sum error and time are taken as the criteria.Multiple studies show that ELM-KNN can outperform KNN and other hybrid algorithm for solving non-linear simulation problems.The traditional KNN algorithm is suitable for solving linear problems, so although the run time is short, the sum error of this single algorithm is large.Therefore, this section take other algorithms which are suitable for non-linear fitting to combine with KNN.These hybrid algorithms are ELM-KNN, BPNN-KNN and SVM-KNN.It can be observed from Table 2 that while the accuracy of these hybrid algorithm is similar, the accuracy of ELM-KNN is slightly better than that of BPNN-KNN and SVM-KNN.More importantly, a marked improvement can be seen in the run speed with ELM-KNN.Each function evaluation is virtually instantaneous on a modern PC.For example, the computation time with ELM-KNN on a 2.13 GHz desktop is between 4 and 5 s, which was much superior to BPNN-KNN and SVM-KNN.These results show the high accuracy and efficiency of ELM-KNN that make it a very powerful tool for fitting.

Evaluation Criteria
The mean absolute error (MAE), the mean relative error (MRE) and the correlation coefficient (R) are used to evaluate the reliability of EWKM model.MAE and MRE measure residual errors, which give a global idea of the difference between the observed and forecasted values.MAE is used to evaluate the absolute error range of the predicted value, while MRE is used to reflect the specification of the predicted value on average.The lower the values of MAE and RMSE, the better the model is.The proportion of the total variance in the observed data can be described by the correlation coefficient (R).R is better when it is close to one.
MAE, MRE and R are calculated as follows: where y i is the observed value, ŷi is the predictive value to y i , y is the average of the observed value and n is the number of the observations of the validation set.
3.5.The Process of the Proposed Hybrid Model

Wavelet De-noising
The electrical demand is affected by a variety of factors, so electric load time series are usually accompanied by high noise.The direct forecasting of electric load with noisy data usually results in large errors.In this section, the DWT is executed for efficiently removing the noise from the observed data.In general, a normative procedure to select the decomposition level does not exist, so the selection is based on multiple experiments.In this study, the Daubechies wavelet of order 3 (db3) has a better performance, so the db3 is adopted in the wavelet denoising process [29,37].Considering the characteristics of the experimental data, after testing the effect of different levels with db3, level 1 works best.The decomposition figure includes the approximation coefficients at level 1 (a 1 ) and the detail coefficients at level 1 (d 1 ).The decomposition process of the experimental data is shown in Figure 5.The a 1 is a smoothed version of the original series and it represents the low frequency signal.Thus, a 1 is selected to build forecasting model.

The Process of ELM-KNN
In Section 2.4, the EWKM hybrid model is established.The parameters of the hybrid model k = 12 are selected by lots of experiment tests.According to the choice of the electric load time series, there are 2832 training data from 1 January 2017 00:30 to 1 March 2017 00:00 (n = 2832) and 2158 testing data from 1 March 2017 00:30 to 14 April 2017 23:30 (m = 2158).Meanwhile, the half-hour electric load of the year 2016 (p = 17568) is selected as the historical data set in the experiment.Finally, the prediction of electric load could be calculated   The results obtained from the modified hybrid model EWKM agree well with the original values.As is shown in Figure 6, the forecasting curve of EWKM closely approaches the original one.This figure confirms that the EWKM model has great performance in predicting the electric load series.As Table 3 shows the MRE, R and MAE of the proposed model EWKM are 0.0262, 0.9660 and 196.7408, respectively.The numerical value of the mean relative error (MRE) is close to zero and the correlation coefficient (R) is close to one.It clearly confirms that the EWKM model can capture the non-stationary and highly noisy features of the electric load series, and the new model can effectively improve the forecasting accuracy.As Table 3 shows the MRE, R and MAE of the proposed model EWKM are 0.0262, 0.9660 and 196.7408, respectively.The numerical value of the mean relative error (MRE) is close to zero and the correlation coefficient (R) is close to one.It clearly confirms that the EWKM model can capture the non-stationary and highly noisy features of the electric load series, and the new model can effectively improve the forecasting accuracy.

Model Comparisons
This section provides a comparison between the proposed EWKM model and three other benchmark models: EKM, WKM, and WNNM.Note that random selection of the parameters (ELM and BPNN) may cause different results.To prevent this uncertain phenomenon, fifty runs for each method are applied and the results of every run are recorded in Table 4.  5. What needs special explanation is the fact that WKM is a non-parameter method and the result of this model is a fixed value, so it makes no sense to do lots of experiments with WKM.The mark "/" in Table 5 stands for the non-existent statistical values.To verify the EWKM model, this experiment compares it with EKM, WKM and WNNM, using the same electric load data.It must be noticed this analysis is done according to the mean value of three indicators.As listed in Table 5, EWKM has the lowest MRE, MAE and the highest R among the four models.Comparing EKM with EWKM, after introducing wavelet denoising into the model, the MRE and MAE have obviously decreased.Unlike the EKM model which is directly constructed from the original data, we decompose the original data into a low frequency and high frequencies parts, and the results indicate that the EWKM can capture the highly noisy and non-stationary features of electric load data after the wavelet denoising, so it can be concluded that the denoising process is crucial to predict electric loads.Then, comparing WKM with EWKM, the three average indicators MRE, R, MAE of the WKM are 0.0290, 0.9607 and 219.6566, respectively, which are all inferior to those of EWKM.This indicates that optimizing the kernel function by using the ELM has a wonderful effect, so the ELM is necessary to predict the electric load series.As for the WNNM, the BPNN is an artificial neural network which is widely used to predict time series.When using the BPNN to forecast electric loads, the two statistical errors MAE and MRE are clearly larger than the errors of the EWKM model.The comparison of the results between the four models shows that the proposed model EWKM can effectively improve forecasting accuracy.Therefore, the conclusion can be drawn that every part of the proposed model is suitable and reasonable.The proposed model EWKM that includes denoising processing, KNN and ELM constitutes a significant improvement in electric load forecasting.

Conclusions and Future Work
In electricity demand forecasting, noise signals, caused by various unstable factors, often corrupt electric load series.The contribution of this paper is a method that uses a hybrid model based on wavelet denoising processing.In previous studies, models had been usually established with the original data, however, this paper takes the low-frequency signal to modeling, so that it can reduce errors caused by noise signals.Moreover, in the construction of the kernel function, the traditional way is to construct a linear relationship, but this paper introduces the ELM into the establishment of the kernel function, so that it can optimize the KNN algorithm.Through the analysis of experimental results, a conclusion can be drawn that the every part included in the new hybrid model is necessary to predict future electric loads.
However, this paper only takes electric load data as the research subject, without taking other related variables into consideration.To resolve such limitations, future research should aim to include other factors which may influence the electric demand, such as the population and GDP, and there's a lot of room for improvement.

2 
are the scale and location of the discrete wavelet, respectively.N stands for the integer power of 2, and) (

Figure 1 .
Figure 1.The structure of ELM.

Figure 1 .
Figure 1.The structure of ELM.

Figure 2 .
Figure 2. The basic structure of the proposed model.

Figure 3 .
Figure 3.The study area location.

Figure 2 .
Figure 2. The basic structure of the proposed model.

Figure 2 .
Figure 2. The basic structure of the proposed model.

Figure 3 .
Figure 3.The study area location.

Figure 3 .
Figure 3.The study area location.

F1
x 1 and x 2 have 101 possible values, respectively, and the algorithm can produce 10,201 groups of experimental data.After random ordering of these experimental data, all data are divided into three groups: the first group is the historical subset of KNN, which contains 8201 data points (from No. 1 to No. 8201); the second group is the training subset of ELM, which contains 1500 data points (from No. 8202 to No. 9701); the remaining 500 data points (from No. 9702 to No. 10,201) as the testing subset of ELM.

Figure 4 .
Figure 4.The electric load series of New Souths Wales, (a) historical data set; (b) train data set; (c) test data set.

Figure 4 .
Figure 4.The electric load series of New Souths Wales, (a) historical data set; (b) train data set; (c) test data set.

Figure 5 .
Figure 5.The process of wavelet denoising, (a) the decomposition of historical data set; (b) the decomposition of train and test data sets; note: a1, approximation signal; d1, detail signal.

Figure 5 .
Figure 5.The process of wavelet denoising, (a) the decomposition of historical data set; (b) the decomposition of train and test data sets; note: a1, approximation signal; d1, detail signal.

3. 6 . 17 3. 6 .
Results and Analysis3.6.1.Results of the Proposed ModelThe results obtained from the modified hybrid model EWKM agree well with the original values.As is shown in Figure6, the forecasting curve of EWKM closely approaches the original one.This figure confirms that the EWKM model has great performance in predicting the electric load series.Energies 2017, 10, x FOR PEER REVIEW 13 of Results and Analysis 3.6.1.Results of the Proposed Model

Figure 6 .
Figure 6.Plot of the original data and predictive values.

Figure 6 .
Figure 6.Plot of the original data and predictive values.

Table 1 .
Statistical parameters in each data set.Min is the minimum; Max is the maximum; Std is the standard deviation; S is the skewness; K is the kurtosis.

Table 2 .
The Fitting performance of the four methods.

Table 3 .
Indicators of the EWKM.

Table 3 .
Indicators of the EWKM.

Table 4 .
The performance of the EWKM model compared to EKM and WNNM.

Table 4 .
Cont.The estimation performance of EWKM, EKM and WNNM are assessed by statistical indicators of MRE, MAE and R. The values are presented in Table

Table 5 .
Comparing the three criteria of four models.