Next Article in Journal
A Quasi-Resonant ZVZCS Phase-Shifted Full-Bridge Converter with an Active Clamp in the Secondary Side
Next Article in Special Issue
Online Evaluation Method for Low Frequency Oscillation Stability in a Power System Based on Improved XGboost
Previous Article in Journal
Low-Voltage Ride-Through Operation of Grid-Connected Microgrid Using Consensus-Based Distributed Control
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A New Input Selection Algorithm Using the Group Method of Data Handling and Bootstrap Method for Support Vector Regression Based Hourly Load Forecasting

Department of Electrical and Computer Engineering, Pusan National University, Busan 46241, Korea
*
Author to whom correspondence should be addressed.
Energies 2018, 11(11), 2870; https://doi.org/10.3390/en11112870
Submission received: 10 September 2018 / Revised: 17 October 2018 / Accepted: 21 October 2018 / Published: 23 October 2018
(This article belongs to the Special Issue Machine Learning and Optimization with Applications of Power System)

Abstract

:
Electric load forecasting is indispensable for the effective planning and operation of power systems. Various decisions related to power systems depend on the future behavior of loads. In this paper, we propose a new input selection procedure, which combines the group method of data handling (GMDH) and bootstrap method for support vector regression based hourly load forecasting. To construct the GMDH network, a learning dataset is divided into training and test datasets by bootstrapping. After constructing GMDH networks several times, the inputs that appeared frequently in the input layers of the completed networks were selected as the significant inputs. Filter methods based on linear correlation and mutual information (MI) were employed as comparison methods, and the performance of hybrids of the filter methods and the proposed method were also confirmed. In total, five input selection methods were compared. To verify the performance of the proposed method, hourly load data from South Korea was used and the results of one-hour, one-day and one-week-ahead forecasts were investigated. The experimental results demonstrated that the proposed method has higher prediction accuracy compared with the filter methods. Among the five methods, a hybrid of an MI-based filter with the proposed method shows best prediction performance.

1. Introduction

Electric load forecasting (ELF) is essential for the effective and stable planning and operation of power systems [1]. Forecasting models are constructed based on historical load series and exogenous variables (e.g., weather, economic, and social factors), and the models are used to predict future loads for a specified period of time ahead. Various decisions related to power systems depend on the future behavior of loads, such as unit commitment, spinning reserve reduction, economic dispatch, automatic generation control, reliability analysis, maintenance scheduling, and energy commercialization [2,3]. ELF, especially, has major effects on deregulated electricity markets (e.g., demand response management [4,5,6]) and their participants because the prices on the markets are determined by the predicted future loads. Therefore, accurate and robust ELF can contribute greatly to decision making related to power systems. However, it is difficult to precisely forecast time-series loads because they exhibit a high degree of seasonality and nonlinear characteristics.
In recent decades, a wide range of methods has been proposed for ELF. In particular, artificial intelligence-based methods such as artificial neural networks (ANNs) [7,8,9,10] and support vector regression (SVR) [11,12,13,14,15] have been applied successfully to ELF due to their excellent learning capacity and nonlinear mapping ability without requiring any prior domain knowledge. However, ANNs based on the empirical risk minimization principle often suffer from over-fitting problems. Furthermore, when derivative-based optimization techniques are used for training ANNs, they are likely to be trapped in a local minimum. Compared to ANNs, over-fitting problems can be overcome in SVR because it is based on the structural risk minimization principle [16,17], i.e., the learning error and generalization capability are considered simultaneously.
The importance of ELF and its difficulty have motivated extensive studies, but the problem of input selection for ELF models remains an open question. In ELF, a major issue is selecting significant inputs (SIs) from many initial input candidates (IICs). Three main reasons necessitate input selection procedures [18,19]. First, properly selected inputs decrease the complexity of the model, and thus facilitate more efficient learning. Second, the forecasting performance can be improved by removing redundant inputs that are irrelevant to the outputs and dependent on other inputs. Finally, we can gain valuable insights into the fundamental features of loads and their prediction mechanism. In the following, we summarize several previous studies on input selection methods for load forecasting.
Ghofrani et al. [20] proposed a new input selection framework by combining correlation analysis and the l2-norm for short-term load forecasting (STLF). Koprinska et al. [21] used mutual information (MI), RReliefF, and a correlation-based method for feature selection of the load forecasting model. Sheikhan and Mohammadi [22] developed a feature selection method by combining a genetic algorithm with ant colony optimization for STLF. Tikka and Hollmén [23] proposed sequential backward input selection for ELF, which is based on a linear model and a cross-validation resampling procedure. Sorjamaa et al. [24] combined a direct prediction strategy with three input selection criteria, i.e., k-nearest neighbors approximation method, MI, and nonparametric noise estimation, for ELF. Da Silva et al. [25] used filter methods based on phase-space embedding and MI and Bayesian wrapper methods for ANN-based ELF. Hu et al. [19] proposed a hybrid filter-wrapper approach with a partial MI and firefly algorithm for STLF feature selection.
In general, the methods mentioned above can be categorized as filter (model-free) or wrapper (model-based) methods [26,27,28,29]. In filter methods, after statistical analyses between the potential inputs and output, the inputs with strong relationships to the output are selected as SIs. In wrapper methods, the accuracy of learning machines is used as a selection criterion and the best input combinations are explored by sequential search procedures, such as backward elimination, forward selection, stepwise selection, and metaheuristics. Filter methods can select SIs without learning machines, but these methods do not consider the prediction accuracy, and thus they often perform worse than wrapper methods. Linear correlation (LC) analysis, frequently used in filter methods, with nonlinear predictors can degrade the forecasting performance because they cannot identify nonlinear relationships between inputs and output. In wrapper approaches, the best input combinations are determined by the sequential search procedures, so the computational complexity grows exponentially with the number of IICs. Furthermore, depending on the structure and parameter identification methods used for wrappers, biased input selection results may be obtained.
In summary, instead of only considering the one-to-one statistical correlations, input selection methods for data-driven load forecasting should be able to tackle both the many-to-one nonlinear relationships between dependent and independent variables and the accuracy of the forecasting machines.
In this paper, we propose a new input selection procedure based on the group method of data handling (GMDH) and the bootstrap method for ELF. Although there are several previous studies where GMDH is employed for ELF [30,31], this paper is the first attempt to use GMDH only for input selection. Generally, GMDH networks have been widely used for modeling the nonlinear relationships between variables. Compared with the previous approaches, this paper focuses on the fact that, based only on a given dataset, the GMDH algorithm can not only determine the network structures but also select the inputs with significant explanatory powers for predicting the outputs. In the GMDH algorithm, the GMDH networks where polynomial neurons are hierarchically connected with each other are automatically constructed using learning datasets collected from target systems. Among a number of IICs, the remaining elements in the input layer of the finally constructed GMDH network are considered as relevant inputs. The learning dataset should be divided into training and test datasets to construct the GMDH network. In this study, bootstrap method is applied for the data division. The learning dataset is divided randomly by bootstrap sampling, so the relevant inputs are different each time. Therefore, in the proposed method, after constructing GMDH networks several times using bootstrapping, the inputs that appear frequently in the input layers of the completed networks are finally selected as the SIs.
The main advantages of the proposed input selection method combining the GMDH and bootstrap method can be briefly summarized as follows. First, based on the GMDH network structures in which the polynomial neurons are hierarchically connected with each other (see Figure 1), the method can select significant inputs by taking many-to-one nonlinear relationships between explanatory and target variables into consideration. Second, since prediction accuracy is used as the input selection criterion (see Section 2.2), it is expected that the method achieves improved forecasting performance compared to filter methods based only on the statistical analyses.
To verify the performance of the proposed method, we use hourly load data from South Korea with seasonality, weekly and daily periodicity. LC- and MI-based filter methods are employed as comparison methods. In addition, hybrid approaches that combine the filter methods with the proposed method are also examined. In total, five input selection methods are investigated. Prediction models are constructed via ν-SVR, and we compare the one-hour, one-day, and one-week-ahead forecasting performance of the five input selection methods.
The remainder of this paper is organized as follows. Section 2 explains the procedure for GMDH network building. In Section 3, the new input selection method based on GMDH and bootstrap sampling, LC- and MI-based filter methods, and hybrid approaches are described. Section 4 briefly summarizes ν-SVR. In Section 5, we present the experimental results, and discuss the results in detail in Section 6. Finally, we give our conclusions in Section 7.

2. Group Method of Data Handling

Ivakhnenko first proposed the GMDH algorithm in 1968, which is a self-organizing modeling technique [32,33,34]. In the GMDH method, as shown in Figure 1, complex and nonlinear modeling is performed by the GMDH network where polynomial neurons are hierarchically connected in a forward direction.
A quadratic two-variable polynomial is commonly used as a transfer function for each neuron and their coefficients can be estimated by the least-squares method. When the number of entering inputs and/or the order of the polynomials become larger, the number of parameters to be estimated rapidly increases. To build a GMDH network, only the learning dataset collected from target systems is needed, and relevant inputs as well as the network structure can be determined automatically.
Let D = { x k ; y k } k = 1 n be a learning dataset for GMDH network building, where x = [x1, x2, …, xM]T is an input vector composed of IICs and M is the number of the IICs. In layer 1, (q = M(M − 1)/2) neurons that consider all combinations of the initial inputs are generated and only the outputs of M neurons that satisfy a selection criterion are used as inputs for the next layer. In the same manner, q neurons are generated from layer 2 based on the M neurons selected in the previous layer. This layer extension process continues in a forward direction until a stopping criterion is satisfied. After stopping the extension process, the output layer and neuron are fixed and all of the elements connected with the output neuron are found sequentially by backward search.

2.1. Parameter Estimation for Polynomial Neurons

In the GMDH network, the output, z, of an arbitrary neuron is calculated as
z = f ( x u , x l ; θ ) = θ 0 + θ 1 x u + θ 2 x l + θ 3 ( x u ) 2 + θ 4 ( x l ) 2 + θ 5 x u x l ,
where xu and xl are two inputs of the neuron and θ = [θ0, …, θ5]T is composed of its polynomial coefficients. The same learning dataset D is employed repeatedly to estimate parameters for each neuron. After substituting D into (1), the n linear equations are formulated in concise matrix form, i.e., = z. Using the least-squares method, the parameter vector θ can be optimized as
θ ^ = ( X T X ) 1 X T z .
In (2), if the determinant of XTX is close to zero, then the estimator θ ^ is rather susceptible to round off errors and its performance deteriorates. In this study, the polynomial coefficients are estimated using singular value decomposition, as follows [35,36]:
θ ^ + = X + z = V S + U T z ,
where X+ is the pseudo-inverse of X, the columns of matrices V and U correspond to the eigenvectors of XTX and XXT, respectively, and the main diagonal components of S+ are composed of the reciprocals of r singular values.

2.2. Construction of the GMDH Network

To build the GMDH network, layer extension is performed in the forward direction until the stopping criterion is satisfied. Next, the output layer and neuron are fixed and a search for all of the elements connected to the output neuron is conducted by backward tracing, i.e., from the output to input layer. After finishing the search in the backward direction, we can obtain not only the GMDH network structure, but also relevant inputs in the input layer.
In the proposed method, checking error criterion (CEC) [37] is employed: (1) to select the neurons whose outputs will be used as inputs for the next layer; and (2) to determine whether the layer extension process should be stopped. The number of layers in the GMDH networks is closely related to the model’s complexity. The deeper the networks, the more complex their structures; in this case, many unnecessary inputs can be selected together with significant inputs. On the other hand, if the networks are too shallow, input variables with great explanatory powers can be missed. To calculate the CEC, the learning dataset D is separated into dataset A, D A = { x k A ; y k A } k = 1 n A , and dataset B, D B = { x k B ; y k B } k = 1 n B , using the bootstrap method [38]. In the bootstrap method, after applying n sampling with replacement to D, we obtain n data pairs. Each data pair in D has the same probability of being selected during the sampling process. Among the n sampled data pairs, only unique pairs comprise DA for training and DB consists of the remaining data pairs, i.e., DB = D\DA for testing. The ratio of DA relative to DB is approximately 6:4.
Using DA, the parameter vector of a neuron is estimated by (3) and the outputs are calculated based on DB, y ^ k B = f ( ( x u ) k B , ( x l ) k B ; ( θ ^ + ) A ) , k = 1, …, nB. The CEC of the neuron is computed by
C E C = 1 n B k = 1 n B ( y k B y ^ k B ) 2 .
After calculating the CEC for q neurons in the ith layer, i.e., CEC(i, j), j = 1, …, q, they are arranged in ascending order and only the M neurons with the smallest CEC values are selected. If an early stopping condition, CEC(i − 1) ≤ CEC(i), i ≥ 2, (where CEC(i) = minj{CEC(i, j)}, j = 1, …, q), is satisfied or the current layer is equal to the predefined maximum layer, then the layer extension is halted. Then, the output layer p* and output neuron q* are selected as follows:
p * = arg   min i { C E C ( i ) } ,
q * = arg   min j { C E C ( p * , j ) } , j = 1 , , q .
After selecting the output neuron, all of the elements connected with the neuron are found by a sequential search in the backward direction. In each layer, neurons without any connections to the output neuron are removed and the remaining neurons and their connections are preserved. The GMDH network is finally constructed after finishing the backward search from the output to input layer. Algorithm 1 describes the procedure for constructing the GMDH network.
Algorithm 1. Constructing the GMDH network.
Input: learning dataset D
Divide D into DA and DB using bootstrap method
Imax ← maximum layer
BN ← {x1, …, xM} //where BN denotes the set of ‘best neurons’.
for i from 1 to Imax
 Create (q = M(M − 1)/2) neurons in ith layer based on BN
for j from 1 to q
  Estimate ( θ ^ + ) A of jth neuron using DA
  Calculate CEC(i, j) for jth neuron using DB
end
CEC(i) = minj{CEC(i, j)}, j = 1, …, q
if i ≥ 2 then
  if CEC(i − 1) ≤ CEC(i) then
   break this loop
  end
end
BNM neurons in ith layers with smallest CEC(i, j), j = 1, …, q
end
p* ← argmini{CEC(i)}
q* ← argminj{CEC(p*, j)}
Remove all neurons in p*th layer except for q*th neuron
pp*
repeat
pp − 1
 Find the neurons in pth layer connected with neurons in (p + 1)th layer
 Remove all neurons and their connections in pth layer except for the founded neurons
until p == 0
  return GMDH network structure and a set of relevant inputs, X = { x 1 , , x m } ( { x 1 , , x M } )
From the final constructed GMDH network, we can confirm the inputs in the input layer connected with the output neuron and they are regarded as relevant inputs.

3. Input Selection Method

In this study, to select the inputs for ELF models, we employed: (1) LC- and MI-based filter methods, (2) GMDH-based and bootstrapping-based input selection (i.e., the proposed method), and (3) hybrid methods that combine filter methods with GMDH-based input selection.

3.1. Filter Methods Based on LC and MI

In this subsection, we explain the LC- and MI-based filter methods (i.e., comparison methods). In the filter methods, we calculated the ranking function (RF) values, RF(xj, y), j = 1, …, M, between all of the potential inputs and output. The inputs with RF values that are greater than or equal to the threshold value RFth were selected as SIs. More details regarding the LC and MI used as RF are available elsewhere [28,29,39,40]. Among M IICs {x1, …, xM}, m SIs {x1st, x2nd, …, xmth} are determined as
{ x 1 st , x 2 nd , , x m th } = { x j | R F ( x j , y ) R F th , j = 1 , , M } ,
where the condition, R F ( x 1 st , y ) > > R F ( x m th , y ) , is satisfied.

3.2. The Proposed Method: GMDH-Based and Bootstrapping-Based Input Selection

As explained in Section 2, the components of the input layer in the finished GMDH network are considered as relevant inputs. Since D is divided randomly into DA and DB by bootstrap sampling, the input selection results will vary in each experiment. Moreover, the relevant inputs obtained by constructing only a single network may yield biased results. Thus, in the proposed method, input selection was performed according to the following procedures. Firstly, the GMDH networks were constructed r times in the same experimental conditions. The set of relevant inputs from the ith experiment is denoted as Xi, i = 1, …, r and their union is X = i = 1 r X i . Then, in the union X, the total number, Nj, of each IIC, xj, j = 1, …, M, is counted. Finally, the inputs for which Nj is greater than or equal to a predefined threshold, Nth, were selected as SIs. Algorithm 2 describes the GMDH- and bootstrap-based input selection procedures.
Algorithm 2. GMDH-based and bootstrap-based input selection.
Input: learning dataset D
r ← the number of network constructions
Nth ← threshold value
X ← {}
for i from 1 to r
 Divide D into DA and DB using bootstrap method
 Select relevant input set Xi using Algorithm 1
X X X i
end
Count the number of xj, j = 1, …, M, Nj, in X
{x1st, x2nd, …, xmth} = {xj|NjNth, j = 1, …, M}
  return {x1st, x2nd, …, xmth}

3.3. Hybrid Input Selection Method

The hybrid input selection procedures that combine filter methods and the proposed method were carried out according to two phases. First, many redundant inputs were eliminated by applying LC- or MI-based filter methods to IICs. After then, the proposed input selection procedure described in Algorithm 2 was applied to the remaining inputs.

4. ν-Support Vector Regression

In this section, we briefly summarize ν-SVR [41] as proposed by Schölkopf et al. Let { ( x i ; y i ) } i = 1 l denote a collected learning dataset, where x i n is an n-dimensional input vector and y i 1 is the target output. The basic idea of SVR is to find a linear regression function after transforming the original space into a high-dimensional feature space, which is defined as
f ( x ) = w T ϕ ( x ) + b ,
where ϕ(∙) is a nonlinear mapping function from the input to feature space, and w and b are parameters of f that should be estimated from the learning dataset. The constrained optimization problem of ν-SVR is defined by introducing two positive slack variables ξ i and ξ i * , as follows [42]:
min w , b , ξ , ξ * , ε 1 2 w 2 + C ( ν ε + 1 l i = 1 l ( ξ i + ξ i * ) ) subject   to   { ( w T ϕ ( x i ) + b ) y i ε + ξ i , y i ( w T ϕ ( x i ) + b ) ε + ξ i * , ξ i , ξ i * 0 , i = 1 , , l , ε 0 .
where 1 2 w 2 is a regularization term, the parameter ν ( 0 , 1 ] controls the number of support vectors and training errors, and C is a regularization constant that determines the tradeoff between the model’s complexity and its accuracy. The training errors of regression function f are penalized by ξi and ξi*, if they are larger than ε. The size of ε is traded off against the model’s complexity and slack variables via a constant ν [41]. As described in (9), SVR avoids under-fitting and over-fitting by minimizing both the regularization term and training errors.
By introducing Lagrange multipliers αi and αi*, and applying Karush-Kuhn-Tucker conditions [43], the constrained optimization problem given by (9) is reformulated as follows:
min α , α * 1 2 ( α α * ) T Q ( α α * ) + y T ( α α * ) subject   to   { i = 1 l ( α α * ) = 0 , i = 1 l ( α α * ) = 0 C ν , 0 α i , α i * C / l , i = 1 , , l .
where Q is a kernel matrix and its entry in the ith row and jth column is Qi,j = K(xi, xj) ≡ ϕ(xi)Tϕ(xj). The kernel function K(xi, xj) defined in the input space is the same as the inner product of xi and xj in the feature space. The nonlinear mapping and inner products can be calculated in the original input space using the kernel function. In this paper, among the widely used kernel functions, such as polynomial, radial basis function (RBF), and sigmoidal kernels, we employed the RBF kernel function defined as
K ( x i , x ) = exp ( γ x i x 2 ) ,
where γ controls the width of the RBF function. By solving (10), we can obtain the approximated regression function as follows:
f ^ ( x ) = i = 1 l ( α i * α i ) K ( x i , x ) + b .
The prediction accuracy of ν-SVR depends on the selection of appropriate design parameters, i.e., ν, C, and γ. Trial and error, cross-validation, grid search, and global optimization have been applied widely for determining the parameter values [2]. The aim of this paper is to examine the performance of the proposed input selection methods, so the procedures for selecting the design parameters will not be described in detail. In this study, the method presented in [3] is used to determine the design parameters.

5. Experimental Results

To verify the performance of the proposed method, hourly load data from South Korea between 1 January 2012 and 31 December 2014 was used. Figure 2 shows the hourly load curves for South Korea during one year (i.e., 2013) and Figure 3 shows an example of the weekly profile for the curves during three weeks from 7 January 2013 until 27 January 2013.
As shown in Figure 2 and Figure 3, the target loads exhibit clear seasonality with weekly and daily periodicities. The load demand is usually higher in the summer and winter seasons compared with the spring and autumn seasons. The demand for electricity is higher on weekdays (from Monday to Friday) compared with weekends and the load demand is slightly higher on Saturday than Sunday. The minimum load on Monday is lower than that on other working days. In addition, the load demand on special days (e.g., national holidays and election days) exhibit remarkably unusual behavior. In this paper, we are not concerned with hourly load forecasting on special days, i.e., we focus only on hourly load forecasting for ordinary days.
To validate the seasonal forecasting performance, we select three months as validation months, i.e., April, July and November in 2014, where the objective is to predict their hourly load demands one hour, one day, and one week ahead. For the one-day and one-week-ahead forecasts that correspond to multi-step-ahead predictions, we employed the recursive strategy [44] explained in Appendix A.

5.1. Data Preparation

To prepare the learning dataset, we considered historical time-series load data {φt, t = 1, …, N}. For example, to predict the hourly load demand from 1 to 30 April in 2014, the load data from 1 January 2012 to 31 March 2014 is employed to prepare the learning data. After organizing the learning data matrix described in (13) using N historical data, the max-min normalization process is applied to each column as presented in (14).
Φ = [ φ d φ d 1 φ 1 φ d + 1 φ d + 1 φ d φ 2 φ d + 2 φ N 1 φ N 2 φ N d φ N ] .
Φ i , j = Φ i , j min i { Φ i , j } max i { Φ i , j } min i { Φ i , j } .
where Φi,j is the original component in the ith row and jth column of Φ and Φ′i,j is its normalized value. The learning data matrix is an (Nd) × (d + 1) matrix, where d is the window size, i.e., the number of IICs. In this paper, to capture the daily and weekly periodicities of the target load, we chose a one-week window size, i.e., d = 168. The last column in Φ is composed of the desired outputs and the remaining columns consist of 168 IICs. Each row vector in Φ corresponds to an input-output learning data pair.
To take account of the seasonality, we reformulated a new matrix Φnew composed of the row vectors of Φ, where the last column’s components (i.e., desired outputs) only correspond to the load values of a validation month and the previous month. For example, to forecast the hourly load during April in 2014, Φnew is made up of row vectors where the desired outputs correspond to the hourly load during March and April in 2012 and 2013 and March in 2014.
Finally, the row vectors that contain hourly load values on special days are removed from Φnew. In other words, if only one component of the row vectors corresponds to the hourly load for special days, the row vectors are discarded. This is necessary because the hourly load curves for special days have different shapes from those for normal days. If the prediction models for normal days are trained by learning data with load values on special days, biased forecasting results could be obtained.
The input selection methods explained in Section 3 were applied to the learning dataset prepared according to the procedures described above.

5.2. Input Selection Results

In this subsection, we present the results of applying the input selection methods to the prepared learning dataset. As explained in Section 3, we employed five input selection methods, which are abbreviated as follows.
(1)
‘LC’: LC-based filter method.
(2)
‘MI’: MI-based filter method.
(3)
‘GMDH’: the proposed method described in Algorithm 2.
(4)
‘LC + GMDH’: hybrid input selection method combining ‘LC’ and ‘GMDH’.
(5)
‘MI + GMDH’: hybrid input selection method combining ‘MI’ and ‘GMDH’.
(Note: The acronyms, LC and MI, denote the RFs, but the abbreviations, ‘LC’ and ‘MI’, refer to the LC- and MI-based filter methods.)
Figure 4 and Figure 5 show the LC and MI values calculated using the prepared learning dataset for predicting the loads in April, July and November in 2014.
In Figure 5, the MI values are normalized in a range of [0, 1]. Figure 4 and Figure 5 demonstrate that the shapes of LC and MI differ from the validation months because the target loads exhibit distinct seasonal behaviors. Daily and weekly periodicities can also be observed in these figures. In the filter methods based on LC and MI, after rearranging the calculated RF values in descending order, m inputs with RF values greater than or equal to the predefined RFth are finally selected as SIs. Figure 6 and Figure 7 show the input selection results using ‘LC’ and ‘MI’.
As shown in Figure 6 and Figure 7, the threshold values for LC and MI are set as 0.8 and 0.6, respectively, and they are indicated by dashed thick solid black lines. The selected inputs are indicated by dashed red lines in the figures.
As explained in Section 3.2, in the ‘GMDH’, GMDH networks were constructed independently r times on the same conditions. In this paper, r is set to 30. In Table 1, and the maximum layer, Imax, is set to 5 based on several trial-and-errors. Note that Imax corresponds to the maximum acceptable depth of the GMDH networks. Through several prior experiments, we confirmed that the forecasting performance may be degraded when Imax is too large (e.g., Imax ≥ 7) or too small (e.g., Imax ≤ 3). If Imax is set to be excessively large values, and consequently, too many inputs are selected, then it is difficult to intuitively understand the predictive principles of target data. Table 1 lists the results of applying Algorithm 1 for 30 times to the prepared learning dataset for predicting the loads in April 2014.
After the relevant input set Xi is joined into the union ( X = X 1 X 30 ) , the frequency of each element in X, Nj (j = 1, …, 168), is counted. Next, Nj are arranged in descending order and the m inputs that satisfy a condition, i.e., NjNth, are finally selected, where Nth is the threshold value of Nj. In the hybrid methods, after removing many redundant inputs using ‘LC’ or ‘MI’, ‘GMDH’ is applied to the remaining inputs. In this study, two-thirds of the IICs were filtered by ‘LC’ or ‘MI’. Figure 8, Figure 9 and Figure 10 show the input selection results using ‘GMDH’, ‘LC + GMDH’ and ‘MI + GMDH’, respectively.
As shown in Figure 8, Figure 9 and Figure 10, the threshold value, Nth, is set as 15 and the values are marked by dashed thick solid black lines in the figures. The inputs for which Nj is greater than or equal to Nth, i.e., NjNth, are enclosed within dashed red lines. Table 2 summarizes the SIs selected by the five methods for each validation month (i.e., April, July, and November in 2014). As listed in Table 2, the loads in the same hour during the past several days (i.e., φt−23 and φt−167) as well as several recent loads (i.e., φt and φt−1) are useful for predicting future loads.
In addition to the selected SIs listed in Table 2, we also employed binary-valued vectors used in [45], i.e., w { 0 , 1 } 7 and h { 0 , 1 } 24 , for ELF models to keep track of the daily and weekly periodicities, where w and h are vectors of zero with a 1 in the position of the day of the week and the hour of the day for the load under consideration, respectively.

5.3. Forecasting Results

After conducting the input selection procedures, ν-SVR learning was carried out with the SIs and binary-valued vectors, w and h. For example, when the load demand during April 2014 was predicted with ‘LC’, the input vector for ν-SVR corresponds to x = [φt, φt−1, φt−165, φt−166, φt−167, wt+1, ht+1]T. In this paper, we used LIBSVM [46] to implement ν-SVR. The accuracy of the ELF was measured using six performance indices, which were also employed in [2,3], i.e., mean absolute percentage error (MAPE), symmetric mean absolute percentage error (SMAPE), mean absolute error (MAE), normalized mean squared error (NMSE), relative error percentage (REP) and magnitude of maximum error (MME). In the following, we only present the results of the performance comparisons using MAPE, which is defined as
MAPE = 1 N i = 1 N | A i F i | A i × 100 ,
where Ai and Fi are the actual and predicted values for the ith validation dataset, respectively, and N is the number of validation dataset. The performance comparisons using the other indices are presented in Appendix B. Table 3 and Figure 11 illustrate the MAPE values of each validation month and the overall values for five input selection methods.
In Table 3, the best entries among ‘LC’, ‘MI’ and ‘GMDH’ are indicated by the superscript plus sign and the best entries among the five input selection methods are highlighted by the superscript asterisk. When H = 1, i.e., in the case of one-hour ahead forecasting, MAPE values of every validation month are similar to each other; but when H = 24 and 168, they are quite different. This can be attributed to the fact that load variations in July and November are higher than those in April because of seasonal effects; it can be observed that MAPE values for multi-step ahead forecasting in July and November are larger than those in April.
As illustrated in Table 3 and Figure 11, ‘GMDH’ shows the best prediction performance among ‘LC’, ‘MI’, and ‘GMDH’ in terms of the overall MAPE, except for one-week-ahead forecasting. Among the five methods, ‘MI + GMDH’ shows the best forecasting performance, except for the one-day-ahead forecast for November 2014. Let us look at how much ‘MI + GMDH’ improves the overall MAPE compared with the other methods. For one-hour-ahead forecasting, the percentage improvements with ‘MI + GMDH’ are 8.19%, 6.79%, 3.80%, and 5.72% compared with the four methods, respectively. For one-day and one-week-ahead forecasting, ‘MI + GMDH’ improves the overall MAPE compared with the other methods by 17.35%, 6.45%, 4.91%, and 8.64%, and by 9.34%, 2.93%, 3.64%, and 5.65%, respectively. Figure 12 shows box plots of the overall absolute percentage errors. As shown in Figure 12, ‘MI + GMDH’ exhibits superior performance compared with the other methods. Figure 13 shows examples of the actual and predicted load curves obtained using ‘MI + GMDH’ and ν-SVR. Due to space constraints, the load curves for the other validation months and periods are not presented. In Figure 13, the load values predicted by ‘MI + GMDH’ and ν-SVR are very similar to the actual values. Figure 14 shows a histogram of illustrating the one-hour-ahead prediction errors by ‘MI + GMDH’ and ν-SVR for July 2014.
As shown in Figure 14, the errors are centered symmetrically on zero and there are no severe outliers. Figure 15 illustrates the results of linear regression analysis between the actual and predicted load values by ‘MI + GMDH’ and ν-SVR for July 2014.
The results for the other validation months are not presented due to space limitations. In Figure 15, the X and Y axes correspond to the actual and predicted load values, respectively, the black circles are scatter plots of the data points, i.e., (X, Y), and the dashed black lines and thick solid blue lines indicate the straight lines, Y = X, and the best linear regression lines, respectively. The r2 values presented above the figures represent the relationships between X and Y. A value close to 1 indicates that X and Y have a strong linear relationship. As illustrated in Figure 15, we can confirm that there are strong linear relationships between the actual and predicted load values.

6. Discussion

Based on the experimental results presented in Section 5, we can highlight several key findings. Let us begin with the comparisons between the filter methods (i.e., ‘LC’ and ‘MI’) and ‘GMDH’. At one-hour and one-day ahead forecasting, ‘GMDH’ performed better than the filter methods in terms of the overall MAPE. There are two main reasons for the improved performance of ‘GMDH’: (1) ‘GMDH’ can capture many-to-one nonlinear relationships between inputs and output via hierarchical network structure, whereas ‘LC’ or ‘MI’ can only capture one-to-one linear or nonlinear relationships; and (2) unlike filter methods that select SIs based on statistical RF, the prediction accuracy (i.e., CEC) is employed as an input selection criterion in ‘GMDH’.
Second, we consider the performance comparisons between ‘GMDH’ and the hybrid methods. In all cases, ‘MI + GMDH’ obtained the best forecasts in terms of the overall MAPE. The load series has nonlinear characteristics and ν-SVR with a nonlinear kernel (i.e., RBF kernel) was used for the prediction models, so the hybrid method that combines ‘MI’ with ‘GMDH’ performed better. The prediction results of ‘LC + GMDH’ were poor compared with ‘GMDH’ alone because LC-based filtering procedure may remove the inputs with weak linear but strong nonlinear relationships with the output.
Finally, let us discuss the performance of the five methods in terms of their computational time requirements. Table 4 lists the computational time required by each input selection method for April 2014.
The learning dataset is composed of 2904 input-output pairs and 168 IICs comprise the input part for each pair. The computer used to measure the computational time had 8 GB of RAM and a 2.8 GHz quad-core CPU. The result shows clearly that the computational efficiency of ‘GMDH’ is higher than that of ‘MI’. The computational time required by ‘LC + GMDH’ is much less than that of ‘GMDH’ alone because the number of inputs handled by ‘GMDH’ is reduced to one-third of the IICs by the LC-based filtering procedure.

7. Conclusions

In this study, we proposed a new input selection procedure, which combines GMDH with a bootstrap method for SVR-based short-term hourly load forecasting. After constructing GMDH networks many times under the same experimental conditions, the inputs that remain frequently in the input layers were finally selected as SIs. The networks were constructed several times because each relevant input is different due to the random division of the learning dataset. Indeed, only constructing a single network could yield biased input selection results. In experimental assessments, we employed LC- and MI-based filter methods for comparison, and also verified the performance of two hybrid methods. In total, five input selection methods were examined in this study. To illustrate the performance of the proposed method, an hourly load dataset from South Korea was used and the one-hour, one-day and one-week-ahead forecasting performances were compared.
The experimental results showed that the proposed method can select SIs in an effective manner. The forecasting performance of the proposed method (i.e., ‘GMDH’) was better than that of the LC- and MI-based filter methods. To be specific, in one-hour, one-day, and one-week ahead forecasts, ‘GMDH’ improves the overall MAPE values by 0.024, 0.226, and 0.124, respectively, over ‘LC’. Although the overall MAPE with H = 168 worsens by 0.014, when H = 1 and 24, ‘GMDH’ improves the overall MAPE values by 0.016 and 0.025, respectively, compared with ‘MI’. In addition, the computational efficiency of the proposed method was higher than that of the MI-based filter method. Among the five methods, ‘MI + GMDH’ achieved the best prediction accuracy. When H = 1, 24, and 168, ‘MI + GMDH’ outperforms ‘GMDH’, with 0.019, 0.073, and 0.072, respectively, improvements in overall MAPE values.
Let us summarize the main reasons for the improved performance of the proposed method. First, the proposed method can perform input selection by capturing many-to-one nonlinear relationships between potential inputs and output via the hierarchical structure of the GMDH network. Second, compared with the filter methods, the prediction accuracy (i.e., CEC) is employed as a selection criterion, which improves the prediction performance. Finally, SIs are selected by constructing GMDH networks many times, thereby facilitating more robust input selections.
In future research, the proposed method will be applied to hourly load forecasting on special days and daily peak load forecasting. Moreover, the proposed method can be used in various real-world applications such as financial time-series analysis, process monitoring, and nonlinear function approximations.

Author Contributions

J.Y. analyzed the data and wrote the paper. The analysis results and the paper were supervised by J.H.P. and S.K.

Acknowledgments

This work was supported by the Human Resources Program in Energy Technology of the Korea Institute of Energy Technology Evaluation and Planning (KETEP), granted financial resources from the Ministry of Trade, Industry & Energy, Republic of Korea. (No. 20174030201770).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Recursive Strategy for Multi-Step-Ahead Prediction

In H-step-ahead forecasting, the objective is to predict H future time-series data, {φt, t = N + 1, …, N + H}, using N historical data, {φt, t = 1, …, N}, where H ≥ 1 is called the prediction horizon, i.e., the values of H for one-hour, one-day, and one-week-ahead hourly load forecasting correspond to 1, 24, and 168, respectively.
In the recursive strategy [44], after constructing one-step-ahead prediction models, multi-step predictions are performed by feeding back the predicted values recursively as inputs. The following describes a one-step-ahead time-series predictor:
φ t + 1 = f ( φ t , , φ t d + 1 ) + w ,
where only the selected inputs are used for the predictor f, w is the zero-mean error term, and d is the window size, i.e., the number of IICs. After building the predictor using the historical dataset, H-step-ahead predictions are carried out as follows:
φ ^ N + h = { f ^ ( φ N , , φ N d + 1 ) if   h = 1 f ^ ( φ ^ N + h 1 , , φ ^ N + 1 , φ N , , φ N d + h ) if   h { 2 , , d } f ^ ( φ ^ N + h 1 , , φ ^ N d + h ) if   h { d + 1 , , H }

Appendix B. Performance Evaluation Using Five Indices Excluding MAPE

Excluding MAPE, the performance indices used in this study, i.e., SMAPE, MAE, NMSE, REP, and MME, are defined as follows:
SMAPE = 1 N i = 1 N | A i F i | ( | A i | + | F i | ) / 2 ,
MAE = 1 N i = 1 N | A i F i | ,
NMSE = 1 Δ 2 N i = 1 N ( A i F i ) 2 ,   where   Δ 2 = 1 N 1 i = 1 N ( A i A ¯ ) 2 ,
REP = i = 1 N ( A i F i ) 2 i = 1 N A i 2 × 100 ,
MME = max i ( | A i F i | ) ,   i = 1 ,   ,   N ,
where Ai and Fi are the actual and predicted values of the ith validation dataset, respectively, A ¯ is the mean of the actual values, and N is the number of validation datasets. Table A1 lists the results of performance comparisons of the five input selection methods using the five indices.
Table A1. Performance comparisons of the five input selection methods using the five indices.
Table A1. Performance comparisons of the five input selection methods using the five indices.
IndicesH‘LC’‘MI’‘GMDH’‘LC + GMDH’‘MI + GMDH’
SMAPE10.00530.00520.0051 +0.00520.0049 *
240.01730.01520.0150 +0.01560.0142 *
1680.02090.0195 +0.01970.02010.0190 *
MAE1314.60310.02300.51 +306.45289.27 *
241046.18918.81903.30 +943.22860.06 *
1681245.111162.16 +1171.441199.361128.97 *
NMSE10.00500.00480.0045 +0.00460.0041 *
240.05080.04010.0390 +0.04090.0358 *
1680.06550.0600 +0.06090.06290.0580 *
REP10.7060.6970.678 +0.6900.651 *
242.3352.0812.048 +2.1191.961 *
1682.6352.511 +2.5182.5792.456 *
MME11710.361561.011489.93 +1548.481420.95 *
244549.244162.12 +4217.884152.854061.22 *
1684610.244484.984389.05 +4502.894354.41 *
Due to space limitations, only the overall indices of the three validation months (i.e., April, July, and November in 2014) are presented in Table A1. In Table A1, the best entries among ‘LC’, ‘MI’ and ‘GMDH’ are highlighted by the superscript plus sign, and the best entries among the five input selection methods are accentuated by the superscript asterisk. As listed in Table A1, among the ‘LC’, ‘MI’ and ‘GMDH’, the proposed method improves the four indices except for MME in one-hour and one-day-ahead forecasting, and improves MME in one-hour and one-week-ahead forecasting compared with the filter methods. Among the five methods, the ‘MI + GMDH’ method obtains the best predictions with respect to all the indices for one-hour, one-day and one-week-ahead forecasting.

References

  1. Senjyu, T.; Takara, H.; Uezato, K.; Funabashi, T. One-hour-ahead load forecasting using neural network. IEEE Trans. Power Syst. 2002, 17, 113–118. [Google Scholar] [CrossRef]
  2. Nagi, J.; Yap, K.S.; Nagi, F.; Tiong, S.K.; Ahmed, S.K. A computational intelligence scheme for the prediction of the daily peak load. Appl. Soft Comput. 2011, 11, 4773–4788. [Google Scholar] [CrossRef]
  3. Elattar, E.E.; Goulermas, J.; Wu, Q.H. Electric load forecasting based on locally weighted support vector regression. IEEE Trans. Syst. Man Cybern. C Appl. Rev. 2010, 40, 438–447. [Google Scholar] [CrossRef]
  4. Apostolopoulos, P.A.; Tsiropoulou, E.E.; Papavassiliou, S. Demand Response Management in Smart Grid Networks: A Two-Stage Game-Theoretic Learning-Based Approach. Mob. Netw. Appl. 2018, 1–14. [Google Scholar] [CrossRef]
  5. Shi, W.; Li, N.; Xie, X.; Chu, C.C.; Gadh, R. Optimal residential demand response in distribution networks. IEEE J. Sel. Areas Commun. 2014, 32, 1441–1450. [Google Scholar] [CrossRef]
  6. Maharjan, S.; Zhu, Q.; Zhang, Y.; Gjessing, S.; Basar, T. Dependable demand response management in the smart grid: A Stackelberg game approach. IEEE Trans. Smart Grid 2013, 4, 120–132. [Google Scholar] [CrossRef]
  7. Hippert, H.S.; Pedreira, C.E.; Souza, R.C. Neural networks for short-term load forecasting: A review and evaluation. IEEE Trans. Power Syst. 2001, 16, 44–55. [Google Scholar] [CrossRef]
  8. Taylor, J.W.; Buizza, R. Neural network load forecasting with weather ensemble predictions. IEEE Trans. Power Syst. 2002, 17, 626–632. [Google Scholar] [CrossRef]
  9. Chen, Y.; Luh, P.B.; Guan, C.; Zhao, Y.; Michel, L.D.; Coolbeth, M.A.; Friedland, P.B.; Rourke, S.J. Short-term load forecasting: Similar day-based wavelet neural networks. IEEE Trans. Power Syst. 2009, 25, 322–330. [Google Scholar] [CrossRef]
  10. Felice, M.D.; Yao, X. Short-Term load forecasting with neural network ensembles: A comparative study. IEEE Comput. Intell. Mag. 2011, 6, 47–56. [Google Scholar] [CrossRef]
  11. Chen, B.J.; Chang, M.W.; Lin, C.J. Load forecasting using support vector machines: A study on EUNITE competition 2001. IEEE Trans. Power Syst. 2004, 19, 1821–1830. [Google Scholar] [CrossRef]
  12. Ceperic, E.; Ceperic, V.; Baric, A. A strategy for short-term load forecasting by support vector regression machines. IEEE Trans. Power Syst. 2013, 28, 4356–4364. [Google Scholar] [CrossRef]
  13. Fan, G.F.; Peng, L.L.; Hong, W.C.; Sun, F. Electric load forecasting by the SVR model with differential empirical mode decomposition and auto regression. Neurocomputing 2016, 173, 958–970. [Google Scholar] [CrossRef]
  14. Che, J.; Wang, J. Short-term load forecasting using a kernel-based support vector regression combination model. Appl. Energy 2014, 132, 602–609. [Google Scholar] [CrossRef]
  15. Ghelardoni, L.; Ghio, A.; Anguita, D. Energy load forecasting using empirical mode decomposition and support vector regression. IEEE Trans. Smart Grid 2013, 4, 549–556. [Google Scholar] [CrossRef]
  16. Vapnik, V. The Nature of Statistical Learning Theory; Springer: Berlin, Germany, 1995; ISBN 9780387987804. [Google Scholar]
  17. Kim, J.S. Vessel Target Prediction Method and Dead Reckoning Position Based on SVR Seaway Model. Int. J. Fuzzy Logic Intell. Syst. 2017, 17, 279–288. [Google Scholar] [CrossRef] [Green Version]
  18. Sindelar, R.; Babuska, R. Input selection for nonlinear regression models. IEEE Trans. Fuzzy Syst. 2004, 12, 688–696. [Google Scholar] [CrossRef]
  19. Hu, Z.; Bao, Y.; Xiong, T.; Chiong, R. Hybrid filter–wrapper feature selection for short-term load forecasting. Eng. Appl. Artif. Intell. 2015, 40, 17–27. [Google Scholar] [CrossRef]
  20. Ghofrani, M.; Ghayekhloo, M.; Arabali, A.; Ghayekhloo, A. A hybrid short-term load forecasting with a new input selection framework. Energy 2015, 81, 777–786. [Google Scholar] [CrossRef]
  21. Koprinska, I.; Rana, M.; Agelidis, V.G. Correlation and instance based feature selection for electricity load forecasting. Knowl.-Based Syst. 2015, 82, 29–40. [Google Scholar] [CrossRef]
  22. Sheikhan, M.; Mohammadi, N. Neural-based electricity load forecasting using hybrid of GA and ACO for feature selection. Neural Comput. Appl. 2012, 21, 1961–1970. [Google Scholar] [CrossRef]
  23. Tikka, J.; Hollmén, J. Sequential input selection algorithm for long-term prediction of time series. Neurocomputing 2008, 71, 2604–2615. [Google Scholar] [CrossRef]
  24. Sorjamaa, A.; Hao, J.; Reyhani, N.; Ji, Y.; Lendasse, A. Methodology for long-term prediction of time series. Neurocomputing 2007, 70, 2861–2869. [Google Scholar] [CrossRef]
  25. Da Silva, A.P.A.; Ferreira, V.H.; Velasquez, R.M.G. Input space to neural network based load forecasters. Int. J. Forecast. 2008, 24, 616–629. [Google Scholar] [CrossRef]
  26. Tran, H.D.; Muttil, N.; Perera, B.J.C. Selection of significant input variables for time series forecasting. Environ. Model. Softw. 2015, 64, 156–163. [Google Scholar] [CrossRef]
  27. Crone, S.F.; Kourentzes, N. Feature selection for time series prediction–A combined filter and wrapper approach for neural networks. Neurocomputing 2010, 73, 1923–1936. [Google Scholar] [CrossRef] [Green Version]
  28. Guyon, I.; Elisseeff, A. An introduction to variable and feature selection. J. Mach. Learn. Res. 2003, 3, 1157–1182. [Google Scholar]
  29. May, R.; Dandy, G.; Maier, H. Review of input variable selection methods for artificial neural networks. Intech Open 2011. [Google Scholar] [CrossRef]
  30. Abdel-Aal, R.E. Short-term hourly load forecasting using abductive networks. IEEE Trans. Power Syst. 2004, 19, 164–173. [Google Scholar] [CrossRef]
  31. Elattar, E.E.; Goulermas, J.Y.; Wu, Q.H. Generalized locally weighted GMDH for short term load forecasting. IEEE Trans. Syst. Man Cybern. C Appl. Rev. 2011, 42, 345–356. [Google Scholar] [CrossRef]
  32. Madala, H.R.; Ivakhnenko, A.G. Inductive Learning Algorithms for Complex Systems Modeling; CRC Press: Boca Raton, FL, USA, 1994; ISBN 0-8493-4438-7. [Google Scholar]
  33. Mueller, J.A.; Lemke, F. Self-Organising Data Mining: An Intelligent Approach to Extract Knowledge from Data; Libri: Hamburg, Germany, 2000; ISBN 9783898118613. [Google Scholar]
  34. Yu, J.; Kim, S. Locally-weighted polynomial neural network for daily short-term peak load forecasting. Int. J. Fuzzy Logic Intell. Syst. 2016, 16, 163–172. [Google Scholar] [CrossRef]
  35. Burden, R.L.; Faires, J.D. Numerical Analysis; Brooks/Cole, Cengage Learning: Boston, MA, USA, 2011; ISBN 9780538733519. [Google Scholar]
  36. Strang, G. Linear Algebra and Its Applications; Thomson, Brooks/Cole: Belmont, CA, USA, 2005; ISBN 9780534422004. [Google Scholar]
  37. Chiu, S.L. Selecting input variables for fuzzy models. J. Intell. Fuzzy Syst. 1996, 4, 243–256. [Google Scholar] [CrossRef]
  38. Han, J.; Kamber, M.; Pei, J. Data Mining: CONCEPTS and Techniques; Elsevier: Amsterdam, The Netherlands, 2011; ISBN 9780123814791. [Google Scholar]
  39. Rossi, F.; Lendasse, A.; François, D.; Wertz, V.; Verleysen, M. Mutual information for the selection of relevant variables in spectrometric nonlinear modelling. Chemom. Intell. Lab. Syst. 2006, 80, 215–226. [Google Scholar] [CrossRef] [Green Version]
  40. Kraskov, A.; Stögbauer, H.; Grassberger, P. Estimating mutual information. Phys. Rev. E 2004, 69. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  41. Schölkopf, B.; Smola, A.J.; Williamson, R.C.; Bartlett, P.L. New support vector algorithms. Neural Comput. 2000, 12, 1207–1245. [Google Scholar] [CrossRef] [PubMed]
  42. Chang, C.C.; Lin, C.J. Training ν-support vector regression: Theory and algorithms. Neural Comput. 2002, 14, 1959–1977. [Google Scholar] [CrossRef] [PubMed]
  43. Meng, Q.; Ma, X.; Zhou, Y. Forecasting of coal seam gas content by using support vector regression based on particle swarm optimization. J. Nat. Gas Sci. Eng. 2014, 21, 71–78. [Google Scholar] [CrossRef]
  44. Taieb, S.B.; Sorjamaa, A.; Bontempi, G. Multiple-output modeling for multi-step-ahead time series forecasting. Neurocomputing 2010, 73, 1950–1957. [Google Scholar] [CrossRef]
  45. Espinoza, M.; Suykens, J.A.K.; Belmans, R.; De Moor, B. Electric load forecasting. IEEE Control Syst. 2007, 27, 43–57. [Google Scholar] [CrossRef]
  46. Chang, C.C.; Lin, C.J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2011, 2. [Google Scholar] [CrossRef]
Figure 1. Group method of data handling (GMDH) network. In each layer, polynomial neurons are described by squares and they constitute a hierarchical and forward network structure. Here, the number of input variables entering each neuron is limited to 2.
Figure 1. Group method of data handling (GMDH) network. In each layer, polynomial neurons are described by squares and they constitute a hierarchical and forward network structure. Here, the number of input variables entering each neuron is limited to 2.
Energies 11 02870 g001
Figure 2. Hourly load curves for South Korea from 1 January 2013 to 31 December 2013. Dashed red lines indicate the main holidays in South Korea, i.e., Lunar New Year’s Day and Korean Thanksgiving Day.
Figure 2. Hourly load curves for South Korea from 1 January 2013 to 31 December 2013. Dashed red lines indicate the main holidays in South Korea, i.e., Lunar New Year’s Day and Korean Thanksgiving Day.
Energies 11 02870 g002
Figure 3. Example of the weekly load profile from South Korea over three weeks, i.e., from 7 January 2013 to 27 January 2013. The weekly and daily periodicities are clearly evident.
Figure 3. Example of the weekly load profile from South Korea over three weeks, i.e., from 7 January 2013 to 27 January 2013. The weekly and daily periodicities are clearly evident.
Energies 11 02870 g003
Figure 4. Linear correlation (LC) values calculated using the prepared learning dataset for predicting the loads in: (a) April 2014; (b) July 2014; (c) November 2014.
Figure 4. Linear correlation (LC) values calculated using the prepared learning dataset for predicting the loads in: (a) April 2014; (b) July 2014; (c) November 2014.
Energies 11 02870 g004
Figure 5. Normalized mutual information (MI) values calculated using the prepared learning dataset for predicting the loads in: (a) April 2014; (b) July 2014; (c) November 2014.
Figure 5. Normalized mutual information (MI) values calculated using the prepared learning dataset for predicting the loads in: (a) April 2014; (b) July 2014; (c) November 2014.
Energies 11 02870 g005
Figure 6. Results of LC-based filter method (‘LC’) for predicting the loads in: (a) April 2014; (b) July 2014; (c) November 2014. The threshold values and selected inputs are indicated by dashed thick solid black lines and dashed red lines, respectively.
Figure 6. Results of LC-based filter method (‘LC’) for predicting the loads in: (a) April 2014; (b) July 2014; (c) November 2014. The threshold values and selected inputs are indicated by dashed thick solid black lines and dashed red lines, respectively.
Energies 11 02870 g006
Figure 7. Results of MI-based filter method (‘MI’) for predicting the loads in: (a) April 2014; (b) July 2014; (c) November 2014.
Figure 7. Results of MI-based filter method (‘MI’) for predicting the loads in: (a) April 2014; (b) July 2014; (c) November 2014.
Energies 11 02870 g007
Figure 8. Results of the proposed method (‘GMDH’) for predicting the loads in: (a) April 2014; (b) July 2014; (c) November 2014. The threshold values, Nth, and selected inputs are indicated by dashed thick solid black lines and dashed red lines, respectively.
Figure 8. Results of the proposed method (‘GMDH’) for predicting the loads in: (a) April 2014; (b) July 2014; (c) November 2014. The threshold values, Nth, and selected inputs are indicated by dashed thick solid black lines and dashed red lines, respectively.
Energies 11 02870 g008
Figure 9. Results of ‘LC + GMDH’ for predicting the loads in: (a) April 2014; (b) July 2014; (c) November 2014.
Figure 9. Results of ‘LC + GMDH’ for predicting the loads in: (a) April 2014; (b) July 2014; (c) November 2014.
Energies 11 02870 g009
Figure 10. Results of ‘MI + GMDH’ for predicting the loads in: (a) April 2014; (b) July 2014; (c) November 2014.
Figure 10. Results of ‘MI + GMDH’ for predicting the loads in: (a) April 2014; (b) July 2014; (c) November 2014.
Energies 11 02870 g010
Figure 11. Bar chart showing the mean absolute percentage error (MAPE) values for each validation month and the overall values. H = 1, H = 24, and H = 168 correspond to one-hour, one-day and one-week-ahead forecasting, respectively.
Figure 11. Bar chart showing the mean absolute percentage error (MAPE) values for each validation month and the overall values. H = 1, H = 24, and H = 168 correspond to one-hour, one-day and one-week-ahead forecasting, respectively.
Energies 11 02870 g011
Figure 12. Overall absolute percentage errors for the three validation months based on comparisons between the actual and predicted loads: (a) H = 1; (b) H = 24; (c) H = 168.
Figure 12. Overall absolute percentage errors for the three validation months based on comparisons between the actual and predicted loads: (a) H = 1; (b) H = 24; (c) H = 168.
Energies 11 02870 g012
Figure 13. Examples of the actual and predicted hourly load curves obtained using ‘MI + GMDH’, and their errors from 1 to 7 July 2014: (a) H = 1; (b) H = 24; (c) H = 168; (d) Errors.
Figure 13. Examples of the actual and predicted hourly load curves obtained using ‘MI + GMDH’, and their errors from 1 to 7 July 2014: (a) H = 1; (b) H = 24; (c) H = 168; (d) Errors.
Energies 11 02870 g013
Figure 14. Histogram showing the one-hour-ahead prediction errors (=actual loads − predicted loads) by ‘MI + GMDH’ and ν-SVR for July 2014. The solid red line is estimated by the kernel method.
Figure 14. Histogram showing the one-hour-ahead prediction errors (=actual loads − predicted loads) by ‘MI + GMDH’ and ν-SVR for July 2014. The solid red line is estimated by the kernel method.
Energies 11 02870 g014
Figure 15. Linear regression analysis between the actual and predicted load values by ‘MI + GMDH’ and ν-SVR for July 2014: (a) H = 1; (b) H = 24; (c) H = 168.
Figure 15. Linear regression analysis between the actual and predicted load values by ‘MI + GMDH’ and ν-SVR for July 2014: (a) H = 1; (b) H = 24; (c) H = 168.
Energies 11 02870 g015
Table 1. Example illustrating the application of Algorithm 1 30 times independently to the prepared learning dataset for predicting the loads in April 2014. The IICs and output are {φtd+1, d = 1, 2, …, 168} and φt+1, respectively.
Table 1. Example illustrating the application of Algorithm 1 30 times independently to the prepared learning dataset for predicting the loads in April 2014. The IICs and output are {φtd+1, d = 1, 2, …, 168} and φt+1, respectively.
Experiment, iSet of Relevant Inputs, Xi
1X1 = {φtd+1|d = 1, 2, 5, 7, 8, 22, 23, 135, 167, 168}
2X2 = {φtd+1|d = 1, 2, 6, 7, 24, 25, 167, 168}
3X3 = {φtd+1|d = 1, 2, 4, 7, 21, 24, 48, 49, 167, 168}
30X30 = {φtd+1|d = 1, 2, 6, 7, 21, 23, 24, 96, 167, 168}
Table 2. Significant inputs (SIs) selected for each validation month using the five methods: ‘LC’, ‘MI’, ‘GMDH’, ‘LC + GMDH’ and ‘MI + GMDH’. The IICs and output are {φtd+1, d = 1, 2, …, 168} and φt+1, respectively.
Table 2. Significant inputs (SIs) selected for each validation month using the five methods: ‘LC’, ‘MI’, ‘GMDH’, ‘LC + GMDH’ and ‘MI + GMDH’. The IICs and output are {φtd+1, d = 1, 2, …, 168} and φt+1, respectively.
Validation MonthMethodSet of Selected SIsNo.
April 2014‘LC’{φtd+1|d = 1, 2, 166, 167, 168}5
‘MI’{φtd+1|d = 1, 2, 167, 168}4
‘GMDH’{φtd+1|d = 1, 2, 24, 167, 168}5
‘LC + GMDH’{φtd+1|d = 1, 2, 5, 24, 167, 168}6
‘MI + GMDH’{φtd+1|d = 1, 2, 23, 24, 25, 167, 168}7
July 2014‘LC’{φtd+1|d = 1, 2, 166, 167, 168}5
‘MI’{φtd+1|d = 1, 2, 24, 167, 168}5
‘GMDH’{φtd+1|d = 1, 2, 21, 24, 25, 152, 168}7
‘LC + GMDH’{φtd+1|d = 1, 2, 24, 25, 48, 49, 168}7
‘MI + GMDH’{φtd+1|d = 1, 2, 21, 24, 25, 168}6
November 2014‘LC’{φtd+1|d = 1, 2, 166, 167, 168}5
‘MI’{φtd+1|d = 1, 24, 167, 168}4
‘GMDH’{φtd+1|d = 1, 4, 24, 25, 167, 168}6
‘LC + GMDH’{φtd+1|d = 1, 4, 10, 24, 25, 26, 167, 168}8
‘MI + GMDH’{φtd+1|d = 1, 24, 25, 152, 167, 168}6
Table 3. Mean absolute percentage error (MAPE) values of each validation month and the overall values. The last three rows indicate the overall MAPE values for three validation months and the second column represents the prediction horizons, i.e., H = 1, H = 24, and H = 168, which correspond to one-hour, one-day and one-week-ahead forecasting, respectively.
Table 3. Mean absolute percentage error (MAPE) values of each validation month and the overall values. The last three rows indicate the overall MAPE values for three validation months and the second column represents the prediction horizons, i.e., H = 1, H = 24, and H = 168, which correspond to one-hour, one-day and one-week-ahead forecasting, respectively.
Validation MonthH‘LC’‘MI’‘GMDH’‘LC + GMDH’‘MI + GMDH’
April 201410.5590.5450.526 +0.5310.496 *
241.3611.3211.252 +1.2651.162 *
1681.6751.6561.604 +1.6041.559 *
July 201410.4590.4740.457 +0.4780.438 *
241.7591.5761.514 +1.6781.434 *
1682.1081.900 +1.9252.0251.834 *
November 201410.5820.5560.544 +0.5480.534 *
242.0471.666 +,*1.7251.7281.676
1682.4852.301 +2.3712.3932.293 *
Overall10.5320.5240.508 +0.5190.489 *
241.7231.5221.497 +1.5581.424 *
1682.0901.952 +1.9662.0081.894 *
Table 4. Computational time required by the five input selection methods for April 2014.
Table 4. Computational time required by the five input selection methods for April 2014.
Input Selection MethodsTime
‘LC’9.863 s
‘MI’3402.265 s
‘GMDH’1483.911 s
‘LC + GMDH’189.622 s
‘MI + GMDH’3572.328 s

Share and Cite

MDPI and ACS Style

Yu, J.; Park, J.H.; Kim, S. A New Input Selection Algorithm Using the Group Method of Data Handling and Bootstrap Method for Support Vector Regression Based Hourly Load Forecasting. Energies 2018, 11, 2870. https://doi.org/10.3390/en11112870

AMA Style

Yu J, Park JH, Kim S. A New Input Selection Algorithm Using the Group Method of Data Handling and Bootstrap Method for Support Vector Regression Based Hourly Load Forecasting. Energies. 2018; 11(11):2870. https://doi.org/10.3390/en11112870

Chicago/Turabian Style

Yu, Jungwon, June Ho Park, and Sungshin Kim. 2018. "A New Input Selection Algorithm Using the Group Method of Data Handling and Bootstrap Method for Support Vector Regression Based Hourly Load Forecasting" Energies 11, no. 11: 2870. https://doi.org/10.3390/en11112870

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop