Article SVR with Hybrid Chaotic Immune Algorithm for Seasonal Load Demand Forecasting

Accurate electric load forecasting has become the most important issue in energy management; however, electric load demonstrates a seasonal/cyclic tendency from economic activities or the cyclic nature of climate. The applications of the support vector regression (SVR) model to deal with seasonal/cyclic electric load forecasting have not been widely explored. The purpose of this paper is to present a SVR model which combines the seasonal adjustment mechanism and a chaotic immune algorithm (namely SSVRCIA) to forecast monthly electric loads. Based on the operation procedure of the immune algorithm (IA), if the population diversity of an initial population cannot be maintained under selective pressure, then IA could only seek for the solutions in the narrow space and the solution is far from the global optimum (premature convergence). The proposed chaotic immune algorithm (CIA) based on the chaos optimization algorithm and IA, which diversifies the initial definition domain in stochastic optimization procedures, is used to overcome the premature local optimum issue in determining three parameters of a SVR model. A numerical example from an existing reference is used to elucidate the forecasting performance of the proposed SSVRCIA model. The forecasting results indicate that the proposed model yields more accurate forecasting results than the ARIMA and TF-e-SVR-SA models, and therefore the SSVRCIA model is a promising alternative for electric load forecasting.

In the recent decade, lots of researchers have tried to apply artificial intelligence techniques to improve the accuracy of the load forecasting. Knowledge-based expert systems (KBES') and artificial neural networks (ANNs) are the popular representatives. The KBES approaches performed electric load forecasting by simulating the experiences of system operators who were well-experienced in the electricity generation processes, such as Rahman and Bhatnagar [14]. The characteristic feature of this approach is that it is rule-based, which implies that the system transforms new rules from received information. In other words, it is presumed that an expert trained using existing data will provide increased forecasting accuracy [14][15][16]. This approach is a derivation of the rules from on-the-job training and sometimes transforming the information logic into equations could be impractical. Lots of researchers have also tried to apply ANNs to improve the load forecasting accuracy. Park et al. [17] proposed a 3-layer back-propagation neural network for daily load forecasting problems. The inputs include three indices of temperature: average, peak and lowest loads. The outputs are peak loads. The proposed model outperforms the regression model and the time series model in terms of forecasting accuracy index and mean absolute percent error (MAPE). Novak [18] applied radial basis function (RBF) neural networks to forecasting electricity loads. The results indicate that RBF is at least 11 times faster and more reliable than back-propagation neural networks. Darbellay and Slama [19] applied ANNs to predict the electricity load in the Czech Republic. The experimental results indicate that the proposed ANN model outperformed the ARIMA model in terms of forecasting accuracy index, normalized mean square error (NMSE). Abdel-Aal [20] proposed an Abductive network to conduct one-hour-ahead load forecasting for five years. Hourly temperature and hourly load data are considered. The results of the proposed model are very promising in terms of forecasting accuracy index, MAPE. Hsu and Chen [21] employed the ANN model to forecast the regional electricity load in Taiwan. The empirical results indicate that proposed model is superior to the traditional regression model. However, it is possible to get trapped in local minima and be subjective in selecting the model architecture [22]. Support vector machines (SVM) were originally developed to solve pattern recognition and classification problems. With the introduction of Vapnik's ε-insensitive loss function, SVMs have been extended to solve nonlinear regression estimation problems, i.e., the so-called support vector regression (SVR), and have been successfully applied to solve forecasting problems in many fields such as financial time series (stocks indexes and exchange rates) forecasting [23][24][25][26][27], tourist arrival forecasting [28,29], the engineering and software field (production values and reliability forecasting) [30,31], atmospheric science forecasting [32][33][34][35], and so on. Meanwhile, the SVR model has also been successfully applied to forecasting electric loads [36][37][38][39][40][41]. The practical results indicated that poor forecasting accuracy suffered from the lack of knowledge of the selection of the three parameters (σ, C, and ε) in a SVR model. However, the structured ways for determining three free parameters in a SVR model are poor. Recently, some major nature-inspired evolutionary algorithms were applied to solve optimization problems, the immune algorithm (IA) being one of them. IA, proposed by Mori et al. [42] and used in this study, is based on the learning mechanism of natural immune systems. Similar to GA, SA, and PSO, IA is also a population based evolutionary algorithm, therefore, it provides a set of solutions for exploration and exploitation of search space to obtain optimal/near optimal solutions [43]. In addition, the diversity of the employed population set will determine the search results, the desired solution or premature convergence (trapping in a local minimum). To overcome these drawbacks, it is necessary to find some effective approaches and improvements of the IA to maintain the population diversity and avoid leading to a local optimum. One possible approach is to divide the chromosome population into several subgroups and limit the crossover between the members in different subgroups to maintain the population diversity. However, such a method would require a huge population size, which is not typical in business forecasting application problem solving. Another feasible approach is focused on the chaos approach, due to its easy implementation and special ability to avoid being trapped in local optima [44]. Chaos often occurrs in a deterministic nonlinear dynamic system [45,46]. It is highly unstable motion in finite phase space. Such a motion is very similar to a random process ("randomicity"). Therefore, any variable in the chaotic space can travel ergodically over the whole space of interest ("ergodicity"). The variation of those chaotic variables obeys delicate inherent rules in spite of the fact that its variation may look like being in disorder ("regularity"). In addition, it is extremely sensitive to the initial conditions, which is an important property sometimes referred to as the so-called butterfly effect [47]. Attempting to simulate numerically a global weather system, Lorenz discovered that minute changes in initial conditions steered subsequent simulations towards radically different final states. Based on the two advantages of the chaos, the chaotic optimization algorithm (COA) was proposed to solve complex function optimizations [45]. The basic idea of the COA is to transform the problem variables from the solution space to the chaos space and then perform searches to find out the solution based on the three characteristics (randomicity, ergodicity, and regularity) of the chaotic variables. In this investigation, the chaotic immune algorithm (CIA) is tried to determine the values of three parameters in a SVR model. On the other hand, as indicated in the literature [48][49][50], electric energy demands also demonstrate a cyclic (seasonal) trend caused by the differences in demand from month to month and season to season, and the applications of SVR models to deal with cyclic (seasonal) trend time series, however, have not been widely explored. Therefore, this paper also attempts to apply the seasonal adjustment method [50,51] to deal with seasonal trend time series problems. The proposed SSVRCIA model is dedicated to improve forecasting performance in capturing non-linear and seasonal electric load changes tendencies. Two other forecasting approaches, the ARIMA and TF-ε-SVR-SA models proposed by Wang et al. [50], are used to compare the forecasting accuracy of electric load. The rest of this paper is organized as follows: the SSVRCIA model, including the formulation of SVR, the CIA algorithm, and the seasonal adjustment process, is introduced in Section 2. A numerical example is presented in Section 3. Conclusions are discussed in Section 4.

Support Vector Regression (SVR) Model
The brief basic concepts of SVMs for the case of regression are introduced. A nonlinear mapping is the ε-insensitive loss function and defined as Equation 3: is employed to find out an optimum hyperplane on the high dimensional feature space to maximize the distance separating the training data into two subsets. Thus, the SVR focuses on finding the optimum hyper plane and minimizing the training error between the training data and the ε-insensitive loss function. Then, the SVR minimizes the overall errors: with the constraints: The first term of Equation 4, employing the concept of maximizing the distance of two separated training data, is used to regularize weight sizes, to penalize large weights, and to maintain regression function flatness. The second term penalizes training errors of ) (x f and y by using the ε-insensitive loss function. C is a parameter to trade off these two terms. Training errors above ε are denoted as ξ i , whereas training errors below-ε are denoted as ξ i .
After the quadratic optimization problem with inequality constraints is solved, the parameter vector w in Equation 1 is obtained from: β are obtained by solving a quadratic program and are the Lagrangian multipliers.
Finally, the SVR regression function is obtained as Equation 6 in the dual space: is called the kernel function, and the value of the Kernel equals the inner product of two vectors, i x and j x , in the feature space ) ( i x ϕ and ) ( j x ϕ , respectively; that is, . Any function that meets Mercer's condition [52] can be used as the Kernel function.
There are several types of kernel functions. The most used kernel functions are the Gaussian RBF with a width of σ : x x and the polynomial kernel with an order of d and constants a 1 and a 2 : . Till now, it has been hard to determine the type of kernel functions for specific data patterns [53,54]. However, the Gaussian RBF kernel is not only easier to implement, but also capable of nonlinearly mapping the training data into an infinite dimensional space, thus, it is suitable to deal with nonlinear relationship problems. Therefore, the Gaussian RBF kernel function is specified in this study.

Chaotic Immune Algorithm (CIA) in Selecting Parameters of the SVR Model
The selection of the three parameters, σ, ε and C, of a SVR model influence the forecasting accuracy. However, structural methods for confirming efficient selection of parameters efficiently are lacking. Recently, Hong [38] applied an immune algorithm (IA) to determine the parameters of a SVR model, and found that the proposed model is superior to other competitive forecasting models (ANN and regression models). However, based on the IA operation procedure, if the population diversity of an initial population cannot be maintained under selective pressure, i.e., the initial individuals are not necessarily fully diversified in the search space, then an IA could only seek for the solutions in the narrow space and the solution is far from the global optimum (premature convergence). To overcome thise shortcoming, it is necessary to find some effective approach and improve the design or procedures of the IA to track in the solution space effectively and efficiently. One feasible approach is focused on the chaos approach, due to its easy implementation and special ability to avoid being trapped in local optima [44]. The application of chaotic sequences can be a good alternative to diversify the initial definition domain in stochastic optimization procedures, i.e., small changes in the parameter settings or the initial values in the model. Due to the ergodicity property of chaotic sequences, it will lead to very different future solution-finding behaviors, thus, chaotic sequences can be used to enrich the search behavior and to avoid being trapped in a local optimum [55]. There are lots of applications in optimization problema using chaotic sequences [56][57][58][59][60]. Coelho and Mariani [61] recently apply a chaotic artificial immune network (chaotic opt-aiNET) to solve the economic dispatch problem (EDP), based on Zaslavsky's map by its spread-spectrum characteristic and large Lyapunov exponent to successfully escape from local optimum and to converge to a stable equilibrium. Therefore, it is believable that applying chaotic sequences to diversify the initial definition domain in IA's initialization procedure (CIA) is a feasible approach to optimize the parameter selection in a SVR model. Recently, Wang et al. [62] also employed similar applications of CIA to determine the three parameters of a SVR model and obtained good performance in jumping out of the local optimum.
To design the CIA, many principal factors like identifying the affinity, selection of antibodies, crossover and mutation of antibody population are similar to the IA factors, and more procedural details about the CIA used in this study are as follows, and the corresponding flowchart is shown in Figure 1.
Then, employing the chaotic sequence, defined as Equation 8 with 4 = μ to compute the next iteration chaotic variable, , and ) (i x is distributed in the range (0,1), the above system exhibits chaotic behavior. We transform to obtain three parameters for the next iteration, After this transformation, the three parameters, C, σ, and ε, constitute the initial antibody population, and then will be represented by a binary-code string. For example, assume that an antibody contains 12 binary codes to represent three SVR parameters. Each parameter is thus expressed by four binary codes. Assume the set-boundaries for parameters σ, C, and ε are 2, 10, and 0.5 respectively, then, the antibody with binary-code "1 0 0 1 0 1 0 1 0 0 1 1" implies that the real values of the three parameters σ, C, and ε are 1.125, 3.125, and 0.09375, respectively. The number of initial antibodies is the same as the size of the memory cell. The size of the memory cell is set to 10 in this study.
Step 2. Identify the Affinity and the Similarity. A higher affinity value implies that an antibody has a higher activation with an antigen. To continue keeping the diversity of the antibodies stored in the memory cells, the antibodies with lower similarity have higher probability of being included in the memory cell. Therefore, an antibody with a higher affinity value and a lower similarity value has a good likelihood of entering the memory cells. The affinity between the antibody and antigen is defined as Equation 10: where d k denotes the SVR forecasting errors obtained by the antibody k. The similarity between antibodies is expressed as in Equation 11: where T ij denotes the difference between the two SVR forecasting errors obtained by the antibodies inside (existed) and outside (will be entering) the memory cell.
Step 3. Selection of Antibodies in the Memory Cell. Antibodies with higher values of Ag k are considered to be potential candidates for entering the memory cell. However, the potential antibody candidates with Ab ij values exceeding a certain threshold are not qualified to enter the memory cell. In this investigation, the threshold value is set to 0.9.
Step 4. Crossover of Antibody Population. New antibodies are created via crossover and mutation operations. To perform crossover operation, strings representing antibodies are paired randomly. Moreover, the proposed scheme adopts the single-point-crossover principle. Segments of paired strings (antibodies) between two determined break-points are swapped. In this investigation, the probability of crossover (p c ) is set as 0.5. Finally, the three crossover parameters are decoded into a decimal format.
Step 5. Annealing Chaotic Mutation of Antibody Population. For the ith iteration (generation) crossover antibody population ( where q max is the maximum evolutional generation of the population. Then, the ith chaotic variable ) (i k x is summed up to ) ( i k x and the chaotic mutation variable are also mapped to interval [0,1] as in Equation 13: where δ is the annealing operation. Finally, the chaotic mutation variable obtained in interval [0,1] is mapped to the solution interval (Min k ,Max k ) by definite probability of mutation (p m ), thus completing a mutative operation: Step 6. Stopping Criteria. If the number of generations equals a given scale, then the best antibody is a solution, otherwise return to Step 2. The CIA is used to seek a better combination of the three parameters in the SVR. The value of the mean absolute percentage error (MAPE) is used as the criterion (the smallest value of MAPE) of forecasting errors to determine the suitable parameters used in SVR model in this investigation, which is given by Equation 15: (15) where N is the number of forecasting periods; ƒ i is the actual value at period i; i fˆ denotes is the forecasting value at period i.

Seasonal Adjustment
Due to the difference in demand from month to month and season to season, electric energy demands also demonstrate a cyclic (seasonal) tendency, so any model attempting to accomplish the goal of high accurate forecasting performance, must estimate this seasonal component. There are several approaches to estimate the seasonal index of data series [50,63,64], including product-model type and non-product-model type. Based on the data series type consideration, this investigation employed Deo and Hurvich's [63] approach to compute the seasonal index, as shown in Equation 16: where t = j, l + j, 2l + j,…,(m − 1)l + j only for the same time point (month) in each period. Then, the seasonal index (SI) for each time point (month) j is computed as Equation 17: ( ) j l m j l j j peak peak peak m SI where j = 1,2,…,l. Eventually, the forecasting value of the SSVRCIA is obtained by Equation 18: where k = 1,2,…,l implies the time point (month) in another period (for forecasting period).

Data Set
To based our comparisons on the same conditions, this study uses historical monthly electric load data from Northeast China to compare the forecasting performance of the proposed SSVRCIA model with those of ARIMA and TF-ε-SVR-SA models proposed by Wang et al. [50]. In addition, due to verification of performance of seasonal adjustment mechanism, the SVRCIA model (without seasonal adjustment mechanism) is also implemented for comparison. Table 1 Table 2.  Figure 2. The rolling-base forecasting procedure (training stage).

SSVRCIA Electric Load Forecasting Model
Before conducting the seasonal adjustment for the SSVRCIA model, it is necessary to implement the CIA algorithm to determine suitable values of the three parameters in a SVR model. The parameters of the CIA in the proposed model are experimentally set as shown in Table 3. For the SVRCIA modeling procedure, in the training stage, a rolling-based forecasting procedure (Figure 2), which divides the training data into two subsets, namely the fed-in subset (for example, 25 load data points) and the fed-out subset (7 load data points), respectively. First, the primary 25 load data of the fed-in subset are fed into the proposed model, the structural risk minimization principle is employed to minimize the training error, then, we obtain the one-step ahead forecasting load, namely the 26th forecasting load. Second, the next 25 load data points, including 24 of the fed-in subset data (from 2nd to 25th) pulsing the 26th data in the fed-out subset, are similarly fed again into the proposed model, the structural risk minimization principle is also employed to minimize the training error, then, we obtain the one-step ahead forecasting load, namely the 27th forecasting load. The rolling-based forecasting procedure is repeated till the 32nd forecasting load is obtained. Meanwhile, the training error in this training stage is also obtained. In the validation and testing stage, a one-hour-ahead forecasting policy is adopted. Then, several types of data-rolling are considered to forecast the electric load in the next point (month). Different values of the electric load in a time series are fed into the SVRCIA model to forecast the electric load in the next validation period. While training errors improvement occurs, the three kernel parameters, σ, C, and ε of the SVRCIA model adjusted by the CIA algorithm are employed to calculate the validation error. Then, the adjusted parameters with minimum validation error are selected as the most appropriate parameters. Table 4 indicates that SVRCIA models perform the best when 25 input data are used for electric load forecasting.  Now the seasonal term is considered. For the monthly electric load in Northeastern China, each month has a different electric demand pattern, the seasonal length is verified as 12 [50], thus, there are 12 points (months) in each electric load cyclic year. The seasonal indexes for each point (month) are calculated based on the 46 forecasting values of the SVRCIA model both in training (32 forecasting values) and validation (14 forecasting values) stages, as shown in Table 5. Those indices with values smaller than 1 imply that the average forecasts (based on 32 training forecasts and 14 validation forecasts) of the SVRCIA model are overestimated, i.e., the smaller a seasonal index is, the higher the overestimation of electric load is. Thus, overproduced supplies would lead to energy losses. On the contrary, the higher the seasonal index is, the lower the underestimation of electric load is, i.e., the months with a higher seasonal index (larger than 1) may be potential months with limited amounts of useable electric load in the future.   Table 6 shows the actual values and the forecast values obtained using various forecasting models: ARIMA(1,1,1), TF-ε-SVR-SA, SVRCIA, and SSVRCIA. The MAPE values are calculated to compare fairly the proposed models with other alternative models. The proposed SSVRCIA model has smaller MAPE values than the ARIMA, TF-ε-SVR-SA, and SVRCIA models for capturing electric load cyclic trends on monthly average basis. Furthermore, to verify the significance of accuracy improvement of SSVRCIA model comparing with ARIMA(1,1,1), TF-ε-SVR-SA, and SVRCIA models, a statistical test, namely the Wilcoxon signed-rank test, is conducted at the 0.025 and 0.05 significance levels in one-tail-tests (Table 7). From Table 7, it is seen that the SSVRCIA model outperforms the ARIMA(1,1,1) model significantly, due to its theoretical assumption of a convex set. In addition, the SSVRCIA model is also significantly superior to the TF-ε-SVR-SA model, not only because of the superior searching capability of CIA to determine proper parameters in a SVR model, but also because of the use of a seasonal adjustment mechanism to adjust the seasonal/cyclic effects of electric loads. Finally, the SSVRCIA model also significantly outperforms the SVRCIA model, obviously, the seasonal adjustment mechanism employed here is proficient in dealing with such cyclic data types.

Conclusions
From the historical data, the electric load values in Northern China show not only a strong growth trend but also an obvious monthly seasonal/cyclic tendency. This is a common electric load phenomenon in developing countries. However, the role of electric demand growth rate forecasting seems to be to avoid overproduction or underproduction of electric loads. This study introduces the application of a forecasting technique, SSVRCIA, to investigate its feasibility for forecasting electric loads. The experimental results indicate that the SSVRCIA model has better forecasting performance than the ARIMA(1,1,1), TF-ε-SVR-SA, and SVRCIA models. The superior performance of the SSVRCIA model is not only because of its theoretical assumptions of a convex set while SVR modeling, but also because of the superior searching capability of CIA to determine the proper parameters in SVR (this is why it outperforms the TF-ε-SVR-SA model) and effective seasonal adjustment mechanism (this is why it outperforms the SVRCIA model). By contrast, ARIMA model employs the parametric technique which is based on specific assumptions, such as linear relationships between the current value of the underlying variables and previous values of the variable and error terms, and these assumptions are not completely in line with real world problems.
This investigation is the first to apply the SVR with CIA and seasonal adjustment for forecasting monthly electric loads. Many forecasting methodologies have been proposed to deal with the seasonality of electric loads, but most models are time consuming in verifying the suitable time-phase divisions, particularly when the sample size is large. In this investigation, the SSVRCIA model provides a convenient and valid alternative for electric load forecasting. The SSVRCIA model directly uses historical monthly electric data and then determines suitable parameters by efficient optimization algorithms. In addition, the proposed SSVRCIA model is also a hybrid forecasting models; some other advanced optimization algorithms for parameters selection can be further applied for the SVR model.