Hybridizing Chaotic and Quantum Mechanisms and Fruit Fly Optimization Algorithm with Least Squares Support Vector Regression Model in Electric Load Forecasting

: Compared with a large power grid, a microgrid electric load (MEL) has the characteristics of strong nonlinearity, multiple factors, and large fluctuation, which lead to it being difficult to receive more accurate forecasting performances. To solve the abovementioned characteristics of a MEL time series, the least squares support vector machine (LS-SVR) hybridizing with meta-heuristic algorithms is applied to simulate the nonlinear system of a MEL time series. As it is known that the fruit fly optimization algorithm (FOA) has several embedded drawbacks that lead to problems, this paper applies a quantum computing mechanism (QCM) to empower each fruit fly to possess quantum behavior during the searching processes, i.e., a QFOA algorithm. Eventually, the cat chaotic mapping function is introduced into the QFOA algorithm, namely CQFOA, to implement the chaotic global perturbation strategy to help fruit flies to escape from the local optima while the population’s diversity is poor. Finally, a new MEL forecasting method, namely the LS-SVR-CQFOA model, is established by hybridizing the LS-SVR model with CQFOA. The experimental results illustrate that, in three datasets, the proposed LS-SVR-CQFOA model is superior to other alternative models, including BPNN (back-propagation neural networks), LS-SVR-CQPSO (LS-SVR with chaotic quantum particle swarm optimization algorithm), LS-SVR-CQTS (LS-SVR with chaotic quantum tabu search algorithm), LS-SVR-CQGA (LS-SVR with chaotic quantum genetic algorithm), LS-SVR-CQBA (LS-SVR with chaotic quantum bat algorithm), LS-SVR-FOA, and LS-SVR-QFOA models, in terms of forecasting accuracy indexes. In addition, it passes the significance test at a 97.5% confidence level.


Motivation
MEL forecasting is the basis of microgrid operation scheduling and energy management.It is an important prerequisite for the intelligent management of distributed energy.The forecasting performance would directly affect the microgrid system's energy trading, power supply planning, and power supply quality.However, the MEL forecasting accuracy is not only influenced by the mathematical model, but also by the associated historical dataset.In addition, compared with the large power grid, microgrid electric load (MEL) has the characteristics of strong nonlinearity, multiple factors, and large fluctuation, which lead to it being difficult to achieve more accurate forecasting performances.Along with the development of artificial intelligent technologies, load forecasting methods have been continuously applied to load forecasting.Furthermore, the hybridization or combination of the intelligent algorithms also provides new models to improve the load forecasting performances.These hybrid or combined models either employ a novel intelligent algorithm or framework to improve the embedded drawbacks or apply the advantages of two of the above models to achieve more satisfactory results.The models apply a wide range of load forecasting approaches and are mainly divided into two categories, traditional forecasting models and intelligent forecasting models.

Relevant Literature Reviews
Conventional load forecasting models include exponential smoothing models [1], time series models [2], and regression analysis models [3].An exponential smoothing model is a curve fitting method that defines different coefficients for the historical load data.It can be understood that a series with the forecasted load time has a large influence on the future load, while a series with the long time from the forecasted load has a small influence on the future load [1].The time series model is applied to load forecasting, which is characterized by a fast forecasting speed and can reflect the continuity of load forecasting, but requires the stability of the time series.The disadvantage is that it cannot reflect the impact of external environmental factors on load forecasting [2].The regression model seeks a causal relationship between the independent variable and the dependent variables according to the historical load change law, determining the regression equation, and the model parameters.The disadvantage of this model is that there are too many factors affecting the forecasting accuracy.It is not only affected by the parameters of the model itself, but also by the quality of the data.When the external influence factors are too many or the relevant influent factor data are difficult to analyze, the regression forecasting model will result in huge errors [3].
Intelligent forecasting models include the wavelet analysis method [4,5], grey forecasting theory [6,7], the neural network model [8,9], and the support vector regression (SVR) model [10].In load forecasting, the wavelet analysis method is combined with external factors to establish a suitable load forecasting model by decomposing the load data into sequences on different scales [4,5].The advantages of the grey model are easy to implement and there are fewer influencing factors employed.However, the disadvantage is that the processed data sequence has more grayscale, which results in large forecasting error [6,7].Therefore, when this model is applied to load forecasting, only a few recent data points would be accurately forecasted; more distant data could only be reflected as trend values and planned values [7].Due to the superior nonlinear performances, many models based on artificial neural networks (ANNs) have been applied to improve the load forecasting accuracy [8,9].To achieve more accurate forecasting performance, these models and other new or novel forecasting approaches have been hybridized or combined [9].For example, an adaptive network-based fuzzy inference system is combined with an RBF neural network [11], the Monte Carlo algorithm is combined with the Bayesian neural network [12], fuzzy behavior is hybridized with a neural network (WFNN) [13], a knowledge-based feedback tuning fuzzy system is hybridized with a multi-layer perceptron artificial neural network (MLPANN) [14], and so on.However, these ANNs-based models suffer from some serious problems, such as trapping into local optimum easily, it being time-consuming to achieve a functional approximation, and the difficulty of selecting the structural parameters of a network [15,16], which limits its application in load forecasting to a large extent.
The SVR model is based on statistical learning theory, as proposed by Vapnik [17].It has a solid mathematical foundation, a better generalization ability, a relatively faster convergence rate, and can find global optimal solutions [18].Because the basic theory of the SVR model is perfect and the model is also easy to establish, it has attracted extensive attention from scholars in the load forecasting fields.In recent years, some scholars have applied the SVR model to the research of load forecasting [18] and achieved superior results.One study [19] proposes the EMD-PSO-GA-SVR model to improve the forecasting accuracy, by hybridizing the empirical mode decomposition (EMD) with two particle swarm optimization (PSO) and the genetic algorithm (GA).In addition, a modified version of the SVR model, namely the LS-SVR model, only considers equality constraints instead of inequalities [20,21].Focused on the advantages of the LS-SVR model to deal with such problems, this paper tries to simulate the nonlinear system of the MEL time series to receive the forecasting values and improve the forecasting accuracy.However, the disadvantages of the SVR-based models in load forecasting are that when the sample size of the load is large, the time of system learning and training is highly time-consuming, and the determination of parameters mainly depends on the experience of the researchers.This has a certain degree of influence on the accuracy in load forecasting.Therefore, exploring more suitable parameter determination methods has always been an effective way to improve the forecasting accuracy of the SVR-based models.To determine more appropriate parameters of the SVR-based models, Hong and his colleagues have conducted research using different evolutionary algorithms hybridized with an SVR model [22][23][24].In the meantime, Hong and his successors have also applied different chaotic mapping functions (including the logistic function [22,23] and the cat mapping function [10]) to diversify the population during modeling processes, and the cloud theory to make sure the temperature continuously decreases during the annealing process, eventually determining the most appropriate parameters to receive more satisfactory forecasting accuracy [10].
The fruit fly optimization algorithm (FOA) is a new swarm intelligent optimization algorithm proposed in 2011, it searches for global optimization based on fruit fly foraging behavior [25,26].The algorithm has only four control parameters [27].Compared with other algorithms, FOA has the advantages of being easy to program and having fewer parameters, less computation, and high accuracy [28,29].FOA belongs to the domain of evolutionary computation; it realizes the optimization of complex problems by simulating fruit flies to search for food sources by using olfaction and vision.It has been successfully applied to the predictive control fields [30,31].However, similar to those swarm intelligent optimization algorithms with iterative searching mechanisms, the standard FOA also has drawbacks such as a premature convergent tendency, a slow convergent rate in the later searching stage, and poor local search performance [32].
Quantum computing has become one of the leading branches of science in the modern era due to its powerful computing ability.This not only prompted us to study new quantum algorithms, but also inspired us to re-examine some traditional optimization algorithms from the quantum computing mechanism.The quantum computing mechanism (QCM) makes full use of the superposition and coherence of quantum states.Compared with other evolutionary algorithms, the QCM uses a novel encoding method-quantum bit encoding.Through the encoding of qubits, an individual can characterize any linear superposition state, whereas traditional encoding methods can only represent one specific one.As a result, with QCM it is easier to maintain population diversity than with other traditional evolutionary algorithms.Nowadays, it has become a hot topic of research that QCM is able to hybridize with evolutionary algorithms to receive more satisfactory searching results.The literature [33] introduced QCM into genetic algorithms and proposed quantum derived genetic algorithm (QIGA).From the point of view of algorithmic mechanism, it is very similar to the isolated niches genetic algorithm.Han and Kim [34] proposed a genetic quantum algorithm (GQA) based on QCM.Compared with traditional evolutionary algorithms, its greatest advantage is its better ability to maintain population diversity.Han and Kim [35] further introduced the population migration mechanism based on ure [34], and renamed the algorithm a quantum evolution algorithm (QEA).Huang [36], Lee and Lin [37,38], and Li et al. [39] hybridized the particle swam optimization (PSO) algorithm, Tabu search (TS) algorithm, genetic algorithm (GA), and bat algorithm (BA) with the QCM and the cat mapping function, and proposed the CQPSO, CQTS, CQGA, and CQBA algorithms, which were employed to select the appropriate parameters of an SVR model.The results of the application indicate that the improved algorithms obtain more appropriate parameters, and higher forecasting accuracy is achieved.The above applications also reveal that the improved algorithm, by hybridizing with QCM, could effectively avoid local optimal position and premature convergence.

Contributions
Considering the inherent drawback of the FOA, i.e., suffering from premature convergence, this paper tries to hybridize the FOA with QCM and the cat chaotic mapping function to solve the premature problem of FOA.Eventually, determine more appropriate parameters of an LS-SVR model.The major contributions are as follows: (1) QCM is employed to empower the search ability of each fruit fly during the searching processes of QFOA.The cat chaotic mapping function is introduced into QFOA and implements the chaotic global perturbation strategy to help a fruit fly escape from the local optima when the population's diversity is poor.(2) We propose a novel hybrid optimization algorithm, namely CQFOA, to be hybridized with an LS-SVR model, namely the LS-SVR-CQFOA model, to conduct the MEL forecasting.Other similar alternative hybrid algorithms (hybridizing chaotic mapping function, QCM, and evolutionary algorithms) in existing papers, such as the CQPSO algorithm used by Huang [36], the CQTS and CQGA algorithms used by Lee and Lin [37,38], and the CQBA algorithm used by Li et al. [39], are selected as alternative models to test the superiority of the LS-SVR-CQFOA model in terms of forecasting accuracy.(3) The forecasting results illustrate that, in three datasets, the proposed LS-SVR-CQFOA model is superior to other alternative models in terms of forecasting accuracy indexes; in addition, it passes the significance test at a 97.5% confidence level.

The Organization of This Paper
The rest of this paper is organized as follows.The modeling details of an LS-SVR model, the proposed CQFOA, and the proposed LS-SVR-CQFOA model are introduced in Section 2. Section 3 presents a numerical example and a comparison of the proposed LS-SVR-CQFOA model with other alternative models.Some insight discussions are provided in Section 4. Finally, the conclusions are given in Section 5.

Least Squares Support Vector Regression (LS-SVR)
The SVR model is an algorithm based on pattern recognition of statistical learning theory.It is a novel machine learning approach proposed by Vapnik in the mid-1990s [17].The LS-SVR model was put forward by Suykens [20].It is an improvement and an extension of the standard SVR model, which replaces the inequality constraints of an SVR model with equality constraint [21].The LS-SVR model converts quadratic programming problem into linear programming solving, reduces the computational complexity, and improves the convergent speed.It can solve the load forecasting problems due to its characteristics of nonlinearity, high dimension, and local minima.

Principle of the Standard SVR Model
Set a dataset as {(x i , y i )} N i=1 , x i ∈ R n is the input vector of n-dimensional system, y i ∈ R is the output (not a single real value, but a n-dimensional vector) of system.The basic idea of the SVR model can be summarized as follows: n-dimensional input samples are mapped from the original space to the high-dimensional feature space F by nonlinear transformation φ(•), and the optimal linear regression function is constructed in this space, as shown in Equation (1) [17]: where f (x) represents the forecasting values; the weight, w, and the coefficient, b, would be determined during the SVR modeling processes.
The standard SVR model takes the ε insensitive loss function as an estimation problem for risk minimization, thus the optimization objective can be expressed as in Equation (2) [17]: where c is the balance factor, usually set to 1, and ξ i and ξ * i are the error of introducing the training set, which can represent the extent to which the sample point exceeds the fitting precision ε.
Equation ( 2) could be solved according to quadratic programming processes; the solution of the weight, w, in Equation ( 2) is calculated as in Equation ( 3) [17]: where α i and α * i are Lagrange multipliers.The SVR function is eventually constructed as in Equation ( 4) [17]: where Ψ(x i , x), the so-called kernel function, is introduced to replace the nonlinear mapping function, φ(•), as shown in Equation ( 5) [15]: (5)

Principle of the LS-SVR Model
The LS-SVR model is an extension of the standard SVR model.It selects the binomial of error ξ t as the loss function; then the optimization problem can be described as in Equation ( 6) [20]: where the bigger the positive real number γ is, the smaller the regression error of the model is.
The LS-SVR model defines the loss function different from the standard SVR model, and changes its inequality constraint into an equality constraint so that w can be obtained in the dual space.After obtaining parameters α and b by quadratic programming processes, the LS-SVR model is described as in Equation ( 7) [20]: It can be seen that an LS-SVR model contains two parameters, the regularization parameter γ and the radial basis kernel function, σ 2 .The forecasting performance of an LS-SVR model is related to the selection of γ and σ 2 .The role of γ is to balance the confidence range and experience risk of learning machines.If γ is too large, the goal is only to minimize the experience risk.On the contrary, when the value of γ is too small, the penalty for the experience error will be small, thus increasing the value of experience risk σ controls the width of the Gaussian kernel function and the distribution range of the training data.The smaller σ is, the greater the structural risk there is, which leads to overfitting.Therefore, the parameter selection of an LS-SVR model has always been the key to improve the forecasting accuracy.

Chaotic Quantum Fruit Fly Algorithm (CQFOA)
FOA is a population intelligent evolutionary algorithm that simulates the foraging behavior of fruit flies [26].Fruit flies are superior to other species in smell and vision.In the process of foraging, firstly, fruit flies rely on smell to find the food source.Secondly, they visually locate the specific location of food and the current position of other fruit flies, and then fly to the location of food through population interaction.At present, FOA has been applied to the forecasting of traffic accidents, export trade, and other fields [40].

Fruit Fly Optimization Algorithm (FOA)
According to the characteristics of fruit flies searching for food, FOA includes the following main steps.
Step 1. Initialize randomly the fruit flies' location (X 0 and Y 0 ) of population.
Step 2. Give individual fruit flies the random direction and distance for searching for food by smell, as in Equations ( 8) and ( 9) [26]: Step 3. Due to the location of food being unknown, firstly, the distance from the origin (Dist) is estimated as in Equation ( 10) [25], then the determination value of taste concentration (S) is calculated as in Equation ( 11) [25], i.e., the value is the inverse of the distance.
Step 4. The determination value of taste concentration (S) is substituted into the determination function of taste concentration (or Fitness function) to determine the individual position of the fruit fly (Smell i ), as shown in Equation ( 12) [26]: Step 5. Find the Drosophila species (Best index and Best Smell values) with the highest odor concentrations in this population, as in Equation ( 13) [26]: Step 6.The optimal flavor concentration value (Optimal_Smell) is retained along with the x and y coordinates (with Best_index) as in Equations ( 14)-( 16) [25], then the Drosophila population uses vision to fly to this position.
Step 7. Enter the iterative optimization, repeat Steps 2 to 5 and judge whether the flavor concentration is better than that of the previous iteration; if so, go back to Step 6.
The FOA algorithm is highly adaptable, so it can efficiently search without calculating partial derivatives of the target function.It overcomes the disadvantage of trapping into local optima easily.
However, as a swarm intelligence optimization algorithm, FOA still tends to fall into a local optimal solution, due to the declining diversity in the late evolutionary population.
It is noticed that there are some significant differences between the FOA and PSO algorithms.For FOA, the taste concentration (S) is used to determine the individual position of each fruit fly, and the highest odor concentration in this population is retained along with the x and y coordinates; eventually, the Drosophila population uses vision to fly to this position.Therefore, it is based on the taste concentration to control the searching direction to find out the optimal solution.For the PSO algorithm, the inertia weight controls the impact of the previous velocity of the particle on its current one by using two positive constants called acceleration coefficients and two independent uniformly distributed random variables.Therefore, it is based on the inertia weight to control the velocity to find out the optimal solution.
Thus, aiming to deal with the inherent drawback of FOA, i.e., suffering from premature convergence or trapping into local optima easily, this paper tries to use the QCM to empower each fruit fly to possess quantum behavior (namely QFOA) during the modeling processes.At the same time, the cat mapping function is introduced into QFOA (namely CQFOA) to implement the chaotic global perturbation strategy to help a fruit fly escape from the local optima when the population's diversity is poor.Eventually, the proposed CQFOA is employed to determine the appropriate parameters of an LS-SVR model and increase the forecasting accuracy.

Quantum Computing Mechanism for FOA (1) Quantization of Fruit Flies
In the quantum computing process, a sequence consisting of quantum bits is replaced by a traditional sequence.The quantum fruit fly is a linear combination of state |0⟩ and state |1⟩, which can be expressed as in Equation ( 17) [34,35]: where α 2 and β 2 are the probability of states, |0⟩ and |1⟩, respectively, satisfying α 2 + β 2 = 1, and (α, β) are qubits composed of quantum bits.A quantum sequence, i.e., a feasible solution, can be expressed as an arrangement of l qubits, as shown in Equation (18) [34,35]: where the initial values of α j and β j are all set as 1/ √ 2 to meet the equity principle, α 2 j + β 2 j = 1 (j = 1, 2, . . ., l), which is updated through the quantum revolving door during the iteration.
Conversion between quantum sequence and binary sequence is the key to convert FOA to QFOA.Randomly generate a random number of [0,1], rand j , if rand j ≥ α 2 j , the corresponding binary quantum bit value is 1, otherwise, 0, as shown in Equation ( 19): Using the above method, the quantum sequence, q, can be transformed into a binary sequence, x; then the optimal parameter problem of an LS-SVR model can be determined using QFOA.

(2) Quantum Fruit Fly Position Update Strategy
In the QFOA process, the position of quantum fruit flies represented by a quantum sequence is updated to find more feasible solutions and the best parameters.This paper uses quantum rotation to update the position of quantum fruit flies.The quantum position of individual i (there are in total N quantum fruit flies) can be extended from Equation (18) and is expressed as in Equation ( 20): where Quantum rotation is a quantum revolving door determined by the quantum rotation angle, which updates the quantum sequence and conducts a random search around the position of quantum fruit flies to explore the local optimal solution.The θ g ij is the jth quantum rotation angle of the population iterated to the ith fruit fly of generation, g, and the quantum bit q g ij (due to the nonnegative position constraint of q g ij , the absolute function, abs() is used to take the absolute value of each element in the calculation result) is updated according to the quantum revolving gate U θ g ij , as shown in Equations ( 21) and ( 22) [34,35]: In special cases, when the quantum rotation angle, θ g+1 ij , is equal to 0, the quantum bit, q g+1 ij , uses quantum non-gate N to update with some small probability, as indicated in Equation ( 23) [35]:

Chaotic Quantum Global Perturbation
For a bionic evolutionary algorithm, it is a general phenomenon that the population's diversity would be poor, along with the increased iterations.This phenomenon would also lead to being trapped into local optima during modeling processes.As mentioned, the chaotic mapping function can be employed to maintain the population's diversity to avoid trapping into local optima.Many studies have applied chaotic theory to improve the performances of these bionic evolutionary algorithms, such as the artificial bee colony (ABC) algorithm [41], and the particle swarm optimization (PSO) algorithm [42].The authors have also employed the cat chaotic mapping function to improve the genetic algorithm (GA) [43], the PSO algorithm [44], and the bat algorithm [39], the results of which demonstrate that the searching quality of GA, PSO, ABC, and BA algorithms could be improved by employing chaotic mapping functions.Hence, the cat chaotic mapping function is once again used as the global chaotic perturbation strategy (GCPS) in this paper, and is hybridized with QFOA, namely CQFOA, which hybridizes GCPS with the QFOA while suffering from the problem of being trapped into local optima during the iterative modeling processes.
The two-dimensional cat mapping function is shown as in Equation ( 24) [39]: where frac function is used to calculate the fractional parts of a real number, y, by subtracting an approached integer.
The global chaotic perturbation strategy (GCPS) is illustrated as follows.
(1) Generate 2popsize chaotic disturbance fruit flies.For each Fruit f ly i (I = 1, 2, . . ., 2popsize), Equation ( 24) is applied to generate d random numbers, z j , j = 1, 2, . . ., d.Then, the qubit (with quantum state, |0⟩) amplitude, cos θ i j , of Fruit f ly i is shown in Equation ( 25): (2) Select 0.5 popsize better chaotic disturbance fruit flies.Compute the fitness value of each Fruit f ly from 2 popsize chaotic disturbance fruit flies, and arrange these fruit flies to be a sequence based on the order of fitness values.Then, select the fruit flies with 0.5 popsize ranking ahead in the fitness values; as a result, the 0.5 popsize better chaotic disturbance fruit flies are obtained.(3) Determine 0.5 popsize current fruit flies with better fitness.Compute the fitness value of each Fruit f ly from current QFOA, and arrange these fruit flies to be a sequence based on the order of fitness values.Then, select the fruit flies with 0.5 popsize ranking ahead in the fitness values.(4) Form the new CQFOA population.Mix the 0.5 popsize better chaotic disturbance fruit flies with 0.5 popsize current fruit flies with better fitness from current QFOA, and form a new population that contains new 1popsize fruit flies, and name it the new CQFOA population.( 5) Complete global chaotic perturbation.After obtaining the new population of CQFOA, take it as the new population of QFOA and continue to execute the QFOA process.

Implementation Steps of CQFOA
The steps of the proposed CQFOA for parameter optimization of an LS-SVR model are as follows as shown in Figure 1.
Step 1. Initialization.The population size of quantum Drosophila is 1 popsize; the maximum number of iterations is Gen-max; the random search radius is R; and the chaos disturbance control coefficient is N GCP .
Step 2. Random searching.For quantum rotation angle, θ ij , of a random search, according to the quantum rotation angle, fruit fly locations on each dimension are updated, and then, a quantum revolving door is applied to update the quantum sequence, as shown in Equations ( 26) and (27) [34,35]: where i is an individual of quantum fruit flies, i = 1, 2, . . ., 1popsize; j is the position dimension of quantum fruit flies, j = 1, 2, . . ., l.As mentioned above, the position of q ij is non-negative constrained, thus, the absolute function, abs() is used to take the absolute value of each element in the calculation result.
Step 3. Calculating fitness.Mapping each Drosophila location, q i , to the feasible domain of an LS-SVR model parameters to receive the parameters, (γ i , σ i ).
The training data are used to complete the training processes of the LS − SVR i model and calculate the forecasting value in the training stage corresponding to each set of parameters.Then, the forecasting error is calculated as in Equation ( 12) of CQFOA by the mean absolute percentage error (MAPE), as shown in Equation ( 28): where N is the total number of data points; f i (x) is the actual load value at point i; and fi (x) is the forecasted load value at point i.
Step 4. Choosing the current optimum.Calculate the taste concentration of fruit fly, Smell i , by using Equation (12), and find the best flavor concentration of individual, Best_Smell i , by Equation ( 13), as the optimal fitness value.
Step 5. Updating global optimization.Compare whether the contemporary odor concentration, Best_Smell i=current , is better than the global optima, Best_Smell i .If so, update the global value by Equation ( 14), and enable the individual quantum fruit fly to fly to the optimal position with vision, as in Equations ( 29) and ( 30), then go to Step 6.Otherwise, go to Step 6 directly. Step

Dataset of Experimental Examples
To test the performance of the proposed LS-SVR-CQFOA model, this paper employs the MEL data from an island data acquisition system in 2014 (IDAS 2014) [45] and the data of GEFCom2014-E [46] to carry out a numerical forecast.Taking the whole time of 24 h as the sampling interval, the

Dataset of Experimental Examples
To test the performance of the proposed LS-SVR-CQFOA model, this paper employs the MEL data from an island data acquisition system in 2014 (IDAS 2014) [45] and the data of GEFCom2014-E [46] to carry out a numerical forecast.Taking the whole time of 24 h as the sampling interval, the load data contains 168-hour load values in total, i.e., from 01:00 14 July 2014 to 24:00 20 July 2014 in IDAS 2014 (namely IDAS 2014), and another two load datasets with the same 168-hour load values, i.e., from 01:00 1 January 2014 to 24:00 7 January 2014 (namely GEFCom2014 (Jan.)) and from 01:00 1 July 2014 to 24:00 7 July 2014 (namely GEFCom2014 (July)) in GEFCom2014-E, respectively.
The preciseness and integrity of historical data directly impact the forecasting accuracy.The data of the historical load are collected and obtained by electrical equipment.To some extent, the data transmission and measurement will lead to some "bad data" in the data of historical load, which mainly includes missing and abnormal data.If these data are used for modeling, the establishment of load forecasting model and the forecasting will bring adverse effects.Thus, the preprocessing of historical data is essential to load forecasting.In this paper, before the numerical test, the data of the MEL are preprocessed, including: completing the missing data; identifying abnormal data; eliminating and replacing unreasonable data; and normalizing data.When the input of an LS-SVR model is multidimensional with a large data size (e.g., several orders of magnitude), it may lead to problems when using the raw data to implement model training directly.Therefore, it is essential that the sample data are normalized for processing, to keep all the sample data values in a certain interval (this topic limits [0,1]), ensuring that all of the data have the same order of magnitude.
The normalization of load data is converted according to Equation (31), where i = 1, 2, . . ., N (N is the number of samples); x i and y i represent the values of before and after the normalization of sample data, respectively; and min(x i ) and max(x i ) represent the minimal and maximal values of sample data, respectively.
After the end of the forecasting, it is necessary to use the inverse normalization equation to calculate the actual load value, as shown in Equation ( 32): The normalized data of the values in IDAS 2014, GEFCom2014 (Jan.) and GEFCom2014 (July) are collected and shown in Tables 1-3, respectively.During the modeling processes, the load data are divided into three parts: the training set with the former 120 h, the validation set with the middle 24 h, and the testing set with the latter 24 h.Then, the rolling-based modeling procedure, proposed by Hong [18,47], is applied to assist CQFOA to look for appropriate parameters, (γ, σ), of an LS-SVR model during the training stage.Repeat this modeling procedure until all forecasting loads are received.The training error and the validation error can be calculated simultaneously.The adjusted parameters, (γ, σ), would be selected as the most suitable parameters only with both the smallest validation and testing errors.The testing dataset is never used during the training and validation stages; it will only be used to calculate the forecasting accuracy.Eventually, the 24 h's load data are forecasted by the proposed LS-SVR-CQFOA model.

Forecasting Accuracy Index
This study uses the MAPE (mentioned in Equation ( 28)), the root mean square error (RMSE), and the mean absolute error (MAE) as forecasting accuracy indexes.The RMSE and MAE are defined as in Equations ( 33) and (34), respectively: where N is the total number of data points; f i (x) is the actual value at point i; and fi (x) is the forecasting value at point i.

Forecasting Performance Improvement Tests
To demonstrate the significant forecasting performances of the proposed model, Diebold and Mariano [48] and Derrac et al. [49] suggest that, for a small data size (24-h load forecasting) test, a Wilcoxon signed-rank test [50] is suitable.Thus, we decided to apply the Wilcoxon signed-rank test.For the same data size, a Wilcoxon test detects the significance of the difference (i.e., the forecasting errors from two forecasting models) in the central tendency.Therefore, let d i be the absolute forecasting errors from any two models on ith forecasting value: R + be the sum of ranks that d i > 0; R − the sum of ranks that d i < 0. If d i = 0, then, remove this comparison and decrease the sample size.The statistics of Wilcoxon test, W, is calculated as in Equation ( 37): If W is smaller than or equal to the critical value, based on the Wilcoxon distribution under n degrees of freedom, then the null hypothesis (i.e., equal performance from the two compared forecasting models) could not be accepted, i.e., the proposed model achieves significance.The parameters of the proposed CQFOA algorithm for the numerical example are set as follows: the population size, popsize, is set to 200; the maximal iteration, gen-max, is set to 1000; and the control coefficient of chaotic disturbance, N GCP , is set to 15.These two parameters of the LS-SVR model are set as, γ ∈ [0, 1000], and σ ∈ [0, 500], respectively.The iterative time of each algorithm is set as the same to ensure the reliability of the forecasting results.

Results and Analysis
Considering the CQPSO, CQTS, and CQGA algorithms have been used to determine the parameters of an SVR-based load forecasting model in [36][37][38][39], those existing algorithms are also hybridized with an LS-SVR model to provide forecasting values to compare with the proposed model here.These alternative models include LS-SVR-FOA, LS-SVR-QFOA, LS-SVR-CQPSO (LS-SVR hybridized with chaotic quantum particle swarm optimization algorithm [36]), LS-SVR-CQTS (LS-SVR hybridized with chaotic quantum Tabu search algorithm [37]), LS-SVR-CQGA (LS-SVR hybridized with chaotic quantum genetic algorithm [38]), and LS-SVR-CQBA (LS-SVR hybridized with chaotic quantum bat algorithm [39]), in order to compare the forecasting performance of LS-SVR-based models comprehensively, this article also selects BPNN method as a contrast model.The parameters of an LS-SVR model are selected by the CQPSO, CQTS, CQGA, CQBA, FOA, QFOA, and CQFOA algorithms, respectively.The details of the suitable parameters of all models for the IDAS 2014, the GEFCom2014 (Jan.) and the GEFCom2014 (July) data are shown in Tables 4-6, respectively.Based on the same training settings, another representative model, the back-propagation neural network (BPNN) is compared with the proposed model.The forecasting results of these models mentioned above and the actual values for IDAS 2014, GEFCom2014 (Jan.) and GEFCom2014 (July) are given in Figures 2-4, respectively.This indicates that the proposed LS-SVR-CQFOA model achieves a better performance than the other alternative models, i.e., closer to the actual load values.Tables 7-9 indicate the evaluation results from different forecasting accuracy indexes for IDAS 2014, GEFCom2014 (Jan.) and GEFCom2014 (July), respectively.For Table 7, the proposed LS-SVR-CQFOA model achieves smaller values for all employed accuracy indexes than the seven other models: RMSE (14.10),MAPE (2.21%), and MAE (13.88), respectively.For Table 8, similarly, the proposed LS-SVR-CQFOA model also achieves smaller values for all employed accuracy indexes compared to the seven other models: RMSE (40.62),MAPE (1.02%), and MAE (39.76), respectively.Similarly in Table 9, the proposed LS-SVR-CQFOA model also achieves smaller values for all employed accuracy indexes than the other seven models: RMSE (38.70),MAPE (1.01%), and MAE (37.48), respectively.The details of the analysis results are as follows.Tables 7-9 indicate the evaluation results from different forecasting accuracy indexes for IDAS 2014, GEFCom2014 (Jan.) and GEFCom2014 (July), respectively.For Table 7, the proposed LS-SVR-CQFOA model achieves smaller values for all employed accuracy indexes than the seven other models: RMSE (14.10),MAPE (2.21%), and MAE (13.88), respectively.For Table 8, similarly, the proposed LS-SVR-CQFOA model also achieves smaller values for all employed accuracy indexes compared to the seven other models: RMSE (40.62),MAPE (1.02%), and MAE (39.76), respectively.Similarly in Table 9, the proposed LS-SVR-CQFOA model also achieves smaller values for all employed accuracy indexes than the other seven models: RMSE (38.70),MAPE (1.01%), and MAE (37.48), respectively.The details of the analysis results are as follows.Finally, to test the significance in terms of forecasting accuracy improvements from the proposed LS-SVR-CQFOA model, the Wilcoxon signed-rank test is conducted under two significant levels, α = 0.025 and α = 0.05, by one-tail test.The test results for the IDAS 2014, the GEFCom2014 (Jan.), and the GEFCom2014 (July) datasets are described in Tables 10-12, respectively.In these three tables, the results demonstrate that the proposed LS-SVR-CQFOA model achieved significantly better forecasting performance than the other alternative models.For example, in the IDAS 2014 dataset, for LS-SVR-CQFOA vs. LS-SVR-CQPSO, the statistic of Wilcoxon test, W = 72, is smaller than the critical statistics, W** = 81 (under α = 0.025) and W* = 91 (under α = 0.05), thus we could conclude that the proposed LS-SVR-CQFOA model is significantly outperform the LS-SVR-CQPSO model.In addition, the p-value = 0.022 is also smaller than the critical α = 0.025 and α = 0.05, which support the conclusion.mapping function to determine more appropriate parameters of an LS-SVR model to improve the forecasting accuracy.
Comparing the LS-SVR-QFOA model with the LS-SVR-FOA model, the forecasting accuracy of the LS-SVR-QFOA model is superior to that of the LS-SVR-FOA model.This demonstrates that the QCM empowers the fruit fly to have quantum behaviors, i.e., the QFOA find more appropriate parameters of an LS-SVR model, which improves the forecasting accuracy of the LS-SVR-FOA model in which the FOA is hybridized with an LS-SVR model.For example, in Table 4, the usage of the QCM in FOA changes the forecasting performances (RMSE = 15.93,MAPE = 2.48%, MAE = 15.63) of the LS-SVR-FOA model to the much better performance (RMSE = 14.87,MAPE = 2.32%, MAE = 14.61) of the LS-SVR-QFOA model.Similar results are demonstrated in the GEFCom2014 (Jan.) and the GEFCom2014 (July) from Tables 5 and 6, respectively.
For forecasting performance comparison between the LS-SVR-CQFOA and LS-SVR-QFOA models, the values of RMSE, MAPE, and MAE for the LS-SVR-CQFOA model are smaller than those of the LS-SVR-QFOA model.This reveals that the introduction of cat chaotic mapping function into QFOA plays a positive role in searching appropriate parameters when the population of QFOA algorithm is trapped into local optima.Then, the CQFOA finds more appropriate parameters.As a result, as shown in Table 4, employing CQFOA to select the parameters for an LS-SVR model markedly improves the performance (RMSE = 14.87,MAPE = 2.32%, MAE = 14.61) of the LS-SVR-QFOA model to the much better one (RMSE = 14.10,MAPE = 2.21%, MAE = 13.88) of the LS-SVR-CQFOA model.Similar results are illustrated in the GEFCom2014 (Jan.) and the GEFCom2014 (July) from Tables 5 and 6, respectively.
Comparing the time-consuming problem during the parameter searching processes in all the IDAS 2014, the GEFCom2014 (Jan.), and the GEFCom2014 (July) datasets, the proposed CQFOA is less than that of the CQGA and CQBA algorithms, but more than that of the CQPSO and CQTS algorithms.However, considering the time requirements of the actual application, the increase in time compared with CQPSO (more than 7 s) and CQTS (more than 23 s) is acceptable.
Finally, some limitations should be noticed.This paper only employs an existing dataset to establish the proposed model; thus, for different seasons, months, weeks, and dates, the electricity load patterns should be changed season by season, month by month, and week by week.For real-world applications, this paper should be a good beginning to guide planners and decision-makers to establish electricity load forecasting models overlapping the seasons, months, and weeks to achieve more comprehensive results.Thus, our planned future research direction is to explore the feasibility of hybridizing more powerful novel optimization frameworks (e.g., chaotic mapping functions, quantum computing mechanism, and hourly, daily, weekly, monthly adjusted mechanism) and novel meta-heuristic algorithms with an LS-SVR model to overcome the drawbacks of evolutionary algorithms to achieve excellent forecasting accuracy.

Conclusions
This paper proposes a novel hybrid forecasting model by hybridizing an LS-SVR model with the QCM, the cat chaotic mapping function, and the FOA.The forecasting results show that the proposed model achieves better performance than the alternative forecasting models, by hybridizing chaotic mapping function, QCM, and other evolutionary algorithms with an LS-SVR-based model.Employing the cat chaotic mapping function to enrich the diversity of searching scope and enhance the ergodicity of the population could successfully avoid trapping into local optima, and, also proves that applying QCM to overcome the limitations of the fruit fly's searching behaviors empowers the fruit fly to undertake quantum searching behaviors, thereby achieving more satisfactory results for MEL forecasting.The global chaotic perturbation strategy based on the cat mapping function is employed to jump out of local minima while the population of QFOA suffers from premature convergence, and also helps to improve the forecasting performance.

Energies 2018 ,
11, x 15 of 21GEFCom2014 (July) are given in Figures2-4, respectively.This indicates that the proposed LS-SVR-CQFOA model achieves a better performance than the other alternative models, i.e., closer to the actual load values.

Author
Contributions: M.-W.L. and W.-C.H. conceived and designed the experiments; J.G. and Y.Z.performed the experiments; M.-W.L. and W.-C.H. analyzed the data and wrote the paper.
6. Global chaos perturbation judgment.If the distance from the last disturbance is equal to N GCP , go to Step 7; otherwise, go to Step 8. Step 7. Global chaos perturbation operations.Based on the current population, conduct the global chaos perturbation algorithm to obtain the new CQFOA population.Then, take the new CQFOA population as the new population of QFOA, and continue to execute the QFOA process.
Step 8. Iterative refinements.Determine whether the current population satisfies the condition of evolutionary termination.If so, stop the optimization process and output the optimal results.Otherwise, repeat Steps 2 to 8.

Table 1 .
Normalization values of load data for IDAS 2014.

Table 3 .
Normalization values of load data for GEFCom2014 (July).

Table 4 .
LS-SVR parameters, MAPE, and computing times of CQFOA and other algorithms for IDAS 2014.

Table 5 .
Parameters combination of LS-SVR determined by CQFOA and other algorithms for GEFCom2014 (Jan.).

Table 6 .
Parameters combination of LS-SVR determined by CQFOA and other algorithms for GEFCom2014 (July).

Table 7 .
Forecasting indexes of LS-SVR-CQFOA and other models for IDAS 2014.

Table 7 .
Forecasting indexes of LS-SVR-CQFOA and other models for IDAS 2014.