Individualized Short-Term Electric Load Forecasting Using Data-Driven Meta-Heuristic Method Based on LSTM Network

Short-term load forecasting is viewed as one promising technology for demand prediction under the most critical inputs for the promising arrangement of power plant units. Thus, it is imperative to present new incentive methods to motivate such power system operations for electricity management. This paper proposes an approach for short-term electric load forecasting using long short-term memory networks and an improved sine cosine algorithm called MetaREC. First, using long short-term memory networks for a special kind of recurrent neural network, the dispatching commands have the characteristics of storing and transmitting both long-term and short-term memories. Next, four important parameters are determined using the sine cosine algorithm base on a logistic chaos operator and multilevel modulation factor to overcome the inaccuracy of long short-term memory networks prediction, in terms of the manual selection of parameter values. Moreover, the performance of the MetaREC method outperforms others with regard to convergence accuracy and convergence speed on a variety of test functions. Finally, our analysis is extended to the scenario of the MetaREC_long short-term memory with back propagation neural network, long short-term memory networks with default parameters, long short-term memory networks with the conventional sine-cosine algorithm, and long short-term memory networks with whale optimization for power load forecasting on a real electric load dataset. Simulation results demonstrate that the multiple forecasts with MetaREC_long short-term memory can effectively incentivize the high accuracy and stability for short-term power load forecasting.


Introduction
Smart grid (SG) is a new type of power system that has emerged in recent years and is widely used by power companies due to its accuracy in power load forecasting [1][2][3][4]. Energy issues are very important nowadays especially with the spread of appliances and the concepts of the Internet of Things [5,6], smart house [7], and smart city [8]. Thus, the problem of energy efficiency is one of the key ones. That is why a lot of attention is paid to reliable energy sources for domestic use [9][10][11] to maintain the required infrastructure. There are also many energy saving techniques [12,13]. However, the main problems of energy efficiency belong to energy generation and transfer. In particular, inaccurate sensors in producing facilities [14,15]. These inaccuracies have considerable effect on the world economy. According to the estimation [16], yearly losses are about USD 400 million worldwide, therefore many studies are carried out in this field [17,18]. In particular, new sensors [15,17] and techniques for data processing were proposed [17,19]. However, proper

Materials and Methods
Short-term electric load forecasting is critical to our lives. Electricity load forecasting models are mainly divided into traditional forecasting models and forecasting models based on machine learning methods. Researchers have conducted a lot of research to improve the accuracy of forecasting. In previous studies, in order to more clearly describe the previous studies on electric load forecasting, these studies were divided into single forecasting models and hybrid integrated forecasting ones. The specific discussion is shown below.
A single model is defined as a forecasting model that is used only to forecast the electric load. In [43], the authors proposed a multilayer bidirectional recurrent neural network based on the LSTM and the gated recurrent unit (GRU). They used this network for short-term power load forecasting, and compared this method with the LSTM, SVM, and BP on two data sets. The comparison of the prediction results showed the superiority of the method proposed in [43]. An improved exponential smoothing gray model based on this model is proposed in [44], and this model is applied to short-term power load forecasting. The method not only improves the accuracy of short-term power load forecasting but also extends the application scope of gray forecasting. In [45], a radial basis function neural network (RBFNN) is used in the prediction of electric load, and the results show that it has considerable accuracy and stability in the prediction of electric load. An autoregressive integrated moving average (ARIMA) and support vector machine (SVM)-based power forecasting method was proposed [46]. Its core idea is to first use ARIMA to forecast the daily load and then use SVM to correct the previously obtained forecast deviations. The experimental results show the high accuracy of the method for forecasting large sample electricity data. A model using dynamic neural networks for power load forecasting was presented in [47] and the structure of the neural network was also validated in the paper using regression plots. The experimental results show the efficiency of the method in [47] for forecasting complex time series. The authors in [48] proposed a support vector regression (SGA-SVR) based on a sequential grid approach for forecasting electric loads. In the experimental results, the SGA-SVR showed considerable prediction performance.
A hybrid integrated prediction framework uses feature extraction or optimization models mixed with prediction models to achieve improved prediction accuracy. A hybrid power forecasting framework based on SVM and ant colony optimization was proposed in [49]. The authors compared the performance of this method with SVM and BP on short-term power loads and the results showed that this method can achieve better prediction accuracy. In [50], a hybrid power forecasting method based on the generalized regression neural network (GRNN) and the fruit fly optimization algorithm (FOA) was proposed. The authors used the FOA to solve the problem of how to select the appropriate propagation parameters in the GRNN. Finally, this method is compared with a variety of other forecasting methods for prediction performance, and the experimental results prove the effectiveness of the method [50]. A hybrid power forecasting method based on the least squares support vector machine (LSSVM) and the moth-flame optimization algorithm (MFO) was proposed [51]. The authors in [36] used MFO to determine two parameters (σ and C) in the LSSVM model. The prediction method combining second-order oscillations and repulsive particle swarms used to optimize SVM parameters was proposed in [52] and applied to power data prediction in Singapore. In [53], the power load forecasting method combining the differential evolution (DE) algorithm with SVR is proposed and the forecasting performance of this method is compared with SVR with default parameters, BP, artificial neural network (BPNN), and regression forecasting methods. The experimental results show the efficiency of the method in [53]. Consequently, [54] used global optimal particle swarm optimization (GPAO) to improve the prediction accuracy of feedforward artificial neural networks (ANNs) and tested this model on ISO New England grid data. The test results demonstrated the prediction accuracy of the method. A wavelet neural network (WNN) hybrid electric load forecasting model based on improved empirical modal decomposition (IEMD), autoregressive integral shift (ARIMA), and fruit fly optimization algorithm (FOA) was proposed in [55]; the simulation results then show that the model has good performance in power load forecasting.
Although all the above forecasting methods in the literature have achieved different degrees of improvement in electric load forecasting, it was found that few researchers have applied the LSTM forecasting model with optimized parameters by the sine cosine optimization algorithm (SCA) to electric load forecasting in previous studies. The LSTM can perform well in long-sequence problems, moreover, the SCA optimization is also an algorithm with considerable optimization effect. The purpose of this paper is to propose a power load forecasting model based on the improved SCA with optimized LSTM parameters, in order to achieve higher forecasting accuracy.

Problem Formulation
In this section, the short-term electric load forecasting problem is defined from the mathematical point of view and delivers an optimized forecasting framework.
The process of electric load forecasting is to first split the historical power load data into training and testing sets, then to train the fitted prediction model with the training set, and finally to use the fitted prediction model to validate the prediction on the testing set. The main symbols and their meanings in this paper are listed in Table 1. Assuming that n is the number of samples in the testing set, the optimization problem of electric load forecasting can be defined as follows: where x is a solution in the optimization algorithm, it is a matrix with four columns in a row, and the four elements of the matrix represent the learning rate, the number of training sessions, the number of neurons in the first layer, and the number of neurons in the second layer in the LSTM network, respectively. Let L j and U j represent the upper and lower bounds of the j − th parameter, respectively. Let f (x) be the fitness value of the solution x and the mean square error of the optimized LSTM on the testing set. Let n be the number of samples in the testing set, and Y represents the testing set. Let y i denote the real power load data of the i − th time period. Let y i be the forecast power load data for the i − th time period. The short-term power load forecasting framework in this paper is shown in Figure 1, and the MetaREC method will be described in detail in Section 4.

MetaREC technology
Input data set Processing algorithm iteration Input data set

MetaREC_LSTM MetaREC_LSTM
Forecasting electric load Figure 1. MetaREC_LSTM prediction framework for demand prediction with electricity management.

Standard Sine Cosine Algorithm
The essence of the sine cosine algorithm (SCA) is to find the optimal value by using the perturbation properties of the sine and cosine functions [56][57][58]. Contrasted with other meta-heuristic algorithms, the advantages of the SCA are as follows: fewer parameters and a simpler structure. The optimization search process of the SCA can be split into three steps as follows.
Step 1. Determination of initial populations The initial population is calculated according to the following: The essence of the sine cosine algorithm (SCA) is to find the optimal value by using the perturbation properties of the sine and cosine functions [56][57][58]. Contrasted with other meta-heuristic algorithms, the advantages of the SCA are as follows: fewer parameters and a simpler structure. The optimization search process of the SCA can be split into three steps as follows. Step 1. Determination of initial populations The initial population is calculated according to the following: where X(max, j) and X(min, j) are the upper and lower limits of the individual on dimension j, respectively. Let R be a random number within (0, 1).
Step 2. Calculation of amplitude factor, r 1 (t), and random numbers r 2 , r 3 , and r 4 . The amplitude factor is the key part to control the SCA to convert between global search and local search, and the update formula of the amplitude factor is shown below The parameters r 2 , r 3 , and r 4 are random numbers, each obeying uniform distribution, α is specified as a constant and generally is Equation (3). Let T be the total number of iterations of the algorithm in the optimization search process.
Step 3. Renewal of populations. The population is updated according to the following formula.
where X t (i, j) denotes the position of individual i at the time when the number of iteration rounds is t; and P t (i, j) represents the optimum position achieved by the particle in the previous t iterations. The SCA has been successfully used in many fields based on its efficient merit-seeking capability. However, thus far, there has been little research conducted using optimized SCA to solve optimum production schemes for electricity. Therefore, in this paper, the improved version of the SCA is proposed for short-term power load forecasting.

Modified Sine Cosine Method
The conventional SCA uses a random initialization strategy in the population initialization phase, which leads to problems such as the algorithm can easily fall into local optima and slow convergence speed. However, the introduction of the chaos operator can improve this drawback. Chaos is a phenomenon in which deterministic systems spontaneously produce instability. Under the three characteristics, i.e., randomness, regularity, and ergodicity, the chaotic motion can traverse all states without repetition in a certain range and according to its own laws. Therefore, using chaotic variables to handle optimization search is far more efficient than using blind and disorderly random searches [59][60][61].
There are various types of chaotic variables, and in this paper, the logistic chaos operator is used, whose expression is shown below. The logistic chaos mapping is shown in Section 4.2.1.
where t is the number of iteration steps, and for any t there is X(t) ∈ [0, 1]. When µ is taken in different ranges, the results of the logistic system appear in three different states: stable point, periodic, and chaotic. When the parameter µ is in the range of [3.57, 4], the motion trajectory of logistic shows chaotic characteristics. When µ = 3.57, the logistic appears chaotic. The main characteristic of chaos is that even small changes in the initialized populations lead to significant differences in the results as time increases. As µ continues to increase, the results of the logistic iteration switch between periodic and chaotic types. When u = 4, the logistic results will be completely chaotic, which will eventually lead to a uniform distribution of the results on the interval [0, 1]. In this paper, logistic operator with u = 4 is chosen for the experiments.
In addition, the value of the adjustment factor, r 1 (t), decreases linearly with the number of iterations in the conventional SCA. According to the literature [62], it is known that the SCA will converge faster when r 1 (t) < 1. If the SCA falls into a local optimum at this point, the result of the SCA will become unsatisfactory. To solve this problem, this paper uses a multilevel adjustment factor, r * 1 (t), to change the value of the adjustment factor according to the change in the number of iterations. The multilevel adjustment factor, r * 1 (t), was set to four different equations, as shown in Figure 2.  The multilevel regulatory factor is defined in Equation (7). The number of iteration rounds was split into four phases with

Multilevel regulators factor
Substituting = 2 into × and ℎ × 1 − gives the following result. The multilevel regulatory factor is defined in Equation (7). The number of iteration rounds was split into four phases with Sensors 2022, 22, 7900 gives the following result.
From (8), when t ∈ T 1 ∪ T 3 , r * 1 (t) increments from 1 to 2. At this stage, the SCA will exhibit a global search. When t ∈ T 2 ∪ T 4 , r * 1 (t) decreases from tanh(2) to 0. In this phase, the SCA exhibits a local search. The introduction of the multi-level adjustment factor makes the SCA switch between exploration and operation several times during the process of finding the optimum, which reduces the chances of the SCA falling into a local optimum to some extent.
Based on the conventional SCA, the logistic chaos factor is added to initialize the distribution of population solutions in order to enhance the diversity of the initial population of the optimization algorithm. The multilevel regulatory factor, r * 1 (t), replaces the original r 1 , in order that the SCA can change its convergence strategy according to the number of training rounds during the optimization iteration and prevent the situation of falling into a local optimum. With the addition of the logistic chaos operator and the multilevel adjustment factor strategy, the solution is updated as follows.
The algorithm of the MetaSCA model is shown in Algorithm 1, and the structure diagram is shown in Figure 3. 3. The fitness value of each solution X t is calculated according to the fitness function f (x) and the one X best t with the smallest fitness value is found.
The position of each solution is updated according to Equation (9); 8.
Calculate the fitness value of each solution according to f (x).

9.
Update the global optimal solution X best t . 10. While (t < T) 11. Output: Global optimal solution X best t after iteration.

No
Yes optimal fitness value Pbest, optimal particle Gbest Output result: Logistic chaotic operator

Increase population diversity
Make SCA avoid falling into local optimization Figure 3. The MetaREC procedure.

Basic LSTM Process
The LSTM is a temporal recurrent neural network [63]. The LSTM is used for very long intervals of events in a time series by adding three control units, the input door, the forgotten door, and the output door [64][65][66]. The structure of the LSTM is given in Figure  4 and the main process of the LSTM is shown below. 1. The cellular information from the previous moment is selectively filtered using the forgotten door to pick out the cellular information that has an impact on that moment before being fed into the neural network for calculation

Basic LSTM Process
The LSTM is a temporal recurrent neural network [63]. The LSTM is used for very long intervals of events in a time series by adding three control units, the input door, the forgotten door, and the output door [64][65][66]. The structure of the LSTM is given in Figure 4 and the main process of the LSTM is shown below.
1. The cellular information from the previous moment is selectively filtered using the forgotten door to pick out the cellular information that has an impact on that moment before being fed into the neural network for calculation where f t is the output of the forgetting gate, σ is the activation function, w f and b f are the weight coefficient and offset, respectively, h t−1 is the hidden state in the previous time series, and x t is the input data of the current series.
2. The input door determines which information will be stored in the cell state.
where i t is the input door, a t is the state of the node at moment t, C t−1 and C t denote the state of the cell at moments t − 1 and t, respectively, and h t−1 denotes the output at moment t − 1.

Basic LSTM Process
The LSTM is a temporal recurrent neural network [63]. The LSTM is used for very long intervals of events in a time series by adding three control units, the input door, the forgotten door, and the output door [64][65][66]. The structure of the LSTM is given in Figure  4 and the main process of the LSTM is shown below. Figure 4. The LSTM structure diagram.
1. The cellular information from the previous moment is selectively filtered using the forgotten door to pick out the cellular information that has an impact on that moment before being fed into the neural network for calculation 3. The output gate is calculated to obtain the current hidden layer state h t .
where O t is the value obtained from the previous hidden layer state, h t−1 , together with the current layer input, x t , after computing σ.
In addition to the parameters mentioned above, the four parameters that need to be set manually and are important for the prediction efficiency of the LSTM model are the learning rate, α, the number of training sessions, epochs, the number of neurons in the first layer, N 1 , and the number of neurons in the second layer, N 2 . The MetaREC will be used to find the best combination of parameters (α, epochs, N 1 , N 2 ) in the LSTM model in view of its excellent merit-seeking capability.

MetaREC Process via LSTM Network
When it comes to choosing values for the parameters (α, epochs, N 1 , N 2 ) in the LSTM, the most widespread approach is to let the parameters vary within a limited range. With a set of parameters selected, the training set is applied to train the LSTM and obtain prediction accuracy. Finally, the set of parameters that obtains the best prediction accuracy is selected as the optimum parameter combination. In this experiment, the MetaREC is used to find the optimum combination of parameters. The MetaREC_LSTM obtains the optimum prediction accuracy by the following steps. The structure diagram and pseudo-code of the MetaREC_LSTM are shown in Figure 5 and Algorithm 2, respectively.
Step 1. Determine the initial parameters in the MetaREC_LSTM, such as the number of solutions, pop, the dimensionality of the solutions, dim, the upper and lower boundariess on the values of the solutions, [lb, ub], and the number of iterations for the optimization search, Iternum.
Step 2. The combination of parameters (α, epochs, N 1 , N 2 ) is used as the location of the solution, and the location of the initial solution is initialized using the logistic chaos operator.
Step 3. The fitness value of each solution is calculated according to (1), and the fitness of each solution also represents the training error obtained using the reformulation parameters.
Step 4. The position of each solution is updated according to (9), the fitness of each solution is recalculated, and the combination of parameters for which the minimum fitness is obtained is taken as the best combination of parameters.
Step 5. The best combination of parameters is applied to train the LSTM model and make predictions on the testing set.  The values of the important parameters in the LSTM model have a significant impact on the predictive performance of the model. Therefore, it is important to use effective methods to determine the values of these parameters.
for j= 1 to dim do

LSTM-Based Heuristic Structure for Electric Forecasting
In the LSTM model, there are four parameters that play a crucial role in the prediction performance of the model, such as the number of neurons, hidden_node_1, hidden_node_2, of the LSTM, the learning rate, al pha, and the training number, num_epochs. The values of these four parameters are taken as the object of the MetaREC optimization. The flow chart used by MetaREC_LSTM to forecast the electric load is shown in Figure 6.
The steps for forecasting the electric load using the MetaREC_LSTM model are as follows.
Step 1: Data preprocessing. The acquired data are normalized and split into training and testing sets.
Step 2: Constructing the LSTM prediction model. Set the number of neurons, N 1 and N 2 , the learning rate, α, and the range of values for the number of training iterations, epochs.
Step 3: Build the MetaREC model. Initialize the parameters of the model, including the number of populations, individual dimensions, and the maximum number of iterations, where each solution is a (α, epochs, N 1 , N 2 ) combination, and the dimension of each solution is 4.
Step 4: Set the fitness function. The mean of squared errors is set as the fitness function, as shown in (16).  The steps for forecasting the electric load using the MetaREC_LSTM model are as follows.
Step 1: Data preprocessing. The acquired data are normalized and split into training and testing sets.
Step 2: Constructing the LSTM prediction model. Set the number of neurons, and , the learning rate, , and the range of values for the number of training iterations, ℎ . Step 4: Set the fitness function. The mean of squared errors is set as the fitness function, as shown in (16).
Step 5: The fitness value of each solution is calculated to determine the optimum solution in the population and its corresponding optimum fitness value.
Step 6: The position of each solution is updated according to (9), and the fitness value of each solution is recalculated to update the historical optimum value and the optimum solution.
Step 7: Terminate the iteration. Output the optimum combination of parameters ( , ℎ , , ) and the optimum fitness at this point, when the number of iterations reaches the set maximum number of iterations. Step 5: The fitness value of each solution is calculated to determine the optimum solution in the population and its corresponding optimum fitness value.
Step 6: The position of each solution is updated according to (9), and the fitness value of each solution is recalculated to update the historical optimum value and the optimum solution.
Step 7: Terminate the iteration. Output the optimum combination of parameters (α, epochs, N 1 , N 2 ) and the optimum fitness at this point, when the number of iterations reaches the set maximum number of iterations.
Step 8: Forecasting of electrical loads. The best combination of parameters obtained in step 7 is used as the parameter values for the LSTM model, which is used to fit the training set and then used to make predictions on the testing set.

Simulation Results
The main configuration of the experimental platform used to evaluate the performance of all the prediction models in this paper was an Intel (R) Core (TM) i7-10750H 2.6 GHz processor, 16G memory.
The purpose of this section is to illustrate the performance of the proposed method. Considering that the PSO, WOA, and conventional SCA are more efficient search algorithms in the category of population intelligence domain. The search capability of the MetaREC proposed in this paper is compared with the PSO, WOA, and conventional SCA in the first part of the simulation results to explore the search capability and convergence speed of MetaREC. In the second part, the MetaREC is used to adjust the set parameters of the LSTM model to predict the electricity load. The prediction results are compared with the results  Table 2. Step [−100, 100] F min = 0

Evaluation Setup
Different evaluation parameters for various electric prediction models in simulation experiments are given as follows: • Relative percentage error: the magnitude of this parameter illustrates the difference between the predicted and true data of the load. The smaller this parameter is, the better the prediction of the model is [67].
• Mean absolute percentage error (MAPE): if this parameter is 0 the prediction model is perfect and when this parameter is greater than 100%, it means that the model is inferior [68].
• Root mean square error (RMSE): The smaller this parameter is, the better the prediction model is, and vice versa, the bigger the value the worse the model is [69].
• Mean absolute error (MAE): The smaller this parameter is, the better the prediction performance of the prediction model.
• Coefficient of determination (R 2 ): This parameter implies the degree of fit of the prediction model. The closer the value of this parameter is to 1, the better the fit of the model. It is the proportion of variation in the dependent variable that is predicted by the model [70].
where y i denotes the real power load data of the i − th time period, y i denotes the forecast electric load data of the i − th time period, and y is the mean value of the real power load data.

Test Functions Assessment
The purpose of this section is to test the superiority seeking capability of the MetaREC method in solving complex optimization problems. The commonly used benchmark functions are of three types: regularity, separability, and multimodality.
In addition, as the dimensionality of the search space increases, the difficulty of function finding increases. Therefore, six benchmark functions were selected and the search space dimensions were set to 10, 50, and 100, respectively, to test the efficient search capability of the MetaREC. At the same time, the results obtained by the MetaREC were compared with those acquired by PSO, WOA, and SCA. The comparison results are shown in Table 3 and Figures 7-9. The expressions of the six benchmark functions are as follows: Step :  Each algorithm was run 10 times to obtain the mean and variance of the algorithm's search results. In Table 3, the results of the five optimization algorithms (SCA, PSO, FA, WOA, and MetaREC) are shown in different dimensions for the benchmark function search. For the sphere function, the MetaREC obtained better optimum values and variances in all three dimensions than the results of the other four methods. For the Rastrigin function, although the conventional SCA can reach the theoretical optimum at dim = 10, it can be obtained from the results in the table that as the dimensionality of the search space increases, the SCA's ability to find the optimum decreases, yet the MetaREC can always obtain the theoretical optimum. For the quartic, groan, and step functions, the optimum values and variances obtained by the MetaREC are better than those obtained by the SCA, PSO, and FA. Only the results obtained by the WOA are similar with those of the MetaREC. In addition, the MetaREC obtained the theoretical optimum value in the search for the optimal values for the Griwank and step functions. The above simulation experimental results and analysis show that the MetaREC mentioned in this paper has considerable accuracy and stability in dealing with low, medium, and high dimensional problems.
With the purpose of showing the convergence speed of MetaREC during the optimization, this paper shows the iteration curves of five optimization algorithms for optimization on six benchmark functions at dim = 10, dim = 50, and dim = 100, respectively. With the purpose of showing the convergence speed of MetaREC during the optimization, this paper shows the iteration curves of five optimization algorithms for optimization on six benchmark functions a = 10, = 50, and = 100, respectively. Firstly, the speed of convergence of the different optimization algorithms on the six benchmark functions is tested by setting = 10. The test results are shown in Figure  7. As shown in Figure 7a,b,d-f, the MetaREC not only gives more accurate iterative results, but also has the fastest convergence rate among the five optimization algorithms. Although the convergence accuracy of the MetaREC is slightly lower than that of WOA in Figure 8c, the convergence speed is significantly higher than that of WOA.  Firstly, the speed of convergence of the different optimization algorithms on the six benchmark functions is tested by setting dim = 10. The test results are shown in Figure 7. As shown in Figure 7a,b,d-f, the MetaREC not only gives more accurate iterative results, but also has the fastest convergence rate among the five optimization algorithms. Although the convergence accuracy of the MetaREC is slightly lower than that of WOA in Figure 8c, the convergence speed is significantly higher than that of WOA.
Secondly, dim = 50 was set to test the results of iterations of the five algorithms on the six benchmark functions. The simulation results are exhibited in Figure 8. In Figure 9a, MetaREC achieves the optimum accuracy along with the optimum iterative convergence speed. As shown in Figure 8b,d-f, the convergence speed and accuracy of the MetaREC's optimization search are significantly higher than the results of the SCA, PSO, and FA. Although the convergence accuracy of WOA can be the same as that of the MetaREC, the convergence speed of the MetaREC is significantly greater than that of the WOA. 9a, MetaREC achieves the optimum accuracy along with the optimum iterative convergence speed. As shown in Figure 8b,d-f, the convergence speed and accuracy of the Me-taREC's optimization search are significantly higher than the results of the SCA, PSO, and FA. Although the convergence accuracy of WOA can be the same as that of the MetaREC, the convergence speed of the MetaREC is significantly greater than that of the WOA. Finally, = 100 was set to test the convergence results of the different algorithms on the six benchmark functions. The simulation results are presented in Figure 9, the convergence speed of the MetaREC almost does not change as the dimensionality of the search space increases. In Figure 9b,d-f, the convergence speed of the MetaREC is still the fastest among the five algorithms. From the above experimental results it can be obtained Finally, dim = 100 was set to test the convergence results of the different algorithms on the six benchmark functions. The simulation results are presented in Figure 9, the convergence speed of the MetaREC almost does not change as the dimensionality of the search space increases. In Figure 9b,d-f, the convergence speed of the MetaREC is still the fastest among the five algorithms. From the above experimental results it can be obtained that in most cases the MetaREC can guarantee higher accuracy while also obtaining faster convergence.

Comparison of Electricity Load Forecasting
The purpose of this section is to use the forecasting techniques in this paper to forecast electrical loads and to test the performance of the forecasting techniques. In this experiment, published electrical load data from a region in Zhejiang, China, is applied. This dataset includes the electrical load values for the period 0-23 h for each day from 13 February to 20 May 2021. The electricity load data from 13 February to 19 May were used as the training set to train the forecasting model and the data from 20 May were used as the testing set to check the performance of the forecasting technique. To verify the efficiency of MetaREC_LSTM technology, the test results are compared with those of LSTM, WOA_LSTM, and SCA_LSTM models.

Comparison of Electricity Load Forecasting
The purpose of this section is to use the forecasting techniques in this paper to forecast electrical loads and to test the performance of the forecasting techniques. In this experiment, published electrical load data from a region in Zhejiang, China, is applied. This dataset includes the electrical load values for the period 0-23 h for each day from 13 February to 20 May 2021. The electricity load data from 13 February to 19 May were used as the training set to train the forecasting model and the data from 20 May were used as the testing set to check the performance of the forecasting technique. To verify the efficiency of MetaREC_LSTM technology, the test results are compared with those of LSTM, WOA_LSTM, and SCA_LSTM models. Table 4 gives the absolute values of the deviations between the predicted and actual values of the electric load for the 0-23-h period using the three forecasting models-WOA_LSTM, SCA_LSTM, and MetaREC_LSTM. Firstly, it can be obtained from Table 4 that the minimum and maximum absolute values of the deviations in the forecasts obtained by WOA_LSTM are 0.01% at the 21:00 moment and 4.79% at the 6:00 moment, respectively. The minimum and maximum absolute values of the deviations in the forecasts obtained by SCA_LSTM are 0.53% at 16:00 and 4.82% at 6:00, respectively. The minimum and maximum absolute values of the deviations in the forecasts obtained by MetaREC_LSTM are 0.02% at 0:00 and 3.18% at 6:00, respectively. From the above resulting data, it can be obtained that the minimum deviation of MetaREC_LSTM prediction results are similar to the minimum deviation obtained by WOA_LSTM and is smaller to the minimum deviation obtained by SCA_LSTM. Furthermore, the maximum deviation obtained by MetaREC_LSTM is significantly better than the maximum deviation obtained by WOA_LSTM and SCA_LSTM. Secondly, the range Using the error data from Table 4, the minimum and maximum of absolute values of errors were found, the first quartile, the median error and the third quartile, mean and the interquartile range for each method.
From the Table 5 and Figure 11, the minimum error, median, and mean of WOA_LSTM and MetaREC_LSTM are quite close but the upper 50% of error of the WOA_LSTM spans far wider than for the MetaREC_LSTM. For instance, the third quartile is more than 20% bigger for the WOA_LSTM, and the maximum value is 50.6% bigger than those for the MetaREC_LSTM. The MetaREC_LSTM has slightly bigger mean and median for the first quartile than those for the WOA_LSTM and somewhat smaller than the SCA_LSTM for these parameters. The MetaREC_LSTM has the third quartile almost the same as that of the SCA_LSTM. The absolute value of the maximum error of the SCA is almost the same as that of the WOA and considerably bigger than that of the MetaREC. Thus, the method considerably reduces the magnitude of prediction errors. Using the error data from Table 4, the minimum and maximum of absolute values of errors were found, the first quartile, the median error and the third quartile, mean and the interquartile range for each method.
From the Table 5 and Figure 11, the minimum error, median, and mean of WOA_LSTM and MetaREC_LSTM are quite close but the upper 50% of error of the WOA_LSTM spans far wider than for the MetaREC_LSTM. For instance, the third quartile is more than 20% bigger for the WOA_LSTM, and the maximum value is 50.6% bigger than those for the MetaREC_LSTM. The MetaREC_LSTM has slightly bigger mean and median for the first quartile than those for the WOA_LSTM and somewhat smaller than the SCA_LSTM for these parameters. The MetaREC_LSTM has the third quartile almost the same as that of the SCA_LSTM. The absolute value of the maximum error of the SCA is almost the same as that of the WOA and considerably bigger than that of the MetaREC. Thus, the method considerably reduces the magnitude of prediction errors.  Figure 11. Minimum and maximum errors, mean (marked with cross), the first quartile, the me error, and the third quartile of absolute values of errors for the three methods.
In Table 6, the values of MAPE, RMSE, MAE, and are given for the BP, LS WOA_LSTM, SCA_LSTM, and MetaREC_LSTM for the testing dataset. From Table 6 values of MAPE, RMSE, MAE, and obtained by the MetaREC_LSTM prediction t nique proposed in this paper are the best among the five prediction methods mentio above. The MetaREC_LSTM method predicts approximately 28%, 33%, and 29% lo than the LSTM results for MAPE, RMSE, and MAE, respectively; 5%, 15%, and 9% lo than the WOA_LSTM results, respectively; and 23%, 24%, and 25% lower than SCA_LSTM results, respectively. In the comparison of as an indicator, MetaRE approximately 1.00% , 0.31% , and 0.59% higher than LSTM, WOA_LSTM, SCA_LSTM, respectively. The graphic presentation is given in Figure 12. It shows the predicted load cu obtained by the five prediction methods and true load curves. As can be seen in Figur compared to the other four forecasting methods, the load forecasting curve of taREC_LSTM fits better with the real load curve, indicating that MetaREC_LSTM higher forecasting accuracy and proving the efficiency of MetaREC_LSTM in electr load forecasting. In Table 6, the values of MAPE, RMSE, MAE, and R 2 are given for the BP, LSTM, WOA_LSTM, SCA_LSTM, and MetaREC_LSTM for the testing dataset. From Table 6, the values of MAPE, RMSE, MAE, and R 2 obtained by the MetaREC_LSTM prediction technique proposed in this paper are the best among the five prediction methods mentioned above. The MetaREC_LSTM method predicts approximately 28%, 33%, and 29% lower than the LSTM results for MAPE, RMSE, and MAE, respectively; 5%, 15%, and 9% lower than the WOA_LSTM results, respectively; and 23%, 24%, and 25% lower than the SCA_LSTM results, respectively. In the comparison of R 2 as an indicator, MetaREC is approximately 1.00%, 0.31%, and 0.59% higher than LSTM, WOA_LSTM, and SCA_LSTM, respectively. The graphic presentation is given in Figure 12. It shows the predicted load curves obtained by the five prediction methods and true load curves. As can be seen in Figure 12, compared to the other four forecasting methods, the load forecasting curve of MetaREC_LSTM fits better with the real load curve, indicating that MetaREC_LSTM has higher forecasting accuracy and proving the efficiency of MetaREC_LSTM in electricity load forecasting.
In order to make the conclusions obtained more general, different testing sets and training sets were used to fit and test the prediction methods. Electricity load data from 13 February to 18 May were used as the training set and load data from 19 May were used as the testing set. In order to make the conclusions obtained more general, different testing sets and training sets were used to fit and test the prediction methods. Electricity load data from 13 February to 18 May were used as the training set and load data from 19 May were used as the testing set.
The error rate curves obtained by the four prediction methods of the LSTM, WOA_LSTM, SCA_LSTM, and MetaREC_LSTM were compared for each time point and the results are shown in Figure 13. The MetaREC_LSTM achieves better error rates than the other three forecasting methods for electricity forecasting at most points in time. Figure 14 shows the load forecasting curves of different forecasting methods on 19 May, and it can be obtained that the forecasting curves of this paper's method fit the real load data curve more closely than other methods in most of the time periods. From the above data and analysis, it can be confirmed that the MetaREC_LSTM has higher forecasting performance. The error rate curves obtained by the four prediction methods of the LSTM, WOA_LSTM, SCA_LSTM, and MetaREC_LSTM were compared for each time point and the results are shown in Figure 13. The MetaREC_LSTM achieves better error rates than the other three forecasting methods for electricity forecasting at most points in time. Figure 14 shows the load forecasting curves of different forecasting methods on 19 May, and it can be obtained that the forecasting curves of this paper's method fit the real load data curve more closely than other methods in most of the time periods. From the above data and analysis, it can be confirmed that the MetaREC_LSTM has higher forecasting performance.    The values of MAPE, RMSE, MAE, and for each forecasting method on 19 May for the training set were compared and the results exhibited in Figure 15 and Table 7. It can be clearly seen that the MAPE, RMSE, and MAE data obtained by the MetaREC_LSTM The values of MAPE, RMSE, MAE, and R 2 for each forecasting method on 19 May for the training set were compared and the results exhibited in Figure 15 and Table 7. It can be clearly seen that the MAPE, RMSE, and MAE data obtained by the MetaREC_LSTM are significantly smaller than for the other four methods and are equivalent to the results of WOA_LSTM and SCA_LSTM in terms of R 2 . In order to show more clearly the comparison of data on the assessment indicators for the five forecasting methods, Table 7 displays the exact data obtained for the five methods for the abovementioned indicators. From Table 7, the MAPEs obtained by MetaREC_LSTM were 54%, 52%, 7%, and 5% lower than the results obtained by BP, LSTM, WOA_LSTM, and SCA_LSTM, respectively. The RMSEs are 49%, 46%, 2.8%, and 3.2% lower than the results obtained by BP, LSTM, WOA_LSTM, and SCA_LSTM, respectively. The MAEs obtained are 55%, 55%, 12%, and 3.6% lower than the results obtained by BP, LSTM, WOA_LSTM, and SCA_LSTM, respectively. In addition, MetaREC_LSTM obtained improvements in R 2 of 4.8%, 4.0%, 0.04%, and 0.09% over the results of BP, LSTM, WOA_LSTM, and SCA_LSTM, respectively. The above data results and analysis can also confirm the efficient forecasting capability of the proposed MetaREC_LSTM for electricity load forecasting. SCA_LSTM, respectively. In addition, MetaREC_LSTM obtained improvements in of 4.8% , 4.0% , 0.04% , and 0.09% over the results of BP, LSTM, WOA_LSTM, and SCA_LSTM, respectively. The above data results and analysis can also confirm the efficient forecasting capability of the proposed MetaREC_LSTM for electricity load forecasting.

Conclusions
Short-term power load forecasting is an important part of grid management and the

Conclusions
Short-term power load forecasting is an important part of grid management and the foundation for power dispatch centers to develop generation plans, which has a significant responsibility in the efficient operation of power systems. Therefore, a hybrid method (MetaREC_LSTM) forecasting framework was proposed in order to improve short-term power load forecasting accuracy. The logistic chaos operator and multilevel modulation factor are used to improve the SCA. Then the improved SCA is used to optimize the parameter taken from the LSTM. Finally, the MetaREC_LSTM forecasting framework is used to forecast the electric load while comparing it with a few other single and hybrid forecasting models with respect to forecasting performance. The experimental results and analysis verify that the prediction model in this paper has higher forecast accuracy and stability. In future work, factors such as temperature, humidity, and holidays can be taken into account to improve the accuracy of the prediction model more effectively.
Due to the excellent forecasting performance of MetaREC_LSTM and its important feature of reducing the magnitude of the forecast error, it is suggested that power companies may consider applying it to their own short-term electric load forecasting for the purpose of scheduling the total amount of power generation planned by the company and thus improve the economic value of the company. Meanwhile, the prediction framework can also be applied to other prediction fields, such as wind prediction, traffic flow prediction, flu prediction, pollution prediction of complex ecosystems, etc.