Hybrid Model Based on an SD Selection, CEEMDAN, and Deep Learning for Short-Term Load Forecasting of an Electric Vehicle Fleet

: Forecasting the aggregate charging load of a ﬂeet of electric vehicles (EVs) plays an important role in the energy management of the future power system. Therefore, accurate charging load forecasting is necessary for reliable and efﬁcient power system operation. A hybrid method that is a combination of the similar day (SD) selection, complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), and deep neural networks is proposed and explored in this paper. For the SD selection, an extreme gradient boosting (XGB)-based weighted k-means method is chosen and applied to evaluate the similarity between the prediction and historical days. The CEEMDAN algorithm, which is an advanced method of empirical mode decomposition (EMD), is used to decompose original data, to acquire intrinsic mode functions (IMFs) and residuals, and to improve the noise reduction effect. Three popular deep neural networks that have been utilized for load predictions are gated recurrent units (GRUs), long short-term memory (LSTM), and bidirectional long short-term memory (BiLSTM). The developed models were assessed on a real-life charging load dataset that was collected from 1000 EVs in nine provinces in Canada from 2017 to 2019. The obtained numerical results of six predictive combination models show that the proposed hybrid SD-CEEMDAN-BiLSTM model outperformed the single and other hybrid models with the smallest forecasting mean absolute percentage error (MAPE) of 2.63% Canada-wide.


Introduction 1.Problem Statement
The global EV market is growing exponentially.The World Economic Forum says worldwide sales of EVs reached 6.6 million in 2021, almost doubling from the previous year [1].The International Energy Agency estimated that the global EV fleet could reach 250 million by 2030 [2].With the significant growth of the EV industry, it is bound to bring new challenges to the power systems due to the large battery capacity and uncertain charging behaviors of EV users [3].This would result in significant peak-valley differences in load in featured time slots, particularly in a super-short-term time scale.Therefore, utilities and other power producers need to be prepared to meet the increased loads as transportation electrification grows, and to be able to forecast required electricity with a minimum error, to maintain stable and effective power system operation.

Proposed Solutions
Over the past few decades, scientists have proposed various solutions to improve the accuracy of short-term load forecasting (STLF) methods including traditional methods, the SD selection, the EMD methods, artificial intelligence (AI), and hybrid forecasting models [4][5][6][7][8][9][10][11][12][13].With energy grid diversification, an increasing number of factors impact the load demand such as weather, holidays, and electricity prices [3].Traditional load forecasting methods are unable to provide prediction models with sufficient accuracy [14].These models are based on mathematical methods that often perform poorly when predicting non-linear problems [15].
The SD selection is based on historical days that have comparable features to capture the features of load.To predict short-term load, a weekday index and weather events similar to the forecasted day were used [16].The SD selection was applied to assess the attribute weights using the XGB algorithm and to compute the distance between the chosen day and the day that depends on different measurement attributes in different weights [12].The authors also used the k-means algorithm to merge SDs into one cluster and applied it as input data for the succeeding load prediction based on the XGB distance.However, scientists found that the SD method cannot sufficiently capture the complex electric load features alone and it should be combined with other methods [12,[16][17][18].
AI and machine learning (ML) modeling techniques are broadly used by many electric utility companies to handle accurate load forecasting problems [19].Although comprehensive research has been accomplished, accurate STLF remains a challenge due to its non-stationary electric load data and long-term dependencies estimating horizon [12].The LSTM model, which is a special type of recurrent neural network (RNN), was used to predict the aggregated demand-side electric load over the short-and medium-term horizons [20].The BiLSTM has been applied in several areas of study to provide accurate aggregated electric load prediction results [21][22][23].The GRU model was used to forecast the short-term load of EV charging stations and the state of charge of batteries [3,24].A comparative study of ML approaches using LSTM, BiLSTM, and GRU was evaluated for day-ahead charging of the EV fleet in Canada [25].The results showed that the BiLSTM algorithm has the lowest error among the used models and was best suited for load prediction of the EV fleet.However, in view of the complexity and non-stationarity of the aggregate load of the EV fleet, it is difficult to obtain accurate results using neural networks.
The EMD method has been used by many scholars in a wide range of applications including electric load, wind speed, solar radiation, and crude oil price to improve prediction accuracy [5][6][7]12,26,27].The EMD method can sufficiently extract the features of the original data from non-stationary and unstable time series into a series of frequency components [28].There are many types of EMD methods that can be applied to the time series including ensemble EMD (EEMD), complete EEMD (CEEMD), and CEEMD with adaptive noise (CEEMDAN).Compared with EMD and EEMD, the CEEMDAN method can perform a better spectral separation of the IMFs at a lower computational cost [29].More recently, CEEMDAN was used to reconstruct the original input/output variables for electricity demand forecasting [30].They found that the accuracy of the load model with the CEEMDAN method is greater than that without it.
Based on the above-mentioned solutions, a hybrid approach consisting of the SD selection, the CEEMDAN, and deep neural networks is proposed to forecast the aggregated load of an EV fleet.The SD method is used to capture the features of load using the XGB algorithm and cluster them using the k-means method.The CEEMDAN technique is introduced to decompose historical data into a set of IMFs and a residue.Three ML methods comprising LSTM, BiLSTM, and GRU were applied to compare and select the best-suited method for predicting the charging load of the EV fleet.
The contribution of the presented paper is the application of the CEEMDAN, the SD methods, and deep neural networks applied to the problem of STLF in EV fleet research with:

•
a unique heterogeneous fleet of 1000 EVs, grouped into long-range, mid-range, and short-range EVs, in terms of their battery capacity; • the three-day-ahead predictions with high accuracy achieved on real data provided by FleetCarma Inc. (Waterloo, ON, Canada); • an evaluation with various methods comprising the SD selection, the CEEMDAN method, and different RNN architectures.
The following section provides the theoretical background of our method.Section 3 describes data pre-processing and feature analysis.Section 4 describes the experimental results and validation.Section 5 presents future work.Section 6 makes a conclusion.

Materials and Methods
This section describes the SD approach, the signal-processing algorithm called CEEM-DAN, and the three structures of RNN comprising LSTM, BiLSTM, and GRU investigated in this study.

SD Approach Using XGB and k-Means
This section estimates the weights of the features, i.e., vehicle groups, charging levels, charging locations, temperature, electricity rates, seasonal category, weather events, day type, etc. [25], using the XGB algorithm.Then, the weighted features are integrated using the k-means clustering.
XGB is a decision tree-based ensemble ML algorithm that applies a gradient boosting framework.XGB is a supervised algorithm that assembles all base learners into strong learners.The prediction result by XGB is equal to the total of all base learners.
The XGB process is as follows: where f n represents the nth decision tree, N is the number of decision trees, F is the space that covers a set of decision trees.The objective function of XGB includes the loss function and the regularization term, which are defined as: where l (y i , ŷi ) denotes the loss function, y i and ŷi are the prediction and the target values, Ω states the regularization term, n is the number of targets y i , T specifies the leaf node number in a decision tree, w j is the score of leaf nodes, γ and β act as the penalty factors.Then, k-means clustering, which is an unsupervised learning algorithm, is used to partition n observations into different clusters, so that similar data points are grouped together and underlying patterns can be discovered.

Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN)
CEEMDAN is an algorithm based on EMD [31], that can effectively solve the EMD mode mixing and the EEMD residual noise problems.The decomposition process is as follows: i.
Add white noise V i (t) with a normal distribution to the original signal S(t).The signal for the ith decomposition, where i denotes the number of decompositions, is represented as ii.The EMD decomposes the trial signal S i (t) to obtain I MF i 1 correspondingly, so iii.Add white noise V i (t) to the residual r 1 (t), execute the trial i times (i = 1, 2, • • • , I), and each trial adopts EMD to decompose to find its first-order component I MF i 1 .
iv.Then, the signal is further decomposed by EMD to calculate the second IMF mode and the relating residue by repeating the above decomposition process.When the residual cannot be decomposed by EMD, the program ends.The original signal can be represented as:

Long Short-Term Memory (LSTM) and Bidirectional LSTMs
LSTM is a specific RNN architecture used to prevent gradient vanishing and exploding problems.LSTMs can acquire long-term dependencies by a gate mechanism that can control the flow of information.These gates can realize which data in the sequence are important to preserve or forget.There are three different gates in the LSTM cell; input, forget, and output (Figure 1).iii.Add white noise   () to the residual r 1 (), execute the trial i times (i = 1, 2, ⋯, I), and each trial adopts EMD to decompose to find its first-order component   1 . Residual iv.Then, the signal is further decomposed by EMD to calculate the second IMF mode and the relating residue by repeating the above decomposition process.When the residual cannot be decomposed by EMD, the program ends.The original signal can be represented as:

Long Short-Term Memory (LSTM) and Bidirectional LSTMs
LSTM is a specific RNN architecture used to prevent gradient vanishing and exploding problems.LSTMs can acquire long-term dependencies by a gate mechanism that can control the flow of information.These gates can realize which data in the sequence are important to preserve or forget.There are three different gates in the LSTM cell; input, forget, and output (Figure 1).The following equations are the LSTM cell states and parameters' updating scheme [12,20,32]: The following equations are the LSTM cell states and parameters' updating scheme [12,20,32]: The idea of BiLSTM is to combine input information in the past and future time step in LSTM models.In BiLSTM, information can be preserved from both the past and future at any point in time [33].

Gated Recurrent Unit (GRU)
GRU is a newer generation of RNN architecture and is similar to LSTM.GRU does not have the cell state, instead it uses a hidden state to transfer information.Therefore, GRU has only two gates: reset and update gate (Figure 2).
where   ,  , and   represent the activation of input, forget, and output gate at time step t, respectively,  −1 is the output at time step  − 1,   is the input at the present moment, and  −1 is the memory cell from the previous state.The forget gate (  ) can input the information in ℎ −1 and   , and outputs a vector, with values ranging from 0 to 1 for the cell state  −1 , 1 means completely reserved and 0 means completely discarded.The input gate   decides what new information is going to be stored in the cell state, and the first part is the sigmoid layer called the "input gate layer" that decides which values will be updated.The second part is that the ℎ layer creates a vector of a new candidate value, Č  which is added to the state.The current block memory   is created by the accumulation of the items from the previous block and input gate.Finally, the output gate (  ) determines the cell state value, where W and b are weight and bias.The following equations are sigmoid and hyperbolic tangent functions The idea of BiLSTM is to combine input information in the past and future time step in LSTM models.In BiLSTM, information can be preserved from both the past and future at any point in time [33].

Gated Recurrent Unit (GRU)
GRU is a newer generation of RNN architecture and is similar to LSTM.GRU does not have the cell state, instead it uses a hidden state to transfer information.Therefore, GRU has only two gates: reset and update gate (Figure 2).The key equations for GRU are shown below [24]: where   , ℎ  , ĥ  ,   , and   are input, output, activation function, update gate, and reset gate.W, U, and b are parameter matrices and vectors.  and ɸ ℎ are sigmoid and hyperbolic tangent functions.GRUs have fewer parameters and thus may train faster than LSTM.The key equations for GRU are shown below [24]: where x t , h t , ĥt , z t and r t are input, output, activation function, update gate, and reset gate.W, U, and b are parameter matrices and vectors.δ g and Φ h are sigmoid and hyperbolic tangent functions.GRUs have fewer parameters and thus may train faster than LSTM.

The Proposed SD-CEEMDAN-BiLSTM Prediction Model
In the BiLSTM model, the rectified linear unit (ReLU) function is used as an activation function of the stack of a fully connected layer and the mean square error (MSE) is used as a loss function.
Appl.Sci.2022, 12, 9288 6 of 16 where N is the sum of estimated number of days, y n is the estimated value, and d n is the actual value.
The proposed model generally comprises the following steps, which are shown in Figure 3.

I.
The feature weight is calculated by the XGB method and is merged with the k-means algorithm to establish the SD cluster.II.
The CEEMDAN method is utilized to decompose the charging load into several IMF sequences   () (i = 1, 2, ⋯, M) and one residue R M ().III.
Each IMF and the residue item are normalized and used as the input to the BiLSTM model for training and obtaining the predicted values, respectively.The results of the test set predictions are   () (i = 1, 2, ⋯, M) and R M ().IV.
Then, the final forecast results are adjusted using the below formula.
where L is the test series length, and   () is the final predictive series of the test set.
where L is the test series length, and S i (t) is the final predictive series of the test set.
The architecture of the BiLSTM was derived by testing different configurations of units in each layer and calculating the MAPE on training and testing datasets.Table 1 shows the MAPE for a different number of layers and units for data with one-day-ahead prediction.Increasing the capacity of the neural network can be seen by increasing the number of layers and units, which improves error on the training and testing dataset.However, the proposed model executes well on the training data by a 2-layer network with 100 units in layer 2 and 30 units in layer 1.A large dataset with over 727,000 charging load events was collected in nine provinces in Canada (Figure 4) by FleetCarma Inc., in partnership with 10 electric utility companies and the University of Waterloo, over a three-year period using FleetCarma data loggers via the OBD II port of 1000 EVs.The collected data include charging loads of thirty-five vehicle models for three charging locations and three charging levels.The data include charging load start and end times, energy consumption, energy loss, and the state of charge of the battery at start time and end time.
prediction.Increasing the capacity of the neural network can be seen by increasing the number of layers and units, which improves error on the training and testing dataset.However, the proposed model executes well on the training data by a 2-layer network with 100 units in layer 2 and 30 units in layer 1.

Data Description and Input Variable
A large dataset with over 727,000 charging load events was collected in nine provinces in Canada (Figure 4) by FleetCarma Inc., in partnership with 10 electric utility companies and the University of Waterloo, over a three-year period using FleetCarma data loggers via the OBD II port of 1000 EVs.The collected data include charging loads of thirty-five vehicle models for three charging locations and three charging levels.The data include charging load start and end times, energy consumption, energy loss, and the state of charge of the battery at start time and end time.
For the charging load prediction, two individual models were developed to process battery size and weather events into feature input parameters, and the full results have been published recently [25].For the charging load prediction, two individual models were developed to process battery size and weather events into feature input parameters, and the full results have been published recently [25].

XGB Feature Importance
Several features described in Section 2.1 are used as inputs for the XGB algorithm to compute the feature importance with a charging load for a fleet of EVs (Figure 5).The increase in the feature importance of the individual predictors in the tree is then visualized.A higher value of the metric "Feature gain score", compared to another feature, implies it is more important for generating a prediction.It can be seen from Figure 5 that the charge load is sensitive to temperature variables.In addition, the SD charging load, which has the highest feature gain score, is an important feature for load prediction.This assumption is consistent with the data analysis results.
Several features described in Section 2.1 are used as inputs for the XGB algorithm to compute the feature importance with a charging load for a fleet of EVs (Figure 5).The increase in the feature importance of the individual predictors in the tree is then visualized.A higher value of the metric "Feature gain score", compared to another feature, implies it is more important for generating a prediction.It can be seen from Figure 5 that the charge load is sensitive to temperature variables.In addition, the SD charging load, which has the highest feature gain score, is an important feature for load prediction.This assumption is consistent with the data analysis results.

Data Decomposition
The CEEMDAN method is utilized to decompose the aggregated charging load at low-and high-frequency waves, which correspond to daily and seasonal, respectively.All graphs in Figure 6 are on the same scale, thus enabling the evaluation of the contribution of each extracted IMF on the daily, weekly, monthly, and seasonal scales (Figure 6b-e).It can be seen that increasing the range of data for decomposition from daily to seasonally would decrease the amplitude of the fluctuation.Therefore, it presents improvements in the extracted IMFs and provides a valuable abstract for data visualization.For load forecasting, the generated IMFs were divided into training and testing sets, with the ratio selected as 76/24 (16 months of training data, and 5 months of testing data).

Data Decomposition
The CEEMDAN method is utilized to decompose the aggregated charging load at low-and high-frequency waves, which correspond to daily and seasonal, respectively.All graphs in Figure 6 are on the same scale, thus enabling the evaluation of the contribution of each extracted IMF on the daily, weekly, monthly, and seasonal scales (Figure 6b-e).It can be seen that increasing the range of data for decomposition from daily to seasonally would decrease the amplitude of the fluctuation.Therefore, it presents improvements in the extracted IMFs and provides a valuable abstract for data visualization.For load forecasting, the generated IMFs were divided into training and testing sets, with the ratio selected as 76/24 (16 months of training data, and 5 months of testing data).
For time interval processing, Anaconda, which is a distribution of the Python programming language, was used to integrate all the charging load event data based on the time stamps and split them into 24 h time stamps every single day.

Evaluation Indicators
To evaluate the performance of the models and the prediction errors, three commonly used metrics, including root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percent error (MAPE), were employed.The error indicators are defined as below: where y i is the actual value, ŷi is the prediction value, and N is the total amount of data.

Results
Six predictive combination models including the SD selection, the CEEMDAN method, and three different RNN architectures (GRU, LSTM, BiLSTM) were selected for comparative analysis and for verifying the predictive performance of the proposed model.The six predictive combination models in this research are implemented using Python Anaconda.
Figures 7 and 8 illustrate the results of the aggregated charge power, in kilowatts (kW), during the three-day-ahead prediction period for all single and hybrid prediction models in spring and winter, respectively.The training period is from 20 August 2017 to 31 December 2018 and the three-day-ahead prediction period is from 30 April 2019 to 3 May 2019 and from 26 January 2019 to 29 January 2019 for spring and winter, respectively.It can be seen that the peak charging load during the three-day-ahead period occurred around 7 p.m., and the real error, which is the difference between actual and predicted load, is the largest at that point (Figures 7 and 8).However, the proposed hybrid model that includes the SD selection and decomposition method outperformed the single (Figures 7b and 8b) and other hybrid models (Figures 7c and 8c) at the peak points.Although all six models show good predictive results on the three-day-ahead period, the SD-CEEMDAN-BiLSTM model achieves a comparatively better performance with the smallest forecasting MAPE of 2.63% Canada-wide.Such trends in terms of improvements of the models using hybrid methods for electricity load, wind speed, solar radiation, and crude oil price prediction performance have also been observed in reported experimental studies [5][6][7]12].
The aggregated charge power (kW) in spring and winter are compared and the results show that the peak periods in spring are shorter than in winter, resulting in higher overall charge power in winter (Figures 7 and 8).This might be due to decreased battery efficiency, a higher charging load for cabin heating, and more charging load per season in cold winter weather.Comparisons between the forecasting curve of the single models in spring and winter (Figures 7b and 8b) show that the BiLSTM curves follow the actual data better than other models.Comparisons between the prediction curves of the hybrid models in spring and winter (Figures 7c and 8c) show that the predicting curve of the proposed SD-CEEMDAN-BiLSTM model is closer to the original charge power curve than the other models.
Table 2 shows the error values of the six prediction models.The training period is from 20 August 2017 to 31 December 2018, and the prediction period is from 1 January 2019 to 30 May 2019.Among the six prediction models, the results obtained by using SD-CEEMDAN-BiLSTM are more fitted to the original data.The RMSE, MAPE, and MAE values are 14.5, 2.6, and 13.1, respectively, which are much smaller than other prediction models.These data indicate that the SD-CEEMDAN-BiLSTM model has better stability and accuracy and can be well applied to EV load prediction.LSTM has the maximum MAPE value.Although all three single models and other hybrid models followed the general trend of the raw data, their forecasting errors were high.The aggregated charge power (kW) in spring and winter are compared and the results show that the peak periods in spring are shorter than in winter, resulting in higher overall charge power in winter (Figures 7 and 8).This might be due to decreased battery efficiency, a higher charging load for cabin heating, and more charging load per season in cold winter weather.Comparisons between the forecasting curve of the single models in spring and winter (Figures 7b and 8b) show that the BiLSTM curves follow the actual data

FutureWork
Issues that could be addressed in future work: 1.
Examine reinforcement learning approaches for dealing with real large datasets composed of time series.
To learn the individual EV user energy consumption in order to reduce peak power on an aggregate level.
To investigate the individual and cumulative impact of battery capacity and time of use rate on the charging behavior of EV users.

2.
Apply smart charging strategy to minimize overall vehicle energy use costs.

3.
Develop a methodology that facilitates the decision making in real-time, using models that can be applied to a wide range of vehicle models and/or groups.

Conclusions
This study proposes a hybrid model based on the SD selection, CEEMDAN signal processing technique, and the BiLSTM network.The proposed approach was compared with LSTM, GRU, SD-BiLSTM, and CEEMDAN-BiLSTM models to assess the effectiveness of the proposed model on hourly EV charging load forecasting.To achieve the best performance, fine-tuning and proper hyper-parameters were investigated.The performance of these prediction models was evaluated in terms of MAE, RMSE, and MAPE.The main conclusions of this study can be summarized as follows: 1.
The hybrid approach is feasible and reasonable for the three-day-ahead load forecasting of a real-life dataset that was collected from 1000 EVs in nine provinces in Canada from 2017 to 2019.

2.
The SD algorithm applied for optimizing the single models and the CEEMDAN technique applied in extracting the various components could both improve the prediction performance of the single models.

3.
Overall, the proposed SD-CEEMDAN-BiLSTM model shows a competitive technique for enhancing the charging load prediction accuracy of the EV fleet.

Figure 1 .
Figure 1.The architecture of LSTM cell.Numbered values correspond to the equation numbers below.

Figure 1 .
Figure 1.The architecture of LSTM cell.Numbered values correspond to the equation numbers below.

Figure 2 .
Figure 2. The architecture of the GRU cell.Numbered values correspond to the equation numbers below.

Figure 2 .
Figure 2. The architecture of the GRU cell.Numbered values correspond to the equation numbers below.

Figure 3 .
Figure 3. Hybrid forecast model based on SD-CEEMDAN-BiLSTM.Figure 3. Hybrid forecast model based on SD-CEEMDAN-BiLSTM.I.The feature weight is calculated by the XGB method and is merged with the k-means algorithm to establish the SD cluster.II.The CEEMDAN method is utilized to decompose the charging load into several IMF sequences C i (t) (i = 1, 2, • • • , M)and one residue R M (t).III.Each IMF and the residue item are normalized and used as the input to the BiLSTM model for training and obtaining the predicted values, respectively.The results of the test set predictions are C i (t) (i = 1, 2, • • • , M) and R M (t).IV.Then, the final forecast results are adjusted using the below formula.

Figure 4 .
Figure 4. Number and distribution of EVs used in this project.Figure 4. Number and distribution of EVs used in this project.

Figure 4 .
Figure 4. Number and distribution of EVs used in this project.Figure 4. Number and distribution of EVs used in this project.

17 Figure 6 .
Figure 6.The original data sequence of the aggregated daily charging load (a) and the result of CEEMDAN; daily (b) weekly (c), monthly (d), seasonally (e).

Figure 6 .
Figure 6.The original data sequence of the aggregated daily charging load (a) and the result of CEEMDAN; daily (b) weekly (c), monthly (d), seasonally (e).

Figure 7 .
Figure 7. Predictions and actual aggregated charge power during the three-day-ahead prediction period in spring for (a) all prediction models, (b) single prediction model, (c) hybrid prediction model.

Figure 7 .
Figure 7. Predictions and actual aggregated charge power during the three-day-ahead prediction period in spring for (a) all prediction models, (b) single prediction model, (c) hybrid prediction model.Tables 3 and 4 compare the MAPE values for all the models by month for the one-dayahead and the three-day ahead prediction.The training period is from 20 August 2017 to 31 December 2018, and the prediction period is from 1 January 2019 to 30 May 2019.The results indicate that the proposed model is significantly superior to the single models and other hybrid models.MAPE of the SD-CEEMDAN-BiLSTM model is the lowest among all the models.The average forecasting accuracy of the proposed model reaches 97.93% and 97.29% in the one-day-ahead and the three-day-ahead prediction, respectively.

Figure 8 .
Figure 8. Predictions and actual aggregated charge power during the three-day-ahead prediction period in winter for (a) all prediction models, (b) single prediction model, (c) hybrid prediction model.

Figure 8 .
Figure 8. Predictions and actual aggregated charge power during the three-day-ahead prediction period in winter for (a) all prediction models, (b) single prediction model, (c) hybrid prediction model.
called the "input gate layer" that decides which values will be updated.The second part is that the tanh layer creates a vector of a new candidate value, Čt which is added to the state.The current block memory C t is created by the accumulation of the items from the previous block and input gate.Finally, the output gate (O t ) determines the cell state value, where W and b are weight and bias.The following equations are sigmoid and hyperbolic tangent functions where i t , f t, and O t represent the activation of input, forget, and output gate at time step t, respectively, H t−1 is the output at time step t − 1, x t is the input at the present moment, and C t−1 is the memory cell from the previous state.The forget gate ( f t ) can input the information in h t−1 and x t , and outputs a vector, with values ranging from 0 to 1 for the cell state C t−1 , 1 means completely reserved and 0 means completely discarded.The input gate i t decides what new information is going to be stored in the cell state, and the first part is the sigmoid layer

Table 3 .
MAPE (%) of all the models per month for the one-day-ahead prediction.

Table 4 .
MAPE (%) of all the models per month for the three-day-ahead prediction.
Čt the current state of the cell  −1 cell state   , ℎ    input, output, and update gates ĥ  ,   activation function and reset gate W, U, b parameter matrices and vector   and ɸ ℎ sigmoid and hyperbolic function  total predicted number of days   predictive value