Wind Speed and Solar Irradiance Prediction Using a Bidirectional Long Short-Term Memory Model Based on Neural Networks

: The rapid growth of wind and solar energy penetration has created critical issues, such as fluctuation, uncertainty, and intermittence, that influence the power system stability, grid operation, and the balance of the power supply. Improving the reliability and accuracy of wind and solar energy predictions can enhance the power system stability. This study aims to contribute to the issues of wind and solar energy fluctuation and intermittence by proposing a high-quality prediction model based on neural networks (NNs). The most efficient technology for analyzing the future performance of wind speed and solar irradiance is recurrent neural networks (RNNs). Bidirectional RNNs (BRNNs) have the advantages of manipulating the information in two opposing directions and providing feedback to the same outputs via two different hidden layers. A BRNN’s output layer concurrently receives information from both the backward layers and the forward layers. The bidirectional long short-term memory (BI-LSTM) prediction model was designed to predict wind speed, solar irradiance, and ambient temperature for the next 169 h. The solar irradiance data include global horizontal irradiance (GHI), direct normal irradiance (DNI), and diffuse horizontal irradiance (DHI). The historical data collected from Dumat al-Jandal City covers the period from 1 January 1985 to 26 June 2021, as hourly intervals. The findings demonstrate that the BI-LSTM model has promising performance in terms of evaluation, with considerable accuracy for all five types of historical data, particularly for wind speed and ambient temperature values. The model can handle different sizes of sequential data and generates low error metrics.


Introduction
In recent years, artificial intelligence (AI) technologies, such as deep learning (DL), have become incredibly influential as a promising branch of machine learning [1], supported by several advantages, including high generalization capabilities and big data processing compared with shallow models, and supervised and unsupervised feature learning algorithms [2].The supervised learning algorithms are applied to the original dataset that is already labelled.The original dataset has input and output variables and the supervised learning process that employs algorithms for learning the mapping functions [3].The supervised learning algorithms enable the model to relate the signal dataset to an activity class [3], while the unsupervised learning algorithms can extract the learning features from the original dataset [4] and can reconstruct their patterns.These technologies are characterized by multiple linear layers of processing and large-scale hierarchical data representation [5].The large numbers of layers and increased computational complexity refer to a more complex architecture of DL.DL can utilize and analyze major issues in big data, including the extraction of complex patterns from large volumes, semantic indexing, data tagging, the fast retrieval of information, and the simplification of discriminatory tasks.DL has several approaches, such as autoencoder (AE) coding [6], deep belief networks (DBNs) [7], deep Boltzmann machines (DBMs) [8], convolutional neural networks (CNNs) [9], and recurrent neural networks (RNNs) [10].DL provides a comprehensive solution for various applications in the engineering research area, such as energy prediction and monitoring power systems.In optimal power system operation and planning, wind speed and solar energy interval prediction are gaining vital importance.However, forecasting efficiency is an arduous process due to the obvious issues of wind speed and solar power fluctuation.Therefore, numerous advanced DL-based approaches have been developed in previous studies to enhance the prediction accuracy and to foster potential innovation in the field.Wang et al. [11] predicted photovoltaic (PV) power using the wavelet transformation and a deep CNN.This technique decomposes the input signal into multiple frequency sequences.In addition, Piazza et al. [12] used the nonlinear autoregressive with exogenous input based on a neural network to predict the hourly intervals of solar irradiation and wind speed.The approach required the use of external data, such as temperature or wind direction values, to provide the model with more details.Another important study was conducted by Wang et al. [13] to predict the short-term solar-irradiance-based artificial neural network (ANN) model.In addition, Sözen et al. [14] employed the (ANN) model to predict solar potential energy in Turkey.However, these types of models can be improved by increasing the number of the hidden layers.Altan et al. [15] developed a long short-term memory (LSTM) neural network with a decomposition technique and grey wolf estimator to predict short-term wind speed.Moreover, combined multiple techniques to develop a hybrid prediction model can improve the model's performance and prediction accuracy.Hu et al. [16] created a nonlinear hybrid model based on LSTM, a differential evolution algorithm, a nonlinear hybrid mechanism, and a hysteretic extreme learning machine to enhance wind speed prediction accuracy.The differential evolution algorithm is utilized to upgrade the model, although it is difficult to balance.Alli et al. [17] used the time series model-based LSTM neural network to predict solar radiation, wind speed, precipitation, relative humidity, and temperature values.Moreover, this model is able to cope with different types of weather data.Liu and Lin [18] conducted a study to predict the daily load performance in the UK, during the COVID-19 pandemic restrictions and lockdown, using a multivariate time series forecasting-based BI-LSTM neural network.The prediction model considered solar and wind power, including wind speed, biomass, and temperature values.Furthermore, the model recorded high values of the root mean square error (RMSE) with obvious overfitting.K.U. and Kovoor [19] proposed a wind speed prediction model-based ensemble empirical mode decomposition and BI-LSTM neural network.The model enhanced the accuracy values because of data denoising and disintegrating characteristics.Zhen et al. [20] developed a short-term prediction model using the BI-LSTM and genetic algorithm to estimate the PV power.The model is capable of comprehending the connection between several PV output series.In addition to the abovementioned studies, there are numerous studies that present many promising ideas in the field of energy prediction.
In this study, a prediction model was designed based on bidirectional long shortterm memory (BI-LSTM).The model is able to estimate the future performance of solar irradiance, including global horizontal irradiance (GHI), direct normal irradiance (DNI), and diffuse horizontal irradiance (DHI), with the wind speed and ambient temperature values based on time series.In addition, the model predicts future values for one week (169 h) ahead, as hourly intervals.Moreover, the aim of the study is to contribute to the issues of wind and solar energy fluctuation and intermittence that can be used to investigate their implications on the stability of conventional power systems influenced by the factors of wind and solar energy, along with the ambient temperatures.Analyzing the performance of significant quantities of historical data for a specific region can assist in understanding the nature of the variability of wind and solar energy in this region.The paper is organized as follows: Section 2 contains an overview of deep learning (DL) approaches.Section 3 presents the materials and methods.Section 4 consists of the results and discussion.Finally, Section 5 outlines the conclusions.

Overview of Deep Learning (DL) Approaches
Artificial neural networks (ANNs) are designed from hundreds of single units as a variety of interconnected processing elements, called neurons, and are connected to coefficients (weights as adjustable parameters), which represent the neural structures and are organized in layers [21].ANNs are designed to simulate specific biological neural processes, and act similarly to the human brain [22], in order to solve more complicated problems.The basic principle of an ANN is to infer and extract knowledge by detecting patterns, analyzing the relationships between the data, and then learning them based on an experience, not obtained from programming [21,23], but by using a series of processes that are dependent on evaluating the available inputs.Each neuron receives input signals as aggregated data from other neurons, or external stimuli from outside the network, which are then analyzed and processed locally through transfer functions to generate an output signal that is converted to other neurons or classified as external output [22].Moreover, each processing element contains a transfer function, an output, and a weighted sum of the inputs, which is the neuron activation function [21].The processing element is an equation that is used to balance the inputs and outputs.The single output of the neuron is produced by multiplying the input signals by the connection weights to combine them, and then by transferring the activation signal via a transfer function (see Figure 1).The transfer function can be calculated by incorporating the time-lagged observations and predictor variables into an ANN model [22].The transfer function can specify the relationship between a node and a neural network's inputs and outputs, and it introduces the nonlinearity that is necessary for most ANN applications [22].The transfer functions of the neurons are the basic elements that influence the behavior of the neural network, which depends on its learning rules and architecture [21].In addition, the neurons are the basic components of an ANN and are considered the neural network power that simulates the function of biological human neurons, as shown in Figure 1 [21].The effective prediction approaches of ANNs are summarized in the next subsections.

Autoencoder (AE)
The AE is an unsupervised learning [23] feed-forward neural network approach that is trained to replicate its inputs to its outputs via the hidden layers [24,25].An AE can be stackable units to create a deep and complex structure that forms a multilayer neural network.The stacked autoencoder produces fewer reconstruction errors compared to the shallow models [26].The AE consists of four major components: the encoder, decoder, reconstruction loss, and bottleneck [23].There are two advantages of the AE: (i) applying latent representations of features can improve model efficiency, and (ii) the reduction in dimensionality decreases training time [27].

The Restricted Boltzmann Machine (RBM)
The RBM is a commonly used as a deep probabilistic model and represents undirected probabilistic approaches with two basic layers, called the Boolean visible neuron layer, and the binary valued hidden layer [2,25].The first layer consists of visible inputs (v), and the second layer consists of hidden variables (h). Figure 2 represents the configuration of an RBM, where W stands for weight, and a and b both stand for biases.The layers of the RBM are stacked one on top of the other to make it deeper [25].The RBM is used for learning the distribution of probability through its data input space to demonstrate the desirable potential in its configuration [28].The distribution is learned by reducing the model that developed as a feature of the thermodynamic-based network parameters [2].
The assumption procedure includes the definition of reconstructed data driving probabilities through both visible layers and the hidden layer.

Deep Belief Network (DBN)
The DBN is considered a type of neural network that consists of multiple hidden layer units [30].The DBN acts as a stacked RBM, composed of several hidden layers, which employs the backpropagation algorithm for training [25,31].Each DBN layer contains an RBM; however, the RBM can be used with one hidden layer only.In the DBN architecture, there is no interconnection used to link the units in each layer [7].The connections link each unit in a layer to the next layer's units [25].Figure 3 illustrates the configuration of the DBN for four layers: three hidden layers and one visible layer are utilized, with the top three layers remaining undirected, while all of the intermediate layers' connections are oriented toward the data layer.The DBN can be trained with unsupervised learning to extract discriminant features [7] by contrastive ramification, and the DBN algorithm can decrease the dimensionality of the input dataset [32].Two steps must be carried out to perform DBN training for regression: First, training the DBN, which learns by divergence in an unsupervised method, leads to a reduced set of features from the data [32,33].Then, appending the ANN as a single layer for fully linked neurons to the pretrained architecture is the second training step [32,33].To perform forecasting, the new attached layers must train for the desired target [32].Copyright © 2021 Elsevier [29].

Deep Boltzmann Machine (DBM)
The DBM is a type of deep structure neural network approach that has a similar design to the RBM.The DBM has more hidden layers, with variables compared to the RMB [25], and the DBM's hidden units are organized into a layer hierarchy rather than a single layer [8].The connectivity restriction of the RMB enables complete connectivity between subsequent layers, while there are no connections between the layers or between nonneighboring layers [8].A DBM consists of undirected connections to connect all layers between the variables, including the visible layer and multiple hidden layers.Higher order correlations can be captured by each layer between the hidden features in the lower layer (see Figure 4) [8].Along with a bottom-up pass, the approximate inference method can integrate top-down feedback that allows the DBM to spread ambiguity and interact with uncertain information more robustly.The DBM can learn more complicated internal representations, which is known to be a promising method of overcoming problems [8].The DBM is trained as a joint model and represents a graphical model that is absolutely undirected, while the DBN can be a directed and undirected model and trained appropriately in layers [8].In terms of computing, DBM training is costlier than DBN training.[34].

Convolutional Neural Network (CNN)
The CNN is considered a neuronal feed-forward neural network and has a completely connected network with fewer parameters to learn [23,25].In addition, the CNN model consists of three main layer types, namely, completely connected layers, pooling layers, and convolutional layers (see Figure 5).Usually, the data input applied to the CNN is 2D [5].The convolutional layers contain feature maps and multiple filters as parallel layers, which are the neuron layers with weighted inputs, to produce output values [35,36].The pooling layer is used to minimize overfitting, generalize the feature representations, and to sample the feature map of the previous layers [36].The completely connected layer is applied for prediction applications at the end of the network as the detector stages with modified linear activations.CNN models are commonly used in several applications, such as imaging and audio processing, video recognition, speech recognition, and the natural language process.

Materials and Methods
This section discusses the main approaches that we used to develop the prediction model, based on an RNN, LSTM, and BI-LSTM, to predict the future performance of solar irradiance, such as GHI, DNI, and DHI, as well as wind speed and temperature values.In addition, the section presents the historical solar irradiance, wind speed, and ambient temperature data.Furthermore, the prediction model configurations and error metrics are discussed in this section.

Recurrent Neural Network (RNN)
RNN is a neural network designed to perform sequence data processing and utilize each sequence variable iteratively [37].RNNs consist of an internal memory, which is used for updating the neuron status in the network system based on the preceding input, and to employ backpropagation for training over time [25].RNNs basically provide one-way information transfer from the inputs to the hidden units with one-way information transfer synthesis from the previous temporal unit [38].Moreover, the RNN identifies a directional loop that can learn and utilize previous data as an important difference from standard feed-forward neural networks to the current output.RNNs have unique advantages due to the increased number of stacking layers in their architecture [39].The vanishing gradient descent can be a disadvantage of RNNs in some cases [40].In addition, several intermediate steps are needed by RNN-based approaches, which does not support training and configuration in an end-to-end manner [41].However, the algorithm is able to learn uncertainties replicated in previous time measures because of the sharing advantages of RNN.Furthermore, the RNN obtains much deeper learning over time.RNNs aim to map the input in the computational algorithms to graph the sequential () of  value in a commensurate sequential output (), as shown in Equations ( 1)-(3), while the learning process is carried out for each time step from ( = 1 to  = τ) [42].The RNN neuron parameters at the layer  can update their sharing states at each time step ().The () indicates the data input, ℎ ( ) is the sharing status of layer (), and () refers to the corresponding prediction.In addition,  () is the input value of the layer, and  indicates the base.

Long Short-Term Memory (LSTM)
LSTM is a complex computing unit that can produce strong results in a range of sequence modeling tasks because of its exceptional capacity to retain sequence information over time.In the training process of RNNs, LSTM can address the exploding and vanishing gradient phenomenon [43].The remarkable findings from the implementation of LSTM in numerous fields indicate that it is capable of capturing the data variance pattern and defining the data associated with the time series dependency and relationship.This type of neural network has been created to overcome the limitations of RNNs in learning the long-term incompatibilities [44].The memory cells incorporated into the LSTM architecture that store data are the strongest approach for identifying and manipulating the long-range context [45,46].Each LSTM block is comprised of three main gates: an input gate, a forget gate, and an output gate equipped with a memory cell [47].These gates, and the sigmoid activation feature, regulate the changes in cell status.Figure 6 shows the LSTM cell configuration, with the key components of the LSTM, such as the adding element level and the multiplication symbol, which corresponds to the multiplication of the element levels.Additionally, the con represents the vector merging.[48].
The input gate ( ) specifies the magnitude of the values flowing into the cell and stored in the memory of the processor.The forget gate ( ) specifies the degree of the values that remain in the cell and removes the data that are not required from the memory.The output gate ( ) triggers the LSTM's output activation and defines which information is used as the output.In addition, the input node ( ) is a vector of cell activation.Equations ( 4)-( 8) represent the mathematical performance of the LSTM, where ℎ represents the hidden variable, and the logistic sigmoid is denoted by σ.

Bidirectional Long Short-Term Memory (BI-LSTM)
BI-LSTM is developed using the approach of LSTM [49] to increase the classification process performance.BI-LSTM is created by combining RNNs with LSTM approaches (see Figure 7), which can access a long-range context [50].BI-LSTM networks outperform unidirectional networks, such as LSTM, and they are significantly faster and more accurate than both conventional RNNs and time-windowed multilayer perceptrons (MLPs).Moreover, BI-LSTM has comprehensive detail for the sequential information for all the stages before and after each step in the given sequence [50].In addition, BI-LSTM employs the LSTM algorithm to compute the hidden layers.Unlike LSTM, a characteristic of BI-LSTM is that it is capable of processing data in two different directions by utilizing two hidden layers and forwarding the results to the same output layer [51,52].The forward hidden ℎ ⃗ and the backward hidden ℎ ⃖ layers are the main parameters of BI-LSTM, presented in Equations ( 9) and ( 10), respectively.Equation ( 11) represents a combination of the forward hidden ℎ ⃗ and the backward hidden ℎ ⃖ layers.Backpropagation over time is employed to learn the parameters from the training data, using the output sequence  ( ) to iterate the backward layers based on the error function.The hidden layers (Ң) are used for N stack layers, and the hidden vector series ℎ is computed sequentially via n (n = 1) to N (t = 1,…, T), as indicated in Equation ( 12).In addition, ℎ = . denotes the network's ultimate output, as illustrated in Equation (13).

Study Area and Historical Data Collection
In this study, Dumat al-Jandal City in Al-Jouf province was identified as a case study.Dumat al-Jandal is situated at 29°48′41′′ N latitude, and 39°52′06′′ E longitude, in northwest Saudi Arabia.In 2019, a 400 MW onshore wind farm power plant was constructed in Dumat Al-Jandal City, which is the first and largest regional utility-scale wind power source project in Saudi Arabia.The historical data were collected from the Meteoblue weather service's portal [54] for Dumat al-Jandal City, which consists of the GHI, DNI, DHI, wind speeds (m/s) at 80 m height, and ambient temperature (°C) at 2 m height, as presented in Figures 8-16.The actual historical data covers over 36 years, as hourly intervals, for the period from 1 January 1985 to 26 June 2021, and were used to evaluate and predict the behavior of the GHI, DNI, DHI, wind speed, and ambient temperature for the next 169 h in the future.The geological conditions have changed and varied over the previous 36 years.This large quantity of historical data can provide details that the BI-LSTM neural network prediction model can accurately reflect.The data were analyzed, verified, and cleaned to ensure that no values were missing or duplicated.Moreover, the Dickey-Fuller theory confirmed that all the data were stationary (p-values < 0.05).The p-values were less than 0.05, and the p-value hypothesis was tested.Furthermore, the quality of the historical data was a key element, allowing the BI-LSTM model to extract the essential components and produce accurate results.

Experimental Setup and Model Configuration
The prediction model was built using Python programming, which has a large memory and can handle enormous quantities of datasets, including a wide range of prebuilt libraries.The datasets were prepared and cleaned in preparation for processing.Moreover, a BI-LSTM model was developed, based on computational approaches (Sections 3.1-3.3),using various layer numbers and model parameters.The numbers of layers, the setting of the neurons, and the training, test, and validation split size are presented in Table 1.Moreover, each type of data required different parameter settings to optimize the BI-LSTM fit owing to the dataset's size and nature.Furthermore, in the factorization of the learning machine network, the two most essential BI-LSTM model parameters are the number of inputs and the number of neurons in the hidden layer.The error metrics remain nearly constant, with very minor fluctuations when the number of neurons is more than 150 and smaller than 300, which is a result of the extreme learning machine networks' randomization of input weight.Figure 17 shows the flowchart of the proposed BI-LSTM model.

The Error Indices
Several indices were specified as popular error metrics to enable the rigorous evaluation of the prediction performance of the two proposed modes.The modeling error indicators were utilized to examine the accuracy of the BI-LSTM model, including the mean square error (MSE), root mean square error (RMSE), the mean absolute percentage error (MAPE), and the mean absolute error (MAE), as presented mathematically in Equations ( 14)- (18).The (y ) represents the real value, ( ) indicates the forecasting observation, and () denotes the number of samples used for the investigation.Moreover, the standard deviation of the residuals (prediction error) is denoted by the abbreviation RMSE, which is the square root of the mean square errors and is considered an effective general purpose error index for numerical forecasts.The best model obtains an MAE value around zero, which is the average error between the actual and predicted variables.The mean squared difference between the estimated and original parameters is simply averaged by MSE, and MAPE calculates the range of error as a percentage.In addition, R-squared (R 2 ) is a statistical index that refers to how much of a dependent variable's variation is explained by an independent variable, or it is the square of the correlation.In Equation ( 18), the sum of the square's residuals is (SS ), and the (SS ) represents the absolute squares' number, which is proportional to the data variance.

Results and Discussion
The developed BI-LSTM model was implemented to predict the future values of solar irradiance, wind speed, and ambient temperature for 169 h ahead, from 23:00 27 June 2021 to 23:00 3 July 2021, based on the historical dataset covering activity for 36 years in Dumat al-Jandal City.On the basis of the predicted results for solar irradiance, shown in Figures 18-22, the BI-LSTM model effectively handled the three different types of irradiances as a multimodal dataset.The GHI (Figure 18a), the DNI (Figure 19a), and the DHI (Figure 20a) did not record any major overfitting between both the experimental and predicted results, which means the BI-LSTM model shows promising performance with this type of dataset.Furthermore, in terms of the accuracy indicators, the BI-LSTM model achieved notable error metrics, as presented in Table 2.The error metrics of GHI were as follows: The RMSE was 7.8 W/m 2 ; the MAE was 5 W/m 2 ; the MSE was 61 W/m 2 ; and the MAPE was 2.5%.In addition, the MAE of GHI was 61 W/m 2 , which is a high value, with an R 2 of 99%.The RMSE of DNI was reduced to 7 W/m 2 , while the MAE increased to 5.8 W/m 2 , due to two main factors: (i) the type of dataset; and (ii) the different parameter settings in Table 1.In addition, the MSE of DNI was reduced to 53 W/m 2 , while the MAPE value grew rapidly to 45% with an R 2 of 99%.The MAPE values were affected by the zero values of solar irradiance during the night in historical data.Moreover, the error metrics of DHI recorded a sharp decrease, with an RMSE value of 1.7 W/m 2 , while the MAE was 1.4 W/m 2 , the MSE was 3 W/m 2 , and the MAPE value was reduced to 30% with an R 2 of 99%.The DHI's error metric values were notable compared to the GHI and DNI error values.Figures 18b, 19b and 20b illustrate the future predicted values of the GHI, DHI, and DNI for 169 h ahead.The performance comparison between the proposed BI-LSTM model and the other external solar irradiance models (Table 3) also revealed that the proposed model's evaluation metrics were superior to those of the other external solar irradiance models.The BI-LSTM model showed remarkable outcomes and performance with the shortterm wind speed prediction values, as illustrated in Figure 21.Despite the massive quantity of training data, the BI-LSTM model did not demonstrate overfitting, as seen in Figure 21a.Furthermore, the wind speed values for the next 169 h, from 23:00 27 June 2021 to 23:00 3 July 2021, were forecasted with substantial error metrics, as shown in Figure 21b.The wind speed's error indicators were as follows: The RMSE was 0.6 W/m 2 ; the MAE was 0.4 W/m 2 ; the MSE was 0.4 W/m 2 ; and the MAPE was 15%.It is clear that all the error metrics, except the MAPE value, decreased with the historical wind speed dataset values, compared to the other types of historical datasets, because of the nature and type of the historical wind speed dataset.The R 2 was reduced to 93%; however, the R 2 showed notable and agreeable correlation between the variables' variation of wind speed values.Furthermore, as demonstrated in Figure 22a, the BI-LSTM performed admirably for both the training and test predicted values for ambient temperature.The evaluation metrics of the predicted values for ambient temperature were as follows: The RMSE was 1 W/m 2 ; the MAE was 0.9 W/m 2 ; and the MSE was 1 W/m 2 .In addition, compared to the wind speed, the MAPE was reduced to 3%, and the R 2 increased to 98%, which reflects the high performance of the BI-LSTM model with this type of historical dataset.However, all the error values of the BI-LSTM model with these different types of historical datasets were considered to be relatively small values when compared to other prediction models, demonstrating their flexibility in new historical data observations.Figure 22b presents the predicted values of the ambient temperature for the next 169 h.The BI-LSTM prediction model was evaluated by implementing the five main types of historical datasets, which are the three types of solar irradiance, wind speed, and the ambient temperature, based on the parameter settings in Table 1.In addition, the proposed BI-LSTM model was compared with several external ANN prediction models to evaluate its performance, as presented in Tables 3-5.In comparison to the observational results and ANN models, the BI-LSTM had significant achievements compared to the other models in terms of accuracy and interpretability.The BI-LSTM prediction model showed admirable performance with all three main types of historical datasets, especially for the wind speed and ambient temperature values.The epoch numbers and error indicators had a significant connection, implying that as the epoch numbers rose, the error values would certainly decrease, but the training time would increase.Moreover, by combining a large amount of historical data with a significant amount of training data, the duration of the simulation periods was extended even further.The subsequent steps are reliant on the preceding stages, so the BI-LSTM prediction model must be handled sequentially.Nevertheless, one of the most challenging problems was enhancing this model's generalization capabilities and achieving improved outcomes.Generalizability is defined as the variation in the recognition rate of the BI-LSTM model when comparing earlier observational datasets (training data) with a dataset that the model has not previously seen (testing data).Furthermore, the lack of generalization of the model leads to the excessive overfitting of training and testing datasets.The BI-LSTM model processes the dataset in two directions, with two hidden layers that feed the data forward to the same output layer.In addition, the BI-LSTM model employs a time series to enhance the generalization ability by taking into consideration the impacts of features on the predicted historical dataset values of the upcoming instant.The BI-LSTM model can be configured to predict the next 1 h in the future, or 24 h ahead, and up to 4 weeks in the future, but increasing the future predicted interval values would raise the error metric indicators.Whenever a sufficient number of historical datasets are chosen, the effectiveness of the BI-LSTM model becomes apparent, and the short-term predictive performance and accuracy are improved.

Conclusions
This study contributes to knowledge gaps in renewable energy (wind and solar) short-term forecasting by considering the impacts of solar irradiance and wind speed fluctuation, uncertainty, and intermittence issues, along with the ambient temperatures that affect the power system stability.The scheduling and operation of wind and solar energy integration power systems would be greatly influenced by improving the quality of wind and solar energy predictions.A proposed deep-learning BI-LSTM prediction model based on an RNN time series was trained and validated through a set of selected features, including the algorithms' neural connection strength, gated units, and layers, which are the best prediction sequence processes for the future performance of solar irradiance, wind speed, and ambient temperature values.The BI-LSTM prediction model was examined and evaluated by implementing five different types of historical datasets, including three types, namely, solar irradiance, wind speed, and the ambient temperature values.In addition, a historical dataset (1 January 1985 to 26 June 2021) was collected from Dumat Al-Jandal City in Saudi Arabia as hourly interval data to predict the next 169 h in the future.The BI-LSTM prediction model proved its ability and capability to handle different types of historical datasets.
The efficacy of the proposed model was evaluated in terms of the performance of the error indexes, such as the RMSE, MAE, MSE, and MAPE values with R 2 , as summarized in Table 2.The highest RMSE value of the three types of solar irradiance (GHI, DNI, and DHI) was 7.8 W/m 2 , and the smallest value was 1.7 W/m 2 , while the greatest value of the MAE was 5.8 W/m 2 , and the lowest was 1.4 W/m 2 .Moreover, the MSE value recorded 61 W/m 2 as its highest value, and 3 W/m 2 as its smallest value.The MAPE values alternated between 45% and 2.5%.The R 2 value for all three types of solar irradiance was 99%.The proposed BI-LSTM model produced notable error metric values, such as RMSE, MAE, MSE, and MAPE, which were 0.6, 0.4, and 0.4 m/s and 15%, respectively, for the wind speed values, and 1, 0.9, 0.3 °C, and 3%, respectively, for the ambient temperature.The R 2 values for the wind speed and ambient temperature were 93% and 98%, respectively.In addition, the results indicate that the proposed BI-LSTM model had a substantial edge over its competitors in terms of the prediction accuracy, overfitting, minimization of redundancies and training, and testing execution time, and it demonstrated significant performance.These results were supported by the comparison of the proposed BI-LSTM model with the external ANN prediction models, as shown in Tables 3-5.Nevertheless, the BI-LSTM model can be upgraded, and the error indicators can be minimized by upgrading the learning performance, optimizing the hidden layers, and modifying the epoch size or learning iterations.
Finally, future work in this field will focus on new short-term forecasting models to forecast wind speeds using various sizes of intervals of historical data and different regions, including solar irradiance.

Figure 3 .
Figure 3.The configuration of the DBN consists of three mains stacked RBMs with an output layer.

Figure 5 .
Figure 5.The three main layers of a CNN with the softmax layer.Copyright © 2021 Elsevier [5].

Figure 8 .
Figure 8. Hourly intervals historical performance of the GHI over 36 years.

Figure 9 .
Figure 9. Hourly intervals historical performance of the DNI over 36 years.

Figure 10 .
Figure 10.Hourly intervals historical performance of the DHI over 36 years.

Figure 11 .
Figure 11.Hourly intervals historical performances of the GHI, DHI, and DNI.

Figure 12 .
Figure 12.Hourly intervals historical performances of the GHI, DHI, and DNI.

Figure 13 .
Figure 13.Hourly intervals historical performance of the wind speed over 36 years.

Figure 14 .
Figure 14.Hourly intervals historical performance of the wind speed.

Figure 15 .
Figure 15.Hourly intervals historical performance of the temperature.

Figure 16 .
Figure 16.Hourly intervals historical performance of the temperature.

Figure 17 .
Figure 17.The flowchart of the proposed BI-LSTM model.

Figure 18 .Figure 19 .Figure 20 .
Figure 18.(a) The real and predicted values of GHI, which show the fit and performance of the BI-LSTM model; and (b) the GHI's predicted future values for the next 169 h.

Figure 21 .Figure 22 .
Figure 21.(a) The real and predicted values of wind speed, which show the fit and performance of the BI-LSTM model; and (b) the wind speed's predicted future values for the next 169 h.

Table 1 .
Number of layers and the parameter settings of the BI-LSTM model.

Table 2 .
The forecasting accuracy indicators for the BI-LSTM model.

Table 3 .
Comparison of BI-LSTM proposed model performance with other solar irradiance models.