A Hybrid Deep Learning Approach for Crude Oil Price Prediction

: Crude oil is one of the world’s most important commodities. Its price can affect the global economy, as well as the economies of importing and exporting countries. As a result, forecasting the price of crude oil is essential for investors. However, crude oil price tends to ﬂuctuate considerably during signiﬁcant world events, such as the COVID-19 pandemic and geopolitical conﬂicts. In this paper, we propose a deep learning model for forecasting the crude oil price of one-step and multi-step ahead. The model extracts important features that impact crude oil prices and uses them to predict future prices. The prediction model combines convolutional neural networks (CNN) with long short-term memory networks (LSTM). We compared our one-step CNN–LSTM model with other LSTM models, the CNN model, support vector machine (SVM), and the autoregressive integrated moving average (ARIMA) model. Also, we compared our multi-step CNN–LSTM model with LSTM, CNN, and the time series encoder–decoder model. Extensive experiments were conducted using short-, medium, and long-term price data of one, ﬁve, and ten years, respectively. In terms of accuracy, the proposed model outperformed existing models in both one-step and multi-step predictions.


Introduction
Forecasting crude oil price is important for many stakeholders, such as governments, companies, and investors.Crude oil is one of the most influential commodities on the global stage, exerting a profound impact on economies, industries, and financial markets worldwide.Its dynamic pricing is subject to complex interactions of geopolitical events, supply and demand dynamics, economic fluctuations, and environmental factors.As such, the ability to anticipate changes in crude oil prices is pivotal for informed decision-making by governments, corporations, and investors.This is a challenging task because of crude oil's high volatility (Saltik et al. 2016), making prices susceptible to sudden fluctuations driven by multiple factors.Developing prediction models for crude oil prices has been the focus of some researchers.The models include traditional econometric/statistical models and complex machine learning models (Jahanshahi et al. 2022).
In this paper, we focus on machine learning models.We propose a hybrid model that combines CNN and LSTM to forecast oil prices.It can make one-step and multi-step oil price predictions.A one-step prediction can forecast the oil price for the next day, while a multi-step prediction can forecast the oil price for the following week.The multi-step prediction is useful in speculating on promising opportunities and minimizing potential risks.For governments, especially those that heavily rely on oil revenues, accurate price forecasts are imperative for fiscal planning.Budgeting, taxation, and public expenditure allocation depend on oil prices.Sound forecasting aids in managing deficits, stabilizing economies, and mitigating potential shocks.
For one-step predictions, the proposed model combines CNN and LSTM models.The CNN model is effective in extracting new features of time series data.The LSTM model is suitable for modeling a long sequence of dependencies.The combined CNN-LSTM model was tested with short-, medium-, and long-term datasets.The results demonstrated the superiority of the proposed model over the existing models.For multi-step predictions, we implemented two models and compared their results.The first model was the vector output model, which is based on LSTM models using multi-step predictions.The second model was an encoder-decoder model, which is also based on LSTM.We tested them on short-, medium-, and long-term datasets.We find that the multi-step CNN-LSTM model is superior to the encoder-decoder LSTM model.
The paper makes three contributions: First, it proposes a hybrid one-step CNN-LSTM model.Second, it extends the one-step model, and proposes a multi-step model.Third, it conducts comprehensive experiments to show its effectiveness.In particular, it compares the hybrid models with various machine learning and ARIMA models on short-, medium-, and long-term datasets.
The rest of the paper is organized as follows.Section 2 reviews the existing methods used for oil price prediction.Section 3 presents a hybrid deep learning model.Section 4 describes the datasets that we used, the evaluation metrics, and the results of our experiments.Finally, Section 5 summarizes our work, states the advantages and limitations of the proposed method, and discusses some future work.

Literature Review
In the literature, some researchers use statistical/econometric time series models and machine learning models to predict crude oil prices.Random walk is a process describing a path that includes a set of random steps (Xia et al. 2020).Among statistical models, random walk-based methods have been adopted for oil price prediction (Panopoulou and Pantelidis 2015).Their main drawback is that they oversimplify the complexity of financial markets (Smith 2023).Econometric time series models are quantitative models that use historical data to predict future prices.Among these models, autoregressive integrated moving average (ARIMA)-based models have been used to predict oil prices (Yu et al. 2016).ARIMA models constitute a family of statistical models that offer a framework for understanding and predicting time series data.The three key components of ARIMA models are the autoregressive (AR) component, which captures the serial correlation of the time series data, the integrated (I) component, which accounts for differencing to achieve stationarity, and the moving average (MA) component, which models short-term dependencies of the data.The limitation of the ARIMA model is its limited capability of capturing the nonlinearity of oil prices.
To overcome the shortcomings of econometric algorithms, various machine learning techniques have been suggested, such as support vector machines (SVM) (Fan et al. 2016) and artificial neural networks (ANN) (Hu 2021).SVM is based on the principle of smallsample statistical learning theory.This theory primarily concerns the analysis of limited datasets in the framework of statistical learning principles, and it has applications in tasks, such as pattern classification and nonlinear regression.The SVM algorithm seeks one misalignment mapping from the input space to the output space.This mapping transforms the data into a feature space, where subsequent linear regression is performed (Guo et al. 2012).
Artificial neural networks are computational systems inspired by human neural networks.Their primary objective is to generate an output pattern based on a given input pattern.ANN possesses an architecture characterized by a vast number of nodes (neurons) and connections, which are distributed in a parallel fashion (Lakshmanan and Ramasamy 2015).The primary advantage of these algorithms is their ability to handle nonlinearity, which makes them popular in forecasting tasks.The ANN technique is suitable for pattern recognition, and so has become the most popular technique in this field.
Deep learning techniques have been used extensively in economics and finance, due to their capability of learning complex patterns in high-dimensional data.Currently, the most frequently used deep learning techniques are convolutional neural networks (CNN) and recurrent neural networks (RNN), including extensions, such as long short-term memory (LSTM) and deep recursive neural networks (DRNN), etc. Li et al. (2019) presented a novel method based on analyzing and text-mining online media, using a CNN.Similarly, Wu et al. (2021) proposed a text-based and big-data-driven technique that employs a CNN model to automatically read crude oil news updates, processing more than 8000 news headlines.Chen and Huang (2021) devised a CNN to predict stock prices, using gold and oil prices.Using RNN, Wang and Wang (2016) forecasted crude oil indices.Cen and Wang (2019) proposed LSTM-based models to predict the fluctuating behaviors of crude oil prices.Jahanshahi et al. (2022) employed LSTM and bidirectional LSTM (Bi-LSTM) models to predict crude oil prices affected by the Russia-Ukraine war and the COVID-19 pandemic.They tested the models on a dataset collected over 20 years and used seven features, including crude oil opening, closing, intraday highest, and intraday lowest price values.Similarly, Daneshvar et al. (2022) explored LSTM and Bi-LSTM to predict Brent crude oil prices.

A Hybrid Deep Learning Model
Before describing our approach, we first provide the background to convolutional neural network (CNN) and long short-term memory (LSTM) architectures.We then present a brief description of the vector output model and the encoder-decoder LSTM model.

Convolutional Neural Network
Convolutional neural networks were introduced in 1995 by LeCun and Bengio (1998) in the context of computer vision.CNN mimics the perception and learning processes of the human eye in many tasks, such as image processing, natural language processing, face recognition, classification problems, and recommendation systems.They can be very effective at automatically extracting and learning features from one-dimensional sequence data, such as univariate time series data.They are composed of many layers, i.e., the input layer, the convolutional layers, the pooling layers, the fully connected layers, and the output layer.The role of a convolutional layer is to apply a convolution operation on the data, which involves filtering the input data to measure their effect on the data.The size of the filter indicates its coverage.Each filter utilizes a shared set of weights to perform the convolutional operation.Normally, weights are updated during the training process.The output v i,j of an input layer represented by an N × N matrix and a convolution filter represented by an F × F matrix is calculated by Equation (1): where v l i,j is the value at row i and column j in layer l, w k,m is the weight at row k and column m of the filter, and δ is the activation function.The output of the filter is passed to an activation function of the next layer.Common nonlinear activation functions include the ReLU (Rectified Linear Unit) function, which is represented as f (x) = max(0, x).
Figure 1 shows the calculation of v 1,1 in a matrix of size E × E at layer l, where E = N − F + 1.The process performs the convolution of the input data matrix with a convolutional filter.
J. Risk Financial Manag. 2023, 16, x FOR PEER REVIEW 4 of 24 to an activation function of the next layer.Common nonlinear activation functions include the ReLU (Rectified Linear Unit) function, which is represented as () = (0, ).
Figure 1 shows the calculation of  , in a matrix of size  ×  at layer , where  =  −  + 1.The process performs the convolution of the input data matrix with a convolutional filter.To avoid overfi ing in CNN, an additional pooling layer is added.Deep models are more prone to overfi ing than shallow models.Max pooling is the most common type of pooling, where the maximum value in a certain window is chosen.
The last step in a CNN is the fully connected layer, which is a multi-layer perceptron (MLP) network.This layer converts the extracted features in the previous layers for the final output.The final output is calculated by Equation ( 2): where  is the value of neuron  at the layer ,  is activation function, and  , is the weight of the connection between neuron  from layer  − 1 and neuron  from layer .

Long Short-Term Memory
LSTM is a special variant of RNN, first introduced by Hochreiter and Schmidhuber (1996).It solves mathematical problems of modeling long sequence dependencies.Figure 2 shows an RNN unit, where  denotes the input vector at time ,  denotes the output vector at time , and  denotes the hidden state at time , which is dependent on the input vector and the previous hidden state. denotes the weights of the hidden layer,  denotes the weights of the output layer, and  denotes the transition weights of the hidden layer.Equations ( 3) and (4) calculate the output and hidden vectors, respectively, where  is the activation function, which can be sigmoid, tanh, SoftMax, or ReLU.

𝑂 = 𝑓(𝑉𝐴 )
(3) To avoid overfitting in CNN, an additional pooling layer is added.Deep models are more prone to overfitting than shallow models.Max pooling is the most common type of pooling, where the maximum value in a certain window is chosen.
The last step in a CNN is the fully connected layer, which is a multi-layer perceptron (MLP) network.This layer converts the extracted features in the previous layers for the final output.The final output is calculated by Equation (2): where v i j is the value of neuron i at the layer j, δ is activation function, and w j−1 k,i is the weight of the connection between neuron k from layer j − 1 and neuron i from layer j.

Long Short-Term Memory
LSTM is a special variant of RNN, first introduced by Hochreiter and Schmidhuber (1996).It solves mathematical problems of modeling long sequence dependencies.Figure 2 shows an RNN unit, where X t denotes the input vector at time t, O t denotes the output vector at time t, and A t denotes the hidden state at time t, which is dependent on the input vector and the previous hidden state.U denotes the weights of the hidden layer, V denotes the weights of the output layer, and W denotes the transition weights of the hidden layer.Equations ( 3) and (4) calculate the output and hidden vectors, respectively, where f is the activation function, which can be sigmoid, tanh, SoftMax, or ReLU. (3) The original, fully connected RNN experiences the gradient vanishing issue in modeling long time series.To solve this problem, LSTM replaces the ordinary node in a hidden layer with a memory cell with a complex internal gate structure.This structure provides a powerful learning capability to LSTM.Because it can extract features automatically and incorporate exogenous variables very easily, LSTM is expected to do well in crude oil price prediction.LSTM overcomes the problem of the gradient vanishing issue of RNN.It is well suited for dealing with long-term dependency problems.The detailed structure of the model is shown in Figure 3, where the cell state C is used to record the long-term status of the sequence and the hidden state h is used for the current status of the sequence.The first step is the forget gate layer, which decides which information will be discarded from the cell state.This task is accomplished using a sigmoid layer, whose output value is between 0 and 1.The value determines the degree of forgetting the input information, where 0 means completely forgetting and 1 means the opposite.It takes h t−1 and x t as input and outputs a number in the range [0, 1], as shown in the following Equation (5): where x t is the input vector of the memory cell at time t, and h t−1 is the value of the memory cell at time t − 1. W f and U f are weight matrices and b f is a bias vector.The next steps are the input gate layer and tanh layer, which decide which information will be stored in the memory cell state.The input gate layer is a sigmoid layer, and it decides which values will be updated.The tanh layer creates a vector of new candidate values Ĉ, as shown in Equations ( 6) and ( 7): where W i , W c , U i , and U c are weight matrices and b i and b c are bias vectors.The third step is to update the old cell state C t−1 into the new cell state C t using Equation (8).
The final step is generating the output based on a filtered cell state through two stages comprising Equations ( 9) and (10).The original, fully connected RNN experiences the gradient vanishing issue in modeling long time series.To solve this problem, LSTM replaces the ordinary node in a hidden layer with a memory cell with a complex internal gate structure.This structure provides a powerful learning capability to LSTM.Because it can extract features automatically and incorporate exogenous variables very easily, LSTM is expected to do well in crude oil price prediction.LSTM overcomes the problem of the gradient vanishing issue of RNN.It is well suited for dealing with long-term dependency problems.The detailed structure of the model is shown in Figure 3, where the cell state C is used to record the long-term status of the sequence and the hidden state h is used for the current status of the sequence.The first step is the forget gate layer, which decides which information will be discarded from the cell state.This task is accomplished using a sigmoid layer, whose output value is between 0 and 1.The value determines the degree of forge ing the input information, where 0 means completely forge ing and 1 means the opposite.It takes ℎ and  as input and outputs a number in the range [0, 1], as shown in the following Equation ( 5): where  is the input vector of the memory cell at time , and ℎ is the value of the memory cell at time  − 1.  and  are weight matrices and  is a bias vector.The next steps are the input gate layer and tanh layer, which decide which information will be stored in the memory cell state.The input gate layer is a sigmoid layer, and it decides which values will be updated.The tanh layer creates a vector of new candidate values Ĉ, The final step is generating the output based on a filtered cell state through two stages comprising Equations ( 9) and ( 10).

The Hybrid Model Architecture
The proposed hybrid model combines a CNN and LSTMs to forecast daily oil prices.The CNN model effectively uncovers and acquires novel features in time series data.On the other hand, the LSTM model excels in capturing extended sequential dependencies.This combined CNN-LSTM model is good at time-based analysis and abstracting meaningful features.Its widespread applications include computer vision and natural language processing with highly satisfactory results (Liang et al. 2020).Our crude oil price prediction model learns a function that maps a sequence of past observations, i.e., past oil prices, as input to an output observation, i.e., the future oil price.As such, the sequence of observations must be transformed into multiple samples, from which the LSTM can learn.We divide the sequence into multiple input/output samples, where three-time steps are used as input and one time step is used as output, for one-step prediction.We experimented with three LSTM models-vanilla LSTM, stacked LSTM, and the proposed hybrid model.
The vanilla LSTM model is composed of only one single hidden layer LSTM unit and an output layer for prediction.The number of LSTM units in the hidden layer is 50.The model is trained using the Adam stochastic gradient descent and optimized using the mean square error loss function.The stacked LSTM is composed of multiple LSTM hidden layers stacked on top of each other.We defined our model with two hidden layers, each with 50 LSTM units.
Our hybrid model is composed of a CNN and an LSTM model, where the CNN is used to interpret sub-sequences of input that together are provided as input to the LSTM. Figure 4 presents the architecture of our hybrid model.We first format our training sample, using samples of an n × k matrix as input to the convolution layer.We split our time series data into input/output samples with four steps as input and one as output.Each sample can then be split into two sub-samples, each with two-time steps.The CNN can interpret each sub-sequence of two-time steps and provide a time series of interpretations of the sub-sequences to the LSTM model as input.All experiments were run on a PC with a 1.8 GHz CPU and 64 GB RAM.The CNN and LSTM were implemented in Python 3.7.4via Keras 2.4.3.The neural networks were trained using the Nadam algorithm with a default learning rate of 0.001.

Multi-Step Prediction
A time series forecasting problem that requires a prediction of multiple time steps into the future can be referred to as multi-step time series forecasting.Specifically, it is a problem where the forecast horizon or interval is more than one time step.There are two types of LSTM models that can be used for multi-step forecasting-the vector output model and the encoder-decoder model.

Multi-Step Vector Output LSTM Model
Our multi-step LSTM model predicts a week (i.e., predictions for seven days) into the future.LSTM directly outputs a vector that can be interpreted as a multi-step forecast.We extended our proposed CNN-LSTM model to include multi-step prediction.The input to All experiments were run on a PC with a 1.8 GHz CPU and 64 GB RAM.The CNN and LSTM were implemented in Python 3.7.4via Keras 2.4.3.The neural networks were trained using the Nadam algorithm with a default learning rate of 0.001.

Multi-Step Prediction
A time series forecasting problem that requires a prediction of multiple time steps into the future can be referred to as multi-step time series forecasting.Specifically, it is a problem where the forecast horizon or interval is more than one time step.There are two types of LSTM models that can be used for multi-step forecasting-the vector output model and the encoder-decoder model.

Multi-Step Vector Output LSTM Model
Our multi-step LSTM model predicts a week (i.e., predictions for seven days) into the future.LSTM directly outputs a vector that can be interpreted as a multi-step forecast.We extended our proposed CNN-LSTM model to include multi-step prediction.The input to the CNN-LSTM model is a vector consisting of a series of days, and the output of the model is a vector containing the price prediction for the following seven days.The multiple output strategy entails the construction of a unified model capable of performing one-shot predictions for the entire forecast sequence, as shown in Figure 5, where an input in the range [1, n] days is fed to the CNN-LSTM units, with an expected output of a sequence of n days predictions.

Multi-Step Prediction
A time series forecasting problem that requires a prediction of multiple time steps into the future can be referred to as multi-step time series forecasting.Specifically, it is a problem where the forecast horizon or interval is more than one time step.There are two types of LSTM models that can be used for multi-step forecasting-the vector output model and the encoder-decoder model.

Multi-Step Vector Output LSTM Model
Our multi-step LSTM model predicts a week (i.e., predictions for seven days) into the future.LSTM directly outputs a vector that can be interpreted as a multi-step forecast.We extended our proposed CNN-LSTM model to include multi-step prediction.The input to the CNN-LSTM model is a vector consisting of a series of days, and the output of the model is a vector containing the price prediction for the following seven days.The multiple output strategy entails the construction of a unified model capable of performing oneshot predictions for the entire forecast sequence, as shown in Figure 5, where an input in the range [1, n] days is fed to the CNN-LSTM units, with an expected output of a sequence of n days predictions.

Encoder-Decoder LSTM Model
The encoder-decoder LSTM model adopts the autoencoder paradigm (Baldi 2012).It is suitable for addressing the task of multi-step time series forecasting, where both input and output sequences are involved.The problem is commonly referred to as a sequence-

Encoder-Decoder LSTM Model
The encoder-decoder LSTM model adopts the autoencoder paradigm (Baldi 2012).It is suitable for addressing the task of multi-step time series forecasting, where both input and output sequences are involved.The problem is commonly referred to as a sequenceto-sequence (seq2seq) problem, and the model is specifically designed to solve problems, such as text translation from one language to another.
The model architecture consists of two distinct sub-models, namely the encoder and the decoder, each playing a crucial role in the overall functioning of the model.The encoder, as its name implies, is responsible for processing and absorbing the input sequence.It uses a vanilla LSTM model as the default choice for the encoder.However, alternative encoder models, including stacked LSTMs, bidirectional LSTMs, and CNN-based models, can be employed based on the specific requirements and characteristics of the input sequence.
The primary objective of the encoder is to generate a fixed-length vector that encapsulates the model's interpretation of the input sequence.This vector serves as a meaningful representation of the input information and is subsequently utilized by the decoder component to generate the desired output sequence, as shown in Figure 6.The encoder input is the oil price for a sequence of days, and the decoder output is a sequence of predicted prices.
to-sequence (seq2seq) problem, and the model is specifically designed to solve problems, such as text translation from one language to another.
The model architecture consists of two distinct sub-models, namely the encoder and the decoder, each playing a crucial role in the overall functioning of the model.The encoder, as its name implies, is responsible for processing and absorbing the input sequence.It uses a vanilla LSTM model as the default choice for the encoder.However, alternative encoder models, including stacked LSTMs, bidirectional LSTMs, and CNN-based models, can be employed based on the specific requirements and characteristics of the input sequence.
The primary objective of the encoder is to generate a fixed-length vector that encapsulates the model's interpretation of the input sequence.This vector serves as a meaningful representation of the input information and is subsequently utilized by the decoder component to generate the desired output sequence, as shown in Figure 6.The encoder input is the oil price for a sequence of days, and the decoder output is a sequence of predicted prices.

Experimental Evaluation and Result Analysis
In this section, we evaluate the performance of our hybrid forecasting model using several evaluation criteria, and compare it with other oil price prediction techniques in the literature.

Experimental Evaluation and Result Analysis
In this section, we evaluate the performance of our hybrid forecasting model using several evaluation criteria, and compare it with other oil price prediction techniques in the literature.

Dataset Description
We downloaded three recent crude oil datasets from MarketWatch for ten years (i.e., from 2013 to 2022).The type of oil is WTI crude oil.It represents the benchmarked North America Oil Price.We split the datasets into three sub-datasets.The three sub-datasets are composed of the daily prices of WTI crude oil.

•
The first sub-dataset is a long-term period dataset, which spans 10 years from January 2013 to December 2022, constituting 2521 data points.Figure 7 shows the evolution of the time series data.Table 1 provides the descriptive analysis for this sub-dataset.

•
The second sub-dataset is a medium-term period dataset, which spans five years from January 2018 to December 2022, constituting 1262 data points.Figure 8 shows the evolution of the time series data.Table 2 provides the descriptive analysis for this sub-dataset.

•
The third sub-dataset is a short-term period dataset, which covers only January to December 2022, constituting 251 data points.Figure 9 shows the evolution of the time series data.Table 3 provides the descriptive analysis for this sub-dataset.


The second sub-dataset is a medium-term period dataset, which spans five years from January 2018 to December 2022, constituting 1262 data points.Figure 8 shows the evolution of the time series data.Table 2 provides the descriptive analysis for this sub-dataset.


The second sub-dataset is a medium-term period dataset, which spans five years from January 2018 to December 2022, constituting 1262 data points.Figure 8 shows the evolution of the time series data.Table 2 provides the descriptive analysis for this sub-dataset.


The third sub-dataset is a short-term period dataset, which covers only January to December 2022, constituting 251 data points.Figure 9 shows the evolution of the time series data.Table 3 provides the descriptive analysis for this sub-dataset.Our observations are summarized in the following: (1) The mean crude oil prices in the long-and medium-terms were close to USD 65 per barrel, while they increased in the short-term period during 2022, hi ing a mean of USD 94.This can be explained by the fact that, during the initial months of 2022, crude oil prices surged to levels surpassing USD 120 per barrel, marking the highest price in the 10-year period.These elevated prices were considered as a potential source of inflationary pressure on economic growth.This scenario stands in contrast to the sharp decline in crude oil prices observed during the Spring of 2020, which was a direct response to the onset of the COVID-19 pandemic.
(2) The price distribution is not normal, since the skewness is greater than zero and kurtosis is less than three, which yields to skewness towards right with thickened tails.(3) Fluctuations in oil prices exhibit diverse magnitudes and durations, implying the possible presence of a dynamic nonlinear nature of the data.This suggests the need for nonlinear models capable of accommodating these irregularities.
Each dataset was split into two parts, with 70% of the dataset used for training and 30% used for testing.Figure 10 shows the training and testing data for the three sub-datasets.The blue color represents the samples for training the model, while the orange color represents the samples for testing the model.Our observations are summarized in the following: (1) The mean crude oil prices in the long-and medium-terms were close to USD 65 per barrel, while they increased in the short-term period during 2022, hitting a mean of USD 94.This can be explained by the fact that, during the initial months of 2022, crude oil prices surged to levels surpassing USD 120 per barrel, marking the highest price in the 10-year period.These elevated prices were considered as a potential source of inflationary pressure on economic growth.This scenario stands in contrast to the sharp decline in crude oil prices observed during the Spring of 2020, which was a direct response to the onset of the COVID-19 pandemic.
(2) The price distribution is not normal, since the skewness is greater than zero and kurtosis is less than three, which yields to skewness towards right with thickened tails.(3) Fluctuations in oil prices exhibit diverse magnitudes and durations, implying the possible presence of a dynamic nonlinear nature of the data.This suggests the need for nonlinear models capable of accommodating these irregularities.
Each dataset was split into two parts, with 70% of the dataset used for training and 30% used for testing.Figure 10 shows the training and testing data for the three subdatasets.The blue color represents the samples for training the model, while the orange color represents the samples for testing the model.
(3) Fluctuations in oil prices exhibit diverse magnitudes and durations, implying the possible presence of a dynamic nonlinear nature of the data.This suggests the need for nonlinear models capable of accommodating these irregularities.
Each dataset was split into two parts, with 70% of the dataset used for training and 30% used for testing.Figure 10 shows the training and testing data for the three sub-datasets.The blue color represents the samples for training the model, while the orange color represents the samples for testing the model.

Evaluation Criteria
We used two standard performance metrics to measure the difference between the actual and predicted oil prices.The root mean square error (RMSE) and the mean absolute percentage error (MAPE) have often been used (Zhang 2023).The first metric RMSE quantifies the difference between the actual and the predicted prices.If y 1 , y 2 , y 3 . .., y n are the actual prices and y 1 , y 2 , y 3 , . .., y 4 are the corresponding predicted prices, then the RMSE is calculated using Equation ( 11).
The RMSE is by far the most frequently used metric for measuring the performance of models predicting commodity prices.It is a criterion that gives a higher weight to larger absolute errors.
The second metric MAPE compares the results of different models.The MAPE, when expressed as a percentage, is calculated using Equation ( 12).

MAPE =
100 n n ∑ i=1 y i − y i y i (12)

Parameter Tuning and Optimization
For parameter tuning and optimization, we experimented with different optimizers and activation functions during the training phase.The optimizers that we used included Adam, Adadelta, Adagrad, Adamax, Nadam, Ftrl, and RMSprop optimizers.We tested with ReLU, Softplus, Softsign, tanh, SELU, and ELU activation functions.Table 4 shows the RMSE obtained using each optimizer and activation function.From this table, we can conclude that the best optimizer is Nadam and the best activation function is ReLU.

One-Step-Ahead Prediction
To evaluate the effectiveness of our proposed model, we compared it with two other models based on long short-term memory (LSTM).The first model was vanilla LSTM.It has a single layer of LSTM units, and the output layer is used for the oil price prediction.The second model was stacked LSTM, where multiple hidden LSTM layers are stacked with one on top of another.
To verify the efficacy of our proposed model, we compared it with other three benchmark models, namely ARIMA, SVM, and CNN.Our model was a hybrid one composed of a CNN and an LSTM.We used CNN, because it can automatically extract features from one-dimensional data.Tables 5-7 show RMSE and MAPE for the three sub-datasets.Table 5 shows the results of long-term predictions.Here, we can observe that vanilla, stacked LSTM, and CNN models yielded similar performances.The proposed CNN-LSTM model improved the performance by obtaining the lowest RMSE.Similar behavior was observed with MAPE.The SVM and ARIMA models had the lowest accuracy with higher error rates.
Table 6 shows the results of the medium-term prediction.While the medium-term RMSE values were worse than the long-term RMSE values, the MAPE values for the medium term were better.Once again, the proposed CNN-LSTM model yielded better accuracy in predictions.The SVM model had a poor performance compared to the other models when tested on the medium-term datasets.This was because the medium-term dataset is between the years 2018 to 2022 and the testing data were from the last 30% of the time period, which approximately covered 2021 and 2022.The SVM model could not predict the spikes in oil prices during the Russia-Ukraine crisis.Deep learning models tend to have better performances with high volatility rates.
Table 7 shows the results of the short-term prediction.These results fell between the long-and medium-term intervals.From Tables 5-7, we can observe that the hybrid CNN-LSTM model outperformed the other models on the three sub-datasets.The RMSE and MAPE values of the CNN-LSTM model were the lowest among the models.
Figures 11-13 depict the actual oil prices versus the predicted ones on the three subdatasets, using the hybrid model.In Figure 11, we can observe the almost perfect match between the predicted and actual values, except for the days between 50 and 70.As can be seen from the figure, the abnormal deviation for this time interval corresponded to the high fluctuation in oil prices during the COVID-19 period.In Figure 12, we plo ed the actual and predicted oil prices from 2018 to 2022.The approximate range for the training data, which was 70% of this interval, corresponded to the years 2018, 2019, and 2020, leaving two years for the testing data, namely, 2021 and 2022.Our model was able to capture the price changes, even during the recent Russia and Ukraine conflict, with an excellent performance.Figure 13 shows the actual and predicted values of oil prices.The oil price trend was recognized by our model with minor deviations.As discussed above, it can be concluded that the proposed model performs be er than the benchmark models.
Figure 13.The actual versus the predicted oil price using the hybrid model on the short-term dataset.
Further analysis was conducted to verify the obtained results.We ran a simple moving average (SMA) on the output of machine learning models, including SVM, CNN, LSTM, and CNN-LSTM.Moving averages are considered one of the main indicators in technical analysis.The SMA is the average over a specified period.We calculated a series of averages of fixed-size subsets of the total set of the predicted output.Figure 14a shows the simple moving average of the actual prices versus the predicted prices of SVM, CNN, LSTM, and CNN-LSTM models on the short-term dataset, where the window size is 3 days.Similarly, Figure 14b shows the simple moving average of the actual versus the predicted output of the models on the medium-term dataset, where the window size is 300 days.The SMA of SVM is relatively far from the actual SMA, when compared with other models.It clearly shows that the trend was captured by the models except SVM.In some intervals, CNN and LSTM models were closer to the actual SMA, but the CNN-LSTM model showed a be er overall performance.Figure 15 shows the simple moving average of the actual versus the predicted output of the models for the long-term dataset, where the window size is 50 days.As the accuracy was high and it was very hard to spot the differences among different models, we enlarged the first and the last intervals of the SMA.The overall computed SMA is plo ed in Figure 15a with two rectangles delineating the enlarged intervals.Figure 15b-d   In Figure 12, we plotted the actual and predicted oil prices from 2018 to 2022.The approximate range for the training data, which was 70% of this interval, corresponded to the years 2018, 2019, and 2020, leaving two years for the testing data, namely, 2021 and 2022.Our model was able to capture the price changes, even during the recent Russia and Ukraine conflict, with an excellent performance.
Figure 13 shows the actual and predicted values of oil prices.The oil price trend was recognized by our model with minor deviations.As discussed above, it can be concluded that the proposed model performs better than the benchmark models.
Further analysis was conducted to verify the obtained results.We ran a simple moving average (SMA) on the output of machine learning models, including SVM, CNN, LSTM, and CNN-LSTM.Moving averages are considered one of the main indicators in technical analysis.The SMA is the average over a specified period.We calculated a series of averages of fixed-size subsets of the total set of the predicted output.Figure 14a shows the simple moving average of the actual prices versus the predicted prices of SVM, CNN, LSTM, and CNN-LSTM models on the short-term dataset, where the window size is 3 days.Similarly, Figure 14b shows the simple moving average of the actual versus the predicted output of the models on the medium-term dataset, where the window size is 300 days.The SMA of SVM is relatively far from the actual SMA, when compared with other models.It clearly shows that the trend was captured by the models except SVM.In some intervals, CNN and LSTM models were closer to the actual SMA, but the CNN-LSTM model showed a better overall performance.
J. Risk Financial Manag.2023, 16, x FOR PEER REVIEW 14 of 24 Figure 13 shows the actual and predicted values of oil prices.The oil price trend was recognized by our model with minor deviations.As discussed above, it can be concluded that the proposed model performs be er than the benchmark models.
Figure 13.The actual versus the predicted oil price using the hybrid model on the short-term dataset.
Further analysis was conducted to verify the obtained results.We ran a simple moving average (SMA) on the output of machine learning models, including SVM, CNN, LSTM, and CNN-LSTM.Moving averages are considered one of the main indicators in technical analysis.The SMA is the average over a specified period.We calculated a series of averages of fixed-size subsets of the total set of the predicted output.Figure 14a shows the simple moving average of the actual prices versus the predicted prices of SVM, CNN, LSTM, and CNN-LSTM models on the short-term dataset, where the window size is 3 days.Similarly, Figure 14b shows the simple moving average of the actual versus the predicted output of the models on the medium-term dataset, where the window size is 300 days.The SMA of SVM is relatively far from the actual SMA, when compared with other models.It clearly shows that the trend was captured by the models except SVM.In some intervals, CNN and LSTM models were closer to the actual SMA, but the CNN-LSTM model showed a be er overall performance.Figure 15 shows the simple moving average of the actual versus the predicted output of the models for the long-term dataset, where the window size is 50 days.As the accuracy was high and it was very hard to spot the differences among different models, we enlarged the first and the last intervals of the SMA.The overall computed SMA is plo ed in Figure 15a with two rectangles delineating the enlarged intervals.Figure 15b-d  Figure 15 shows the simple moving average of the actual versus the predicted output of the models for the long-term dataset, where the window size is 50 days.As the accuracy was high and it was very hard to spot the differences among different models, we enlarged the first and the last intervals of the SMA.The overall computed SMA is plotted in Figure 15a with two rectangles delineating the enlarged intervals.Figure 15b-d shows zoom plots of the intervals from point 0 to 150 and Figure 15e-g shows zoom plots of the intervals from point 550 to 700.The figures show the superiority of the CNN-LSTM model, where the black line (i.e., the SMA of the CNN-LSTM predicted prices) is closest to the green line (i.e., the SMA of actual prices).

Multi-Step-Ahead Prediction
In the experiments, we focused on the comparison among several deep learning models, by calculating their average performance.We conducted several experiments of different multi-step vector output LSTM models.They are stacked LSTM, CNN, and CNN-LSTM vector output models.Tables 8-13 summarize the results, using multi-step vector output and the encoder-decoder LSTM models.The RMSE of t+1 to t+7 for the long-, medium-, and short-term datasets are illustrated in Tables 8-10.The lowest RMSE is highlighted in the bold font.

Multi-Step-Ahead Prediction
In the experiments, we focused on the comparison among several deep learning models, by calculating their average performance.We conducted several experiments of different multi-step vector output LSTM models.They are stacked LSTM, CNN, and CNN-LSTM vector output models.Tables 8-13 summarize the results, using multi-step vector output and the encoder-decoder LSTM models.The RMSE of t+1 to t+7 for the long-, medium-, and short-term datasets are illustrated in Tables 8-10.The lowest RMSE is highlighted in the bold font.The MAPE of t+1 to t+7 for the long-, medium-, and short-term datasets are illustrated in Tables 11-13.The lowest MAPE is highlighted in the bold font.Here, we can observe that the performance of the vector output CNN-LSTM model was superior to other models in terms of both RSME and MAPE.
We also compared the performance of the CNN-LSTM and the encoder-decoder model on the t+1 and t+7 days.Figures 16 and 17 show the accuracy of the price prediction for the first day, using the vector output CNN-LSTM model and the encoder-decoder model on the long-term data set, respectively.We also compared the performance of the CNN-LSTM and the encoder-decoder model on the t+1 and t+7 days.Figures 16 and 17 show the accuracy of the price prediction for the first day, using the vector output CNN-LSTM model and the encoder-decoder model on the long-term data set, respectively.We also compared the performance of the CNN-LSTM and the encoder-decoder model on the t+1 and t+7 days.Figures 16 and 17 show the accuracy of the price prediction for the first day, using the vector output CNN-LSTM model and the encoder-decoder model on the long-term data set, respectively.Figures 18 and 19 show the accuracy for the seventh day, using the two models on the long-term dataset, respectively.We note that the accuracy for the first day was higher than the seventh, which was expected.Nonetheless, the prediction accuracy for the seventh day remained high and acceptable.
Figures 18 and 19 show the accuracy for the seventh day, using the two models on the long-term dataset, respectively.We note that the accuracy for the first day was higher than the seventh, which was expected.Nonetheless, the prediction accuracy for the seventh day remained high and acceptable.Figures 18 and 19 show the accuracy for the seventh day, using the two models on the long-term dataset, respectively.We note that the accuracy for the first day was higher than the seventh, which was expected.Nonetheless, the prediction accuracy for the seventh day remained high and acceptable.Figures 22 and 23 show the accuracy of the price prediction for the seventh day, using the two models on the medium-term dataset, respectively.
Again, the superiority of the vector output CNN-LSTM model over the encoderdecoder model for both the first and seventh days is clear.Figures 24 and 25 show the accuracy of the price prediction for the t+1 day, using the vector output CNN-LSTM model and the encoder-decoder model on the short-term dataset, respectively.Again, the superiority of the vector output CNN-LSTM model over the encoder-decoder model for both the first and seventh days is clear.Figures 24 and 25 show the accuracy of the price prediction for the t+1 day, using the vector output CNN-LSTM model and the encoder-decoder model on the short-term dataset, respectively.Again, the superiority of the vector output CNN-LSTM model over the encoder-decoder model for both the first and seventh days is clear.Figures 24 and 25 show the accuracy of the price prediction for the t+1 day, using the vector output CNN-LSTM model and the encoder-decoder model on the short-term dataset, respectively.Although Figures 24-27 show the close accuracies of the two models, the vector output CNN-LSTM model had higher accuracy than the encoder-decoder model.We can conclude that the proposed vector output CNN-LSTM model yielded higher accuracy than other models in the paper, including the encoder-decoder model, when applied on short-, medium-, and long-term datasets.Although Figures 24-27 show the close accuracies of the two models, the vector output CNN-LSTM model had higher accuracy than the encoder-decoder model.We can conclude that the proposed vector output CNN-LSTM model yielded higher accuracy than other models in the paper, including the encoder-decoder model, when applied on short-, medium-, and long-term datasets.

Conclusions
This paper proposed a model based on a CNN and an LSTM to predict the WTI crude oil price.Due to its high volatility, it is difficult to predict crude oil prices.We used two deep learning models, namely, CNN and LSTM, which are useful in modeling nonlinear dynamics.A hybrid model was presented that combines the two deep learning models.The obtained accuracy for the experimental models was high, but the CNN-LSTM model had a slightly be er performance.The models compared in our research are good prediction models in time series cases.However, the CNN-LSTM model had the lowest RMSE and MAPE among the four models.This indicated the effectiveness of the CNN-LSTM model, compared to other models used for the WTI crude oil market.
Publicly available crude oil price data were used to train the models.Our experiments included short-, medium-, and long-term datasets.Investors can use the model

Conclusions
This paper proposed a model based on a CNN and an LSTM to predict the WTI crude oil price.Due to its high volatility, it is difficult to predict crude oil prices.We used two deep learning models, namely, CNN and LSTM, which are useful in modeling nonlinear dynamics.A hybrid model was presented that combines the two deep learning models.The obtained accuracy for the experimental models was high, but the CNN-LSTM model had a slightly better performance.The models compared in our research are good prediction models in time series cases.However, the CNN-LSTM model had the lowest RMSE and MAPE among the four models.This indicated the effectiveness of the CNN-LSTM model, compared to other models used for the WTI crude oil market.
Publicly available crude oil price data were used to train the models.Our experiments included short-, medium-, and long-term datasets.Investors can use the model trained on long-term samples to develop a long-term investment plan, and they can use the one trained on short-term samples to make a short-term investment decision.
In addition to investigating daily price prediction, we extended our study to include the prediction of multiple steps up to seven days into the future.Our experiments included the vector output CNN-LSTM model and the encoder-decoder LSTM model.We tested the two models on short-, medium-, and long-term datasets.The two models had close accuracies, but the vector output CNN-LSTM model outperforms the encoder-decoder LSTM model.
In our experiments, there are four stages in the model to predict oil prices.In the future, we plan to investigate the impact of the number of stages during model training on the overall model performance.We also aim to combine the proposed model with optimization algorithms, and investigate their effectiveness in increasing prediction accuracy.In addition, the information from different online media sources could be integrated into the system.

Figure 1 .
Figure 1.Calculation of the output  , by applying a convolution filter  ×  to an input layer represented by the  ×  matrix.

Figure 1 .
Figure 1.Calculation of the output v 1,1 by applying a convolution filter F × F to an input layer represented by the N × N matrix.

10 )Figure 3 .
Figure 3.One memory cell of a long short-term memory network.

Figure 7 .
Figure 7. Daily crude oil prices for the long-term period.

Figure 7 .
Figure 7. Daily crude oil prices for the long-term period.

Figure 8 .
Figure 8. Daily crude oil prices for the medium-term period.

Figure 8 .Figure 9 .
Figure 8. Daily crude oil prices for the medium-term period.J. Risk Financial Manag.2023, 16, x FOR PEER REVIEW 10 of 24

Figure 9 .
Figure 9. Daily crude oil prices for the short-term period.

Figure 10 .
Figure 10.The training and testing data for long-, medium-, and short-term datasets.Figure 10.The training and testing data for long-, medium-, and short-term datasets.

Figures 11 -
Figures 11-13 depict the actual oil prices versus the predicted ones on the three subdatasets, using the hybrid model.In Figure11, we can observe the almost perfect match between the predicted and actual values, except for the days between 50 and 70.As can be seen from the figure, the abnormal deviation for this time interval corresponded to the high fluctuation in oil prices during the COVID-19 period.

Figure 11 .
Figure 11.The actual versus the predicted oil price using the hybrid model on the long-term dataset.In Figure12, we plo ed the actual and predicted oil prices from 2018 to 2022.The approximate range for the training data, which was 70% of this interval, corresponded to the years 2018, 2019, and 2020, leaving two years for the testing data, namely, 2021 and 2022.Our model was able to capture the price changes, even during the recent Russia and Ukraine conflict, with an excellent performance.

Figure 12 .
Figure12.The actual versus the predicted oil price using the hybrid model on the medium-term dataset.

Figure 11 .
Figure 11.The actual versus the predicted oil price using the hybrid model on the long-term dataset.

Figures 11 -
Figures 11-13 depict the actual oil prices versus the predicted ones on the three subdatasets, using the hybrid model.In Figure11, we can observe the almost perfect match between the predicted and actual values, except for the days between 50 and 70.As can be seen from the figure, the abnormal deviation for this time interval corresponded to the high fluctuation in oil prices during the COVID-19 period.

Figure 11 .
Figure 11.The actual versus the predicted oil price using the hybrid model on the long-term dataset.

Figure 12 .
Figure12.The actual versus the predicted oil price using the hybrid model on the medium-term dataset.

Figure 12 .
Figure 12.The actual versus the predicted oil price using the hybrid model on the medium-term dataset.

Figure 14 .
Figure 14.(a) Simple moving average of the actual prices versus the predicted prices on the shortterm dataset.(b) Simple moving average of the actual prices versus the predicted prices on mediumterm dataset.
Figure15shows the simple moving average of the actual versus the predicted output of the models for the long-term dataset, where the window size is 50 days.As the accuracy was high and it was very hard to spot the differences among different models, we enlarged the first and the last intervals of the SMA.The overall computed SMA is plo ed in Figure15awith two rectangles delineating the enlarged intervals.Figure15b-dshows zoom plots of the intervals from point 0 to 150 and Figure 15e-g shows zoom plots of the intervals from point 550 to 700.The figures show the superiority of the CNN-LSTM model,

Figure 13 .
Figure 13.The actual versus the predicted oil price using the hybrid model on the short-term dataset.

Figure 14 .
Figure 14.(a) Simple moving average of the actual prices versus the predicted prices on the shortterm dataset.(b) Simple moving average of the actual prices versus the predicted prices on mediumterm dataset.
Figure15shows the simple moving average of the actual versus the predicted output of the models for the long-term dataset, where the window size is 50 days.As the accuracy was high and it was very hard to spot the differences among different models, we enlarged the first and the last intervals of the SMA.The overall computed SMA is plo ed in Figure15awith two rectangles delineating the enlarged intervals.Figure15b-dshows zoom plots of the intervals from point 0 to 150 and Figure 15e-g shows zoom plots of the intervals from point 550 to 700.The figures show the superiority of the CNN-LSTM model, Figure 14.(a) Simple moving average of the actual prices versus the predicted prices on the short-term dataset.(b) Simple moving average of the actual prices versus the predicted prices on mediumterm dataset.
J. Risk Financial Manag.2023, 16, x FOR PEER REVIEW 15 of 24 where the black line (i.e., the SMA of the CNN-LSTM predicted prices) is closest to the green line (i.e., the SMA of actual prices).

Figure 15 .
Figure 15.Simple moving average of the actual prices versus the predicted prices on the long-term dataset with an enlarged view of six time-intervals.

Figure 15 .
Figure 15.Simple moving average of the actual prices versus the predicted prices on the long-term dataset with an enlarged view of six time-intervals.

Figure 16 .
Figure16.The actual versus the predicted oil price, using the vector output CNN-LSTM model on the long-term dataset for the t+1 day price prediction.

Figure 17 .
Figure17.The actual versus the predicted oil price, using the encoder-decoder model on the longterm dataset for the t+1 day price prediction.

Figure 16 .
Figure16.The actual versus the predicted oil price, using the vector output CNN-LSTM model on the long-term dataset for the t+1 day price prediction.

Figure 16 .
Figure16.The actual versus the predicted oil price, using the vector output CNN-LSTM model on the long-term dataset for the t+1 day price prediction.

Figure 17 .
Figure17.The actual versus the predicted oil price, using the encoder-decoder model on the longterm dataset for the t+1 day price prediction.

Figure 17 .
Figure 17.The actual versus the predicted oil price, using the encoder-decoder model on the long-term dataset for the t+1 day price prediction.

Figure 18 .
Figure18.The actual versus the predicted oil price, using the vector output CNN-LSTM model on the long-term dataset for the t+7 day price prediction.

Figure 19 .
Figure19.The actual versus the predicted oil price, using the encoder-decoder LSTM model on the long-term dataset for the t+7 day price prediction.When we observed the accuracy of both models, we clearly spo ed the superiority of the vector output CNN-LSTM model over the encoder-decoder model for the seventh day.Figures20 and 21show the accuracy of the price prediction for the first day, using the vector output CNN-LSTM model and the encoder-decoder model on the mediumterm data set, respectively.

Figure 18 .
Figure18.The actual versus the predicted oil price, using the vector output CNN-LSTM model on the long-term dataset for the t+7 day price prediction.

Figure 18 .
Figure18.The actual versus the predicted oil price, using the vector output CNN-LSTM model on the long-term dataset for the t+7 day price prediction.

Figure 19 .
Figure19.The actual versus the predicted oil price, using the encoder-decoder LSTM model on the long-term dataset for the t+7 day price prediction.When we observed the accuracy of both models, we clearly spo ed the superiority of the vector output CNN-LSTM model over the encoder-decoder model for the seventh day.Figures20 and 21show the accuracy of the price prediction for the first day, using the vector output CNN-LSTM model and the encoder-decoder model on the mediumterm data set, respectively.

Figure 19 .
Figure 19.The actual versus the predicted oil price, using the encoder-decoder LSTM model on the long-term dataset for the t+7 day price prediction.When we observed the accuracy of both models, we clearly spotted the superiority of the vector output CNN-LSTM model over the encoder-decoder model for the seventh day.Figures20 and 21show the accuracy of the price prediction for the first day, using the vector output CNN-LSTM model and the encoder-decoder model on the medium-term data set, respectively.

Figure 20 .
Figure20.The actual versus the predicted oil price, using the vector output CNN-LSTM model on the medium-term dataset for the t+1 day price prediction.

Figure 21 .
Figure21.The actual versus the predicted oil price, using the encoder-decoder model on the medium-term dataset for the t+1 day price prediction.Figures22 and 23show the accuracy of the price prediction for the seventh day, using the two models on the medium-term dataset, respectively.

Figure 20 .
Figure 20.The actual versus the predicted oil price, using the vector output CNN-LSTM model on the medium-term dataset for the t+1 day price prediction.

Figure 21 .
Figure21.The actual versus the predicted oil price, using the encoder-decoder model on the medium-term dataset for the t+1 day price prediction.Figures22 and 23show the accuracy of the price prediction for the seventh day, using the two models on the medium-term dataset, respectively.

Figure 21 .
Figure 21.The actual versus the predicted oil price, using the encoder-decoder model on the medium-term dataset for the t+1 day price prediction.

Figure 22 .
Figure22.The actual versus the predicted oil price, using the vector output CNN-LSTM model on the medium-term dataset for the t+7 day price prediction.

Figure 23 .
Figure23.The actual versus the predicted oil price, using the encoder-decoder LSTM model on the medium-term dataset for the t+7 day price prediction.

Figure 22 .
Figure22.The actual versus the predicted oil price, using the vector output CNN-LSTM model on the medium-term dataset for the t+7 day price prediction.

Figure 23 .
Figure23.The actual versus the predicted oil price, using the encoder-decoder LSTM model on the medium-term dataset for the t+7 day price prediction.

Figure 23 .
Figure 23.The actual versus the predicted oil price, using the encoder-decoder LSTM model on the medium-term dataset for the t+7 day price prediction.J. Risk Financial Manag.2023, 16, x FOR PEER REVIEW 21 of 24

Figure 24 .
Figure24.The actual versus the predicted oil price, using the vector output CNN-LSTM model on the short-term dataset for the t+1 day price prediction.

Figure 24 .
Figure24.The actual versus the predicted oil price, using the vector output CNN-LSTM model on the short-term dataset for the t+1 day price prediction.

Figure 25 .
Figure 25.The actual versus the predicted oil price, using the encoder-decoder model on the shortterm dataset for the t+1 day price prediction.Figures26 and 27show the accuracy of the price prediction for the t+7 day, using the vector output CNN-LSTM model and the encoder-decoder model on the short-term dataset, respectively.

Figure 25 .
Figure 25.The actual versus the predicted oil price, using the encoder-decoder model on the short-term dataset for the t+1 day price prediction.

FiguresFigure 26 .
Figures 26 and 27 show the accuracy of the price prediction for the t+7 day, using the vector output CNN-LSTM model and the encoder-decoder model on the short-term dataset, respectively.J. Risk Financial Manag.2023, 16, x FOR PEER REVIEW 22 of 24

Figure 26 .
Figure 26.The actual versus the predicted oil price, using the vector output CNN-LSTM model on the short-term dataset for the t+7 day price prediction.

Figure 26 .
Figure26.The actual versus the predicted oil price, using the vector output CNN-LSTM model on the short-term dataset for the t+7 day price prediction.

Figure 27 .
Figure27.The actual versus the predicted oil price, using the encoder-decoder LSTM model on the short-term dataset for the t+7 day price prediction.

Figure 27 .
Figure 27.The actual versus the predicted oil price, using the encoder-decoder LSTM model on the short-term dataset for the t+7 day price prediction.

Table 4 .
RMSE of various optimizers versus activation functions.The lowest RMSE is highlighted in the bold font.

Table 5 .
RMSE and MAPE of each model on the long-term dataset.The lowest RMSE and MAPE are highlighted in the bold font.

Table 6 .
RMSE and MAPE of each model on the medium-term dataset.The lowest RMSE and MAPE are highlighted in the bold font.

Table 7 .
RMSE and MAPE of each model on the short-term dataset.The lowest RMSE and MAPE are highlighted in the bold font.

Table 7 .
RMSE and MAPE of each model on the short-term dataset.The lowest RMSE and MAPE are highlighted in the bold font.

Table 7 .
RMSE and MAPE of each model on the short-term dataset.The lowest RMSE and MAPE are highlighted in the bold font.

Table 8 .
RMSE of each model on the long-term dataset over seven consecutive days.The lowest RMSE is highlighted in the bold font.

Table 9 .
RMSE of each model on the medium-term dataset over seven consecutive days.The lowest RMSE is highlighted in the bold font.

Table 10 .
RMSE of each model on the short-term dataset over seven consecutive days.The lowest RMSE is highlighted in the bold font.

Table 11 .
MAPE of each model on the long-term dataset over seven consecutive days.The lowest MAPE is highlighted in the bold font.

Table 12 .
MAPE of each model on the medium-term dataset over seven consecutive days.The lowest MAPE is highlighted in the bold font.

Table 13 .
MAPE of each model on the short-term dataset over seven consecutive days.The lowest MAPE is highlighted in the bold font.

Table 13 .
MAPE of each model on the short-term dataset over seven consecutive days.The lowest MAPE is highlighted in the bold font.

Table 13 .
MAPE of each model on the short-term dataset over seven consecutive days.The lowest MAPE is highlighted in the bold font.