Improving Deep Learning Models by Bayesian Optimization to Predict Crude Oil Prices

Kachwaha, Shagun; Lahmiri, Salim

doi:10.3390/a18120762

Open AccessArticle

Improving Deep Learning Models by Bayesian Optimization to Predict Crude Oil Prices

by

Shagun Kachwaha

and

Salim Lahmiri

^*

Department of Supply Chain and Business Technology Management, Concordia University, Montreal, QC H3G 1M8, Canada

^*

Author to whom correspondence should be addressed.

Algorithms 2025, 18(12), 762; https://doi.org/10.3390/a18120762

Submission received: 6 November 2025 / Revised: 29 November 2025 / Accepted: 30 November 2025 / Published: 2 December 2025

(This article belongs to the Section Algorithms for Multidisciplinary Applications)

Download

Browse Figures

Versions Notes

Abstract

We implement, optimize, and compare the performance of deep learning models in forecasting prices of crude oil markets, namely West Texas Intermediate (WTI) and Brent. We focus on deep learning models as these are state-of-the-art forecasting systems for complex and nonlinear time series. In this regard, we implement convolutional neural networks (CNNs), long short-term memory (LSTM), and gated recurrent units (GRUs). Classical recurrent neural networks (RNNs) are chosen as the baseline artificial neural networks. We contribute to the literature by examining the effect of fine-tuning of the parameters of the predictive systems by means of Bayesian optimization (BO) on their performance. Also, to check the robustness of the optimized models, they are trained and tested on daily, weekly, and monthly data. The assessment of forecasting performance is based on three different metrics including the root mean of squared errors (RMSE), mean absolute deviation (MAD), and mean absolute percentage error (MAPE). The simulation results show that the GRU-BO and RNN-BO are respectively the best systems to predict prices of BRENT and WTI. In addition, the simulation results show that BO enhances the accuracy of the predictive models. The results obtained would help oil producers, suppliers, traders, and investors to implement the appropriate prediction system for each market to improve accuracy and generate profits for each time horizon.

Keywords:

deep learning; CNN; LSTM; GRU; RNN; Bayesian optimization; forecasting; crude oil price; Brent; WTI; forecasting

1. Introduction

Crude oil, a key energy source in modern economies, holds a vital position in global business operations. The fluctuations in crude oil prices are influenced by a multitude of factors. Therefore, predicting these movements has been a fascinating and challenging research topic in the fields of economics and finance as it is vital for decision-making in the energy, financial, and regulatory sectors. However, crude oil price data exhibits a nonlinear nature and sudden changes. Therefore, it is difficult to comprehend and predict crude oil prices with accuracy.

Uncertainty in crude oil price forecasting poses significant challenges for businesses and governments, thus leading to suboptimal resource allocation and investment decisions. This lack of price foresight exposes the companies to unexpected fluctuations, increasing their financial risks and potential losses. Unpredictable oil prices can disrupt financial markets and aggravate economic volatility, leading to market instability and adverse economic consequences. Moreover, inefficient resource allocation is due to unreliable forecasts of production, inventory management, and supply chain operations, reducing competitiveness and profitability. In summary, accurate crude oil price forecasting is crucial for risk management and stability in global energy markets, the failure of which can give rise to operation planning uncertainty, financial risks, market instability, and delayed economic growth and competitiveness.

Traditional statistical models have been widely applied in crude oil market prediction. For instance, Moshiri and Foroutan [1] utilized a combination of generalized autoregressive conditional heteroskedasticity (GARCH) and autoregressive integrated moving average (ARIMA). In another study by Gao et al. [2], crude oil price data underwent decomposition into multiple modes using ensemble empirical mode decomposition (EEMD). Subsequently, average mutual information was employed to reconstruct the data into stochastic and deterministic elements. The ARIMA model was then utilized to analyze price patterns within these elements, with a Kalman filter (KF) employed to define input parameters. Similarly, Wu et al. [3] applied complementary EEMD to mitigate end effects and mode mixing by decomposing the original crude oil price time series into intrinsic mode functions (IMFs). They subsequently employed ARIMA and sparse Bayesian learning with shared kernels to forecast target values for each IMF individually, with final predictions adaptively selected based on training precision. Marchese et al. [4] investigated the correlation between crude oil prices and its derivatives by exploring fractionally integrated multivariate GARCH (FIGARCH) methods. Notably, these traditional methods (ARIMA, GARCH, FIGARCH, KF) exhibit relatively static operational characteristics and simplicity. Indeed, these methods often rely on strict assumptions regarding the data, such as crude oil time series, despite real-world crude oil price data being nonlinear, non-stationary, highly complex, volatile, and prone to structural breaks.

Further, cutting-edge machine-learning methods are leading the way as they can extract meaningful information from large and complex data. These methods uncover complex linkages that are frequently missed by traditional statistical models. Indeed, traditional statistical methods do not give sound results because they cannot consider all the nonlinear relationships within the data. To address the limitations of traditional econometric models, other authors used the k-nearest neighbor algorithm (kNN) [5], random forests [6], support vector regression (SVR) [7], kernel extreme learning machines (KELMs) [8], and ensemble systems [9,10].

On the other hand, neural networks excel at capturing nonlinear relationships and complex patterns, allowing them to model interactions between various factors more effectively than linear models or time series methods. In this regard, neural networks and more specifically deep learning are receiving growing attention in crude oil price forecasting. Indeed, deep learning stands out as a promising avenue for predicting crude oil prices due to its remarkable ability to model complex nonlinear time series data. Consequently, it has garnered considerable research interest in crude oil market prediction. For instance, Niu et al. [11] combined convolutional neural networks (CNNs) and gated recurrent units (GRUs), Busari and Lim [12] compared GRUs and long short-term models (LSTM), Karasu and Altan [13] explored how CNNs and LSTM can be used to analyze financial indicators to accurately predict price trends in crude oil markets, and Sen and Choudhury [14] utilized LSTM and GRUs tuned by particle swarm optimization (PSO). Other authors focused on using LSTM [15,16] to predict crude market prices whilst others focused on predicting crude market volatility by using CNNs [17].

The main purpose of this work is to implement, optimize, and compare various deep learning models in the task of forecasting crude oil markets. In particular, our investigation includes state-of-the-art models including CNNs, LSTM, and GRUs. Indeed, these models are adept at capturing temporal dependencies and nonlinear patterns in time series data, making them particularly suitable for the volatile nature of crude oil markets. Recurrent neural networks (RNNs) are used as a baseline reference model for comparison.

In summary, we rely on deep learning (DL) as it offers several advantages compared to traditional machine learning. First, DL algorithms possess the ability to automatically learn features from data. Second, they are effective in processing large and multipart datasets. Third, DL algorithms capture complex nonlinear relationships within data that may be difficult to discover using conventional methods. As a result, they contribute to improving representation of complex patterns in the original data. Fourth, DL models are effortlessly scalable and able to generalize in diverse scenarios thanks to their capacity to learn abstract and hierarchical representations of data. Finally, DL has been proven to provide superior performance in forecasting time series [11,12,13,14,15,16,17].

In our study, we consider two predominant oil markets: West Texas Intermediate (WTI) and Brent (European market). By using a comprehensive dataset which spans from 1987 to 2023, a deep historical context is provided, leading to more reliable predictions. The inclusion of these two benchmarks, which are indicators of global oil prices, ensures that the forecasting models account for a broad spectrum of factors influencing the oil market.

The Bayesian optimization algorithm is applied to all neural networks in our study to tune their respective hyperparameters. By tuning a range of hyperparameters, Bayesian optimization streamlines the refinement process of all three deep learning models and the RNN. In this regard, our study contributes to the literature by demonstrating its effectiveness in the context of crude oil price forecasting, showcasing enhanced predictive performance. In addition, in our study, we forecast crude oil markets across various time horizons to address the diverse needs of producers, suppliers, and importers, whose decisions are influenced by price fluctuations over different periods. Specifically, by evaluating daily, weekly, and monthly horizons, the models provide insights tailored to the specific planning and operational timelines relevant to stakeholders in the oil market. In the realm of production planning, deep learning models can provide accurate price forecasts, enabling companies to allocate resources more efficiently and optimize production schedules in line with market demands. This can lead to cost savings and increased profitability. For inventory management, the predictive power of these models helps in maintaining optimal inventory levels, reducing the risk of surplus or shortages, and ensuring a steady supply in fluctuating markets.

The main contributions of our work are summarized as follows. First, deep learning models including the CNN, LSTM, and GRUs are implemented and compared in the task of forecasting crude oil market prices. Second, the predictive models are tested on the largest crude oil markets, namely WTI and Brent. Third, their respective effectiveness is assessed across three different time horizons using daily, weekly, and monthly data. Indeed, daily forecasting is vital for day traders and short-term operational decisions, as it helps navigate the immediate volatility and capitalizes on rapid market movements. Weekly forecasts are essential for logistical planning and operational adjustments, as they allow for a mid-range outlook that can inform shipping schedules and refining activities. Monthly forecasting, on the other hand, aligns with the strategic decision-making of producers and importers, who must plan purchases and manage inventory with an eye on longer-term market trends and economic indicators. Fourth, the standard RNN is chosen as the baseline model as it is the most appropriate artificial neural network for time series analysis and forecasting. Fifth, we investigate the effectiveness of Bayesian optimization on improvement of the forecasting accuracy of all four artificial neural networks considered in our work. Finally, it is expected that our research will shed light on the best deep learning system in forecasting the price of each crude oil market and for each time horizon. The results would provide valuable information to stakeholders so that they can improve the accuracy of their forecasts, thereby making more informed decisions about procurement, production, and distribution. In addition, the insights gained from this study are expected to help oil producers and suppliers in better managing their relationships with stakeholders, including investors, governments, and downstream operators. Indeed, with more reliable price forecasts, they can provide more accurate guidance and build trust through transparency, potentially leading to more favorable investment opportunities. In essence, this study equips oil producers and suppliers with advanced tools and methodologies to navigate the complexities of the global oil market more effectively across time horizons, enabling them to enhance operational efficiencies, reduce risk, and capitalize on market opportunities in a strategic and informed manner.

The rest of this paper is as follows. Section 2 describes the predictive systems and performance measures. Section 3 describes the data and forecasting protocol and provides the experimental results. Finally, Section 4 discusses the results and conclusions.

2. Materials and Methods

In this study, we implement deep learning models in the task of forecasting WTI and Brent prices. The deep learning models are the CNN, LSTM, and GRUs. The RNN is used as the baseline model for comparison. Each artificial neural network system (CNN, LSTM, GRUs, and RNN) is used to predict prices of each market before and after applying Bayesian optimization (BO) to tune its parameters. Finally, three performance measures are employed to assess the effectiveness of each model pre- and post-BO, namely the root mean of squared errors (RMSE), mean absolute deviation (MAD), and mean absolute percentage error (MAPE). They are described next. The methodology is presented in the flowchart shown in Figure 1. The models are described next.

2.1. Convolutional Neural Networks

The CNN [18] is a deep learning neural network with a topology designed by a sequence of convolution layers and pooling layers. For example, the convolution layer extracts deep features from the input series by applying several convolution kernels on inputs. Then, a pooling layer processes the output of the convolution layer to shrink the dimensionality of the system so that the important patterns are retained, and redundant ones are discarded. In particular, the output of the convolution layer is given by

l_{t} = σ (x_{t} * k_{t} + b_{t})

(1)

where

l_{t}

is the output of the convolution layer,

σ

is the activation function,

x_{t} {\in R}^{d}

is the input,

k_{t} {\in R}^{d}

is the parameter of the convolution kernel, and

b_{t}

is the bias term.

2.2. Long Short-Term Memory Neural Networks

The LSTM [19] incorporates three gate units, namely the input gate, forget gate, and output gate. For instance, the input gate layer recognizes relevant information pertaining to the current input, the forget gate is used to discard irrelevant information, and the output gate layer is employed to determine the values for the next hidden state. The units of the LSTM are expressed as follows:

f_{t} = σ (W_{f} x_{t} + U_{f} h_{t - 1} + b_{f})

(2)

i_{t} = σ (W_{i} x_{t} + U_{i} h_{t - 1} + b_{i})

(3)

c_{t}^{'} = \tanh (W_{c} x_{t} + U_{c} h_{t - 1} + b_{c})

(4)

c_{t} = f_{t} {⊙ c}_{t - 1} + i_{t} {⊙ c}_{t}^{'}

(5)

o_{t} = σ (W_{o} x_{t} + U_{o} h_{t - 1} + b_{o})

(6)

h_{t} = o_{t} ⊙ \tanh (c_{t}),

(7)

where

x_{t} \in R^{d}

is the input,

h_{t} \in R^{h}

the hidden state,

f_{t}

the forget gate,

i_{t}

the input gate,

o_{t}

the output gate,

c_{t}^{'}

the temporary cell state,

c_{t} \in R^{h}

the cell state, and

W \in R^{h \times d}, U \in R^{h \times h}, b \in R^{h}

elements of the parameters.

2.3. Gated Recurrent Units

The GRU system [20] can handle the gradient vanishing problem and has a simple structure with a limited number of parameters. In addition, it has high computational efficiency. The GRU elements are as follows:

z_{t} = σ (W_{z} x_{t} + U_{z} h_{t - 1} + b_{z})

(8)

r_{t} = σ (W_{r} x_{t} + U_{r} h_{t - 1} + b_{r})

(9)

{\hat{h}}_{t} = \tanh (W_{h} x_{t} + U_{h} (r_{t} {⊙ h}_{t - 1}) + b_{h})

(10)

h_{t} = z_{t} ⊙ {\hat{h}}_{t} + {(1 - z_{t}) ⊙ h}_{t - 1}

(11)

where

x_{t} \in R^{d}

is the input,

h_{t} \in R^{h}

is the hidden state,

z_{t}

is the forget gate,

r_{t}

is the reset gate,

{\hat{h}}_{t}

is the candidate activation vector,

W \in R^{h \times d}, U \in R^{h \times h}, b \in R^{h}

are the parameter matrices and vectors, and

σ

is an activation function.

2.4. Recurrent Neural Networks

RNNs [21] employ sequential data in their networks, in contrast to traditional artificial neural networks. This property is essential to time series forecasting since the inherent structure in the data sequence provides useful information. The RNN can be thought of as a short-term memory unit, with x being the input layer, y the output layer, and s the state (hidden) layer. The ability of RNNs to preserve an internal state or memory that changes when the artificial neural network receives new inputs is what makes them unique. Because of this memory mechanism, RNNs can analyze and incorporate context from previous inputs, which makes them highly skilled at processing sequential data like time series.

Letting

μ

be the value of the recurrent weight, and assuming for simplicity that the units are linear (i.e., the function

ϕ

is the identity function), the activation of the output unit at time t is given by

x_{2} (t) = μ^{t} x_{2} (0) + \sum_{τ = 0}^{t - 1} μ^{τ} w_{21} x_{1} (t - τ)

(12)

where

x_{1} (t)

is assumed to be constant over time. This equation shows that the trajectory of the network exponentially approaches a constant state for

μ

less than one and goes to infinity for larger values of

μ

. The RNN cell or central unit retains an internal hidden state containing information from previous time steps and processes each input in the sequence recurrently. The network’s predictions are impacted by this hidden state as it changes over time, considering both recent and past inputs.

2.5. Bayesian Optimization

In this study, we employ Bayesian optimization (BO) [22], which is an approximation algorithm effective when the computation task is complex and the number of iterations is large. Specifically, BO finds the best parameters according to the conditional probability of the performance in the training set using a surrogate, for instance, the estimate of the objective function. Specifically, the classification error to minimize is the objective function f. In sum, the BO algorithm essentially involves the following two major phases: (i) updating the surrogate model in each iteration to find the best guess and measure the uncertainty of the model and parameters, and (ii) finding the next sample point based on the acquisition function used to guide the solution toward the global optimal.

2.6. Performance Measures

In this work, three performance measures are adopted to evaluate the performance of each predictive model pre- and post-BO: RMSE, MAD, and MAPE. They are expressed as follows:

The RMSE measures the average magnitude of the errors between predicted and actual values, and it gives a higher weight to larger errors compared to smaller errors. It calculates the square root of the average of squared differences between predicted and actual values over ‘n’ observations. The RMSE is expressed as follows:

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(x_{i} - {\hat{x}}_{i})}^{2}}

(13)

M A D = \frac{1}{n} \sum_{i = 1}^{n} |x_{i} - \bar{x}|

(14)

M A P E = \frac{1}{n} \sum_{t = 1}^{n} |\frac{x_{t} - {\hat{x}}_{t}}{x_{t}}| \times 100

(15)

where

x

represents the actual value of the target variable for the ith observation,

\hat{x}

is the predicted value of the target variable for the ith observation,

\bar{x}

is the average of the dataset, and n is the total number of values in the dataset.

Recall that the RMSE measures the average magnitude of the errors between predicted and actual values, and it gives a higher weight to larger errors compared to smaller errors. The MAD measures the average absolute deviation of a set of values from their mean. It is used to understand the variability or dispersion of a dataset and computes the average of absolute differences between predicted and actual values, focusing on the average magnitude of errors, regardless of their direction. Finally, the MAPE measures the accuracy of predictions as a percentage of the absolute errors relative to the actual values.

3. Results

We selected our data from Federal Reserve Economic Data [23] which is a widely used and highly relevant data source for economic and financial analysis. The span of our WTI data is from 2 January 1986 to 18 September 2023, and for BRENT it is from 20 May 1987 to 18 September 2023. For our comparative analysis we used the standard partition rules; specifically, we employed 80% of the samples for training the neural networks and the remaining 20% for testing. All models are applied to daily, weekly, and monthly samplings to check their performance across three different time horizons. In addition, all experiments are conducted under two scenarios: when neural networks are not tuned, and when they are improved by Bayesian optimization. Figure 2 and Figure 3 respectively display the training and test sets for WTI and Brent crude oil prices.

3.1. WTI Sampling and Forecasting Results

Based on Table 1, the RNN-BO model exhibits the best forecasting performance for daily WTI crude oil prices after Bayesian optimization, delivering an RMSE of 0.07338, an MAD of 0.04029, and an MAPE of 25.75%. According to Table 2, the RNN-BO model demonstrates superior forecasting accuracy for weekly WTI crude oil prices following Bayesian optimization, with an RMSE of 0.06764, an MAD of 0.07191, and an MAPE of 27.85%. Finally, with the values obtained as per Table 3, RNN-BO turns out to be the best model for forecasting monthly WTI prices, with the RMSE, MAD, and MAPE being lowest at 0.07616, 0.06772, and 29.82%, respectively. The RNN-BO model’s effectiveness stems from its ability to capture the sequential patterns present in daily, weekly, and monthly WTI price movements, a crucial requirement for time series forecasting, making it well-suited for this specific task. Regarding the role of BO, one can see that it improves the accuracy of all models for all forecasting horizons, as shown in Table 1, Table 2 and Table 3, except for the CNN in Table 1.

3.2. Brent Sampling and Forecasting Results

Considering the Bayesian optimization results from Table 4, it can be concluded that the GRU model is the best for forecasting Brent’s daily prices due to its superior reduction in error metrics. With the lowest RMSE at 0.05304, the lowest MAD at 0.0358, and the most significant reduction in the MAPE to 30.48%, the GRU-BO model outperforms CNN-BO, RNN-BO, and LSTM-BO in terms of prediction accuracy. This suggests that the GRU is more efficient at capturing the complexities of the dataset and yielding more accurate forecasts when enhanced with Bayesian optimization.

Based on the error evaluation values in Table 5, the GRU-BO model emerges as the most accurate for forecasting weekly Brent crude oil prices. It exhibits the lowest RMSE at 0.07888, an MAD of 0.07002, and the most significant reduction in the MAPE to 25.95%, indicating it delivers the closest predictions to the actual prices compared to the CNN, RNN, and LSTM models when Bayesian optimization is applied.

Upon evaluating the metrics for monthly oil price predictions as in Table 6, it is the GRU model that stands preeminent, enhanced by Bayesian optimization. It records an RMSE of just 0.0315 and a markedly low MAD of 0.06136, accompanied by a substantial reduction in the MAPE to 31.03%. Lastly, Table 4, Table 5 and Table 6 show strong evidence that BO significantly improves the performance of each model across all time horizons.

4. Discussion and Conclusions

Crude oil is playing an increasingly important role in the world economy. In this regard, scholars have conducted various research works on crude oil markets to analyze, understand, and equip stakeholders with the tools necessary to navigate the complex dynamics of oil price movements and to implement more informed and resilient strategies in such volatile markets. For instance, recent studies examined crude oil market efficiency [24,25], linkages between crude oil markets [26,27], price prediction [28], resource scheduling [29], and information spillover [30]. In this study, the historical data on oil was captured from January 1986 to September 2023, a span of almost four decades, to predict the pricing trends for future markets. The two major selected datasets are WTI, which is extracted from fields located in Texas, and Brent crude, which is extracted from the North Sea near Europe; both are listed on US and European exchanges. To predict the prices of WTI and Brent crude oil, we conducted a comparative study of four advanced deep learning models: the CNN, LSTM, the GRU, and the RNN. The experimental results revealed that the GRU model outperformed its counterparts in forecasting BRENT crude prices, while the RNN model was superior for WTI predictions, as determined by MAPE, MAD, and RMSE metrics.

Based on the comprehensive analysis, for Brent crude oil, the GRU model with Bayesian optimization emerges as the top performer across all time horizons (daily, weekly, and monthly), consistently achieving the lowest RMSE, MAD, and MAPE. The GRU model surpassed the CNN, the RNN, and LSTM in Brent crude forecasting primarily due to its effective long-term dependency handling and lower computational complexity. GRUs simplify the learning process while maintaining high accuracy, making them particularly suited for the dynamic Brent market.

For WTI crude oil, the RNN model after Bayesian optimization exhibits superior forecasting accuracy in daily, weekly, and monthly timeframes, consistently outperforming other models in terms of RMSE, MAD, and MAPE. The RNN model excelled in forecasting WTI crude oil prices due to its straightforward architecture, which is particularly effective at processing the types of patterns prevalent in the WTI market. This simplicity enables RNNs to quickly adapt to the market’s short-term volatilities driven by immediate supply and demand changes, inventory updates, and regional economic indicators without the computational overhead and complexity of LSTM and GRU models.

Recall that the GRU-BO is the best performer for forecasting Brent prices across all time horizons, and that RNN-BO is the best performer for forecasting WTI prices across all time horizons. To provide some plausible explanations for this, we computed the main descriptive statistics, as displayed in Table 7. As shown, the average price in Brent is higher than that in the WTI market, and the price in the Brent market is more volatile compared to that in the WTI market. Furthermore, the kurtosis of the distribution of Brent prices is lower than that of WTI prices. Finally, both distributions show similar negative skewness. Hence, on the one hand, one can conclude that GRU-BO can learn crude oil price distribution (for instance, in the Brent market) with a high price average, large volatility, and limited outliers. On the other hand, one can conclude that RNN-BO can learn crude oil price distribution (for instance, in the WTI market) with a low price average, low volatility, and significant outliers.

The application of Bayesian optimization significantly enhances the predictive capabilities of all models, resulting in notable improvements in forecasting accuracy for both Brent and WTI crude oil prices by leveraging probabilistic reasoning to systematically explore the hyperparameter space and identify the optimal configuration for each model. These findings underscore the critical role of hyperparameter tuning in optimizing model performance and highlight the effectiveness of these neural network models in capturing the complexities of crude oil price movements.

It is worth mentioning that price forecasting offers significant benefits to crude oil producers by guiding their strategic decision-making processes. With accurate price predictions, producers can better manage their production schedules, adjusting output to align with expected market conditions and price levels. This study’s emphasis on short-term market fluctuations, including time horizons such as weekly and monthly trends, allows producers and suppliers to fine-tune their strategies to be more responsive to market signals. By adopting such approaches, these stakeholders can improve the accuracy of their forecasts, thereby making more informed decisions about procurement, production, and distribution.

Despite the promising results of the GRU-BO and RNN-BO, some limitations remain to be highlighted. First, macroeconomic indicators were not included, mainly because most of them are available on a quarterly basis. Second, information regarding extreme political risk, such as from international military conflicts, was not considered in the current work. In this regard, future studies could use more inputs, such as geopolitical stability, international financial variables, and world economic indicators. Advancements in machine learning and artificial intelligence, including hybrid approaches, could lead to more precise forecasting models that can adapt to the multifaceted influences on oil prices. These models would be invaluable tools for countries, particularly those identified as highly vulnerable, enabling them to strategize more effectively in mitigating oil price risks. Specifically, for future work, to capture the intricate interplay of spatial and temporal dependencies within crude oil and improve accuracy, one could use multivariate time series graph neural networks with temporal attention and learnable adjacency matrices, spatial attention graphs with temporal convolutional networks, and attention-based spatial–temporal graph convolutional networks. Moreover, in future work, one can consider implementing vision transformers and their temporal variants as they have been proven to be effective in time series analysis and forecasting [31].

Indeed, accurate forecasts could inform national energy policies, guide investment in infrastructure, and aid in the strategic diversification of energy resources. Ultimately, such research tasks would contribute significantly to enhancing economic resilience against the backdrop of a dynamic and uncertain global energy market.

Author Contributions

Conceptualization, S.K. and S.L.; methodology, S.K. and S.L.; software, S.K.; validation, S.K. and S.L.; formal analysis, S.K.; investigation, S.K.; data curation, S.K.; writing—original draft preparation, S.K.; writing—review and editing, S.K. and S.L.; visualization, S.K. and S.L.; supervision, S.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original data presented in the study are openly available in [23].

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

BO	Bayesian optimization
CNN	Convolutional neural network
GRU	Gated recurrent unit
LSTM	Long short-term memory
MAD	Mean absolute deviation
MAPE	Mean absolute percentage error
RMSE	Root mean of squared errors
RNN	Recurrent neural network

References

Moshiri, S.; Foroutan, F. Forecasting Nonlinear Crude Oil Futures Prices. Energy J. 2006, 27, 81–96. [Google Scholar] [CrossRef]
Gao, W.; Aamir, M.; Shabri, A.B.; Dewan, R.; Aslam, A. Forecasting Crude Oil Price Using Kalman Filter Based on the Reconstruction of Modes of Decomposition Ensemble Model. IEEE Access 2019, 7, 149908–149925. [Google Scholar] [CrossRef]
Wu, J.; Chen, Y.; Zhou, T.; Li, T. An Adaptive Hybrid Learning Paradigm Integrating CEEMD, ARIMA and SBL for Crude Oil Price Forecasting. Energies 2019, 12, 1239. [Google Scholar] [CrossRef]
Marchese, M.; Kyriakou, I.; Tamvakis, M.; Di Iorio, F. Forecasting crude oil and refined products volatilities and correlations: New evidence from fractionally integrated multivariate GARCH models. Energy Econ. 2020, 88, 104757. [Google Scholar] [CrossRef]
Alam, M.S.; Murshed, M.; Manigandan, P.; Pachiyappan, D.; Abduvaxitovna, S.Z. Forecasting oil, coal, and natural gas prices in the pre-and post-COVID scenarios: Contextual evidence from India using time series forecasting tools. Resour. Policy 2023, 81, 103342. [Google Scholar] [CrossRef] [PubMed]
Guo, J.; Zhao, Z.; Sun, J.; Sun, S. Multi-perspective crude oil price forecasting with a new decomposition-ensemble framework. Resour. Policy 2022, 77, 102737. [Google Scholar] [CrossRef]
Yu, L.; Xu, H.; Tang, L. LSSVR ensemble learning with uncertain parameters for crude oil price forecasting. Appl. Soft Comput. 2017, 56, 692–701. [Google Scholar] [CrossRef]
Zhang, T.; Tang, Z.; Wu, J.; Du, X.; Chen, K. Multi-step-ahead crude oil price forecasting based on two-layer decomposition technique and extreme learning machine optimized by the particle swarm optimization algorithm. Energy 2021, 229, 120797. [Google Scholar] [CrossRef]
Wang, J.; Zhou, H.; Hong, T.; Li, X.; Wang, S. A multi-granularity heterogeneous combination approach to crude oil price forecasting. Energy Econ. 2020, 91, 104790. [Google Scholar] [CrossRef]
Zhang, Y.; Lahmiri, S. A Deep Learning-Based Ensemble System for Brent and WTI Crude Oil Price Analysis and Prediction. Entropy 2025, 27, 1122. [Google Scholar] [CrossRef]
Niu, T.; Wang, J.; Lu, H.; Yang, W.; Du, P. A Learning System Integrating Temporal Convolution and Deep Learning for Predictive Modeling of Crude Oil Price. IEEE Trans. Ind. Inform. 2021, 17, 4602–4612. [Google Scholar] [CrossRef]
Busari, G.A.; Lim, D.H. Crude oil price prediction: A comparison between AdaBoost-LSTM and AdaBoost-GRU for improving forecasting performance. Comput. Chem. Eng. 2021, 155, 107513. [Google Scholar] [CrossRef]
Karasu, S.; Altan, A. Crude oil time series prediction model based on LSTM network with chaotic Henry gas solubility optimization. Energy 2022, 242, 122964. [Google Scholar] [CrossRef]
Sen, A.; Dutta Choudhury, K. Forecasting the Crude Oil prices for last four decades using deep learning approach. Resour. Policy 2024, 88, 104438. [Google Scholar] [CrossRef]
Cen, Z.; Wang, J. Crude oil price prediction model with long short term memory deep learning based on prior knowledge data transfer. Energy 2019, 169, 160–171. [Google Scholar] [CrossRef]
Nagendra Kumar, Y.J.; Preetham, P.; Kiran Varma, P.; Rohith, P.; Dilip Kumar, P. Crude Oil Price Prediction Using Deep Learning. In Proceedings of the 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 15–17 July 2020; pp. 118–123. [Google Scholar] [CrossRef]
Mohsin, M.; Jamaani, F. A novel deep-learning technique for forecasting oil price volatility using historical prices of five precious metals in context of green financing—A comparison of deep learning, machine learning, and statistical models. Resour. Policy 2023, 86, 104216. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Cho, K.; van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar] [CrossRef]
Elman, J.L. Finding structure in time. Cogn. Sci. 1990, 14, 179–211. [Google Scholar] [CrossRef]
Gelbart, M.; Snoek, J.; Adams, R.P. Bayesian Optimization with Unknown Constraints. arXiv 2014, arXiv:1403.5607. [Google Scholar] [CrossRef]
Available online: https://fred.stlouisfed.org/ (accessed on 21 September 2023).
Lahmiri, S. Wavelet Entropy for Efficiency Assessment of Price, Return, and Volatility of Brent and WTI During Extreme Events. Commodities 2025, 4, 4. [Google Scholar] [CrossRef]
Lahmiri, S. Price disorder and information content in energy and gold markets: The effect of the COVID-19 pandemic. Energy Nexus 2024, 16, 100343. [Google Scholar] [CrossRef]
Lahmiri, S. Causality Between Brent and West Texas Intermediate: The Effects of COVID-19 Pandemic and Russia–Ukraine War. Commodities 2025, 4, 2. [Google Scholar] [CrossRef]
Lahmiri, S. The nexus between fossil energy markets and the effect of the COVID-19 pandemic on clustering structures. Energy Nexus 2024, 16, 100344. [Google Scholar] [CrossRef]
Tang, Y.; Gao, Z.; Li, Y.; Cai, Z.; Yu, J.; Qin, P. Crude Oil and Hot-Rolled Coil Futures Price Prediction Based on Multi-Dimensional Fusion Feature Enhancement. Algorithms 2025, 18, 357. [Google Scholar] [CrossRef]
Ma, N.; Wang, Z.; Ba, Z.; Li, X.; Yang, N.; Yang, X.; Zhang, H. Hierarchical Reinforcement Learning for Crude Oil Supply Chain Scheduling. Algorithms 2023, 16, 354. [Google Scholar] [CrossRef]
An, S. Dynamic Multiscale Information Spillover among Crude Oil Time Series. Entropy 2022, 24, 1248. [Google Scholar] [CrossRef]
Fnu, N.; Bansal, A. Understanding the architecture of vision transformer and its variants: A review. In Proceedings of the 1st International Conference on Innovative Engineering Sciences and Technological Research (ICIESTR) 2024, Muscat, Oman, 14–15 May 2024; pp. 1–6. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the methodology.

Figure 2. Plot of WTI prices.

Figure 3. Plot of Brent prices.

Table 1. Accuracy of neural networks for WTI based on daily forecasting.

Evaluation Metrics	WTI Daily Sampling Rate
	CNN		RNN		LSTM		GRU
	Without BO	With BO	Without BO	With BO	Without BO	With BO	Without BO	With BO
RMSE	0.08518	0.08072	0.07892	0.07338	0.18784	0.0777	0.07671	0.07462
MAD	0.04424	0.04639	0.04338	0.04029	0.13301	0.04461	0.04355	0.0424
MAPE	29.06%	32.20%	27.75%	25.75%	50.97%	32.14%	32.63%	28.61%

Table 2. Accuracy of neural networks for WTI based on weekly forecasting.

Evaluation Metrics	WTI Weekly Sampling Rate
	CNN		RNN		LSTM		GRU
	Without BO	With BO	Without BO	With BO	Without BO	With BO	Without BO	With BO
RMSE	0.16538	0.12163	0.11364	0.06764	0.14179	0.13478	0.10912	0.09902
MAD	0.11613	0.08267	0.07858	0.07191	0.10199	0.099214	0.07583	0.07304
MAPE	53.37%	38.98%	52.14%	27.85%	49.55%	34.85%	50.79%	38.28%

Table 3. Accuracy of neural networks for WTI based on monthly forecasting.

Evaluation Metrics	WTI Monthly Sampling Rate
	CNN		RNN		LSTM		GRU
	Without BO	With BO	Without BO	With BO	Without BO	With BO	Without BO	With BO
RMSE	0.33162	0.11325	0.13872	0.07616	0.31002	0.25095	0.24242	0.23847
MAD	0.23227	0.16383	0.16821	0.06772	0.22913	0.17825	0.17573	0.18848
MAPE	56.16%	46.00%	46.06%	29.82%	50.36%	37.36%	50.07%	35.73%

Table 4. Accuracy of neural networks for Brent based on daily forecasting.

Evaluation Metrics	BRENT Daily Sampling Rate
	CNN		RNN		LSTM		GRU
	Without BO	With BO	Without BO	With BO	Without BO	With BO	Without BO	With BO
RMSE	0.0668	0.05688	0.05792	0.05396	0.06837	0.05457	0.05412	0.05304
MAD	0.04412	0.03834	0.03864	0.03651	0.04792	0.04355	0.037	0.03585
MAPE	43.30%	40.22%	38.40%	37.98%	43.05%	34.34%	37.05%	30.48%

Table 5. Accuracy of neural networks for Brent based on weekly forecasting.

Evaluation Metrics	BRENT Weekly Sampling Rate
	CNN		RNN		LSTM		GRU
	Without BO	With BO	Without BO	With BO	Without BO	With BO	Without BO	With BO
RMSE	0.16253	0.10379	0.23467	0.10575	0.13512	0.10641	0.10667	0.07888
MAD	0.11734	0.07527	0.17599	0.07544	0.09656	0.07573	0.07602	0.0702
MAPE	54.01%	40.80%	55.00%	33.76%	55.57%	36.17%	44.66%	25.95%

Table 6. Accuracy of neural networks for Brent based on monthly forecasting.

Evaluation Metrics	BRENT Monthly Sampling Rate
	CNN		RNN		LSTM		GRU
	Without BO	With BO	Without BO	With BO	Without BO	With BO	Without BO	With BO
RMSE	0.13639	0.12742	0.05464	0.03467	0.16197	0.06279	0.03739	0.0315
MAD	0.2502	0.16888	0.03771	0.07599	0.14242	0.09332	0.07497	0.06136
MAPE	66.33%	45.04%	43.21%	38.00%	45.75%	38.66%	44.06%	31.03%

Table 7. Summary of descriptive statistics for Brent and WTI prices.

	Brent	WTI
Average	73.01	68.19
Standard deviation	19.26	18.54
Kurtosis	0.87	1.02
Skewness	−0.14	−0.14

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kachwaha, S.; Lahmiri, S. Improving Deep Learning Models by Bayesian Optimization to Predict Crude Oil Prices. Algorithms 2025, 18, 762. https://doi.org/10.3390/a18120762

AMA Style

Kachwaha S, Lahmiri S. Improving Deep Learning Models by Bayesian Optimization to Predict Crude Oil Prices. Algorithms. 2025; 18(12):762. https://doi.org/10.3390/a18120762

Chicago/Turabian Style

Kachwaha, Shagun, and Salim Lahmiri. 2025. "Improving Deep Learning Models by Bayesian Optimization to Predict Crude Oil Prices" Algorithms 18, no. 12: 762. https://doi.org/10.3390/a18120762

APA Style

Kachwaha, S., & Lahmiri, S. (2025). Improving Deep Learning Models by Bayesian Optimization to Predict Crude Oil Prices. Algorithms, 18(12), 762. https://doi.org/10.3390/a18120762

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Improving Deep Learning Models by Bayesian Optimization to Predict Crude Oil Prices

Abstract

1. Introduction

2. Materials and Methods

2.1. Convolutional Neural Networks

2.2. Long Short-Term Memory Neural Networks

2.3. Gated Recurrent Units

2.4. Recurrent Neural Networks

2.5. Bayesian Optimization

2.6. Performance Measures

3. Results

3.1. WTI Sampling and Forecasting Results

3.2. Brent Sampling and Forecasting Results

4. Discussion and Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI