1. Introduction
The global energy system is undergoing a profound structural transformation driven by decarbonization policies, technological innovation, and shifting demand patterns. This ongoing energy transition is progressively reshaping the role of fossil fuels, including crude oil, within the global economy. Despite this transition, crude oil remains a strategically critical commodity, influencing inflation, production costs, trade balances, financial stability, and geopolitical dynamics [
1,
2,
3]. However, the determinants of oil price dynamics are no longer confined to traditional supply–demand fundamentals. Instead, they increasingly reflect the interaction between fossil fuel markets and transition-related forces such as climate policy, electrification, and technological change [
4,
5,
6].
Within this evolving landscape, oil markets are becoming embedded in a broader environment, shaping expectations. The acceleration of electric vehicle (EV) adoption, the expansion of carbon pricing mechanisms, and the rapid development of artificial intelligence (AI) are altering both current market behavior and expectations about future energy demand. For instance, the rapid growth of EV markets signals long-term substitution away from oil in transportation, while carbon markets embed policy-driven constraints on carbon-intensive activities. At the same time, AI-related technological expansion influences energy consumption patterns and investment allocation, particularly through increased electricity demand and industrial transformation [
7,
8]. These developments imply that oil price formation must be analyzed within the broader context of energy transition dynamics rather than as an isolated commodity market phenomenon.
From a theoretical standpoint, traditional oil price models emphasize demand shocks, supply disruptions, precautionary demand, and inventory behavior [
1,
9]. While these mechanisms remain relevant, they are increasingly complemented, and sometimes dominated, by transition-driven channels. Carbon allowance markets (ETS), for example, affect oil prices by altering the relative cost of fossil fuel use, transmitting climate policy expectations, and influencing fuel substitution decisions [
10]. Similarly, geopolitical risk continues to play a central role, but its interaction with transition dynamics can amplify uncertainty, particularly in a fragmented global energy system [
11]. In this context, oil markets exhibit heightened sensitivity not only to physical shocks but also to policy signals, financial flows, and technological expectations.
These structural changes have important implications for forecasting. Crude oil price series are typically non-stationary, exhibiting trends, structural breaks, and time-varying variance [
12]. To address this issue, empirical studies commonly transform prices into returns, which are generally stationary and more suitable for econometric modeling [
13]. However, even when stationarity is achieved in the mean, crude oil returns still exhibit important stylized facts, including nonlinearity, volatility clustering, and heavy-tailed distributions [
14]. These features imply that standard linear models may be insufficient to capture the full dynamics of oil markets, particularly under conditions of heightened uncertainty and structural change [
15]. Traditional forecasting approaches, which often focus on conditional mean prediction, are therefore insufficient for capturing the full risk profile of oil markets. In the context of an energy transition, understanding volatility dynamics and tail risks is particularly important, as extreme price movements may arise from abrupt policy shifts, technological breakthroughs, or geopolitical disruptions. Consequently, a distributional perspective that jointly models expected returns, volatility, and tail behavior becomes essential for both financial and policy applications.
Against this background, this study examines whether key energy transition-related and uncertainty-driven variables, namely AI activity, carbon allowance returns (ETS), geopolitical risk (GPR), and EV market returns (SPKS), contain predictive information for crude oil return dynamics. Specifically, the study aims to (i) assess the extent to which these factors influence the conditional mean and volatility of oil returns; (ii) distinguish between directional predictability and risk-based transmission channels; and (iii) develop a forecasting framework capable of capturing the distributional properties of oil returns in a transition-driven environment.
This paper contributes to the literature in several ways. First, it situates crude oil forecasting explicitly within the energy transition paradigm by integrating technological, environmental, and geopolitical drivers into a unified framework. Second, it advances the understanding of heterogeneous transmission mechanisms by distinguishing between mean and volatility/tail-risk predictability. Third, it adopts a heavy-tailed distributional LSTM approach that jointly models return dynamics and risk characteristics, addressing the limitations of traditional point-forecasting methods. Finally, the study provides practical insights into risk management and energy policy by identifying which transition-related variables are most relevant for forecasting oil market behavior under structural change.
The choice of West Texas Intermediate (WTI) crude oil as the benchmark in this study is motivated by both theoretical and practical considerations. WTI is one of the most widely traded and liquid crude oil benchmarks globally, serving as a key reference price in financial markets and academic research. Its high frequency of trading and deep derivatives market make it particularly suitable for modeling return dynamics and volatility behavior. Moreover, WTI reflects market conditions in a major oil-producing region and is extensively used in forecasting studies, allowing comparability with the existing literature. From an econometric perspective, the availability of high-quality, continuous data and its responsiveness to both global and regional shocks make WTI an appropriate proxy for analyzing the interaction between oil markets and energy transition drivers. Therefore, its selection ensures both empirical robustness and consistency with prior research.
The remainder of the paper is organized as follows.
Section 2 reviews the literature.
Section 3 presents the methodology, including data construction, variable definitions, preliminary predictive-content testing, and the proposed heavy-tailed distributional LSTM framework.
Section 4 describes the experimental design, training configuration, benchmarks, and evaluation metrics.
Section 5 reports and discusses the empirical results for mean forecasting, magnitude/volatility forecasting, and heavy-tail dynamics.
Section 6 concludes with implications for energy economics and finance, limitations, and directions for future research.
3. Data Description and Preliminary Analysis
Before model estimation, preliminary statistical analysis is conducted to understand the properties of the data. To assess stationarity, Augmented Dickey–Fuller (ADF) tests are performed. The results confirm that all return series are stationary at conventional significance levels, justifying their use in the predictive framework. Additionally, correlation analysis is conducted to examine linear dependencies between variables. While correlations remain moderate, nonlinear dependencies may still exist, motivating the use of deep learning models.
Table 2 reports summary statistics, including mean, standard deviation, skewness, and kurtosis for all variables.
The descriptive statistics highlight key features of the data. All return series show small mean values, consistent with financial returns, while standard deviations, especially for WTI and GPR, indicate substantial variability. Skewness reveals asymmetry, with GPR and SPKS positively skewed and WTI slightly negative. Crucially, all variables exhibit excess kurtosis, confirming heavy-tailed distributions. The interquartile range (IQR) further underscores dispersion and extreme observations. These stylized facts justify the use of nonlinear and distributional modeling approaches in the subsequent analysis. In addition to standard descriptive statistics, the skewness and kurtosis values indicate that all variables deviate from normality. Positive kurtosis values confirm leptokurtic distributions, while skewness suggests asymmetry in return behavior. Furthermore, interquartile range (IQR) analysis reveals the presence of significant outliers, particularly during periods of market stress, supporting the need for robust modeling approaches.
3.1. Stationarity and Time-Series Properties
A key prerequisite for time-series modeling is the stationarity of the variables. Crude oil price levels are generally non-stationary, which can lead to spurious regression results if used directly. To address this issue, all variables in this study are transformed into daily returns, which are typically stationary [
12]. Before formal testing, an exploratory data analysis (EDA) is conducted to examine the statistical and distributional properties of the series.
Figure 1,
Figure 2,
Figure 3,
Figure 4 and
Figure 5 present the time-series evolution of WTI, ETS, GPR, AI, and SPKS returns. These figures reveal clear evidence of volatility clustering, with periods of high and low variability, as well as the presence of extreme observations, suggesting non-normal and heavy-tailed behavior.
A key prerequisite for time-series modeling is the stationarity of the variables. Crude oil price levels are generally non-stationary, which can lead to spurious regression results if used directly [
62]. To address this issue, all variables in this study are transformed into daily returns, which are typically stationary [
12]. To formally verify stationarity, Augmented Dickey–Fuller (ADF) tests are conducted, and the results are reported in
Table 3 for each return series [
63]. The results confirm that the null hypothesis of a unit root is rejected at conventional significance levels, indicating that all variables are stationary and suitable for predictive modeling. Despite stationarity in the mean, the descriptive statistics (
Table 2) show that return distributions exhibit non-zero skewness and high kurtosis, reinforcing the presence of asymmetry and heavy tails. These characteristics justify the use of advanced nonlinear and distributional models capable of capturing higher-order moments [
63]. The results confirm that the null hypothesis of a unit root is rejected at conventional significance levels, indicating that the transformed series are stationary and suitable for predictive modeling [
63]. Despite stationarity in the mean, return series still exhibit time-varying volatility and heavy tails, which motivates the use of advanced models capable of capturing higher-order distributional characteristics.
The results in
Table 3 confirm that the null hypothesis of a unit root is rejected for all variables at the 5% significance level. This indicates that all return series are stationary, justifying their use in the predictive modeling framework.
3.2. Variable Description
The dataset consists of daily return series covering the period from 18 September 2018 to 9 January 2026. All variables are expressed as daily returns. Descriptive time-series plots of the variables are presented in
Figure 1,
Figure 2,
Figure 3,
Figure 4 and
Figure 5 to illustrate their dynamic behavior. The variables are defined as follows:
Date: Trading day identifier.
WTI: Daily return of West Texas Intermediate crude oil prices. This variable forms the basis for constructing the forward oil shock indicator and represents the target variable of the forecasting framework.
ETS: Daily return of carbon emission allowance prices under the Emission Trading System (ETS), capturing carbon market dynamics and climate policy intensity that may influence fossil fuel demand expectations.
GPR: Daily return of the Geopolitical Risk index, reflecting changes in geopolitical uncertainty that may affect oil supply expectations.
AI: Daily return of a market-based index tracking publicly listed firms associated with artificial intelligence technologies (e.g., an AI-focused equity index or ETF such as the Global X Artificial Intelligence and Technology ETF). This proxy reflects financial market expectations regarding AI-related innovation and adoption. The use of a market-based index ensures high-frequency availability and consistency with other financial variables in the model. This measure is preferred over alternative proxies such as patent counts or search query indices because it provides a forward-looking, market-based assessment of AI activity that is directly comparable with financial return series.
SPKS: Daily return of the S&P Kensho EV s Index, capturing developments in EV markets and electrification trends that may affect long-term oil demand expectations.
The selection of explanatory variables is grounded in the economic mechanisms underlying the energy transition. Carbon allowance returns (ETS) capture climate policy signals and the evolving cost of carbon-intensive production, which can influence oil demand expectations and fuel substitution decisions. Geopolitical risk (GPR) reflects uncertainty related to supply disruptions, conflicts, and global instability, which are well-established drivers of oil price volatility. The inclusion of artificial intelligence (AI) activity is motivated by its role in shaping long-term energy demand through technological innovation, increased electricity consumption, and structural shifts in industrial production. Similarly, the EV market index (SPKS) captures electrification trends and the gradual substitution of oil in transportation. Together, these variables are intended to represent distinct but complementary channels, policy, geopolitical uncertainty, technological change, and energy transition dynamics, through which modern oil markets are influenced. While each variable has theoretical relevance, their combined use reflects the broader objective of capturing forward-looking and cross-market information embedded in financial and economic systems.
It is important to note that AI activity is not interpreted as a direct driver of short-term oil demand, but rather as a forward-looking proxy for technological transformation. Rapid advances in artificial intelligence are associated with structural changes in production processes, energy consumption patterns, and electricity demand, particularly through data centers and digital infrastructure. At the same time, AI-driven efficiency gains may reduce energy intensity in some sectors. As a result, AI activity can influence oil markets indirectly through expectations about future energy demand, technological substitution, and economic restructuring. This forward-looking nature makes it a relevant variable in a predictive framework focused on uncertainty and transition dynamics.
The series exhibits volatility clustering and extreme fluctuations, indicating non-normal behavior and potential heavy-tailed distribution.
5. Results
The reported results are based on models that have been validated using both in-sample validation procedures and out-of-sample evaluation to ensure robustness and to mitigate overfitting concerns. Before discussing the results, it is important to emphasize that the forecasting relationships identified in this study should not be interpreted as evidence of structural causation. Instead, they reflect statistical associations that improve forecasting performance within the sample. This distinction is particularly relevant when using Granger causality tests, which capture temporal predictability rather than true economic causality. In addition to regression-based evaluation metrics, classification-based measures are employed to assess directional predictability.
5.1. Mean Forecasting Performance
The performance of the proposed model is evaluated relative to traditional econometric benchmarks, including ARIMA, GARCH, and VAR models, in addition to the historical mean.
Table 7 presents a comparison of the proposed model with standard econometric benchmarks. In terms of mean prediction, the LSTM model achieves an RMSE of 0.0201 and an MAE of 0.0154 (see
Figure 6).
Directional accuracy is 52.75%, meaning the model predicts next-day return signs only slightly better than random. This modest gain suggests weak directional signals, largely linked to carbon market dynamics, while the absence of broad mean-causality underscores structural limits in crude oil predictability. The out-of-sample relative to the historical mean benchmark is −0.159, indicating no improvement over a constant-mean forecast. From a financial economics perspective, this reflects both limited signal in mean returns and a mismatch between model complexity and the data-generating process. Overall, the results highlight the difficulty of forecasting daily crude oil returns and show that predictive gains remain modest when evaluated by mean prediction.
This result also requires a more nuanced interpretation. A negative out-of-sample indicates that the model’s mean forecasts are inferior to a simple historical mean benchmark, implying that the added model complexity does not translate into improved mean-squared accuracy. This outcome is not merely indicative of weak predictability but suggests that the nonlinear LSTM architecture may capture noise rather than stable directional signals when applied to mean prediction. One possible explanation lies in the well-known low signal-to-noise ratio of daily crude oil returns. Directional movements are often dominated by unpredictable shocks, including geopolitical events, policy announcements, and market sentiment shifts. In such environments, complex nonlinear models may overfit transient patterns in the training data, even when regularization techniques such as dropout and early stopping are applied. As a result, the model may exhibit inferior generalization performance relative to a simple benchmark that imposes strong bias but low variance.
In contrast, the superior performance of the LSTM model in forecasting return magnitude can be explained by the different statistical nature of volatility dynamics. Unlike mean returns, volatility exhibits persistence, clustering, and nonlinear dependence structures that are more amenable to learning by deep neural networks. The LSTM architecture is particularly well-suited to capturing these temporal dependencies and state-dependent effects, especially when combined with a distributional framework that models higher-order moments. Therefore, the divergence in performance between mean and magnitude forecasting reflects a fundamental distinction between directional predictability and risk predictability in crude oil markets. While the former remains limited and prone to noise, the latter benefits from more stable underlying structures that can be effectively captured by nonlinear models.
The results in
Table 7 provide a comparative evaluation of the proposed distributional LSTM model against standard econometric benchmarks. Consistent with the literature, traditional models such as ARIMA and VAR perform slightly better in terms of mean-squared error, reflecting the well-known difficulty of improving upon naive benchmarks in short-horizon oil return prediction. The historical mean model also remains highly competitive, reinforcing the limited predictability of daily returns.
However, the proposed LSTM model outperforms all benchmarks in terms of directional accuracy, indicating its ability to capture weak nonlinear patterns in return direction. More importantly, the LSTM achieves the best performance in forecasting return magnitude, as evidenced by the lowest RMSE and MAE for absolute returns. This confirms that the primary advantage of the model lies in capturing volatility and risk dynamics rather than improving mean forecasts.
These findings support the broader conclusion that, in the context of energy transition and heightened uncertainty, predictive gains are concentrated in the risk dimension rather than in average return predictability. To further evaluate the directional predictive performance of the model, the continuous return forecasts are transformed into a binary classification problem. Specifically, predicted returns are converted into directional signals, where positive returns are classified as “upward movement” and negative returns as “downward movement” (see
Table 8). This transformation allows the use of classification-based evaluation metrics such as the confusion matrix and receiver operating characteristic (ROC) curve.
The confusion matrix indicates that the model achieves a balanced classification performance, with comparable numbers of correctly predicted upward and downward movements. While the predictive power remains modest, consistent with the low signal-to-noise ratio of daily oil returns, the results confirm that the model captures weak directional information beyond random chance.
To complement regression-based evaluation, the model’s directional predictability is assessed using the receiver operating characteristic (ROC) curve. Given the low signal-to-noise ratio in crude oil forecasting, ROC analysis provides a threshold-independent measure of the model’s ability to distinguish upward from downward movements. This framework is particularly relevant here, as directional accuracy is modest and traditional error metrics may not fully capture weak predictive signals embedded in nonlinear dynamics (see
Figure 6).
The ROC curve shows only limited improvement over a random classifier, with an AUC of about 0.54. This indicates that while the LSTM captures some nonlinear patterns, its discriminatory power remains weak. Economically, this aligns with the difficulty of predicting short-term crude oil returns, which are driven by shocks and fast-changing information. Importantly, the modest AUC reinforces that predictive gains lie more in volatility and tail-risk dynamics than in directional forecasting. Thus, the ROC analysis serves as a robustness check, confirming that directional signals exist but remain small and unreliable (see
Table 9).
These classification metrics further confirm that the model achieves only modest improvements in directional prediction, with performance slightly above random chance (see
Figure 7). This result is consistent with the broader literature and highlights that the primary predictive gains of the model lie in volatility and risk estimation rather than directional forecasting.
5.2. Magnitude and Risk Forecasting Performance
While mean predictability remains limited, the model performs better at capturing return magnitude. The RMSE and MAE for absolute returns are 0.0126 and 0.0096, respectively, indicating improved accuracy in modeling the size of daily oil price movements (see
Figure 8). This finding is economically meaningful, as volatility dynamics in energy markets are typically more predictable than directional returns.
The earlier causality analysis showed that, unlike ETS, AI and SPKS do not exhibit robust mean-causal effects. Instead, their informational content appears to operate primarily through dispersion channels. Developments in artificial intelligence investment cycles and electrification trends may alter expectations about energy demand without generating stable short-run directional pressure on oil prices. As a result, these variables are more likely to influence uncertainty and return variability rather than the sign of returns. This relationship should be interpreted as reflecting the informational content of AI-related market signals rather than a direct causal effect on oil prices.
From an economic perspective, these findings can be interpreted through the lens of expectation formation and uncertainty transmission in energy markets. Variables such as AI activity and EV market performance do not directly determine current oil demand but instead influence expectations about future energy consumption and technological substitution. As a result, their impact is more likely to manifest through changes in uncertainty and dispersion rather than immediate price direction. This is consistent with theoretical frameworks in energy economics where forward-looking expectations, rather than contemporaneous fundamentals alone, play a key role in price formation. In this context, the observed forecasting relationships reflect how market participants incorporate information about structural energy transition dynamics into risk assessments, rather than into short-term directional pricing.
5.3. Heavy-Tail Dynamics
The estimated degrees-of-freedom parameter is , indicating pronounced heavy-tailed behavior in conditional WTI returns. A value near 4 implies substantial excess kurtosis relative to the Gaussian benchmark and confirms that extreme price movements occur with higher probability than under normality.
Economically, this finding suggests that tail risk is a structural feature of daily crude oil markets rather than an episodic anomaly. The presence of persistent heavy tails is consistent with the impact of energy transition uncertainty. The Student-t specification is therefore not merely a technical refinement but a necessary component for accurately characterizing the distributional risk embedded in oil returns.
Based on the above, the results of this study should be interpreted with caution. While the findings suggest that contemporary crude oil markets may be increasingly influenced by risk and uncertainty channels, the empirical evidence remains limited in terms of mean predictability. Consistent with earlier studies documenting weak daily mean predictability in oil returns [
1,
2,
26], our results indicate that even a nonlinear deep learning framework yields only modest directional improvements. The informational contribution of carbon allowance returns appears to be statistically significant; however, this should be interpreted as evidence of informational content rather than strong economic influence. Similarly, the role of AI activity and electric vehicle markets is primarily reflected in volatility dynamics, but these effects remain relatively weak and context-dependent. Overall, the results suggest that the proposed framework offers incremental improvements in modeling return magnitude and tail risk, rather than substantial gains in predicting average returns. Therefore, the conclusions of this study should be viewed as indicative rather than definitive.
6. Conclusions
This study examined the forecasting of crude oil returns in an environment characterized by energy transition, technological change, and heightened geopolitical uncertainty. Departing from traditional point-forecasting approaches, we proposed a heavy-tailed distributional LSTM framework that jointly models the conditional mean, volatility, and tail behavior of WTI crude oil returns. The model incorporates transition- and risk-related drivers, including AI, ETS, GPR, and EV market returns, to capture cross-market information flows shaping modern oil markets. The empirical results yield several important insights. First, the daily mean predictability of crude oil returns remains structurally weak. Out-of-sample results show that the proposed model does not outperform a historical mean benchmark in terms of squared error, and directional accuracy is only marginally above 50%. This finding is consistent with the broader energy finance literature and reflects the difficulty of extracting stable, short-run directional signals in highly efficient, shock-driven oil markets. Second, the preliminary Granger causality analysis reveals heterogeneous transmission channels across determinants. Carbon allowance returns exhibit statistically significant informational content for the conditional mean of oil returns, indicating that short-run oil price adjustments respond to changes in carbon market expectations and climate-policy signals. In contrast, AI activity and electric vehicles do not robustly predict mean returns but display significant or marginal forecasting power for return volatility, suggesting that technological innovation and electrification primarily influence oil markets through uncertainty and dispersion channels rather than directional price pressure. Third, the distributional LSTM framework delivers meaningful gains in modeling return magnitude and tail risk. The model improves the forecasting of absolute returns, highlighting that volatility dynamics are more predictable than mean returns. Moreover, the estimated degrees-of-freedom parameter of the Student-t distribution indicates pronounced heavy-tailed behavior, confirming that extreme oil price movements are a persistent structural feature rather than episodic anomalies. However, it is important to note that the empirical results reveal limited forecasting power for mean returns, which constrains the strength of the conclusions and suggests that the proposed framework should be interpreted primarily as a tool for modeling risk dynamics rather than directional forecasting.
The findings have several potential, but limited, practical implications. For risk managers and financial institutions, the results suggest that oil-market predictability may be more concentrated in the risk dimension than in expected returns. While the proposed distributional LSTM framework shows improved performance in modeling volatility and return magnitude, its advantages in mean forecasting remain modest. As such, its use in practice should be considered as complementary to existing tools rather than a standalone forecasting solution. For energy policymakers and regulators, the observed informational relationships between carbon allowance markets and oil returns may indicate that climate policy signals are reflected in oil markets. However, given the weak overall predictability, these results should be interpreted with caution and not as evidence of strong or stable policy transmission mechanisms. Similarly, while AI activity and EV market developments appear to be associated with volatility dynamics, their informational contribution remains limited. Therefore, although monitoring these indicators may provide useful contextual information, they should not be relied upon as robust predictors of oil price movements. Overall, the practical relevance of the results lies primarily in improving the understanding of risk dynamics rather than providing strong predictive tools for directional price forecasting.
Despite its contributions, this study has several limitations that point to avenues for future research. In addition, using absolute returns as a proxy for volatility in the preliminary Granger causality analysis is a simplification. Future research could improve upon this by employing realized volatility measures or conditional variance estimates derived from GARCH-type models to provide a more refined characterization of volatility dynamics. First, the analysis focuses on daily returns and a single benchmark crude oil market, WTI. Extending the framework to multiple oil benchmarks, intraday frequencies, or longer-horizon forecasts could provide additional insights into horizon-dependent predictability. Second, while the study includes key transition-related variables, future work could incorporate additional climate and policy indicators, such as renewable energy indices, measures of carbon-policy uncertainty, or firm-level emissions exposure, to further enrich the information set. Third, although the distributional LSTM captures heavy tails effectively, it remains a data-driven model. Future research could explore hybrid approaches that combine structural energy-economics models with distributional deep learning, improving both interpretability and forecasting robustness. Finally, investigating regime-dependent or quantile-based extensions may further clarify how oil markets respond asymmetrically to extreme geopolitical events and rapid technological transitions.