Abstract
In this paper, we propose hybrid models for modelling the daily oil price during the period from 2 January 1986 to 5 April 2021. The models on manifolds that we consider, including the reference ones, employ matrix representations rather than differential operator representations of Lie algebras. Firstly, the performance of LieNLS model is examined in comparison to the Lie-OLS model. Then, both of these reference models are improved by integrating them with a recurrent neural network model used in deep learning. Thirdly, the forecasting performance of these two proposed hybrid models on the manifold, namely Lie-LSTMOLS and Lie-LSTMNLS, are compared with those of the reference LieOLS and LieNLS models. The in-sample and out-of-sample results show that our proposed methods can achieve improved performance over LieOLS and LieNLS models in terms of RMSE and MAE metrics and hence can be more reliably used to assess volatility of time-series data.
1. Introduction
Crude oil is a strategic natural resource since it is a commodity connected with many financial instruments, such as futures, options, and bonds. While most financial instruments have a short-term maturity period, there are cases with long-term pricing maturity for oil. Additionally, crude oil price has nonlinear behavior.
The nonlinear behavior in the oil price has been well discussed and analyzed by many articles in the past. Barone-Adesi et al. [] suggested a semiparametric method to examine the structure of oil prices. Adrangi et al. [] determined the presence of low-dimensional chaotic structure in the oil prices. Lahmiri [], Bildirici, and Sonustun []; Komijani et al. []; and He [] are the other studies that determine the presence of chaos in the oil prices. Bildirici et al. [] suggested a new hybrid modelling technique based on the LSTARGARCH and LSTM models to analyze the volatility of oil prices.
Apart from the works on volatility, the works by [,] carry importance. Gibson and Schwartz [] also shows “the mean reverting tendency as well as the variability of its changes requires a stochastic representation in order to price oil-linked securities accurately”. In [], a model that depends on a two-factor model for pricing financial and real assets contingent on the price of oil is developed. For valuing futures contracts, the parameters of the model were estimated by using the data between January 1984 and November 1988, and the model was tested on the out of sample data between November 1988 and May 1989. The purpose of the current work is to offer an approach applicable to pricing based on the Lie method.
In this paper, we employ Lie algebras method to solve stochastic differential equation (DE) of short-term model of the oil. We suggest that the model is governed by a stochastic differential equation model on a curved state space and develop oil price models using matrix representations and differential operator on the manifold. In late 19th century, under a continuous group of symmetries, Lie discovered that special approaches to solve DEs were special states of a general integration process dependent upon the invariance of the DE. Nowadays, the applications of Lie groups have a deep impact on the branches of mathematics, mechanics, and robotics sciences.
Especially, in mathematical finance, a few papers employed the Lie method to provide awareness to the structure of related partial differential equations. The approach of employing general differentiable manifolds in interest rate models appears in [,,]. Gazizov and Ibragimov [] used the Lie method in the context of Black–Scholes–Merton equation. Lo and Hui [] and Carr et al. [] constructed a concrete example of a short-rate model on the circle S1. Park et al. [] tested the proposition that nonlinear and random behavior of interest rates is governed by a stochastic differential equation model on a curved state space. They developed short-term interest rate models on and manifolds using matrix representations instead of differential operator representations of Lie groups.
In this paper, we employed spot price (WTI crude daily oil price) during the period from 2 January 1986 to 5 April 2021. The selected period includes some important events that had impacts on the oil price, such as multiple economic crises (1981, 2001, and 2008), US military intervention in Iraq, COVID-19, etc. These factors lead to nonlinear behavior in the oil price between spot and futures contracts.
Therefore, modeling dynamic processes and solving stochastic differential equations (SDE) are important. As is widely recognized, the solutions of DEs yield a set of symmetries that corresponds to Lie groups. In this paper, we employ a model on S2 manifolds that uses matrix representations instead of differential operator representations of Lie algebras. As accented by [], the drift and noise volatility terms of the stochastic state equations are worked out to reflect various observed phenomena. We try to keep these terms simple and instead choose an underlying state space that is curved. Park et al. [] and Goard [] used the ordinary least square (OLS) estimation method for parameter estimation. We preferred the nonlinear least square (NLS) method for parameter estimation due to the nonlinear behavior in the specified period.
As our primary contribution, we propose the use of LSTM networks for forecasting in the domain obtained by the Lie method. Specifically, we suggest both the hybrid Lie-LSTMOLS and the hybrid Lie-LSTMNLS models for more reliable forecasting than standard regression methods in this domain. The forecasting performances of our proposed hybrid methods, Lie-LSTMOLS and Lie-LSTMNLS, are compared against those of the Lie NLS and Lie OLS standard regression methods on the WTI oil price data.
The paper is organized as follows. In Section 2, the orthogonal matrix Lie groups and algebras are given, and then the oil price model is defined on the Lie groups SO(3). In Section 3, the data is presented, and some of its descriptive statistics are given. In Section 4, results are presented and discussed, and the last section gives the conclusion.
2. Materials and Methods
2.1. Preliminaries on Orthogonal Matrix Lie Groups and Algebras
In this section, the definitions of orthogonal matrix Lie groups, their algebras, and the relations of stochastic dynamics with these groups are given [,].
As it is known, a geometrically Lie group is a differentiable manifold, and its algebra is the tangent space in the unit neighboring to the manifold. Usually, the group is denoted with a capital letter and algebra with a lowercase letter. Let G and g be a matrix Lie group with dimension n and its algebra, respectively. In this case, the orthogonal matrix groups are denoted as and defined as follows:
Special orthogonal matrix groups are denoted as and defined as follows:
The manifold of Lie group SO(2) is identified with the unit circle with parametrization .
The manifold of Lie group SO(3) is identified with the unit sphere , with parametrization , . Similarly, the Lie group is identified with the dimensional manifolds .
Lie algebras of these groups are denoted by , and the elements of the algebra satisfy the condition for . The relationship between this algebra and the group is expressed by:
Proposition 1.
([]). Bilinear state equation
where is a constant, and is the diffusion process, and the quadratic function is are given. Hence the dynamics for are as follows:
where , and is the correlation coefficient between and .
Proposition 2.
([]). Under the conditions given in proposition 1, if and M is symmetric, the dynamics for is given as
2.2. Stochastic Dynamics on Orthogonal Matrix Lie Groups
2.2.1. The Lie Group
As it is known, the Lie group is a differentiable manifold, and this manifold can be identified with the unit circle S1. The oil price is defined as follows: where and is a symmetric positive definite matrix.
Thus the bilinear state equation is given:
Indeed, for
where .
Using Equation (3) for (s, θ) dynamics:
and for s dynamics using the oil price relation:
2.2.2. The Lie Group
The Lie group is a differential manifold, and it can be identified with unit sphere . In this manifold, the bilinear state equation, the oil price, and the dynamics for f are given respectively as follows:
where , and are positive symmetric matrices.
As a result, the terms in Equation (4) can be defined as follows:
Hence, we obtain the oil price and the stochastic dynamic for on as follows:
where
and
Thereby, the ds state dynamic obtained by the matrix representation of the Lie group SO(3) coincides with the state dynamic obtained by the differential representations.
Overall forecasting is formed as
where shows long short-term memory (LSTM). LSTM [] is a recurrent neural network that exploits the dependencies among the samples of a segment of the time series on the SO(3) manifold for accurate prediction. The equations governing the LSTM operation may be stated as
where is a bias vector, W and V are weight matrices, the sigmoid function is denoted as σ(·), and denotes element-wise multiplication.
The LSTM unit inputs the state vector and the output vector from time step t − 1 as well as the input feature vector at time step t to yield the state vector and the output vector at time step t. Based on and , LSTM exploits temporal dependencies by determining the part of the previous state that needs to be kept by using the forget gate , forming the new information in normalized form as , and determining its strength by applying the input gate activation to it. The new state is thus formed as in Equation (7), which is normalized by tanh function and modulated by the output gate activation (Equation (12)) to yield the bounded output prediction as .
3. Data and Some Descriptive Statistics
In this paper, daily West Texas Intermediate (WTI) Crude Oil Prices dataset acquired from the FRED Economic Data. It includes oil price data between 2 January 1986 and 5 April 2021.
The published oil price is the spot price given as
As with [], we have used Monte Carlo simulation to evaluate the above expectation.
Firstly, the descriptive statistics of WTI oil price data were obtained, and unit root test was applied. In Table 1, the statistics are shown. Since the data exhibits excess kurtosis, it cannot be modelled by a normal distribution, as confirmed by the Jarque–Bera (JB) test. The main problem seems to be excess kurtosis but not so much excess skewness.

Table 1.
Descriptive statistics and unit root tests.
From the unit root test results in Table 1, it can be seen that H0 hypothesis can be accepted for all variables at the level. ADF and KSS tests suggest the stationarity of the data at the level.
Next, the results of the nonlinearity tests are presented in Table 2 and Table 3. In Table 2, Teraesvirta’s neural network test, White neural network test, Likelihood ratio test for threshold nonlinearity, and Tsay’s test for nonlinearity indicate that the linear form is mis-specified. Teraesvirta and White tests perform similarly to the Tsay test.

Table 2.
Nonlinearity Test Statistics.

Table 3.
BDS test statistic.
The BDS test (Brock et al. []) in Table 3 suggests that the (linear) functional form is misspecified for the variables.
4. Models and Results
The Lie parameters in Equation (5) were obtained by using the OLS and NLS methods. Table 4 shows estimates of the Lie parameters. The coefficient estimates obtained with the two methods turned out to be very similar, whereas coefficient estimates obtained with the two methods were significantly different. The AIC values obtained with both models are similar to each other.

Table 4.
Estimations of Lie parameters.
It is interesting that Lie-OLS model passes RESET, BP, and ARCH tests with values very close to the critical value. On the other hand, the statistical tests of the LieNLS model gave more successful results than the statistical tests of the Lie-OLS model.
Next, the forecasting performances with the LieOLS and LieNLS models were analyzed. LSTM was used to improve the forecasting performances of these models. In order to apply the LSTM model, the dataset was partitioned into an in-sample training set and out-sample test set corresponding to the time intervals between 2 January 1986–20 October 2019 and 21 October 2019–5 April 2021, respectively.
The configuration of our LSTM network is as follows:
- Input samples consist of sequence segments of 30 timesteps, each having 1 feature (price).
- Input layer is connected to an LSTM unit with 25 hidden neurons and a dropout value of 0.20.
- LSTM output feeds a dense layer (output) with one neuron and linear activation function
- Training is performed in batches of 32 samples.
The model giving the lowest RMSE and MAE values is deemed the most successful model.
4.1. In-Sample Forecast Results
Table 5 presents the results of the LSTM method integrated with the LieOLS model or the LieNLS model. As references for comparison, the results with the LieOLS and LieNLS models (employing traditional regression techniques) are also presented.

Table 5.
In-sample forecast results.
It can be observed from Table 5 that the Lie-LSTMNLS model gives more successful results than the Lie-LSTMOLS model. More importantly, it is also seen that Lie-LSTMOLS and Lie-LSTMNLS models give more successful results than the LieOLS and LieNLS models used for reference.
4.2. Out-of-Sample Forecast Results
The RMSE and MAE values for LieOLS, LieNLS, Lie-LSTMOLS, and Lie-LSTMNLS were obtained to explore their forecast accuracy for T+10 and T+20 workdays in Table 6. The out-of-sample results indicate that Lie-LSTMNLS provides the highest out-of-sample forecast accuracy.

Table 6.
The out-of-sample performances of compared methods.
4.3. Test for Forecast Accuracy
The Wilcoxon signed-rank and Diebold–Mariano (DM) tests were applied to test the equivalence of forecast accuracy (null hypothesis H0).
In Table 7 and Table 8, the p values of calculated DM and Wilcoxon test statistics are 0.00, and both of them are significant at the 1% significance level. The H0 hypothesis of these tests assume the models have the same level of accuracy. For most cases, since the p-value is <0.05, H0 hypothesis is rejected. For both tests, the p-value is >0.05 only for the RMSE comparison of the LieOLS and LieNLS models. Hence, these two models are comparable in terms of RMSE performance.

Table 7.
p-values for the Wilcoxon signed-rank tests.

Table 8.
p-values for the Diebold–Mariano tests.
5. Discussion and Conclusions
We have proposed hybrid models for analyzing the short-term model of the oil during the period from 2 January 1986 to 5 April 2021. In our basic model, the Lie group SO(3) is a differential manifold, and it can be identified with unit sphere . We have additionally integrated this model with LSTM to test it.
The previous works by [,] differ significantly from the current work since [] only discussed the mean reverting tendency of oil price, and the model by [] depends on a two-factor model for pricing financial and real assets contingent on the price of oil. Specifically, it estimates the parameters of joint stochastic processes modelling oil-contingent claim and futures contract based on spot prices and net convenience yield and uses this model to value futures contracts. On the other hand, our work develops the short-term model and solves it with the hybrid model of Lie method and LSTM network. Although Lie algebras have been used in interest rate models in many works in the literature, the current work is the first to use Lie algebras for oil price modelling. When the time series for oil prices are considered, it is seen that their distributions have positive skewness. Even though some other methods might be tried out to perform analyses on these processes, this problem can be readily addressed with our proposed Lie model. This can be attributed to the modelling of the oil price on the manifold using matrix representations and differential operators in our suggested LieOLS and LieNLS models. Then, we maintain that carrying out a modelling with the Lie-LSTM methods obtains good forecasting results.
LieOLS-LSTM and LieNLS-LSTM methods facilitate the numerical computations for stochastic differential equations on differentiable manifolds. By employing SO(3) group structure, the oil prices that have a positive skewness and high JB and kurtosis can be described in a geometric way. Moreover, by jointly using the Lie and LSTM methods, it becomes possible to increase forecasting performance by representing the complex structure.
As stated in Park et al. [] for interest rate, the Lie group models in the current work show that the closed form formulas can only be an exception rather than the rule for oil price prediction, and therefore one should resort to numerical approaches for such prediction. Additionally, while the previous works ([,] for bond pricing) have employed either OLS or NLS methods to estimate the model parameters, the current work investigated which of these methods works best with the Lie method. Each of these combinations, LieOLS and LieNLS, were integrated with the LSTM network to get a hybrid model with improved forecasting performance. Specifically, in price forecasting 10 and 20 days into the future, the models incorporating LSTM yielded smaller RMSE and MAE values. According to Wilcoxon signed-rank and Diebold–Mariano tests, Lie-LSTMNLS model turned out to be the most successful one in terms of forecasting performance among the four models considered.
In this study, we showed that the analysis of the short-term model of the spot price of oil by using the Lie method is important. The model that we propose can also be used to analyze the relationship between futures and spot prices of many commodities other than oil.
Author Contributions
Conceptualization, M.B., N.G.B. and Y.U.; methodology, M.B., N.G.B. and Y.U.; software, M.B., N.G.B. and Y.U.; validation, M.B., N.G.B. and Y.U.; formal analysis, M.B., N.G.B. and Y.U.; investigation, M.B., N.G.B. and Y.U.; resources, M.B., N.G.B. and Y.U.; data curation, M.B., N.G.B. and Y.U.; writing—original draft preparation, M.B., N.G.B. and Y.U.; writing—review and editing, M.B., N.G.B. and Y.U.; visualization, M.B., N.G.B. and Y.U.; supervision, M.B., N.G.B. and Y.U.; project administration, M.B., N.G.B. and Y.U.; funding acquisition, M.B., N.G.B. and Y.U. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Data Availability Statement
The WTI dataset can be downloaded from https://fred.stlouisfed.org/series/DCOILWTICO, accessed on 15 April 2021.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Barone-Adesi, G.; Bourgoin, F.; Giannopoulos, K. Don’t look back. Risk 1998, 11, 100–103. [Google Scholar]
- Adrangi, B.; Chatrath, A.; Dhanda, K.K.; Raffiee, K. Chaos in oil prices? Evidence from futures markets. Energy Econ. 2001, 23, 405–425. [Google Scholar] [CrossRef] [Green Version]
- Lahmiri, S. A study on chaos in crude oil markets before and after 2008 international financial crisis. Phys. A Stat. Mech. Its Appl. 2017, 466, 389–395. [Google Scholar] [CrossRef]
- Bildirici, M.E.; Sonustun, F.O. Chaotic Structure of Oil Prices, Inflation and Unemployment. Nonlinear Dyn. Psychol. Life Sci. 2019, 23, 377–394. [Google Scholar]
- Komijani, A.; Naderi, E.; Gandali Alikhani, N. A hybrid approach for forecasting of oil prices volatility. OPEC Energy Rev. 2014, 38, 323–340. [Google Scholar] [CrossRef] [Green Version]
- He, L.-Y. Chaotic Structures in Brent & WTI Crude Oil Markets: Empirical Evidence. Int. J. Econ. Financ. 2011, 3, 242–249. [Google Scholar] [CrossRef] [Green Version]
- Bildirici, M.; Guler Bayazit, N.; Ucan, Y. Analyzing Crude Oil Prices under the Impact of COVID-19 by Using LSTARGARCHLSTM. Energies 2020, 13, 2980. [Google Scholar] [CrossRef]
- Gibson, R.; Schwartz, E. Valuation of Long Term Oil-Linked Assets; Working Paper; Anderson Graduate School of Management, University of California: Los Angeles, CA, USA, 1989; Volume 6, p. 89. [Google Scholar]
- Gibson, R.; Schwartz, E.S. Stochastic convenience yield and the pricing of oil contingent claims. J. Financ. 1990, 45, 959–976. [Google Scholar] [CrossRef]
- Nunes, J.; Webber, N.J. Low Dimensional Dynamics and the Stability of HJM Term Structure Models; Working Paper; AIP Publishing: Melville, NY, USA, 1997. [Google Scholar]
- Gazizov, R.K.; Ibragimov, N.H. Lie symmetry analysis of differential equations in finance. Nonlinear Dyn. 1998, 17, 387–407. [Google Scholar] [CrossRef]
- Ibragimov, N.H.; Soh, C.W. Solution of the Cauchy problem for the Black-Scholes equation using its symmetries. In Proceedings of the Modern Group Analysis, International Conference at the Sophus Lie Conference Center, Nordfjordeid, Norway, 9–13 June 1997. [Google Scholar]
- Lo, C.F.; Hui, C.H. Valuation of financial derivatives with time-dependent parameters: {Lie}-algebraic approach. Quant. Financ. 2001, 1, 73–78. [Google Scholar] [CrossRef]
- Carr, P.; Lipton, A.; Madan, D. The Reduction Method for Valuing Derivative Securities; Working Paper; New York University: New York, NY, USA, 2002. [Google Scholar]
- Park, F.C.; Chun, C.M.; Han, C.W.; Webber, N. Interest rate models on Lie groups. Quant. Financ. 2011, 11, 559–572. [Google Scholar] [CrossRef]
- Goard, J. New solutions to the bond-pricing equation via Lie’s classical method. Math. Comput. Model. 2000, 32, 299–313. [Google Scholar] [CrossRef]
- Klimyk, A.U.; Vilenkin, N.Y. Representations of Lie groups and special functions. In Representation Theory and Noncommutative Harmonic Analysis II; Springer: Berlin/Heidelberg, Germany, 1995; pp. 137–259. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Garcia, C.A. NonlinearTseries: Nonlinear Time Series Analysis; R Foundation for Statistical Computing: Vienna, Austria, 2020. [Google Scholar]
- Brock, W.; Dechert, W.D.; Scheinkman, J. A Test for Independence Based on the Correlation Dimension; Economy Working Paper SSRI-8702; University of Wisconsin: Madison, WI, USA, 1987. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).