1. Introduction
One of the important topics in modern research is the stochastic modelling of time series with emphasized and persistent fluctuations. To this end, in recent decades, various stochastic models have been proposed, especially in econometrics and financial engineering (see, c.f. [
1,
2,
3,
4,
5,
6,
7,
8]). A particular problem arises when the observed time series have a non-stationary dynamic, which usually affects the greater complexity of their stochastic structure (for more recent studies, see, c.f. [
9,
10,
11,
12,
13,
14]). To solve these and similar problems, Engle and Smith [
15] proposed the stochastic permanent breaking (STOPBREAK) process, which was later examined by many authors, especially in the domain of structural and permanent changes in the fluctuations of some real-world data [
16,
17,
18,
19,
20]. Stojanović et al. [
21] introduced a more general form for the STOPBREAK-based stochastic process, named the Split-BREAK process, which was also applied in modelling various time series with permanent and pronounced fluctuations. It is worth pointing out that some more general forms of the Split-BREAK process, the so-called General (or Gaussian) Split-BREAK (GSB) process, were introduced later and discussed in Stojanović et al. [
22,
23,
24] as well as Jovanović et al. [
25].
The main characteristics of the GSB process are based on its Gaussian innovations, which somewhat simplifies the examination of the properties of this model. Nevertheless, there are some specific time series that do not have the Gaussian property. To that end, as an alternative approach, some other stochastic distributions can be considered. They are usually assumed to have a symmetric distribution (as in the case of a zero-mean Gaussian distribution), but with some additional properties that certain non-Gaussian distributions have. For these reasons, a new, general form of the Split-BREAK process with symmetric Laplacian innovations, called the LSB process, is proposed here.
In the next section, the definition and main properties of the LSB process are outlined. Some additional stochastic characteristics of the LSB process, with special emphasis on examining its asymptotic properties, are discussed in
Section 3. As a basic tool, the characteristic functions (CFs) method was used here. The CF calculation procedure is known for simplifying the finding of probability distributions, rather than calculating them directly, and is particularly useful in determining the cumulative distribution functions (CDFs). As will be seen below, it will be quite suitable for investigating time series distributions as constituents of the LSB process. Various procedures for parameter estimation of the LSB process, as well as examining the asymptotic properties of the obtained estimators, are given in
Section 4.
Section 5 describes the Monte Carlo simulations of the obtained estimators, while
Section 6 presents the application of the LSB process in the dynamic analysis of prices and trading volumes of crude oil and natural gas in the world market. Lastly, some concluding remarks are highlighted in
Section 7.
2. Definition and Structure of the LSB Process
Akin to previous works of GSB process [
21,
22,
23,
24,
25], the underlying LSB time series is given by the following additive decomposition:
where
is the finite time values set. In Equation (1), series
represents the martingale means and
are the innovations, that is, the series of independent identical distributed (IID) random variables (RVs). It is usually assumed that innovations
are given on the probability space
which is expanded by some filtration
Practically interpreted, filtration
collects the ‘information’ about some financial index at a certain point in time
. Therefore, the RVs
will be
-adaptive for each
. In the following, we assume that innovations
have the centred Laplace distribution, given with probability density function (PDF):
where
is the scale parameter. Let us emphasize that the proposal of the Laplace distribution is motivated by the fact that this distribution, compared to the Gaussian, can more adequately fit the distributions of many financial time series with pronounced “peaks”. Such a situation is shown in
Figure 1, which shows the histograms of the empirical distributions of the log-returns
Here, the price of crude oil and natural gas on the world market is taken as the basic financial index
, which will be discussed in more detail below. According to the well-known properties of Laplace distribution (see, c.f. [
26], pp. 60), the conditional mean and variance of innovations
are, respectively,
After integrating the function
, where, due to the absolute value, two symmetric cases are distinguished, its CDF is obtained as follows:
On the other hand, for a series of martingale means
it is assumed that they are given recurrently by the relation:
where it is almost certainly (as)
and
. Moreover, (
) is a series of noise indicators, that is, the RVs that depend on innovation series
as follows:
Here,
is the so-called critical value of the reaction, i.e., the parameter that indicates the significance of previous realizations of innovations
, which would allow that their current values to be included in Equation (6). More precisely, in the case
, the value of martingale mean
is equal to its previous value
. Therefore, the value of the basic LSB series (
), given by Equation (1), is obtained with a “small” fluctuation, dependent on (
) only. On the contrary, when
, a pronounced fluctuation of the value
is registered. In this way, the previous realizations of innovations
impact the level of variations of the basic LSB time series (
), that is, the intensity of fluctuations of the LSB process. Using some simple calculations, in the same way as in Jovanović et al. [
25], the following properties of the mentioned time series can be easily proven:
Theorem 1. Letandbe the series defined by Equations (1) and (5), respectively. Then, both series have the constant and equal mean:as well as the variances:whereIn addition, the correlation functions of the seriesandare, respectively, It is noticeable that previous theorem gives some additional information about the stochastic properties of the LSB process. Since
is measurable in relation to the field
, this series represents the component that describes the predictability and stability of the LSB process (see
Figure 2 below). On the other hand, the innovation series
is the noise (deviation) component of the main LSB series
, compared to the martingale means
. Furthermore, according to Equation (7), it follows that the variances of the series
and
are non-constant and their order of dependence is equal to time moment
when they are observed. Thereby, the correlation functions of
and
depend on both time variables
, thus indicating that these series are non-stationary. On the other hand, when
, it follows:
that is, both correlation functions satisfy the condition of
L2-continuity.
Finally, we define another important LSB series, the so-called increments of the LSB process:
This series, according to Equations (1), (5), and (6), can be represented in the following way:
where
. Obviously, the series
represents a bilinear, as well as stationary stochastic process with a random coefficient
, which is usually called Splitting moving average process (of order 1), that is, a Split-MA(1) process. Note that the term ‘split’, as in the case of the main LSB series
, is justified by the fact that
operates in two modes:
- (a)
If the fluctuations of innovations () were emphasized in the previous moment in time, it follows . Thus, equality (9) becomes .
- (b)
Fluctuations whose square do not exceed the critical value imply . Thus, the value of is given as a linear, integrated MA(1) process .
Obviously, the Split-MA(1) series has a similar structure to MA(1) processes, which can be applied in their examination. Based on earlier assumptions, the basic properties of this series, obtained by some simple computations, can be expressed as follows:
Theorem 2. Letbe the time series defined by Equations (8) and (9). Then, the mean and the variance of this series are, respectively,whereIn addition, the correlation of this series is as follows:and it has the finite moments (of the order):
It is worth noticing that, according to Equations (8) and (9), it follows that:
which represents a nonlinear integrated autoregressive moving average (ARIMA) model with “temporary” components
. As will be seen later, this implies the specific structure of the (stationary) series
, as well as the other (non-stationary) components of the LSB process. Nevertheless, as will be seen below, due to its stationary quality, the Split-MA(1) process
plays an important role in examining the stochastic properties of the LSB process, as well as in estimating its parameters.
Figure 2 shows the realizations of all LSB time series obtained by Monte Carlo simulations. Note that non-stationary time series
and
(graphs on the left) can have very different trajectories, in contrast to realizations of the stationary series
and
(graphs on the right).
3. Distributional Features of the LSB Process
Some important stochastic properties of LSB series, regarding their distributions and asymptotic behaviour, are discussed here. To this end, as mentioned earlier, the series plays a significant role due to its stationary quality. Using the method of characteristic functions (CFs), the basic stochastic properties of this series can be expressed in the following way.
Theorem 3. Letbe the time series defined by Equations (8) and (9). Then, for anyand, the CDF of the RVsis given by:whereis the CDF of the Laplacian distributed RVs, given by Equation (4), and Proof. To begin with, let us define the series of RVs
,
Thereby, the RVs (
) and (
) are mutually independent, so it follows:
It can easily be shown that it is
, i.e., the RVs
are uncorrelated for any
. Using the conditional probabilities, for the CDF of the RVs
one obtains:
where
is the CDF of the RV
. According to this, the CF of the RVs
is obtained as follows:
Here,
and
are, respectively, CFs of the RVs
and
. Substituting these CFs into the previous equality gives:
According to this equality and Equation (9), as well as the result of using the partial decomposition of rational functions, for the CF of the RVs
one obtains:
Here,
are the CFs of RVs
with chi-square distribution and
degrees of freedom (DF). According to Lévy’s correspondence theorem (see, c.f., [
27]), the CDFs which correspond to CFs
and
are chi-square-based RVs
and
, respectively. By applying the well-known facts about
distribution, these CDFs are, respectively,
Note that the functions and satisfy the equality , where is the function given by Equation (11). Hence, by again applying the Lévy’s correspondence theorem to the last expression in (12), Equation (10) immediately follows. □
Remark 1. Let us emphasize that the functionis not an ‘ordinary’ CDF, but the sum of two chi-square-based CDFsand, which can also be seen in the left plot in Figure 3. Furthermore, by differentiating Equation (10) when, and after some computations, the probability density function (PDF) of the seriescan be obtained as follows:Notice that PDFdiffers from the PDF of the innovationswith respect to the last multiplicative term in Equation (13). Moreover, these PDFs become equal when, and both are symmetric functions, as shown in the right-hand plots of
Figure 3. ■ Using a similar procedure, the distributive properties of non-stationary LSB series and can be shown as follows.
Theorem 4. Letandbe the time series defined by Equations (1) and (5), respectively, where.
Then, for anyand, the CDFs of the seriesandare, respectively,whereis the convolution operator, and,
are the CDFs of the RVs,
, respectively. Additionally, the following convergences (in the distribution) hold when:
Proof. Similarly to the previous theorem, let us introduce RVs
,
It is easy to show that
is a series of mutually uncorrelated RVs, with
where
Applying the conditional probabilities again, for the CDF of
one obtains:
According to this CDF, the appropriate CF of the RVs
can be obtained as follows:
Now, by using Equation (5), for the CFs of the martingale means
one obtains:
where
is CF of the RV
According to Equation (17) and the correspondence theorem of Lévy [
27], Equation (14) follows. In a similar way, according to Equation (1), the CFs of the series
can be written as follows:
From here, Equation (15) is immediately followed by reapplying the correspondence theorem of Lévy.
Let us now prove the second part of the theorem, that is, Equation (16). First, using previous Equations (17) and (18), for the CFs of RVs
and
, when
one obtains:
From here, taking the limit values, when
and
is a fixed (but arbitrary) value, the following is obtained:
Thus, obtained limit is the obvious the CF of the Gaussian distribution , which confirms the convergences in (16). □
Remark 2. In the proofs of the previous two theorems, the uncorrelated series of RVsandcan be interpreted as innovations with non-zero “optional” values. As the relationholdsit is sufficient to take only one of these two series for consideration. Moreover, it is easy to see that these RVs have the following CDFs: Obviously, both of these functions are continuous almost everywhere, with the only breaking pointwhere ‘jumps’ of sizeandoccur, respectively (for more details see Stojanović et al. [28,29]). For these reasons, the CDFs of the seriesandrepresent mixtures of the Laplacian and discrete distribution concentrated at zero, which we call the Contaminated Laplacian Distribution (CLD). This fact prevents the application of some of the standard procedures in the examination the properties of non-stationary seriesand.
On the other hand, asymptotic relations in Equation (16) show that even non-stationary time seriesand, obtained by non-Gaussian innovations, one can generate seriesand, which will converge to Gaussian distribution when. These facts are of practical importance when applying the LSB process and can be easily observed by the convergence of the appropriate CFsand. As an illustration of this, convergences of the modulus of both of these CFs, for different time values, are shown in Figure 4. ∎
In the following, the asymptotic properties of some other time series, obtained by linear transformations of the non-stationary time series and are presented. It primarily refers to the possibility of finding their asymptotically normal (AN) distributions, and can be proven by the following proposition:
Theorem 5. Let us define, for an arbitrary , the -mean time series:whereandare the non-stationary time series given by (1) and (5), respectively. Then the following statements are valid: - (i)
When, both seriesandhave an asymptotically normal distribution, i.e., the following relations, when, are valid: - (ii)
When, both seriesandare asymptotically vanishing, i.e.,
Proof. First, we prove the statement of the theorem in the case of series
. According to the definition of series
, given by Equation (5), the following is obtained:
Therefore, the time series
is represented as the sum of uncorrelated RVs
, when
. Applying the well-known features of the CFs, as well as the CF of the series
, for the CFs of RVs
one obtains:
From here, by taking the logarithm of the function
, the following function is obtained:
where:
After some computation, we find that:
Thus, the functions
have a local maximum at the point
By applying the Laplace approximation of functions
at
(see, c.f. [
25,
28]), it follows:
Here
denotes an infinitesimally small value of a higher order in relation to
, when
. Taking the asymptotic value in the previous expression, when
, gives:
Substituting this expression in CFs , it is easy to see that the first asymptotic relation in Equation (19) is valid.
The proof for the series
can be conducted in an analogous manner. Using the previously proven facts and Equation (1), we find that:
Since
,
, are mutually independent RVs, the CFs of time series
can be obtained, after some calculations, as follows:
Based on this, and applying the same procedure as in the previous part of the proof, that is, by taking the logarithm of the function
, and by expanding the function
at
, we have:
Taking the asymptotic values, when
, the following is obtained:
Replacing this expression in the CFs , the theorem is completely proven. □
Remark 3. As with the GSB process, the caseis of special interest in the previous theorem. Asymptotic relations (19) then give: The convergences (21), as in the case of GSB process, we called the central limit theorems (CLTs) for the LSB process. ■
5. Numerical Simulations of the LSB Estimators
In this section, the parameter estimation procedures of the LSB model are discussed, where
Monte Carlo simulations of the basic LSB series are generated, of the length
. At the same time, the main goal is to examine the quality of previously proposed estimators, that is, their asymptotic properties, which were analysed and shown in the previous section. To this end, appropriate estimation errors and normality testing procedures are also used. The summarized values of the estimated parameters, i.e., the mean (Mean), minimum (Min.), and maximum (Max.), as well as the corresponding mean square estimated errors (MSEE), are shown in the left part of
Table 1. In addition, the obtained estimates were tested for their AN properties using Anderson–Darling and Cramér–von Mises normality tests. The appropriate test statistics (denoted by AD and W, respectively), as well as their
-values, were computed using the R-package “nortest” [
38] and are presented in the right part of
Table 1.
Based on these obtained estimates, it is evident that almost all of them have the AN property, which is also confirmed by the previous theoretical results. It is worth pointing out that even the estimates of the mean values and , obtained by the realizations of the non-stationary series , have the AN properties. This is already explained by the theoretical findings provided by Theorems 4 and 5, which describe the AN properties of this series. Therefore, the effectiveness of both of these estimators due to their non-stationarity is not pronounced and, due to unlimited variance, there are a wide range of obtained estimated values. On the other hand, we note that the AN property is not particularly emphasized in the case of estimates of the critical value . This is a consequence of the three-step procedure for estimating this parameter, because the estimates of the parameter are obtained after the estimates of the parameters and have been computed. Nevertheless, it is clear that, in accordance with the previous theoretical findings, first of all in Theorem 7, the estimate is more efficient and has a more pronounced AN property than estimate .
Finally, in the case of estimates of the scale parameter
, their efficiency and AN properties are clearly visible. At the same time, it should be noticed that the moment-based estimate
has a somewhat slower efficiency and AN property compared to the ML estimate
, obtained using Equation (37) and the modelled innovations
. This is fully consistent with the previously explained theoretical findings, that is, the definitions of both of these two estimates, as well as Theorems 6 and 7. Some visual confirmation of these facts can be seen in
Figure 6, where the histograms of the frequency distribution of the obtained estimates are shown. Thus, for instance, in all cases (with the exception
) the presence of AN properties clearly can be observed, as well as the efficiency of the obtained estimates.
6. Application in Dynamic Analysis of the World Oil and Gas Market
In the following, the application of the LSB process in the dynamic analysis of prices and trading volumes of crude oil and natural gas on the world market is considered. We emphasize that the temporal dynamics of these two energy sources is of particular importance, and they are greatly influenced by global external factors, such as, for instance, the recent COVID-19 pandemic and the war in Ukraine. Precisely for these reasons, it can be assumed that all of these factors cause permanent and pronounced fluctuations in the dynamics of the price and trading volume of these two energy products, which can be seen in the following
Figure 7. As mentioned in the introduction, it will be shown here that the LSB process can be an appropriate stochastic model for describing dynamics of these kinds. To this end, based on official data from the National Association of Securities Dealers Automated Quotations (NASDAQ) Stock Market [
39], we observed daily changes in crude oil prices and trading volumes (in US dollars per barrel) and natural gas (in US dollars per cubic meter) from 2 April 2018 to 31 March 2023.
In this way, time series of real-world data of length
were obtained, and their main statistical indicators can be seen in the following
Table 2. Based on these, it can be easily concluded that in both cases there are pronounced and permanent fluctuations. For example, the average price of crude oil is (approximately) 62.57 US dollars per barrel. However, it varies from 9.06 US dollars (on 21 April 2020, just over a month after the official announcement of the COVID-19 pandemic), all the way to 123.7 US dollars (on 8 March 2022, a few weeks after the start of the war in Ukraine). Let us notice that the price of natural gas also has pronounced price ranges, although less pronounced than in the case of the price of crude oil.
Note that in addition to the basic financial data (price and volumes of trading),
Table 2 also shows descriptive statistics of the so-called log-volumes. They represent an aggregate financial indicator, obtained as a natural logarithm of the total monetary value of trading volumes, i.e.,
Here,
and
are, respectively, the price and trading volumes of crude oil, and
and
are the price and trading volumes of natural gas, observed at some point in time
. As is stated in two studies [
40,
41], the usage of log-volumes changes the interpretation of activity shocks because unexpected values are not affected by the growth trend in their dynamics. In addition, the variance of log-volatility shocks is then more uniform across the sample (that is, over the timeline of the observed data). This can also be seen through the sample variance and standard deviation of both observed log-volume series, which are shown in
Table 2. Additionally, the corresponding Split-MA(1) processes for these series are as follows:
i.e., they represent the sum of the log-returns of prices and trading volumes.
We further consider the possibility of using the LSB process as a suitable stochastic model of logarithmic volume dynamics. To that end, the basic LSB series, i.e., realizations of log-volumes of crude oil and natural gas, will be referred to as Series A and Series B, respectively. According to these, as well as the results of using Equations (1) and (5), the martingale means
and innovations
can be obtained by iterative procedure:
where
and
is the estimated critical value, obtained by using Equation (31). As initial values in (47), as before, we have taken
as well as
,
. The estimated values of basic statistical indicators of the increment series
,
, as well as two modelled series, martingale means
,
and innovation series
,
, are presented in the following
Table 3.
Based on the obtained estimated parameter values, certain observations can be made, which also derive from previously obtained theoretical results. Note first that the average values of log-volumes are “close” to the average values of martingale means, and that is consistent with equality . Moreover, both series A and B have other similar statistical indicators (variance or standard deviation, for example), which indicates a certain similarity in their dynamics and other stochastic characteristics. This can be seen by comparing statistical indicators of the increments (), , and the innovation series (), . It is noticeable that the sample means of both series are close to zero, that is, they have the property of symmetry of their empirical distributions. Finally, it is worth pointing out that both series (), , have the estimated values of kurtosis , which could indicate their suitability for stochastic modelling with the Laplace distribution.
In the following, using the estimation procedure described in the previous section, the parameters of both real-world time series can be estimated, assuming that their dynamics are subjected to the LSB model.
Table 4 shows the estimates obtained by applying the above-mentioned procedures, that is, two kinds of parameter estimates of the LSB model. Additionally, some other estimates, such as the first-order sample correlation
and the estimates of the threshold parameter
, are shown. Let us notice that the condition
is satisfied in both series cases, which enables the estimation of the parameter
.
As has already been pointed out, the values of the modelled series (
) and (
) were computed by using the most robust estimators of the LSB process
. The agreement between the modelled and actual data can be seen in
Figure 8a, where in addition to the observed log-volume values (
), the modelled martingale mean values (
) are shown. At the same time, the agreement between the increment series (
) and innovations (
) is shown in
Figure 8b. It is worth noting that high correlations between the actual and modelled time series are clearly observable, which can also be explained by the theoretical findings presented in
Section 2. Namely, the martingale means (
) are equal to the log-volumes in cases when there were no pronounced fluctuations of the series (
) in the previous time period. On the other hand, if emphatic fluctuations occur, the values of the series (
) and (
) become different, and the resulting deviations indicate the existence of significant fluctuations and potential risk in the market. Similarly, if at some point in time point’s innovation series (
) has a pronounced fluctuation, the next value of (
) will be equal to increments (
). It is obvious that the agreement of realizations between these time series is better if, in addition to permanent and emphatic fluctuations of (
), the critical value c is relatively small.
Notice that the CDFs and PDFs of the series (
),
can be obtained by using the results given in Theorem 3, that is, Equations (10) and (13), respectively. The above plots in
Figure 9 show fitted PDFs of both empirical distributions of the series (
),
. Finally, using the results of Theorem 4, that is, Equation (15), the fitted CDFs
of the log-volumes (
) can be obtained by the iterative procedure:
where:
From here, by differentiating the CDFs obtained using Equation (48), the corresponding PDFs of the log-volumes (
) can also be easily computed. Therefore, due to the non-stationarity, these PDFs are dependent on the time argument
. The graphs below in
Figure 9 show the theoretical PDFs of the series (
), obtained by using the numerical procedure in the R-package “distr” [
42]. The PDFs of the length
are shown with dashed lines, and the PDFs of the length
are shown with a solid line.
7. Conclusions
The main stochastic properties of the Laplacian Split-BREAK (LSB) process are presented here, along with the investigation of the asymptotic properties of the corresponding LSB series, as well as the procedures for estimating their parameters. It is useful to point out that one of the advantages of this stochastic model, as in the case of the Gaussian Split-BREAK (GSB) process, is that it enables the usage of appropriate stationary and non-stationary components, which provide different procedures for estimating its unknown parameters. At the same time, of particular importance is the asymptotic behaviour of LSB series. This is considered as well as the obtained parameter estimators.
Let us point out that one of the important features of the LSB process, as well as the class of STOPBREAK-based processes in general, is the ability to “remove” the sharp boundary between stochastic processes with permanent shocks and those in which they remain transient. Therefore, these stochastic processes can vary between different well-known non-linear stochastic models (see, for more details, Stojanović et al. [
23,
24]). For instance, the LSB process can vary from an IID (white noise) series, for a larger critical value of reaction
, to a random walk process, as
approaches zero.
In addition, some of the possibilities of applying the LSB process in modelling dynamics of the real-world series with emphasized and persistent fluctuations are also described. This provides opportunities for potential future research based on the various kinds of Split-BREAK processes. At the same time, it is worth pointing out that the dynamic analysis of log-volumes, as composite time series, may represent a certain limitation, due to a possibility of omitting some other characteristics of the oil and gas market.