Generalized Cauchy Process: Difference Iterative Forecasting Model

: The contribution of this article is mainly to develop a new stochastic sequence forecasting model, which is also called the difference iterative forecasting model based on the Generalized Cauchy (GC) process. The GC process is a Long-Range Dependent (LRD) process described by two independent parameters: Hurst parameter H and fractal dimension D . Compared with the fractional Brownian motion (fBm) with a linear relationship between H and D , the GC process can more flexibly describe various LRD processes. Before building the forecasting model, this article demonstrates the GC process using H and D to describe the LRD and fractal properties of stochastic sequences, respectively. The GC process is taken as the diffusion term to establish a differential iterative forecasting model, where the incremental distribution of the GC process is obtained by statistics. The parameters of the forecasting model are estimated by the box dimension, the rescaled range, and the maximum likelihood methods. Finally, a real wind speed data set is used to verify the performance of the GC difference iterative forecasting model.


Introduction
The stochastic sequences exhibit Long-Range Dependent (LRD), if as t → ∞, the (auto)correlation function Cov(x 0 , x t ) decreases to zero like a power function, slowly enough so that ∑ ∞ t=0 Cov(x 0 , x t ) = ∞. In recent years, the trend forecasting of stochastic sequences with LRD characteristics has become a hot topic that has attracted the interest of many scholars [1][2][3][4]. A large number of experiments prove that the forecasting methods based on regression analysis [5,6], Gray system [7][8][9], Wiener process [10,11], Markov process [12][13][14], support vector machine [15,16], fuzzy analysis [17,18], and neural network [19,20] cannot describe the LRD characteristics in the forecasting process of the actual stochastic sequences, which leads in low accuracy of forecasting results. Therefore, a stochastic model with LRD characteristics is proposed, such as fractional Gaussian processes and fBm, etc. [21][22][23], which can describe the LRD characteristics of stochastic sequences. A stochastic sequence forecasting model is established based on the stochastic model with LRD characteristics, which can better forecast the stochastic sequence by comprehensively considering the impact of the past and current states on the future state [24][25][26]. M. Li et al., propose the forecasting theory of the time series with LRD characteristics [27].W. Song at al. [28] and Wan-Qing et al. [29] established a forecasting model based on fBm and applied it to the time series forecasting of power load and rolling bearing degradation. The fBm is used to derive a degradation model and forecast the remaining useful life of mechanical equipment [30,31]. Liu et al. [26] establish a forecasting model based on the fractional Levy stable motion. S. Duan et al. [32] use the fractional Levy stable motion forecasting model to forecast the 2. GC Process: Properties 2.1. Preliminary Knowledge Definition 1. Self-similarity property. A self-similar process is a stochastic process X(t) with the same finite-dimensional distribution globally and locally in a statistical sense, and its mathematical definition is [36,37]: X(at) ≡ a H X(t), a > 0, (1) where ≡ indicates that both sides of the equation have the same probability distribution, 0 < H < 1 is a self-similar parameter, also known as Hurst parameter. If Definition 1 holds for all t, then X(t) is a global self-similar process.
Definition 2. ACF. In this article, the ACF of X(t) is represented by R XX (τ), whose expression is as follows [21,22]: where τ is the time lag and E[·] is the mean operator.
Definition 3. LRD characteristic. If the ACF obeys an asymptotic power function, then X(t) has LRD characteristics, namely [27]: where c is a constant and 0 < β < 1 is the LRD index, which is given by the Hurst parameter: At this time, the Hurst parameter H is limited to 0.5 < H < 1.

Definition 4.
Fractal dimension. The fractal dimension D is related to the fractal index α, which represents the local irregularity of X(t) and is defined as [23,28]: where 0 < α ≤ 2, 1 ≤ D < 2, and c 1 is a constant with a value greater than 0.
It can be seen from Equations (4) and (6) that the Hurst parameter H and the fractal dimension D represent the global and local properties of the stochastic process X(t), respectively, because the LRD exponent and the fractal exponent are obtained under the conditions of τ → ∞ and τ → 0 , respectively.
On the other hand, the Hurst parameter H and the fractal dimension D independently describe the LRD characteristics and local irregularities of stochastic sequences. However, there is a linear relationship between the Hurst parameter H and the fractal dimension D in the fBm [36]: In other words, the fractal dimension D of fBm increases as the Hurst parameter H decreases, and decreases as the Hurst parameter increases. When both the Hurst parameter H and the fractal dimension D are large, e.g., a stochastic sequence, which has both strong LRD and strong local irregularities, cannot be effectively described by the fBm model.

GC Process
The probability density function of X(t) has the following form, which is called Cauchy distribution [39,40]: where X(t) is a stochastic process, µ is the position parameter and δ is the scale parameter. The Cauchy class model is obtained by generalizing Cauchy distribution in the autocorrelation domain. Its ACF and fractal dimension are as follows [41]: where R C (τ) is ACF of Cauchy class model. From Equations (5), (6) and (10), the fractal dimension of the Cauchy class process is always limited to D = 1. The stochastic process with the fractal dimension in the range of (1, 2) cannot be described. Therefore, the Cauchy class process is generalized to the GC process. In this article, the GC distribution is extended to the correlation domain to obtain the ACF, thereby defining the GC process. The GC distribution is defined by the following probability density function form [39,41]: where Γ(x) is the gamma function and p is the tail constant. It should be noted that the GC distribution reduces into Cauchy distribution when p = 2. The GC distribution is introduced into the correlation domain, and the ACF of the Cauchy class model is further extended to obtain the ACF [38,42]: That is, the stationary Gaussian process with the ACF of the Equation (12) is called the GC process.

The LRD Characteristics of the GC Process
Compared with single-parameter stochastic models, such as fBm, the GC model can describe the correlation of stochastic sequences by two parameters, e.g., the GC process is an LRD process when 1 ≤ D < 2 and 0.5 < H < 1. Figure 1 shows the ACF curves of the GC process under different parameter values, which show the influence of different Hurst parameters and fractal dimensions on the correlation. On the one hand, the effect of the Hurst parameter on the ACF is global and determines the overall trend. On the other hand, the influence of the fractal dimension on the ACF is local, and the influence on the global behaviors is small.

Self-Similarity Properties of the GC Process
Unlike the global self-similarity of fBm, the GC process has only weak self-similarity properties, that is, local self-similarity (relative to the global self-similarity property of Equation (1)), which is defined as [38]: where a is a constant. an LRD process when 1 2 D ≤ < and 0.5 1 H < < . Figure 1 shows the ACF curves of the GC process under different parameter values, which show the influence of different Hurst parameters and fractal dimensions on the correlation. On the one hand, the effect of the Hurst parameter on the ACF is global and determines the overall trend. On the other hand, the influence of the fractal dimension on the ACF is local, and the influence on the global behaviors is small.

Self-Similarity Properties of the GC Process
Unlike the global self-similarity of fBm, the GC process has only weak self-similarity properties, that is, local self-similarity (relative to the global self-similarity property of Equation (1)), which is defined as [38]: where a is a constant.
In [41], it is proposed that the GC process can be transformed into a global self-similar process through Lamperti transformation. Let

( )
Y t be the GC process after transformation; then, the transformation form is: The ACF of ( ) Y t is shown in Figure 2. In [41], it is proposed that the GC process can be transformed into a global self-similar process through Lamperti transformation. Let Y(t) be the GC process after transformation; then, the transformation form is: The ACF of Y(t) is shown in Figure 2.

Self-Similarity Properties of the GC Process
Unlike the global self-similarity of fBm, the GC process has only weak selfproperties, that is, local self-similarity (relative to the global self-similarity pr Equation (1)), which is defined as [38]: In [41], it is proposed that the GC process can be transformed into a global se process through Lamperti transformation. Let ( ) Y t be the GC process after mation; then, the transformation form is: The ACF of ( ) Y t is shown in Figure 2.

The Generation of the GC Sequence
Ortigueira fractal linear system theory shows that a steady-state time series can be obtained by white noise and filters, the specific process being [46,47]: where w(t) is the white noise, h(t) is the impulse function and x(t) is the generated stationary time series. Similarly, a non-stationary white noise can be passed through a linear filter to obtain a non-stationary time series. This article introduces a method to generate GC sequence by Gaussian white noise and impulse function. The GC sequence is generated as follows: Step 1: Calculating the Fourier transform of Equation (15). where X GC (ω), W GC (ω) and H GC (ω) are the Fourier transforms of x GC (t), w(t) and h GC (t), respectively.
Step 2: Calculating H GC (ω) where F[x] is the Fourier transform operator.
Step 3: Calculating the inverse Fourier transform of H GC (ω) to find the impulse function: where is the inverse Fourier transform operator. The impulse function at D = 1.5 and H = 0.6 is shown in Figure 3.
Step 2: Calculating x is the Fourier transform operator.
Step 3: Calculating the inverse Fourier transform of ( ) GC H ω to find function:   Figure  4 shows that the global trend of the sequence is approximately the same under the same Hurst parameter, but the higher fractal dimension will result in stronger local irregularities (Figure 4b). Figure 5 shows that the local irregularity of the sequence is similar under the same fractal dimension, but the sequence with a higher Hurst parameter has a stronger global trend (Figure 5b).  Figure 4 shows that the global trend of the sequence is approximately the same under the same Hurst parameter, but the higher fractal dimension will result in stronger local irregularities ( Figure 4b). Figure 5 shows that the local irregularity of the sequence is similar under the same fractal dimension, but the sequence with a higher Hurst parameter has a stronger global trend (Figure 5b).

The Difference Iterative Forecasting Model Based on the GC Process
Brownian motion is a stochastic process in the mathematical sense [28], which described by the stochastic differential equation, whose expression is as follows:

The Difference Iterative Forecasting Model Based on the GC Process
Brownian motion is a stochastic process in the mathematical sense [28], which can be described by the stochastic differential equation, whose expression is as follows: where B(t) is Brownian motion. Then, Equation (20) is extended to the Ito process with drift and interference terms [43,48]: where X(t) is the stochastic sequence, µ and δ are the drift rate and interference intensity (or volatility). Scholes and F. Black [44], and Wang at el. [45] gave the fBm B H (t) instead of B(t) as the volatility term of the Ito process and established the Black-Scholes model to describe the trend of the financial option S t : where B H (t) is the fBm. These articles apply the concept of forecasting in the financial field to the stochastic sequences.
The Black-Scholes model assumes that the volatility value is constant, which does not conform to the phenomenon that the non-stationarity of the stochastic sequence causes the trend to change over time, and it cannot explain the influence of local volatility on long-term dependence. In order to solve this problem, the expression of the Ito process is extended. The constants µ and δ are extended to time-dependent functions µ(t) and δ(t), respectively; the interference term B(t) is represented by Z(t), which is defined as follows: From the perspective of the Ito process driven by the fBm, the GC process can be regarded as a stochastic interference term with LRD properties. Therefore, the generalized expression of the Ito process driven by the fBm is as follows: µ(t) is the drift coefficient, which represents the global trend of the GC process; δ(t) is the diffusion coefficient, which represents the local uncertainty of the GC process.
The stochastic sequence forecasting model based on the GC process is obtained by combining Equations (22) and (24), and the form is as follows Discretizing the Equation (25), we can get: where ∆GC(t) = GC(t + τ) − GC(t). The distribution of the GC process increments ∆GC(t) can be obtained through statistical reasoning. The specific process is as follows: Step 1: The numerical sequence of the GC process is generated by Equation (19).
Step 2: Difference between two states with a time interval τ is computed, namely, Step 3: Step 2 is repeated for the time series generated by Step 1, and multiple differences are made to construct an incremental set.
Step 4: Calculating the variance δ τ of the incremental distribution. The GC process is a stationary Gaussian process and the increment of interval τ also follows the Gaussian distribution, i.e., ∆GC(t) ∼ N(0, δ τ ), as shown in Figure 6.
Step 3: Step 2 is repeated for the time series generated by Step 1, and multiple differences are made to construct an incremental set.
Step 4: Calculating the variance τ δ of the incremental distribution. The GC process is a stationary Gaussian process and the increment of intervalτ also follows the Gaussian distribution, i.e., , as shown in Figure 6. Therefore, Substituting Equation (26)  Therefore, Substituting Equation (26) into the difference equation: The difference iterative forecasting model is obtained: where ∆GC(t) ∼ N(0, δ τ ). Let H = 0.75, D = 1.5, µ = 0.28, δ = 0.35, δ τ = 0.32. Substituting these parameters into Equation (31) and performing 100 Monte Carlo simulations, the generated numerical sequence is similar to the sequence generated by the GC process, as shown in Figure 7.
The difference iterative forecasting model is obtained:

Estimated Hurst Parameter H
There are many existing methods for estimating the Hurst parameter, s

Estimated Hurst Parameter H
There are many existing methods for estimating the Hurst parameter, such as the periodic graph method, the variance method and the absolute value method [29]. A commonly used method is called the rescaled range method [26]. In the process of estimating the Hurst parameter by the rescaled range method, the input sample sequence is divided into subsequences of the same length, and the average value of the ratio of the standard deviation and the range is calculated for the subsequence, e.g., the rescaled range; the length of the subsequence is re-divided to calculate the rescaled range; all the rescaled ranges and the corresponding subsequence lengths takes the logarithm, and the estimated value of the Hurst parameter is obtained by least-squares fitting (see Figure 8). The mathematical definition is as follows: where d is the length of different subsequences, c is a constant, ς is the error, R is the range, S is the standard deviation, R S (d) is the rescaled range and H is the Hurst parameter estimate.

Estimated Fractal Dimension D
There are some techniques commonly used to estimate fractal dimensions, su box counting and spectroscopy [49]. However, some of these methods, such as spec etry, have some errors. In this article, we will use the box-dimension method to cal the fractal dimension [50,51]. The box dimension method divides the plane of the s sequence into small lattices; then, it takes the logarithm of the total number of l covering the sample sequence and the corresponding lengths and obtains the esti value of the fractal dimension by least-square fitting (as shown in Figure 9). The sp mathematical definition of the box-counting method is as follows:

Estimated Fractal Dimension D
There are some techniques commonly used to estimate fractal dimensions, such as box counting and spectroscopy [49]. However, some of these methods, such as spectrometry, have some errors. In this article, we will use the box-dimension method to calculate the fractal dimension [50,51]. The box dimension method divides the plane of the sample sequence into small lattices; then, it takes the logarithm of the total number of lattices covering the sample sequence and the corresponding lengths and obtains the estimated value of the fractal dimension by least-square fitting (as shown in Figure 9). The specific mathematical definition of the box-counting method is as follows: log(N e ) = D × log(1/e) + log(c) + ς, (31) where N e is the number of grids covering the sample sequence, e is the corresponding side length of the grid, and D is the estimated value of the fractal dimension.

Estimated Drift and Diffusion Coefficients
The establishment of the difference iterative prediction model not only relie Hurst parameter and fractal dimension to generate GC sequence but also requires log (N e ) Figure 9. Least-squares fitting of fractal dimension.

Estimated Drift and Diffusion Coefficients
The establishment of the difference iterative prediction model not only relies on Hurst parameter and fractal dimension to generate GC sequence but also requires drift coefficient and diffusion coefficient to establish the stochastic differential equation. Therefore, the probability density function of the GC process is calculated by the maximum likelihood estimation method [42,52] to estimate the values of the drift and diffusion coefficients. Let x = [x 1 , x 2 , · · · , x n ] be a set of input sample sequences with unknown parameters µ and δ; then, the unknown parameters can be obtained by maximum likelihood estimation: where f (x; µ, δ, p) is the joint probability distribution of the GC process, which is defined as: Taking logarithms on both sides of the equation of Equation (33), then solving the partial derivatives for µ, δ, p and making the partial derivatives equal to 0: where f = f (x; µ, δ, p), Ψ(x) is the digamma function, defined as d ln Γ(x)/dx. The estimated values of µ, δ are calculated by using the iterated conditional mode algorithm [42] for Equations (34)-(36).

Case Study
To verify the effectiveness of the difference iterative forecasting model based on the GC process, we consider the problem of forecasting the trend of wind speed. In this experiment, the wind speed data set collected by Sotavento Galicia, S.A. company (Galicia, Spain) [53] from 0:00 to 12:00 on 24 February 2020, every ten minutes is used as the sample data for the forecasting model, and forecasts for the next 3 h, 6 h, 9 h and 12 h wind speed trend are given when wind speed data in the previous 12 h are historical data. The parameter estimation method described in Section 4 is used to estimate the model parameters, whose values are given in Table 1. The wind speed forecasting is shown in Figure 10. In this experiment, the fBm difference iterative forecasting model is used as a comparative experiment. For the modeling process of fBm, see [28].   Figure 10 shows that the difference iterative forecasting model based on process and the fBm can accurately forecast the wind speed trend.   Figure 10 shows that the difference iterative forecasting model based on the GC process and the fBm can accurately forecast the wind speed trend. Table 2: The prediction error results in Table 2 illustrate that the GC difference iterative forecasting model is superior to fBm from three aspects. the local level (maximum, minimum and standard deviation), overall level (mean, median and mode) of the forecasting results, and the fitting effect of the forecasting model (Mean Absolute Percentage Error (MAPE)). Figure 11 shows the boxplot of the relative errors of the two models. No outliers were detected in the figure, thus verifying the effectiveness of GC and FBM iterative prediction models. Further, as shown in Table 1, there are two self-similar parameters H and fractal dimension D in GC model to describe the sequence trend, which respectively predict the global and local trend of the sequence. However, FBM only has self-similar parameter H to predict the global trend of the sequence, so in theory, the prediction accuracy of GC iteration model is higher than that of FBM.  Figure 11. Boxplot of the relative error of the forecasting models.

Conclusions
In this study, a difference iterative forecasting model is developed from the two-parameter model generalized Cauchy process and it is applied to the changing trend of wind speed. The main contributions are as follows: 1. The properties of the Hurst parameter H and fractal dimension D of the generalized Cauchy process are analyzed by the ACF, which describes the global and local properties of stochastic sequences, that is, long-range dependent characteristics and local irregularities, respectively; 2. The simulation sequence of the generalized Cauchy process is generated by the white Figure 11. Boxplot of the relative error of the forecasting models.

Conclusions
In this study, a difference iterative forecasting model is developed from the twoparameter model generalized Cauchy process and it is applied to the changing trend of wind speed. The main contributions are as follows: 1.
The properties of the Hurst parameter H and fractal dimension D of the generalized Cauchy process are analyzed by the ACF, which describes the global and local properties of stochastic sequences, that is, long-range dependent characteristics and local irregularities, respectively;