Parsimonious Heterogeneous ARCH Models for High Frequency Modeling

Ruilova, Juan Carlos; Morettin, Pedro Alberto

doi:10.3390/jrfm13020038

Open AccessArticle

Parsimonious Heterogeneous ARCH Models for High Frequency Modeling

by

Juan Carlos Ruilova

¹

and

Pedro Alberto Morettin

^2,*

¹

Itaú Bank, São Paulo 04344-902, Brazil

²

Department of Statistics, University of São Paulo, São Paulo 05508-090, Brazil

^*

Author to whom correspondence should be addressed.

J. Risk Financial Manag. 2020, 13(2), 38; https://doi.org/10.3390/jrfm13020038

Submission received: 6 January 2020 / Revised: 4 February 2020 / Accepted: 8 February 2020 / Published: 20 February 2020

(This article belongs to the Special Issue Financial Statistics and Data Analytics)

Download

Browse Figures

Versions Notes

Abstract

:

In this work we study a variant of the GARCH model when we consider the arrival of heterogeneous information in high-frequency data. This model is known as HARCH(n). We modify the HARCH(n) model when taking into consideration some market components that we consider important to the modeling process. This model, called parsimonious HARCH(m,p), takes into account the heterogeneous information present in the financial market and the long memory of volatility. Some theoretical properties of this model are studied. We used maximum likelihood and Griddy-Gibbs sampling to estimate the parameters of the proposed model and apply it to model the Euro-Dollar exchange rate series.

Keywords:

GARCH model; HARCH model; PHARCH model; Griddy-Gibs; Euro-Dollar

1. Introduction

High frequency data are those measured in small time intervals. This kind of data is important to study the micro structure of financial markets and also because their use is becoming feasible due to the increase of computational power and data storage.

Perhaps the most popular model used to estimate the volatility in a financial time series is the GARCH(1,1) model; see Engle (1982), Bollerslev (1986):

\begin{matrix} \begin{matrix} r_{t} = σ_{t} ε_{t}, & ε_{t} \sim i i d (0, 1) \end{matrix}, \\ σ_{t}^{2} = α_{0} + α_{1} r_{t - 1}^{2} + β_{1} σ_{t - 1}^{2}, \end{matrix}

(1)

with

α_{0} > 0, α_{1} \geq 0, β_{1} \geq 0, α_{1} + β_{1} < 1

.

When we use high frequency data in conjunction with GARCH models, these need to be modified to incorporate the financial market micro structure. For example, we need to incorporate heterogeneous characteristics that appear when there are many traders working in a financial market trading with different time horizons.

The HARCH(n) model was introduced by Müller et al. (1997) to try to solve this problem. In fact, this model incorporates heterogeneous characteristics of high frequency financial time series and it is given by

\begin{matrix} r_{t} = σ_{t} ε_{t}, \\ σ_{t}^{2} = c_{0} + \sum_{j = 1}^{n} c_{j} {(\sum_{i = 1}^{j} r_{t - i})}^{2}, \end{matrix}

(2)

where

c_{0} > 0

,

c_{n} > 0

,

c_{j} \geq 0

\forall j = 1, \dots, n - 1

and

ε_{t}

are identically and independent distributed (i.i.d.) random variables with zero expectation and unit variance.

However, this model has a high computational cost to fit when compared with GARCH models, due to the long memory of volatility, so the number of parameters to be estimated is usually large.

We propose a new model known as the parsimonious heterogeneous autoregressive conditional heteroscedastic model, in short-form PHARCH, as an extension of the HARCH model. Specifically, we call a PHARCH(m,p), with aggregations of different sizes

a_{1}, \dots, a_{m}

, where m is the number of the market components, the model given by

\begin{matrix} r_{t} = σ_{t} ε_{t}, \\ σ_{t}^{2} = C_{0} + C_{1} {(r_{t - 1} + \dots + r_{t - a_{1}})}^{2} + \dots + \\ + C_{m} {(r_{t - 1} + \dots + r_{t - a_{m}})}^{2} + b_{1} σ_{t - 1}^{2} + \dots + b_{p} σ_{t - p}^{2}, \end{matrix}

(3)

where

ε_{t} \sim i . i . d . (0, 1)

,

C_{0} > 0

,

C_{j} \geq 0

,

\forall j = 1, \dots, m - 1

,

C_{m} > 0

,

b_{j} \geq 0

,

j = 1, \dots, p

.

HARCH models are important because they take account the natural behavior of the traders in the market. However they have some problems, mainly because they need to include several aggregations, so the number of parameters to estimate is large, because of the large memory feature of financial time series. Parsimonious HARCH includes only the most important aggregations in its structure, which makes the model more realistic. We can see some simulations in Figure 1, where the characteristics of clustering and volatility are better represented in PHARCH processes than in ARCH or HARCH processes.

The organization of the paper is as follows. In Section 2 we provide some background information on Markov chains and give the necessary and sufficient conditions for the PHARCH model to be stationary. In Section 3 we obtain forecasts for the proposed model, and in Section 4 we introduce the data that will be used for illustrative purposes. The actual application is given in Section 5, and we close the paper with some conclusions in Section 6.

2. Background

In this section we provide briefly some background on Markov chains and results on stationarity of PHARCH models.

2.1. Markov Chains

Suppose that

X = {X_{n}, n \in Z^{+}}, Z^{+} : = {0, 1, 2, \dots}

are random variables defined over

(Ω, F, B (Ω))

, and assume that

X

is a Markov chain with transition probability

P (x, A), x \in Ω, A \subset Ω

. Then we have the following definitions:

A function $f : Ω \to I R$ is called the smallest semi-continuous function if ${lim inf}_{y \to x} f (y) \geq f (x), x \in Ω$ . If $P (\cdot, A)$ is the smallest semi-continuous function for any open set $A \in B (Ω)$ , we say that (the chain) $X$ is a weak Feller chain.
A chain $X$ is called $φ$ -irreducible if there exists a measure $φ$ on $B (Ω)$ such that, for all x, whenever $φ (A) > 0$ , we have,
$U (x, A) = \sum_{n = 1}^{\infty} P^{n} (x, A) > 0$ .
The measure $ψ$ is called maximal with respect to $φ$ , and we write $ψ > φ$ , if $ψ (A) = 0$ implies $φ (A) = 0$ , for all $A \in B (Ω)$ . If $X$ is $φ$ -irreducible, then there exists a probability measure $ψ$ , maximal, such that $X$ is $ψ$ -irreducible.
Let $d = {d (n)}$ a distribution or a probability measure on $Z^{+}$ , and consider the Markov chain $X_{d}$ with transition kernel

$K_{d} (x, A) : = \sum_{n = 0}^{\infty} P^{n} (x, A) d (n) .$

If there exits a transition kernel T satisfying

$K_{d} (x, A) \geq T (x, A), x \in Ω, A \in B (Ω),$

then T is called the continuous component of $K_{d}$ .
If $X$ is a Markov chain for which there exits a (sample) distribution d such that $K_{d}$ has a continuous component T, with $T (x, Ω) > 0, \forall x$ , then $X$ is called a T-chain.
A measure $π$ over $B (Ω)$ , $σ$ -finite, with the property

$π (A) = \int_{Ω} π (d x) P (x, A), A \in B (Ω),$

is called an invariant measure.

The following two lemmas will be useful. See Meyn and Tweedie (1996) for the proofs and further details. We denote by

I_{A} (\cdot)

the indicator function of A.

Lemma 1.

Suppose that X is a weak Feller chain. Let

C \in B (Ω)

be a compact set and V a positive function. If

V (X_{n}) - E [V (X_{n + 1}) | X_{n}] \geq 0, X_{n} \in C^{c},

then there exists an invariant measure, finite on compact sets of Ω.

Lemma 2.

Suppose that X is a weak Feller chain. Let

C \in B (Ω)

be a compact set, and V a positive function that is finite at some

x_{0} \in Ω

. If

V (X_{n}) - E [V (X_{n + 1}) | X_{n}] \geq 1 - b I_{C} (X_{n}), X_{n} \in Ω,

with b a constant,

b < \infty

, then there exists an invariant probability measure π on

B (Ω)

.

Lemma 3.

Suppose that X is a ψ-irreducible aperiodic chain. Then the following conditions are equivalent:

1.: There exists a function $f : Ω \to [1, \infty)$ , a set $C \in B (Ω)$ , a constant $b < \infty$ and a function $V : Ω \to [0, \infty)$ , such that

$V (X_{n}) - E [V (X_{n + 1}) | X_{n}] \geq f (X_{n}) - b I_{C} (X_{n}), X_{n} \in Ω .$
2.: The chain is positive recurrent with invariant probability measure π, and $π (f) < \infty$ .

2.2. Stationarity of PHARCH(m,p) Models

We first give a necessary condition for Model (3) to be stationary. We know that

E (r_{t - i} r_{t - j}) = 0, \forall i \neq j

, and if

r_{t}

is stationary, we must have

E (r_{t}^{2}) = E (σ_{t}^{2}) = C_{0} + E (r_{t}^{2}) \sum_{i = 1}^{m} a_{i} C_{i} + E (σ_{t}^{2}) \sum_{i = 1}^{p} b_{i},

so

E [r_{t}^{2}] = \frac{C_{0}}{1 - (\sum_{i = 1}^{m} a_{i} C_{i} + \sum_{i = 1}^{p} b_{i})} .

Therefore,

\begin{matrix} \sum_{i = 1}^{m} a_{i} C_{i} + \sum_{i = 1}^{p} b_{i} < 1 . \end{matrix}

(4)

To prove a sufficient condition it will be necessary to represent the PHARCH(m,p) as a Markov process. We use the definitions given in the previous section, so the process

\begin{matrix} X_{t} = (r_{t - 1}, \dots, r_{t - a_{m} + 1}, σ_{t}, \dots, σ_{t - p + 1}), \end{matrix}

(5)

whose elements follow Equation (3), is also a T-chain.

The proofs of the following results are based on Dacorogna et al. (1996), and they are given in the Appendix A.

Proposition 1.

The Markov Chain

X_{t}

that represents a PHARCH(m,p) process is a T-chain.

Proposition 2.

The Markov Chain that represents a PHARCH(m,p) process is recurrent with an invariant probability measure (stationary distribution), and its second moments are finite if the condition given in (4) is satisfied.

Note that if

ε_{t} \sim t (v)

( a Student’s t distribution with

ν

degrees of freedom) in (3), then the necessary and sufficient condition becomes

\sum_{i = 1}^{m} \frac{v}{v - 2} a_{i} C_{i} + \sum_{i = 1}^{p} b_{i} < 1, for v > 2 .

3. Forecasting

In this section we make some considerations about forecasting and validation of the proposed model. Usually two data bases are used for testing tha forecasting ability of a model: one (in-sample), used for estimation, and the other (out-of-sample) used for comparing forecasts with true values. There is an extra complication in the case of volatility models: there is no unique definition of volatility. Andersen and Bollerslev (1998) show that if wrong estimates of volatility are used, evaluation of forecasting accuracy is compromised. We could use the realized volatility as a basis for comparison, or use some trading system.

We could, for example, have a model for hourly returns and use the realized volatility computed from 15 min returns for comparisons. In general, we can compute

v_{h, t} = \sum_{i = 1}^{a_{h}} r_{t - i}^{2}

, where

a_{h}

is the aggregation factor (4, in the case of 15 min returns). Then use some measure based on

s_{h} = {\tilde{v}}_{h, t} - v_{h, t}

, for example, mean squared error, where

{\tilde{v}}_{h, t}

is the volatility predicted by the proposed model. See Taylor and Xu (1997), for example.

Now consider Model (3). The forecast of volatility at origin t and horizon ℓ is given by

\begin{matrix} {\hat{σ}}_{t}^{2} (l) & = E (σ_{t + l}^{2} | X_{t}) \\ = E (C_{0} + C_{1} {(r_{t + l - 1} + \dots + r_{t + l - a_{1}})}^{2} + \dots + \\ + C_{m} {(r_{t + l - 1} + \dots + r_{t + l - a_{m}})}^{2} + b_{1} σ_{t + l - 1}^{2} + \dots + b_{p} σ_{t + l - p}^{2} | X_{t}), \end{matrix}

where

X_{t} = (r_{t}, σ_{t}, r_{t - 1}, σ_{t - 1}, \dots)

, for

l = 1, 2, \dots

Since

a_{0} = 1 < a_{1} < a_{2} < \dots < a_{m} < \infty

, then we have three cases:

(i): If $l = 1$ ,

$\begin{matrix} {\hat{σ}}_{t}^{2} (l) & = & E (C_{0} + C_{1} {(r_{t + l - 1} + \dots + r_{t + l - a_{1}})}^{2} + \dots + \\ + C_{s} {(r_{t + l - 1} + \dots + r_{t + l - a_{s}})}^{2} + \dots + \\ + C_{m} {(r_{t + l - 1} + \dots + r_{t + l - a_{m}})}^{2} + \\ + b_{1} σ_{t + l - 1}^{2} + \dots + b_{p} σ_{t + l - p}^{2} / X_{t}) \\ = & C_{0} + C_{1} {(r_{t} + \dots + r_{t + 1 - a_{1}})}^{2} + \dots + \\ + C_{s} {(r_{t} + \dots + r_{t + 1 - a_{s}})}^{2} + \dots + \\ + C_{m} {(r_{t} + \dots + r_{t + 1 - a_{m}})}^{2} + b_{1} σ_{t}^{2} + \dots + b_{p} σ_{t + 1 - p}^{2} . \end{matrix}$
(ii): If l is such that $a_{s - 1} < l < a_{s}$ , $s = 1, 2, \dots, m$ , then we have,

$\begin{matrix} {\hat{σ}}_{t}^{2} (l) & = & E (C_{0} + C_{1} {(r_{t + l - 1} + \dots + r_{t + l - a_{1}})}^{2} + \dots + \\ + C_{s} {(r_{t + l - 1} + \dots + r_{t + l - a_{s}})}^{2} + \dots + \\ + C_{m} {(r_{t + l - 1} + \dots + r_{t + l - a_{m}})}^{2} + \\ + b_{1} σ_{t + l - 1}^{2} + \dots + b_{p} σ_{t + l - p}^{2} / X_{t}) \\ = & E (C_{0} + \sum_{i = 1}^{s - 1} C_{i} {(\sum_{j = 1}^{a_{i}} r_{t + l - j})}^{2} + \\ + \sum_{i = s}^{m} C_{i} {(\sum_{j = 1}^{l - 1} r_{t + l - j})}^{2} + \sum_{i = s}^{m} C_{i} {(\sum_{j = l}^{a_{i}} r_{t + l - j})}^{2} + \\ + \sum_{i = s}^{m} C_{i} (\sum_{j = 1}^{l - 1} r_{t + l - j}) (\sum_{j = l}^{a_{i}} r_{t + l - j}) + \\ + b_{1} σ_{t + l - 1}^{2} + \dots + b_{p} σ_{t + l - p}^{2} / X_{t}) \\ = & E (C_{0} + \sum_{i = 1}^{s - 1} C_{i} {(\sum_{j = 1}^{a_{i}} σ_{t + l - j} ε_{t + l - j})}^{2} + \\ + \sum_{i = s}^{m} C_{i} {(\sum_{j = 1}^{l - 1} σ_{t + l - j} ε_{t + l - j})}^{2} + \\ + \sum_{i = s}^{m} C_{i} {(\sum_{j = l}^{a_{i}} r_{t + l - j})}^{2} + \\ + \sum_{i = s}^{m} C_{i} (\sum_{j = 1}^{l - 1} σ_{t + l - j} ε_{t + l - j}) (\sum_{j = l}^{a_{i}} r_{t + l - j}) + \\ + b_{1} σ_{t + l - 1}^{2} + \dots + b_{p} σ_{t + l - p}^{2} / X_{t}), \end{matrix}$

and given the independence of $ε_{t}$ and $E (ε_{t}) = 0$ , we have $E (r_{t - i} r_{t - j}) = 0$ , $\forall i \neq j$ ; hence,

$\begin{matrix} {\hat{σ}}_{t}^{2} (l) & = & E (C_{0} + \sum_{i = 1}^{s - 1} C_{i} \sum_{j = 1}^{a_{i}} σ_{t + l - j}^{2} + \sum_{i = s}^{m} C_{i} \sum_{j = 1}^{l - 1} σ_{t + l - j}^{2} + \\ + \sum_{i = s}^{m} C_{i} {(\sum_{j = l}^{a_{i}} r_{t + l - j})}^{2} + b_{1} σ_{t + l - 1}^{2} + \dots + b_{p} σ_{t + l - p}^{2} / X_{t}) \\ = & C_{0} + \sum_{i = 1}^{s - 1} C_{i} \sum_{j = 1}^{a_{i}} {\hat{σ}}_{t + l - j}^{2} + \sum_{i = s}^{m} C_{i} \sum_{j = 1}^{l - 1} {\hat{σ}}_{t + l - j}^{2} + \\ + \sum_{i = s}^{m} C_{i} {(\sum_{j = l}^{a_{i}} r_{t + l - j})}^{2} + b_{1} {\tilde{σ}}_{t + l - 1}^{2} + \dots + b_{p} {\tilde{σ}}_{t + l - p}^{2}, \end{matrix}$

where, for $i = 1, \dots, p$ , we have that ${\tilde{σ}}_{t + l - i}^{2} = \{\begin{matrix} \begin{matrix} σ_{t + l - i}^{2}, & i \geq l \end{matrix} \\ \begin{matrix} {\hat{σ}}_{t + l - i}^{2}, & i < l \end{matrix} \end{matrix}$
(iii): If l is such that $l > a_{m}$ , $s = 1, 2, \dots, m$ , then it follows

$\begin{matrix} {\hat{σ}}_{t}^{2} (l) & = E (C_{0} + \sum_{i = 1}^{m} C_{i} \sum_{j = 1}^{a_{i}} σ_{t + l - j}^{2} + b_{1} σ_{t + l - 1}^{2} + \dots + b_{p} σ_{t + l - p}^{2} / X_{t}) \\ = C_{0} + \sum_{i = 1}^{m} C_{i} \sum_{j = 1}^{a_{i}} {\hat{σ}}_{t + l - j}^{2} + b_{1} {\tilde{σ}}_{t + l - 1}^{2} + \dots + b_{p} {\tilde{σ}}_{t + l - p}^{2}, \end{matrix}$

where for $i = 1, \dots, p$ , we have ${\tilde{σ}}_{t + l - i}^{2} = \{\begin{matrix} \begin{matrix} σ_{t + l - i}^{2}, & i \geq l \end{matrix} \\ \begin{matrix} {\hat{σ}}_{t + l - i}^{2}, & i < l \end{matrix} \end{matrix}$

4. High Frequency Data

In this section we further elaborate on high frequency data and introduce the series that will be analyzed later. High frequency data are very important in the financial environment, mainly because there exist large movements in short intervals of time. This aspect represents an interesting opportunity for trading. Furthermore, it is well known that volatilities in different frequencies have significant cross-correlation. We can even say that coarse volatility predicts fine volatility better than the inverse, as shown in Dacorogna et al. (2001).

As an example, take the tick by tick foreign exchange (FX) time series Euro-Dollar, from January First 1999 to December 31, 2002. Returns are calculated using bid and ask prices, as

r_{t} = ln ((p_{t}^{b i d} + p_{t}^{a s k}) / 2) - ln ((p_{t - 1}^{b i d} + p_{t - 1}^{a s k}) / 2) .

(6)

We discard Saturdays and Sundays, and we replace holidays with the means of the last ten observations of the returns for each respective hour and day. After cleaning the data (see Dacorogna et al. (2001), for details) we will consider equally spaced returns, with sampling interval

Δ t = 15

min. This seems to be adequate, as many studies indicate.

Figure 2 shows Euro-Dollar returns calculated as above. The length of this time series is 95,317. The figure shows that the absolute returns present a seasonal pattern. This is due to the fact that physical time does not follow, necessarily, the same pattern as the business time. This is a typical behavior of a financial time series and we will use a seasonal adjustment procedure similar to that of Martens et al. (2002). However, we will use absolute returns instead of squared returns; that is, we will compute the seasonal pattern as

S_{d, s, h} = \frac{1}{s} \sum_{j = 1}^{s} | (r_{d, j, h} |,

(7)

where

r_{d, s, h}

is the return in the weekday d, week s and hour h, and s is the number of weeks from the beginning of the series. Therefore,

S_{d, N_{s}, h}

is the rolling window mean of the absolute returns with the beginning fixed.

In Figure 3 we have the autocorrelation function of these returns and of squared returns. The seasonality pattern is no longer present.

FX data has some distinct characteristics, mainly because they are produced twenty four hours a day, seven days a week. In particular, Euro-Dollar is the most liquid FX in the world. However, there are periods where the activity is greater or smaller, causing seasonal patterns to occur, as seen above.

Let us analyze some facts about these returns that we will denote simply by

r_{t}

. We can see in Figure 4 the histogram fitted with a non-parametric density kernel estimate, using unbiased cross-validation method to estimate the bandwidth. It shows fat tails and high kurtosis, namely, 121, while its skewness coefficient is −0.079, showing almost symmetry. A normality test (Jarque-Bera) rejects the hypothesis that these returns are normal.

The seasonally adjusted returns are then given by

{\tilde{r}}_{t} = {\tilde{r}}_{d, s, h} = \frac{r_{d, s, h}}{S_{d, s, h}} .

(8)

We may assume for example that the errors of a GARCH model fitted to these returns follow a Student’s t distribution or a generalized error distribution, which represents better the fat tails of the distribution.

Often the optimization of the likelihood function can be a very difficult task, due mainly to the flat behavior of likelihood function, as can be seen in Zumbach (2000). Bayesian methods are an alternative, and in the next section we will use the Griddy-Gibbs sampling to estimate the parameters of a PHARCH model.

Figure 4 also shows that the Euro-Dollar series has some clusters of volatility. This is a typical behavior of financial time series. A problem is that we do not know how many clusters there are and what their sizes are. The reason for this is that the information arriving is different for each sampling frequency.

We can look these clusters as market components and they depend on the heterogeneity of the market. These market components are considered in our PHARCH model, as seen in Equation (3). Differently from GARCH-type models, PHARCH models have a variance equation with returns over intervals of different sizes. Therefore PHARCH models take into account the sign of the returns and not only their absolute value as GARCH models do. Two subsequent returns with similar sizes in the same direction will cause a higher impact on the variance than two subsequent returns with similar sizes but opposite signs.

Now we need to determine the number and the size of the market components for the Euro-Dollar FX series. Ruilova (2007) proposed some technical rules to determine these market components, and Dacorogna et al. (2001) proposed some empirical rules.

To help us to determine if the component sizes chosen are correct we can use the impact of the component.

We define the impact

I_{i}

of the ith component as,

I_{i} = a_{i} C_{i}, \forall i .

(9)

Note that the stationary condition to PHARCH(m) models can be written in terms of these impacts; namely,

\sum_{i = 1}^{m} I_{i} < 1 .

We also notice that if we consider the Student’s t distribution with

ν

degrees of freedom, the impact should be defined as

I_{i} = \frac{v}{v - 2} a_{i} C_{i}, \forall i \geq 1 .

(10)

As remarked above, the number of components in a financial series can vary depending how the returns are being traded in this market. That is, liquid series can have a structure with more components than a non-liquid series.

5. Application

Due to the complexity of the proposed model, the likelihod function may be flat in the neighbourhood of the maxima, so the optimization procedure using traditional procedures may fail. An alternative is to use Bayesian methods. Some references on the use of Bayesian procedures for the family of ARCH processes are Geweke (1989), Kleibergen and Dijk (1993), Geweke (1994) and Bauwens and Lubrano (1998).

It is well known that when the analytical expressions of the full conditional distributions are known we can use Gibbs sampling. However, if the conditional distributions are not known, we need to modify the algorithm or to use another algorithm, such as the Metropolis-Hastings one. Another alternative to solve this is to use the Griddy-Gibbs sampler of Ritter and Tanner (1992).

Griddy-Gibbs sampling can be used when the joint conditional distribution of at least one parameter does not have a distribution form known but it has an analytical expression that can be evaluated on a grid of points. For that, we evaluate the analytical expressions of the joint conditional distribution function, and by numerical integration we can generate random variables of this distribution; see Davis and Rabinowitz (1975).

A problem that appears when we use Griddy-Gibbs sampling is to determine a window and the number of points where we will evaluate numerically the desired function. An inadequate determination of this grid of points could cause errors in the parameters estimation. In general it seems suitable to have 50 points in the grid for a good evaluation.

We will use a technique to reduce the variance that is to compute the conditional mean

\sum_{n = 1}^{N} E (θ_{i} | θ_{1}^{n}, \dots, θ_{i - 1}^{n}, θ_{i + 1}^{n}, \dots, θ_{n}^{n}, y) / N,

instead of

\sum_{n = 1}^{N} θ_{i}^{n} / N

to estimate

E (θ_{i} | θ_{1}, θ_{2}, \dots, θ_{i - 1}, θ_{i + 1}, \dots, θ_{n}, y)

. Here

θ_{i}^{n}

denotes the value of the parameter

θ_{i}

at iteration n.

An important fact is that aggregate returns have, generally, a magnitude greater than non-aggregate returns, so the components with larger aggregations have smaller values. For this reason, we can use the impacts defined above to study the contribution of each component to the model.

In order to establish the number of components in the PHARCH model, we will use information of the financial market behavior based on the behavior of the traders. So, we consider five components, as seen in the Table 1, corresponding to information arriving at the market at the rate of 15 min, 1 h, 1 day, 1 week and 1 month.

This means that we need to estimate the parameters of a PHARCH(5) process with aggregations 1, 4, 96, 480 and 1920 as follows.

\begin{matrix} r_{t} & = & σ_{t} ε_{t} \\ σ_{t}^{2} & = & C_{0} + C_{1} r_{t - 1}^{2} + C_{2} {(r_{t - 1} + \dots + r_{t - 4})}^{2} + \\ + C_{3} {(r_{t - 1} + \dots + r_{t - 96})}^{2} + C_{4} {(r_{t - 1} + \dots + r_{t - 480})}^{2} + \\ + C_{5} {(r_{t - 1} + \dots + r_{t - 1920})}^{2} \end{matrix}

(11)

where

\begin{matrix} C_{j} \geq 0, & j = 1, \dots, 5, & C_{0} > 0 \end{matrix}

, and

ε_{t} \sim t (0, 1, v)

.

The number of parameters to estimate is seven because we considered

ε_{t} \sim t (0, 1, v)

,

v > 2

. We use an autoregressive processes to filter the data and to take into account the information given by the acf function of the returns shown in the Figure 3.

We consider non-informative priors, that is, uniform distributions on the parametric space, as follows:

C_{0}, C_{1}, C_{2}, C_{3}, C_{4}, C_{5} \sim U (0, 1)

and

v \sim U (3, c t)

, where U denotes the uniform distribution and

c t

is a large number; in particular, we used

c t = 50

.

Estimates using maximum likelihood (ML) are shown in Table 2, and the corresponding impacts are shown in Figure 5. The optimizer used to evaluate the impacts was simulated annealing; see Belisle (1992). Several problems were faced in the process of using ML because in some situations the optimizer did not converge. Sometimes we can solve this problem, using initial values near to optimum. But this may not be a normal situation in real cases. So the need for alternative procedures.

As we can see, the impact of the components decreases for larger aggregations. This is a natural result because intraday traders are those who dominate the market. Another fact is that the weekly component has a similar impact to the monthly component, meaning that both have similar weight contributions to predicting volatility. The results show that an impact can be significant even when the parameter is small. The above estimates will serve as a comparison with Griddy-Gibbs estimates.

In the Griddy–Gibbs sampling we use a moving window: we define a new window in each iteration as a function of the mean, mode and standard deviation.

Table 3 shows the results of the estimation of a parsimonious HARCH(5) model for the Euro-Dollar, using this criterion of selection for a moving window for each parameter, where the conditional density will be computed. The number of points where this density was evaluated was 50.

We used the non conditional and conditional mean in each step of the iteration to calculate the estimate parameters. As expected, conditional method was faster than non-conditional, but the difference was very small.

We see that the values are practically the same by both methods (maximum likelihood and Griddy-Gibbs sampling).

In Figure 6 we can see the convergence of the parameters using Griddy-Gibbs for each iteration step. We can see the fast convergence of the parameters.

Now, we compare HARCH modeling with GARCH modeling. In Figure 7, Figure 8 and Figure 9 we present a residual analysis after the fitting of a GARCH model. In Figure 10, Figure 11 and Figure 12 we have the corresponding graphs for the PHARCH(5) fitting. We see a slightly better fit of the PHARCH model. If we use the prediction mean squared error (PMSE) as a criterion for comparison, we obtain the values 15.58 and 15.20, for GARCH and PHARCH modeling, respectively, using the standardized residuals and 1000 values for the prediction period.

6. Conclusions

PHARCH models are good models for the analysis of high frequency data, since the financial market agents behave differently, incorporating heterogeneous information to the market microstructure. Nevertheless, their use still depends on solving some issues, mainly computational ones.

One big challenge in the analysis of high frequency data is dealing with large amounts of observations, and complex models bring computational difficulties, even with the recent technological breakthroughs in computing technology. Therefore, the first issue here is to develop techniques that help us to improve the computational algorithms. Maximum likelihood estimation may collapse, as we have described earlier. Techniques such as genetic algorithms and neural networks are viable optimization alternatives.

Another possibility is to use Bayesian techniques, such as the Griddy-Gibbs samples that we have used. The disadvantage of the Griddy-Gibbs sampler lies in its high computational load. From another viewpoint, more sophisticated volatility models might be developed, taking into account the arrival of information, for example, a stochastic volatility model or stochastic duration model; or we could adapt existing models such as CHARN (conditional heteroskedasticity nonlinear autoregressive) models to heterogeneity of information characteristics. Finally, extensions similar to those proposed to GARCH models could be studied for HARCH models.

A feature of the HARCH models is that the market components are chosen in a subjective way. In the analysis of the Euro-Dollar series, we considered five components, with different aggregations. A different number of components could be proposed, depending of the degree of information one has. This is clearly a matter for further studies.

One last remark is that the performance of the different estimation methods should be evaluated. This evaluation could be done using prediction capabilities, for example. Other possibility is calculating some measure of risk. Volatility models are often established with the purpose of computing the VaR (value at risk) or other risk measure or for establishing trading strategies. In this context, an evaluation of the performance of the proposed model and several estimation procedures should be interesting. A comparison of returns of different trading systems that use a proposed model will be of fundamental importance. Further details on these aspects can be found in Acar and Satchell (2002), Dunis et al. (2003), Ghysels and Jasiak (1994) and Park and Irwin (2005).

Other models for high frequency data use the realized volatility as a basis, instead of models such as ours and models of the ARCH family, which assume that volatility is a letent variable. Among the former, we mention the autoregressive fractionally integrated moving average (ARFIMA) models, the heterogeneous autoregressive model of realized volatilidade (HAR–RV) of Corsi (2009) and the mixed data sampling regression (MIDAS) proposed by Ghysels Santa-Clara and Valkanov (2002). A comparison of the PHARCH models with HAR and MIDAS would be useful, but due to the length of the present paper, this will be the object of future research.

Author Contributions

The authors study some theoretical properties of the PHARCH models and illustrate the theory with an application to real data. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially funded by Fapesp grant 2013/00506-1.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Proof of Proposition 1.

Let

X_{t}

following (3) and (5). Assume that the innovations distribution has a non-zero absolutely continuous component, with a positive density on a Borel set with a non-empty interior. Examples of this are the normal and Student’s t distribution.

Then

X_{t}

can be written as

X_{t} = H (X_{t - 1}, ε_{t})

, where H is a non-linear continuous function for each

ε_{t}

fixed. Then, using the Continuous Mapping Theorem we obtain the weak convergence of

X_{t}

, namely, the conditional distribution

X_{t}

given

X_{t - 1} = y_{k}

converges to the conditional distribution of

X_{t}

given

X_{t - 1} = y

if

y_{k} \to y

. So, the Markov Chain

X_{t}

that represents the PHARCH(m,p) process is also a T-chain. □

Proof of Proposition 2.

Define the function V as:

V (X_{t}) : = \sum_{i = 1}^{a_{m} - 1} α_{i} r_{t - i}^{2} + 2 \sum_{i = 1}^{a_{m} - 2} \sum_{j = i + 1}^{a_{m} - 1} β_{j} r_{t - i} r_{t - j} + \sum_{i = 0}^{p - 1} γ_{i} σ_{t - i}^{2} .

A simple algebraic computation gives

E [\sum_{i = 1}^{a_{m} - 1} α_{i} r_{t + 1 - i}^{2} |X_{t}] = α_{1} σ_{t}^{2} + \sum_{i = 1}^{a_{m} - 2} α_{i + 1} r_{t - i}^{2},

E [2 \sum_{i = 1}^{a_{m} - 2} \sum_{j = i + 1}^{a_{m} - 1} β_{j} r_{t + 1 - i} r_{t + 1 - j} |X_{t}] = 2 \sum_{i = 1}^{a_{m} - 3} \sum_{j = i + 1}^{a_{m} - 2} β_{j + 1} r_{t - i} r_{t - j},

and using (4) we have,

\begin{matrix} E [\sum_{i = 0}^{p - 1} γ_{i} σ_{t + 1 - i}^{2} |X_{t}] = & γ_{0} [C_{0} + \sum_{i = 1}^{m} C_{i} [σ_{t}^{2} + {(\sum_{j = 1}^{a_{i} - 1} r_{t - j})}^{2}] + \sum_{i = 1}^{p} b_{i} σ_{t + 1 - i}^{2}] + \\ + \sum_{i = 1}^{p - 1} γ_{i} σ_{t + 1 - i}^{2} . \end{matrix}

Therefore, taking

α_{a_{m}} = β_{a_{m}} = γ_{p} = 0

and

a_{0} = 1

and grouping we have

\begin{matrix} V (X_{t}) - E [V (X_{t + 1}) | X_{t}] & = \sum_{k = 1}^{m} \sum_{j = a_{k - 1}}^{a_{k} - 1} (α_{j} - α_{j + 1} - γ_{0} \sum_{i = k}^{m} C_{i}) r_{t - j}^{2} + \\ + \sum_{l = 1}^{m} \sum_{j = 1}^{a_{l} - 2} \sum_{k = a_{l - 1} + j}^{a_{l} - 1} (β_{k} - β_{k + 1} - γ_{0} \sum_{i = l}^{m} C_{i}) r_{t - j} r_{t - k} + \\ + \sum_{i = 1}^{p - 1} (γ_{i} - γ_{i + 1} - γ_{0} b_{i + 1}) σ_{t - i}^{2} + \\ + (γ_{0} - γ_{1} - γ_{0} b_{1} - α_{1} - γ_{0} \sum_{i = 1}^{m} C_{i}) σ_{t}^{2} - γ_{0} C_{0} . \end{matrix}

We choose

k \in Z^{+}

,

a_{l - 1} < k < a_{l}

,

l \in \{1, \dots, m\}

, and

β_{k} = β_{k + 1} + \sum_{i = l}^{m} C_{i}

.

If we take

α_{j} > β_{j}

, for all j, then we have

V (X_{t}) \geq 0

, and we can take

γ_{0} = 1

, so

\begin{matrix} V (X_{t}) - E [V (X_{t + 1}) | X_{t}] & = & \sum_{k = 1}^{m} \sum_{j = a_{k - 1}}^{a_{k} - 1} (α_{j} - α_{j + 1} - \sum_{i = k}^{m} C_{i}) r_{t - j}^{2} \\ + \sum_{i = 1}^{p - 1} (γ_{i} - γ_{i + 1} - b_{i + 1}) σ_{t - i}^{2} \\ + (1 - γ_{1} - b_{1} - α_{1} - \sum_{i = 1}^{m} C_{i}) σ_{t}^{2} - C_{0} . \end{matrix}

(A1)

Similarly, we can choose

k \in Z^{+}

,

a_{l - 1} < k < a_{l}

,

l \in \{1, \dots, m\}

,

α_{k} > α_{k + 1} + \sum_{i = l}^{m} C_{i}

and

γ_{i} > γ_{i + 1} + b_{i + 1}

.

If

\sum_{i = 1}^{m} a_{i} C_{i} + \sum_{i = 1}^{p} b_{i} < 1

, then we have for the expressions in Equation (A1):

ξ_{i} = (α_{i} - α_{i + 1} - \sum_{j = k}^{m} C_{j}) > 0

,

i = 1, \dots, a_{m} - 1

and k is chosen such that

i \in (a_{k - 1}, a_{k})

;

ξ_{a_{m} - 1 + i} = (γ_{i} - γ_{i + 1} - b_{i + 1}) > 0

,

i = 1, \dots, p - 1

; and

ξ_{a_{m} + p - 1} = (1 - γ_{1} - b_{1} - α_{1} - \sum_{i = 1}^{m} C_{i}) > 0 .

So,

V (X_{t}) - E [V (X_{t + 1}) | X_{t}]

can be as large as we want if

X_{t} \in C^{c}

.

Then, using Lemma 1 we have that there exists an invariant measure, finite on compact sets of

Ω

.

Choosing

ϖ_{min} = min (ξ_{i})

, for

i = 1, \dots, a_{m} + p - 1

, we have,

V (X_{t}) - E [V (X_{t + 1}) | X_{t}] \geq ϖ_{min} \sum_{i = 1}^{a_{m} - 1} r_{t - j}^{2} + ϖ_{min} \sum_{i = 1}^{p} σ_{t - i}^{2} - C_{0}

\geq 1 - (C_{0} + 1) I_{B (0, \sqrt{\frac{C_{0} + 1}{ϖ_{min}}})},

where

B (c, r)

is the ball with center c and radius r.

Therefore, the Markov chain

X_{t} = (r_{t - 1}, \dots, r_{t - a_{m} + 1}, σ_{t}, \dots, σ_{t - p + 1})

that represents the PHARCH(m,p) process is recurrent, with an invariant probability measure (stationary distribution).

Now, if we consider

f (x_{1}, \dots, x_{a_{m} + p - 1}) = x_{1}^{2} + \dots + x_{a_{m} + p - 1}^{2}

, then,

\begin{matrix} V (X_{t}) - E [V (X_{t + 1}) | X_{t}] & \geq \frac{ϖ_{min}}{2} f (r_{t - 1}, \dots, r_{t - a_{m} + 1}, σ_{t}, \dots, σ_{t - p + 1}) + \\ + C_{0} 1_{B (0, \sqrt{\frac{2 C_{0}}{ϖ_{min}}})} . \end{matrix}

We conclude, using Lemmas 2 and 3, that the process

X_{t}

is a T chain having a stationary distribution with finite second order moments. □

References

Acar, Emmanuel, and Stephen Satchell. 2002. Adavanced Trading Rules. Oxford: Butterworth Heinemann. [Google Scholar]
Andersen, Torben G., and Tim Bollerslev. 1998. Answering the skeptics: Yes, standard volatility models do provide accurate forecasts. International Economic Review 39: 885–905. [Google Scholar] [CrossRef]
Bauwens, Luc, and Michel Lubrano. 1998. Bayesian inference on GARCH models using the gibbs sampler. Econometrics Journal 1: 23–46. [Google Scholar] [CrossRef]
Bélisle, Claude J. 1992. Convergence theorems for a class of simulated annealing algorithms on R^d. Journal of Applied Probability 29: 885–95. [Google Scholar] [CrossRef]
Bollerslev, Tim. 1986. Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31: 307–27. [Google Scholar] [CrossRef] [Green Version]
Corsi, Fulvio. 2009. A simple approximate long-memory model of realized volatility. Journal of Financial Econometrics 7: 174–96. [Google Scholar] [CrossRef]
Dacorogna, Michel M., Ramazan Gençay, Ulrich Muller, Richard B. Olsen, and Olivier V. Pictet. 2001. An Introduction to High-Frequency Finance. Amsterdam: Elsevier. [Google Scholar]
Dacorogna, Michel M., Ulrich A. Müller, Paul Embrechts, and Gennady Samorodnitsky. 1996. How heavy are the tails of a stationary HARCH(k) process? A study of the moments. In Stochastic Processes and Related Topics. Boston: Birkäuser. [Google Scholar]
Davis, Philip J., and Philip Rabinowitz. 1975. Methods Of Numerical Integration. New York: Academic Press. [Google Scholar]
Dunis, Christian L., Jason Laws, and Patrick Naïm. 2003. Applied Quantitative Methods For Trading And Investment. Chichester: John Wiley & Sons Ltd. [Google Scholar]
Engle, Robert F. 1982. Autoregressive conditional heteroskedasticity with estimates of the variance of u. k. inflation. Econometrica 50: 987–1008. [Google Scholar] [CrossRef]
Geweke, John. 1989. Exact predictive densities in linear models with ARCH disturbances. Journal of Econometrics 40: 63–86. [Google Scholar] [CrossRef]
Geweke, John. 1994. Bayesian Comparison of Econometric Models. Technical Report Working Paper 532. Minneapolis, MN, USA: Research Department, Federal Reserve Bank of Minneapolis. [Google Scholar]
Ghysels, Eric, and Joanna Jasiak. 1994. Stochastic Volatility and Time Deformation: An Apllication to Trading Volume and Leverage Effects. Technical report. Santa Fé: Western Finance Association Meeting. [Google Scholar]
Ghysels, Eric, Pedro Santa-Clara, and Rossen Valkanov. 2002. The MIDAS Touch: Mixed Data Sampling Regression Models. Working paper. Chapel Hill, NC, USA: UNC and UCLA. [Google Scholar]
Kleibergen, Frank, and Herman K. Van Dijk. 1993. Non-stationarity in GARCH models: A bayesian analysis. Journal of Applied Econometrics 8: 41–61. [Google Scholar]
Martens, Martin, Yuan-Chen Chang, and Stephen J. Taylor. 2002. A comparison of seasonal adjustment methods when forecasting intraday volatility. Journal of Financial Research 25: 283–99. [Google Scholar] [CrossRef]
Meyn, Sean P., and Richard L. Tweedie. 1996. Markov Chains and Stochastic Stability. Heidelberg: Springer. [Google Scholar]
Müller, Ulricn, Michel M. Dacorogna, Rakhal D. Davé, Richard B. Olsen, Olivier V. Pictet, and Jacob E. von Weizsäcker. 1997. Volatilities of different time resolutions—Analyzing the dynamics of market components. Journal of Empirical Finance 4: 213–89. [Google Scholar] [CrossRef]
Park, Cheol-Ho, and Scott H. Irwin. 2005. The Profitability of Technical Trading Rules in US Futures Markets: A Data Snooping Free Test. Technical report, AgMAS Project Research Report. Urbana-Champaign: University of Illinois. [Google Scholar]
Ritter, Christian, and Martin A. Tanner. 1992. Facilitating the gibbs sampler: The gibbs stopper and the griddy-gibbs sampler. Journal of the American Statistical Association 87: 861–68. [Google Scholar] [CrossRef]
Ruilova, Juan Carlos. 2007. Modelos Arch Heterogêneos E Aplicações À Análise De Dados De Alta Freqüência. Ph.D. thesis, Institute of Mathematic and Statistics, University of São Paulo, São Paulo, Brazil. [Google Scholar]
Taylor, Stephen J., and Xinzhong Xu. 1997. The incremental volatility information in one million foreign exchange quotations. Journal of Empirical Finance 4: 317–40. [Google Scholar] [CrossRef]
Zumbach, Gilles. 2000. The pitfalls in fitting GARCH(1,1) processes. In Advances in Quantitative Asset Management. Technical report. Boston: Springer. [Google Scholar]

Figure 1. Simulations of ARCH, HARCH and PHARCH processes.

Figure 2. Euro-Dollar returns: acf of returns, acf of absolute returns and acf of squared returns.

Figure 3. Autocorrelation functions of the Euro-Dollar returns and squared returns after seasonal adjustment.

Figure 4. Euro-Dollar returns, absolute returns, squared returns and histogram after taking off seasonal pattern.

Figure 5. Impact of the components estimated by maximum likelihood.

Figure 6. Convergence of the parameters using Griddy-Gibbs sampler.

Figure 7. Euro-Dollar residuals, absolute residuals, squared residuals and histogram of the residuals after fitting a GARCH process.

Figure 8. Autocorrelation and partial autocorrelation functions of the residuals, absolute residuals and squared residuals after GARCH fitting.

Figure 9. QQ-Plot of GARCH Residuals.

Figure 10. Euro-Dollar residuals, absolute residuals, squared residuals and histogram of the residuals after fitting a PHARCH process.

Figure 11. Autocorrelation and partial autocorrelation functions of the residuals, absolute residuals and squared residuals after PHARCH fitting.

Figure 12. QQ-plot of PHARCH residuals.

Table 1. Component description of PHARCH process for Euro-Dollar.

Component	Aggregations of Euro-Dollar	Range of Time Intervals	Description
1	1	15 min	Short term, intraday traders, market makers.
2	4	1 h	Intraday traders with few transactions per day.
3	96	1 day	Daily traders.
4	480	1 week	Medium term traders.
5	1920	1 month	Long term traders, investors, derivative traders.

Table 2. Parameter estimation by maximum likelihood using simulate annealing optimizer.

Parameter	Estimates
$C_{0}$	0.529
$C_{1}$	0.101
$C_{2}$	0.0173
$C_{3}$	0.000374
$C_{4}$	0.0000415
$C_{5}$	0.0000105
$ν$	3.638

Table 3. Estimated parameters for the PHARCH(5)model with aggregations 1, 4, 96, 480 and 1920 for the Euro-Dollar series, using Griddy-Gibbs sampling.

Parameter	Non-Conditional	Conditional	Std. Dev.
$C_{0}$	0.532	0.532	0.00471
$C_{1}$	0.102	0.102	0.00321
$C_{2}$	0.0174	0.0174	0.000709
$C_{3}$	0.00359	0.000360	0.0000205
$C_{4}$	0.0000402	0.0000403	0.00000421
$C_{5}$	0.0000104	0.0000103	0.00000105
$ν$	3.649	3.649	0.034

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ruilova, J.C.; Morettin, P.A. Parsimonious Heterogeneous ARCH Models for High Frequency Modeling. J. Risk Financial Manag. 2020, 13, 38. https://doi.org/10.3390/jrfm13020038

AMA Style

Ruilova JC, Morettin PA. Parsimonious Heterogeneous ARCH Models for High Frequency Modeling. Journal of Risk and Financial Management. 2020; 13(2):38. https://doi.org/10.3390/jrfm13020038

Chicago/Turabian Style

Ruilova, Juan Carlos, and Pedro Alberto Morettin. 2020. "Parsimonious Heterogeneous ARCH Models for High Frequency Modeling" Journal of Risk and Financial Management 13, no. 2: 38. https://doi.org/10.3390/jrfm13020038

Article Menu

Parsimonious Heterogeneous ARCH Models for High Frequency Modeling

Abstract

1. Introduction

2. Background

2.1. Markov Chains

2.2. Stationarity of PHARCH(m,p) Models

3. Forecasting

4. High Frequency Data

5. Application

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI