A Note on the Asymptotic Normality Theory of the Least Squares Estimates in Multivariate HAR-RV Models

Won-Tak Hong; Jiwon Lee; Eunju Hwang

doi:10.3390/math8112083

,

and

¹

College of International Studies, KyungHee University, Yongin 446-701, Korea

²

Department of Applied Statistics, Gachon University, Seongnam 13120, Korea

^*

Author to whom correspondence should be addressed.

Mathematics2020, 8(11), 2083;https://doi.org/10.3390/math8112083

Version Notes

Order Reprints

Abstract

In this work, multivariate heterogeneous autoregressive-realized volatility (HAR-RV) models are discussed with their least squares estimations. We consider multivariate HAR models of order p with q multiple assets to explore the relationships between two or more assets’ volatility. The strictly stationary solution of the HAR(

p, q

) model is investigated as well as the asymptotic normality theories of the least squares estimates are established in the cases of i.i.d. and correlated errors. In addition, an exponentially weighted multivariate HAR model with a common decay rate on the coefficients is discussed together with the common rate estimation. A Monte Carlo simulation is conducted to validate the estimations: sample mean and standard error of the estimates as well as empirical coverage and average length of confidence intervals are calculated. Lastly, real data of volatility of Gold spot price and S&P index are applied to the model and it is shown that the bivariate HAR model fitted by selected optimal lags and estimated coefficients is well matched with the volatility of the financial data.

Keywords:

multivariate HAR models; least squares estimation; asymptotic normality; exponentially weighted HAR models

1. Introduction

Volatility in financial asset returns is one of the most important components in the financial market for optimization decisions such as portfolio selection, risk management and asset pricing. Over recent decades, extensive research works by statisticians and econometricians have analyzed volatility alongside time series modelings. In particular, because multiple assets are correlated with each other in the financial markets, the cross-correlation of two or more asset returns and the spillover effect of the volatility have been represented in multivariate time series models rather than the univariate models. Probabilistic properties such as long-memory of the volatility, which means that historical volatility has a persistent impact on future volatility, have been imposed efficiently to time series models through various statistical techniques.

In order to capture the long-memory persistence of the volatility, Corsi [1] suggested a simple but remarkably efficient time series model with additive cascades defined on different time periods by incorporating the daily, weekly and monthly volatility components. Corsi’s model [1] called a heterogeneous autoregressive-realized volatility model of order 3, (HAR-RV(3) or simply HAR(3)), is a type of linear autoregression with heterogeneous regressors, which are previous lag, lagged weekly moving average, and lagged monthly moving average. Since its introduction, various adaptive versions of the HAR model are used to analyze the volatility along with empirical data analysis. Here, we refer to [2,3,4,5] for univariate data and [6,7,8,9,10,11] for multivariate data.

In particular, Bubak et al. [6] used a multivariate extension of the HAR model to uncover volatility transmission between Central European currencies and the EUR/USA foreign exchange rate, whereas Soucek and Todorova [8] employ a bivariate HAR model to explore the relationship between equity and oil market volatility. Cubadda et al. [9] proposed a vector HAR index model for detecting the presence of co-movements and analyzing the joint behavior in a set of daily realized volatility measures. Cech and Barunik [10] proposed a generalized HAR model for dynamic covariance matrix modelling and forecasting.

All of the references above empirically only dealt with the multivariate HAR models, but they have not presented theoretical results of the HAR model. Indeed, there is only a few of theoretical analysis of the HAR model, even in the univariate case. Furthermore, the least squares estimate (LSE) method has been popularly used as a tool of estimation in time series data analysis. The LSEs are used often in the literature to analyze the multivariate data in the multivariate HAR models and good performances on the volatility forecasting are obtained with forecast accuracy, for instance, see [9,12]. However, in the multivariate HAR models, rigorous formulation of asymptotic multivariate normality theory of the LSE has not yet been established in spite of the frequent uses of the LSEs. The earlier studies motivate us to consider multivariate HAR models of order p with q multiple assets, and look into theoretical analysis of the LSEs. In this work we provide, as a main result, strictly stationarity of the HAR(

p, q

) model and asymptotic normality theories of LSEs for the parameters in the multivariate HAR models. The multivariate normalities of the LSEs are established in two cases of i.i.d. and correlated errors of the models. In addition, we consider an exponentially weighted multivariate HAR model and discuss the estimation of a common decay rate on coefficients in the exponentially weighted HAR model.

A Monte Carlo experiment is conducted to illustrate performance of the estimations. Sample mean and standard error of the LSEs as well as empirical coverage and average length of confidence intervals are calculated. In addition, we model volatility of Gold spot price and S&P500 index using a bivariate HAR model and estimate the coefficient parameters. The parameter estimates and confidence intervals are computed according to the asymptotic normality we derived. Finally, the bivariate HAR fitting is demonstrated to the market data with optimally selected lags under some criterion rules.

The remainder of the paper is organized as follows: In Section 2, the multivariate HAR model is presented with discussion on strictly stationary solution and in Section 3 the LSEs and their multivariate normalities are derived. Section 4 deals with the exponentially weighted multivariate HAR model. A Monte-Carlo experiment is performed and an application with market data is given in Section 5 and Section 6, respectively. We conclude in Section 7, where the main conclusion and discussion on future works are stated.

2. Multivariate HAR( $p, q$ ) Models

We consider the following HAR-RV model

{Y_{1, t} : t \in Z}

with multivariate data

{Y_{j, t} : t \in Z, j = 2, \dots, q}

which has been discussed in [8] as an extension of the HAR-RV(3) model in [1].

Y_{1, t} \equiv Y_{1, t}^{(d)} = β_{0} + \sum_{i = d, w, m} β_{1}^{(i)} Y_{1, t - 1}^{(i)} + \sum_{i = d, w, m} β_{2}^{(i)} Y_{2, t - 1}^{(i)} + \dots + \sum_{i = d, w, m} β_{q}^{(i)} Y_{q, t - 1}^{(i)} + ϵ_{1, t}

(1)

where

{ϵ_{1, t}, t \in Z}

are random variables for error terms and

{β_{0}, β_{j}^{(i)}, j = 1, \dots, q; i = 1, \dots, p}

are coefficient parameters. For each jth-asset,

j = 1, 2, \dots, q

, three volatility components

Y_{j, t - 1}^{(i)}

,

i = d, w, m,

corresponding to time horizons of one day (d), one week (w), one month (m) are given by the realized volatility

Y_{j, t - 1}

on day

t - 1

if

i = d

, and

Y_{j, t - 1}^{(i)} = \frac{1}{h_{i}} (Y_{j, t - 1}^{(d)} + \dots + Y_{j, t - h_{i}}^{(d)})

weakly and monthly averages of realized volatility if

i = w, m;

j = 1, \dots, q

; and

h_{d} = 1, h_{w} = 5, h_{m} = 22

. Soucek and Todorova [8] examined the volatility linkages of oil with three considered equity markets in three separate bivariate models in the bivariate case of assets

j = 1, 2

, with

q = 2

in model (1).

Motivated from the above, we consider a general multivariate HAR-RV(

p, q

) model

{Y_{j, t} : t \in Z, j = 1, \dots, q}

:

Y_{j, t}^{(1)} = β_{j 0} + \sum_{i = 1}^{p} β_{j 1}^{(i)} Y_{1, t - 1}^{(i)} + \sum_{i = 1}^{p} β_{j 2}^{(i)} Y_{2, t - 1}^{(i)} + \dots + \sum_{i = 1}^{p} β_{j q}^{(i)} Y_{q, t - 1}^{(i)} + ϵ_{j, t}

(2)

where

{ϵ_{j, t}, t \in Z, j = 1, \dots, q}

are random variables for errors with mean zero and finite variance,

{h_{i}, i = 1, 2, \dots, p}

are assumed to satisfy

1 = h_{1} < h_{2} < \dots < h_{p} < \infty

, and

{β_{0}, β_{j k}^{(i)}, j, k \in {1, \dots, q}; i \in {1, \dots, p}}

are parameters to be estimated. Volatility components are given in the same as above. For

i = 1, \dots, p

and

j = 1, \dots, q

,

Y_{j, t - 1}^{(i)} = \frac{1}{h_{i}} (Y_{j, t - 1}^{(1)} + \dots + Y_{j, t - h_{i}}^{(1)})

.

Model (2) is more general than model (1) and thus we consider model (2) in this work. As main results we investigate the strict stationarity of the model and develop the estimation of parameters.

Let

Y_{t} = {(Y_{1, t}, Y_{2, t}, \dots, Y_{q, t})}^{⊤}

with

Y_{j, t} \equiv Y_{j, t}^{(1)}

, for

j = 1, \dots, q

. Model (2) can be written as follows:

Y_{j, t} = β_{j 0} + \sum_{k = 1}^{h_{p}} ϕ_{k}^{(j, 1)} Y_{1, t - k} + \sum_{k = 1}^{h_{p}} ϕ_{k}^{(j, 2)} Y_{2, t - k} + \dots + \sum_{k = 1}^{h_{p}} ϕ_{k}^{(j, q)} Y_{q, t - k} + ϵ_{j, t}

where, for

j, ℓ \in {1, 2, \dots, q}

, (with

h_{1} = 1

),

ϕ_{1}^{(j, ℓ)} = \sum_{i = 1}^{p} β_{j ℓ}^{(i)} / h_{i},

ϕ_{h_{1} + 1}^{(j, ℓ)} = \dots = ϕ_{h_{2}}^{(j, ℓ)} = \sum_{i = 2}^{p} β_{j ℓ}^{(i)} / h_{i}

,

ϕ_{h_{2} + 1}^{(j, ℓ)} = \dots = ϕ_{h_{3}}^{(j, ℓ)} = \sum_{i = 3}^{p} β_{j ℓ}^{(i)} / h_{i},

\dots, ϕ_{h_{p - 1} + 1}^{(j, ℓ)} = \dots = ϕ_{h_{p}}^{(j, ℓ)} = β_{j ℓ}^{(p)} / h_{p} .

We write, for each

j = 1, 2, \dots, q

,

Y_{j, t} = β_{j 0} + \sum_{k = 1}^{h_{p}} ϕ_{j, k}^{⊤} Y_{t - k} + ϵ_{j, t}

where

ϕ_{j, k} = {(ϕ_{k}^{(j, 1)}, ϕ_{k}^{(j, 2)}, \dots, ϕ_{k}^{(j, q)})}^{⊤}

and thus

Y_{t} = B_{0} + H_{t - 1}^{*} + E_{t} = B_{0} + H Y_{t - 1}^{*} + E_{t}

where

B_{0} = {(β_{10}, β_{20}, \dots, β_{q 0})}^{⊤}

,

E_{t} = {(ϵ_{1, t}, ϵ_{2, t}, \dots, ϵ_{q, t})}^{⊤}

and

H_{t - 1}^{*} = [\begin{matrix} \sum_{k = 1}^{h_{p}} ϕ_{1, k}^{⊤} Y_{t - k} \\ \sum_{k = 1}^{h_{p}} ϕ_{2, k}^{⊤} Y_{t - k} \\ ⋮ \\ \sum_{k = 1}^{h_{p}} ϕ_{q, k}^{⊤} Y_{t - k} \end{matrix}] = [\begin{matrix} ϕ_{1, 1}^{⊤} & ϕ_{1, 2}^{⊤} & \dots & ϕ_{1, h_{p}}^{⊤} \\ ϕ_{2, 1}^{⊤} & ϕ_{2, 2}^{⊤} & \dots & ϕ_{2, h_{p}}^{⊤} \\ ⋮ & ⋮ & ⋮ \\ ϕ_{q, 1}^{⊤} & ϕ_{q, 2}^{⊤} & \dots & ϕ_{q, h_{p}}^{⊤} \end{matrix}] [\begin{matrix} Y_{t - 1} \\ Y_{t - 2} \\ ⋮ \\ Y_{t - h_{p}} \end{matrix}] = : H Y_{t - 1}^{*} .

Note that

H = (ϕ_{j, k}^{⊤}), j = 1, \dots, q; k = 1, 2, \dots, h_{p}

, is a matrix in

R^{q \times q h_{p}}

and

Y_{t - 1}^{*}

is a column vector in

R^{q h_{p}}

. Furthermore note that

H Y_{t - 1}^{*}

is written as

H Y_{t - 1}^{*} = Φ_{1} Y_{t - 1} + Φ_{2} Y_{t - 2} + \dots + Φ_{h_{p}} Y_{t - h_{p}}

where, for

k = 1, 2, \dots, h_{p}

,

Φ_{k} = [\begin{matrix} ϕ_{1, k}^{⊤} \\ ϕ_{2, k}^{⊤} \\ ⋮ \\ ϕ_{q, k}^{⊤} \end{matrix}] = [\begin{matrix} ϕ_{k}^{(1, 1)} & ϕ_{k}^{(1, 2)} & \dots & ϕ_{k}^{(1, q)} \\ ϕ_{k}^{(2, 1)} & ϕ_{k}^{(2, 2)} & \dots & ϕ_{k}^{(2, q)} \\ ⋮ & ⋮ & ⋮ \\ ϕ_{k}^{(q, 1)} & ϕ_{k}^{(q, 2)} & \dots & ϕ_{k}^{(q, q)} \end{matrix}] \in R^{q \times q} .

Therefore we have the following vector-autoregression of order

h_{p}

:

Y_{t} = B_{0} + Φ_{1} Y_{t - 1} + Φ_{2} Y_{t - 2} + \dots + Φ_{h_{p}} Y_{t - h_{p}} + E_{t}

(3)

which is written as

(I_{q} - Φ_{1} L - Φ_{2} L^{2} - \dots - Φ_{h_{p}} L^{h_{p}}) Y_{t} = B_{0} + E_{t}

with lag operator L and the identity matrix

I_{q} \in R^{q \times q}

.

We make the following assumption:

(A1) The roots of

d e t (I_{q} - Φ_{1} z - Φ_{2} z^{2} - \dots - Φ_{h_{p}} z^{h_{p}}) = 0

lie outside the complex unit circle (have modulus greater than one), or equivalently, the eigenvalues of the companion matrix (of dimension

q h_{p} \times q h_{p}

)

F = [\begin{matrix} Φ_{1} & Φ_{2} & \dots & Φ_{h_{p} - 1} & Φ_{h_{p}} \\ I_{q} & O & \dots & O & O \\ O & I_{q} & O \\ ⋮ & ⋱ & ⋮ \\ O & I_{q} & O \end{matrix}]

have modulus less than one.

Denote

{[A]}_{q q}

by the

q \times q

submatrix of a square matrix A, consisting of the first q rows and the first q columns of A, that is

{[A]}_{q q} = I_{q} A I_{q}^{⊤}

where

I_{q} = (I_{q}, O) \in R^{q \times q h_{p}}

. I is the identity matrix with compatible dimension.

Theorem 1.

Assume (A1) holds. Then there exists a unique strictly stationary solution with a finite first-order moment to the model and the solution has the form of

Y_{t} = {[{(I - F)}^{- 1}]}_{q q} B_{0} + \sum_{j = 0}^{\infty} {[F^{j}]}_{q q} E_{t - j} .

(4)

Proof of Theorem 1.

From (3) we write

[\begin{matrix} Y_{t} \\ Y_{t - 1} \\ Y_{t - 2} \\ ⋮ \\ Y_{t - h_{p} - 1} \end{matrix}] = [\begin{matrix} B_{0} \\ O \\ O \\ ⋮ \\ O \end{matrix}] + [\begin{matrix} Φ_{1} & Φ_{2} & \dots & Φ_{h_{p} - 1} & Φ_{h_{p}} \\ I_{q} & O & \dots & O & O \\ O & I_{q} & O \\ ⋮ & ⋱ & ⋮ \\ O & I_{q} & O \end{matrix}] [\begin{matrix} Y_{t - 1} \\ Y_{t - 2} \\ ⋮ \\ Y_{t - h_{p} - 1} \\ Y_{t - h_{p}} \end{matrix}] + [\begin{matrix} E_{t} \\ O \\ O \\ ⋮ \\ O \end{matrix}]

which is denoted by

Y_{t}^{*} = B_{0}^{*} + F Y_{t - 1}^{*} + E_{t}^{*} .

(5)

Let

λ_{i} (F)

be the ith eigenvalue of the matrix

F

and

ρ : = {max}_{i} | λ_{i} (F) |

. Under (A1), we have

ρ < 1

. Furthermore it follows that

{lim}_{m \to \infty} \sum_{j = 1}^{m} F^{j} E_{t - j}^{*}

exists and

Y_{t}^{*} = {(I - F)}^{- 1} B_{0}^{*} + \sum_{j = 0}^{\infty} F^{j} E_{t - j}^{*}

exists. Thus

{Y_{t}^{*}}

is strictly stationary and satisfies (5). It can be given straightforwardly.

Let

{[A]}_{q}

denote

q \times 1

column vector consisting of the first q components of a column vector A. Since

{[Y_{t}^{*}]}_{q} = Y_{t}

,

{[B_{0}^{*}]}_{q} = B_{0}

and

{[E_{t}^{*}]}_{q} = E_{t}

,

{Y_{t}}

is also strictly stationary and

Y_{t} = {[Y_{t}^{*}]}_{q} = {[{(I - F)}^{- 1} B_{0}^{*}]}_{q} + {[\sum_{j = 0}^{\infty} F^{j} E_{t - j}^{*}]}_{q} = {[{(I - F)}^{- 1}]}_{q q} B_{0} + \sum_{j = 0}^{\infty} {[F^{j}]}_{q q} E_{t - j}

that is the desired result in (4). Now we show the uniqueness of the strictly stationarity solution (4). Assume that

{{\overset{˘}{Y}}_{t}}

is another strictly stationary solution of the vector autoregression (3) with

E ∥ {\overset{˘}{Y}}_{t} ∥ < \infty

. Then we have

{\overset{˘}{Y}}_{t}^{*} = B_{0}^{*} + F {\overset{˘}{Y}}_{t - 1}^{*} + E_{t}^{*}

in the same way as above in (5). For any positive integer m we have

{\overset{˘}{Y}}_{t}^{*} = \sum_{j = 0}^{m - 1} F^{j} (B_{0}^{*} + E_{t - j}^{*}) + F^{m} {\overset{˘}{Y}}_{t - m}^{*}

, and we observe

E ∥ Y_{t}^{*} - {\overset{˘}{Y}}_{t}^{*} ∥ = E ∥\sum_{j = m}^{\infty} F^{j} (B_{0}^{*} + E_{t - j}^{*}) - F^{m} {\overset{˘}{Y}}_{t - m}^{*}∥ \leq C ρ^{m}

where C is a constant independent of t and m. Since m is arbitrary, we see

E ∥ Y_{t} - {\overset{˘}{Y}}_{t} ∥ \leq E ∥ Y_{t}^{*} - {\overset{˘}{Y}}_{t}^{*} ∥ = 0

and thus

Y_{t} = {\overset{˘}{Y}}_{t}

almost surely. This completes the uniqueness. □

3. The Least Squares Estimation

We consider two cases on the error processes to discuss the least squares estimates: In the first case, we assume that

{ϵ_{j, t}}

are i.i.d. random variables with mean zero and variance

σ^{2}

for all

j = 1, 2, \dots, q

, that is,

E [E_{t} E_{t}^{⊤}] = σ^{2} I_{q}

. As a second but more general case, we adopt correlated error processes with

E [E_{t} E_{t}^{⊤}] = (σ_{j l}) \in R^{q \times q}

where

σ_{j l} = C o v (ϵ_{j, t}, ϵ_{l, t}) \neq 0

.

First, the case of i.i.d. error processes are stated on assumption as follows and the ordinary least squares estimator (OLSE) of the parameters in model (2) is investigated.

(A2)

{ϵ_{j, t}}

are i.i.d. random variables with mean zero and variance

σ^{2}

for all

j = 1, 2, \dots, q

, that is,

E [E_{t} E_{t}^{⊤}] = σ^{2} I_{q}

. Under (A1) and (A2), let

μ = {(μ_{1}, \dots, μ_{q})}^{⊤} = E [Y_{t}]

with

μ_{j} = E [Y_{j, t}]

and

Γ (ℓ) = C o v (Y_{t}, Y_{t - ℓ})

, lag-ℓ autocovariance matrix function. Note that, by using (4),

μ = {[{(I - F)}^{- 1}]}_{q q} B_{0} = {(I_{q} - Φ_{1} - Φ_{2} - \dots - Φ_{h_{p}})}^{- 1} B_{0}, Γ (0) = σ^{2} \sum_{j = 0}^{\infty} {[F^{j}]}_{q q} {[F^{j}]}_{q q}^{⊤},

Γ (ℓ) = σ^{2} \sum_{j = 0}^{\infty} {[F^{j + ℓ}]}_{q q} {[F^{j}]}_{q q}^{⊤} for ℓ > 0, Γ (ℓ) = σ^{2} \sum_{j = 0}^{\infty} {[F^{j}]}_{q q} {[F^{j - ℓ}]}_{q q}^{⊤} for ℓ < 0 .

Suppose that data

{Y_{- h_{p} + 1}, \dots, Y_{- 1}, Y_{0}, Y_{1}, \dots, Y_{n}}

are observed where

Y_{t} = {(Y_{1, t}, Y_{2, t}, \dots, Y_{q, t})}^{⊤}

with

Y_{j, t} \equiv Y_{j, t}^{(1)}

for

j = 1, \dots, q

.

Let

β_{j} = {(β_{j 0}, β_{j 1}^{(1)}, \dots, β_{j 1}^{(p)}, β_{j 2}^{(1)}, \dots, β_{j 2}^{(p)}, \dots, β_{j q}^{(1)}, \dots, β_{j q}^{(p)})}^{⊤} \in R^{(1 + p q)}

X_{t - 1} = {(1, Y_{1, t - 1}^{(1)}, \dots, Y_{1, t - 1}^{(p)}, Y_{2, t - 1}^{(1)}, \dots, Y_{2, t - 1}^{(p)}, \dots, Y_{q, t - 1}^{(1)}, \dots, Y_{q, t - 1}^{(p)})}^{⊤} \in R^{(1 + p q)}

= {(1, {\bar{Y}}_{1, t - 1}^{⊤}, {\bar{Y}}_{2, t - 1}^{⊤}, \dots, {\bar{Y}}_{q, t - 1}^{⊤})}^{⊤} with {\bar{Y}}_{j, t - 1} = {(Y_{j, t - 1}^{(1)}, \dots, Y_{j, t - 1}^{(p)})}^{⊤} .

Then Equation (2) is, for each j, given by

Y_{j, t} = β_{j}^{⊤} X_{t - 1} + ϵ_{j, t} = X_{t - 1}^{⊤} β_{j} + ϵ_{j, t}

(6)

and its matrix form is

Y_{t} = B X_{t - 1} + E_{t}, Y_{t}^{⊤} = X_{t - 1}^{⊤} B^{⊤} + E_{t}^{⊤},

(7)

where

B = {(β_{1}, β_{2}, \dots, β_{q})}^{⊤}

,

q \times (1 + p q)

matrix, and

E_{t} = {(ϵ_{1, t}, ϵ_{2, t}, \dots, ϵ_{q, t})}^{⊤}

. The matrix form with

t = 1, 2, \dots, n

is given by

Y = X B^{⊤} + E

where

Y = {(Y_{1}, \dots, Y_{n})}^{⊤} \in R^{n \times q}

,

X = {(X_{0}, X_{1}, \dots, X_{n - 1})}^{⊤} \in R^{n \times (1 + p q)}

and

E = {(E_{1}, \dots, E_{n})}^{⊤} \in R^{n \times q}

. The ordinary least squares estimator (OLSE) of

B

is obtained by

\hat{B} = {({\hat{β}}_{1}, {\hat{β}}_{2}, \dots, {\hat{β}}_{q})}^{⊤} = {arg min}_{(β_{1}, β_{2}, \dots, β_{q})} \sum_{j = 1}^{q} \sum_{t = 1}^{n} ϵ_{j, t}^{2} .

Its transpose is given by

{\hat{B}}^{⊤} = {(X^{⊤} X)}^{- 1} (X^{⊤} Y) = {(\sum_{t = 1}^{n} X_{t - 1} X_{t - 1}^{⊤})}^{- 1} (\sum_{t = 1}^{n} X_{t - 1} Y_{t}) .

Equivalently, from (6) with

Y_{j, t} = X_{t - 1}^{⊤} β_{j} + ϵ_{j, t}

, whose matrix form with

t = 1, 2, \dots, n

, is given by

{\tilde{Y}}_{j} : = {(Y_{j, 1}, Y_{j, 2}, \dots, Y_{j, n})}^{⊤} = {(X_{0}, X_{1}, \dots, X_{n - 1})}^{⊤} β_{j} + {\tilde{ϵ}}_{j}

{\tilde{Y}}_{j} = X β_{j} + {\tilde{ϵ}}_{j}

where

{\tilde{ϵ}}_{j} = {(ϵ_{j, 1}, ϵ_{j, 2}, \dots, ϵ_{j, n})}^{⊤}

, we obtain OLSE of

β_{j}

for each j as follows.

{\hat{β}}_{j} = {(X^{⊤} X)}^{- 1} (X^{⊤} {\tilde{Y}}_{j}) = {(\sum_{t = 1}^{n} X_{t - 1} X_{t - 1}^{⊤})}^{- 1} (\sum_{t = 1}^{n} X_{t - 1} Y_{j, t}) .

The following theorem states the asymptotic normality of the OLSE. We need the notations for the theorem:

Under the strictly stationary condition in (A1), for

j, l \in {1, \dots, q}

, let

μ_{j} = E [Y_{j, t}]

and

γ_{j l} (k_{1}, k_{2}) = C o v (Y_{j, t - k_{1}}, Y_{l, t - k_{2}}), {\bar{γ}}_{j l} (i_{1}, i_{2}) = \frac{1}{h_{i_{1}} h_{i_{2}}} \sum_{k_{1} = 1}^{h_{i_{1}}} \sum_{k_{2} = 1}^{h_{i_{2}}} γ_{j l} (k_{1}, k_{2}) .

Furthermore let

J_{p}

be

p \times p

matrix with all components as ones and

v e c (\cdot)

the usual operator, stacking the columns of a given matrix.

Theorem 2.

Assume that (A1) and (A2) holds, and the matrix

X^{⊤} X

is of full rank. (a) For each

j = 1, 2, \dots, q

, as

n \to \infty

we have

\sqrt{n} ({\hat{β}}_{j} - β_{j}) \overset{d}{⟶} N (0, σ^{2} Σ^{- 1})

where Σ is given by

Σ = E [X_{0} X_{0}^{⊤}] = [\begin{matrix} 1 & S_{01} & \dots & S_{0 q} \\ S_{10} & S_{11} & \dots & S_{1 q} \\ ⋮ & ⋮ & ⋱ \\ S_{q 0} & S_{q 1} & \dots & S_{q q} \end{matrix}] \in R^{(1 + p q) \times (1 + p q)}

with

S_{j 0} = {(μ_{l}, \dots, μ_{l})}^{⊤} = S_{0 j}^{⊤} \in R^{p}

and

S_{j l} = [\begin{matrix} {\bar{γ}}_{j l} (1, 1) & \dots & {\bar{γ}}_{j l} (1, p) \\ ⋮ & ⋱ & ⋮ \\ {\bar{γ}}_{j l} (p, 1) & \dots & {\bar{γ}}_{j l} (p, p) \end{matrix}] + μ_{j} μ_{l} J_{p} \in R^{p \times p} .

(b) As

n \to \infty

we have

\frac{\sqrt{n}}{σ} [vec (\hat{B}) - vec (B)] \overset{d}{⟶} N (0, I_{q} \otimes Σ^{- 1})

where ⊗ is the Kronecker product.

Proof of Theorem 2.

For (a),

{\hat{β}}_{j}

is written as

{\hat{β}}_{j} = β_{j} + {(\sum_{t = 1}^{n} X_{t - 1} X_{t - 1}^{⊤})}^{- 1} (\sum_{t = 1}^{n} X_{t - 1} ϵ_{j, t}), \sqrt{n} ({\hat{β}}_{j} - β_{j}) = \sqrt{n} {\hat{Σ}}_{x}^{- 1} {\hat{Σ}}_{xe, j}

where

{\hat{Σ}}_{x} = \frac{1}{n} \sum_{t = 1}^{n} X_{t - 1} X_{t - 1}^{⊤}, {\hat{Σ}}_{xe, j} = \frac{1}{n} \sum_{t = 1}^{n} X_{t - 1} ϵ_{j, t} .

(8)

For the desired result, it suffices to show that as

n \to \infty

,

{\hat{Σ}}_{x} \overset{p}{⟶} Σ and \sqrt{n} {\hat{Σ}}_{xe, j} \overset{d}{⟶} N (0, σ^{2} Σ) .

First, we prove the convergence of

{\hat{Σ}}_{x}

.

{\hat{Σ}}_{x} = \frac{1}{n} \sum_{t = 1}^{n} X_{t - 1} X_{t - 1}^{⊤} = \frac{1}{n} \sum_{t = 1}^{n} [\begin{matrix} 1 & {\hat{Σ}}_{12, t} \\ {\hat{Σ}}_{21, t} & {\hat{Σ}}_{22, t} \end{matrix}]

(9)

where

{\hat{Σ}}_{12, t} = ({\bar{Y}}_{1, t - 1}^{⊤}, {\bar{Y}}_{2, t - 1}^{⊤}, \dots, {\bar{Y}}_{q, t - 1}^{⊤}) = {\hat{Σ}}_{21, t}^{⊤}, 1 \times p q row vectors

{\hat{Σ}}_{22, t} = {\hat{Σ}}_{21, t} {\hat{Σ}}_{12, t}, p q \times p q matrix .

It can be shown straightforwardly that the right-hand side of (9) converges to

Σ

as

n \to \infty

by the WLLN of the stationary sequences.

Second, we verify the convergence of

\sqrt{n} {\hat{Σ}}_{xe, j} (= \frac{1}{\sqrt{n}} \sum_{t = 1}^{n} X_{t - 1} ϵ_{j, t})

in distribution. For this purpose, it is enough to show that

\frac{1}{\sqrt{n}} \sum_{t = 1}^{n} η^{⊤} X_{t - 1} ϵ_{j, t} \overset{d}{⟶} N (0, σ^{2} η^{⊤} Σ η)

(10)

for any

η \in R^{1 + p q}

. Fix j and let

ξ_{t} = \frac{1}{\sqrt{n}} η^{⊤} X_{t - 1} ϵ_{j, t}

and

F_{t}^{*} = σ {ϵ_{j, s}, - \infty < s \leq t}

. Let

M_{t} = \sum_{s = 1}^{t} ξ_{s}

. Note that

E [M_{t} | F_{t - 1}^{*}] = E [\sum_{s = 1}^{t} ξ_{s} | F_{t - 1}^{*}] = E [ξ_{t} + \sum_{s = 1}^{t - 1} ξ_{s} | F_{t - 1}^{*}] = M_{t - 1}

and thus

{M_{t}}

is a martingale sequence with respect to

{F_{t}^{*}}

.

Similar to the proof of the convergence of

{\hat{Σ}}_{x}

, we can show that

E {[(η^{⊤} X_{t - 1}) (X_{t - 1}^{⊤} η)]}^{2} < \infty

, which implies that for any

δ > 0

\sum_{t = 1}^{n} E [ξ_{t}^{2} I (| ξ_{t} > δ) | F_{t - 1}^{*}] \leq \frac{1}{δ^{2}} \sum_{t = 1}^{n} E [ξ_{t}^{4} | F_{t - 1}^{*}] \leq \frac{C}{n^{2} δ^{2}} \sum_{t = 1}^{n} {(η^{⊤} X_{t - 1} X_{t - 1}^{⊤} η)}^{2} \overset{p}{⟶} 0

for some constant

C > 0

. Furthermore

\sum_{t = 1}^{n} E [ξ_{t}^{2} | F_{t - 1}^{*}] = \frac{σ^{2}}{n} \sum_{t = 1}^{n} η^{⊤} X_{t - 1} X_{t - 1}^{⊤} η \overset{p}{⟶} σ^{2} η^{⊤} Σ η

as

n \to \infty

. Hence, by the central limit theorem for martingale difference sequences (see [13]), the desired convergence of

M_{n}

to

N (0, σ^{2} η^{⊤} Σ η)

in (10) in distribution holds. We complete the proof of (a).

For (b), we observe covariance matrix

C o v (\sqrt{n} ({\hat{β}}_{j} - β_{j}), \sqrt{n} ({\hat{β}}_{l} - β_{l}))

for

j \neq l

, which is equal to

n E [({\hat{β}}_{j} - β_{j}) {({\hat{β}}_{l} - β_{l})}^{⊤}] = n E [E [({\hat{β}}_{j} - β_{j}) {({\hat{β}}_{l} - β_{l})}^{⊤} | Y_{t}, t = - h_{p} + 1, \dots, n]]

By (8),

n E [({\hat{β}}_{j} - β_{j}) {({\hat{β}}_{l} - β_{l})}^{⊤}| Y_{t}, t = - h_{p} + 1, \dots, n] =

n E [{\hat{Σ}}_{x}^{- 1} {\hat{Σ}}_{xe, j} {({\hat{Σ}}_{x}^{- 1} {\hat{Σ}}_{xe, l})}^{⊤}| Y_{t}, t = - h_{p} + 1, \dots, n] = {\hat{Σ}}_{x}^{- 1} (\frac{1}{n} \sum_{t = 1}^{n} \sum_{s = 1}^{n} X_{t - 1} E [ϵ_{j, t} ϵ_{l, s}] X_{s - 1}^{⊤}) {\hat{Σ}}_{x}^{- 1}

which is zero because

{ϵ_{j, t}, j = 1, 2, \dots, q}

are independent. Thus the covariance matrix is zero, and therefore the desired asymptotic multivariate normality of

vec (\hat{B})

is obtained with the covariance matrix

I_{q} \otimes Σ^{- 1}

. □

Now we adopt correlated error processes by assuming the following:

(A3)

{ϵ_{j, t}, j = 1, 2, \dots, q, t \in Z}

are correlated random variables with mean zero and

C o v (ϵ_{j, t}, ϵ_{l, t}) = σ_{j l} \neq 0

, that is, the covariance matrix

Ψ

is assumed to be

Ψ : = E [E_{t} E_{t}^{⊤}] = (σ_{j l}) \in R^{q \times q}

.

The covariance matrix

Ψ

is nonsingular and positive definite so there exists a nonsingular symmetric matrix

Ψ^{\frac{1}{2}} \in R^{q \times q}

and

Ψ^{\frac{1}{2}} Ψ^{\frac{1}{2}} = Ψ

.

Under (A1) and (A3), note that lag-ℓ autocovariance matrix functions

Γ (ℓ)

of

Y_{t}

are given by

Γ (0) = \sum_{j = 0}^{\infty} {[F^{j}]}_{q q} Ψ {[F^{j}]}_{q q}^{⊤}

and

Γ (ℓ) = \sum_{j = 0}^{\infty} {[F^{j + ℓ}]}_{q q} Ψ {[F^{j}]}_{q q}^{⊤} for ℓ > 0, Γ (ℓ) = \sum_{j = 0}^{\infty} {[F^{j}]}_{q q} Ψ {[F^{j - ℓ}]}_{q q}^{⊤} for ℓ < 0 .

In this case the generalized least squares estimator (GLSE) is computed by minimizing the sum of squared standardized errors. The GLSE

{\hat{B}}_{G L S} = {({\hat{β}}_{1, G L S}, \dots, {\hat{β}}_{q, G L S})}^{⊤}

of

B

is given as follows: From (7) we have

Ψ^{- \frac{1}{2}} Y_{t} = Ψ^{- \frac{1}{2}} B X_{t - 1} + Ψ^{- \frac{1}{2}} E_{t}, Y_{t}^{⊤} Ψ^{- \frac{1}{2}} = X_{t - 1}^{⊤} B^{⊤} Ψ^{- \frac{1}{2}} + E_{t}^{⊤} Ψ^{- \frac{1}{2}} .

Let

Y_{t, Ψ} = Ψ^{- \frac{1}{2}} Y_{t}

,

B_{Ψ} = Ψ^{- \frac{1}{2}} B

and

U_{t} = Ψ^{- \frac{1}{2}} E_{t} .

Note that

E [U_{t} U_{t}^{⊤}] = I_{q}

and we have

Y_{t, Ψ} = B_{Ψ} X_{t - 1} + U_{t}, Y_{t, Ψ}^{⊤} = X_{t - 1}^{⊤} B_{Ψ}^{⊤} + U_{t}^{⊤} .

(11)

The GLSE is obtained by

{\hat{B}}_{G L S} = arg min_{B} \sum_{t = 1}^{n} U_{t}^{⊤} U_{t} = arg min_{B} \sum_{t = 1}^{n} {(Y_{t} - B X_{t - 1})}^{⊤} Ψ^{- 1} (Y_{t} - B X_{t - 1}) .

The matrix form of (11) with

t = 1, 2, \dots, n

is given as

Y_{Ψ} = X B_{Ψ}^{⊤} + U

where

Y_{Ψ} = {(Y_{1, Ψ}, \dots, Y_{n, Ψ})}^{⊤} \in R^{n \times q}, X = {(X_{0}, X_{1}, \dots, X_{n - 1})}^{⊤} \in R^{n \times (1 + p q)}, U = {(U_{1}, \dots, U_{n})}^{⊤} \in R^{n \times q} .

The least squares estimator of

B_{Ψ}

is of the form

{\hat{B}}_{Ψ}^{⊤} = {(X^{⊤} X)}^{- 1} (X^{⊤} Y_{Ψ})

, which is the estimator of

{(Ψ^{- \frac{1}{2}} B)}^{⊤} = B^{⊤} Ψ^{- \frac{1}{2}}

. Furthermore note that

Y_{Ψ} = Y Ψ^{- \frac{1}{2}}

. Hence the GLSE of

B

is given as

{\hat{B}}_{G L S}^{⊤} = {(X^{⊤} X)}^{- 1} (X^{⊤} Y)

that is of the same form as the OLSE. The following theorem presents the asymptotic normality of the GLSE.

Theorem 3.

Assume that (A1) and (A3) hold, and the matrix

X^{⊤} X

is of full rank.

(a) For each

j = 1, 2, \dots, q

, as

n \to \infty

we have

\sqrt{n} ({\hat{β}}_{j, G L S} - β_{j}) \overset{d}{⟶} N (0, σ_{j}^{2} Σ^{- 1})

where Σ is given as in Theorem 2 and

σ_{j}^{2} = σ_{j j}

.

(b) As

n \to \infty

we have

\sqrt{n} [vec ({\hat{B}}_{G L S}) - vec (B)] \overset{d}{⟶} N (0, Ψ \otimes Σ^{- 1}) .

Theorem 3 includes the following case of the uncorrelated but heterogeneous variance of the error processes with

C o v (ϵ_{j, t}, ϵ_{l, t}) = σ_{j}^{2}

if

j = l

, and zero if

j \neq l

.

Corollary 1.

Assume (A1) and if

E [E_{t}] = O

and

E [E_{t} E_{t}^{⊤}] = d i a g (σ_{j}^{2}) \in R^{q \times q}

, then as

n \to \infty

,

\sqrt{n} ({\hat{β}}_{j, G L S} - β_{j}) \overset{d}{⟶} N (0, σ_{j}^{2} Σ^{- 1})

and

\sqrt{n} [vec ({\hat{B}}_{G L S}) - vec (B)] \overset{d}{⟶} N (0, Σ_{d i a g})

where

Σ_{d i a g}

is given by

Σ_{d i a g} = [\begin{matrix} σ_{1}^{2} Σ^{- 1} & O & \dots & O \\ O & σ_{2}^{2} Σ^{- 1} & \dots & O \\ ⋮ & ⋱ \\ O & \dots & O & σ_{q}^{2} Σ^{- 1} \end{matrix}] .

Proof of Theorem 3.

The proof is similar to that of the Theorem 2, except for the limit of the covariance matrix

C o v (\sqrt{n} ({\hat{β}}_{j, G L S} - β_{j}), \sqrt{n} ({\hat{β}}_{l, G L S} - β_{l}))

for

j \neq l

. Following what we’ve done in the proof of Theorem 2, we get

n E [({\hat{β}}_{j, G L S} - β_{j}) {({\hat{β}}_{l, G L S} - β_{l})}^{⊤}| Y_{t}, t = - h_{p} + 1, \dots, n] =

n E [{\hat{Σ}}_{x}^{- 1} {\hat{Σ}}_{xe, j} {({\hat{Σ}}_{x}^{- 1} {\hat{Σ}}_{xe, l})}^{⊤}| Y_{t}, t = - h_{p} + 1, \dots, n] = {\hat{Σ}}_{x}^{- 1} (\frac{1}{n} \sum_{t = 1}^{n} \sum_{s = 1}^{n} X_{t - 1} E [ϵ_{j, t} ϵ_{l, s}] X_{s - 1}^{⊤}) {\hat{Σ}}_{x}^{- 1}

which is equal to

{\hat{Σ}}_{x}^{- 1} (\frac{1}{n} \sum_{t = 1}^{n} X_{t - 1} σ_{j l} X_{t - 1}^{⊤}) {\hat{Σ}}_{x}^{- 1} = σ_{j l} {\hat{Σ}}_{x}^{- 1} \overset{p}{⟶} σ_{j l} Σ^{- 1}

. Thus the desired asymptotic multivariate normality of

vec ({\hat{B}}_{G L S})

is completed with the covariance matrix

Ψ \otimes Σ^{- 1}

. □

We have discussed parameter estimation by means of the least squared method and have established the multivariate normalities in multivariate HAR(

p, q

) models. For a univariate HAR model, Hwang and Shin [3] proposed infinite-order, long-memory HAR model to capture the long-memory property and studied the asymptotic theories of the LSE. In the work of [3,14], it was assumed that HAR coefficients decrease exponentially, and it was shown that, under the exponential decay condition, the autocorrelation function of the HAR model is algebraically decreasing and thus the model is of long-memory. For this reason, additionally we consider the exponentially weighted multivariate HAR model with exponential decay rate, and develop its asymptotic normality of the rate estimator. As a simple case we assume that multiple assets have a common rate on coefficients, and the rate estimation problem is investigated in the following section. A general case of the exponentially weighted multivariate HAR model will be dealt in the future study.

4. Exponentially Weighted Multivariate HAR Model

In this section, as a special case of model (2), we consider an exponentially weighted multivariate HAR model with common rate on the parameters of common regressors as follows: In model (2), for

j, k \in {1, 2, \dots, q}

, and for

i \in {1, 2, \dots, p}

,

β_{j k}^{(i)} = c_{j k} λ^{i - 1}

for some

c_{j k}

and

| λ | < 1

. The parameters

c_{j k}

and

λ

are estimated using the LSE

{\hat{β}}_{j k}^{(i)}

in Section 3 as follows:

{\hat{c}}_{j k} = {\hat{β}}_{j k}^{(1)} and {\hat{λ}}_{n} = \frac{\sum_{k = 1}^{q} \sum_{j = 1}^{q} {\hat{β}}_{j k}^{(2)}}{\sum_{k = 1}^{q} \sum_{j = 1}^{q} {\hat{β}}_{j k}^{(1)}} .

We write

Σ^{- 1} = (ϱ_{l_{1}, l_{2}}^{(- 1)})

, which is given in the asymptotic variances in Theorems 2 and 3, for

l_{1}, l_{2} \in {1, 2, \dots, 1 + p q}

, that is,

ϱ_{l_{1}, l_{2}}^{(- 1)}

is

(l_{1}, l_{2})

-component of

Σ^{- 1}

. By Theorem 2 for each

i, j, k

, we have

\sqrt{n} ({\hat{β}}_{j k}^{(i)} - β_{j k}^{(i)}) \overset{d}{⟶} N (0, ϱ_{j k}^{(i)})

for some

ϱ_{j k}^{(i)}

which is the corresponding component of the matrix

σ^{2} Σ^{- 1}

in Theorem 2. Indeed,

ϱ_{j k}^{(i)}

is the same variance for all j, say,

ϱ_{k}^{(i)}

, and we can easily represent

ϱ_{k}^{(i)} = σ^{2} ϱ_{l, l}^{(- 1)}

with

l = (k - 1) p + 1 + i

. The following theorem states the asymptotic normality of the estimates for the common rate.

Theorem 4.

Assume (A1) and (A2) hold. In model (2) with

β_{j k}^{(i)} = c_{j k} λ^{i - 1},

as

n \to \infty

, we have

\sqrt{n} ({\hat{λ}}_{n} - λ) \overset{d}{⟶} N (0, v^{2})

where

v^{2} = V a r (\sum_{k = 1}^{q} \sum_{j = 1}^{q} Z_{j k}^{*}) / C^{2}

with

C = \sum_{k = 1}^{q} \sum_{j = 1}^{q} c_{j k}

and

Z_{j k}^{*}

following normal distribution with mean zero and variance

v_{j k}^{*} = σ^{2} [ϱ_{(k - 1) p + 3, (k - 1) p + 3}^{(- 1)} + λ^{2} ϱ_{(k - 1) p + 2, (k - 1) p + 2}^{(- 1)} - 2 λ ϱ_{(k - 1) p + 2, (k - 1) p + 3}^{(- 1)}] .

Proof of Theorem 4.

By Theorem 2 for each

i, j, k

, we have

\sqrt{n} ({\hat{β}}_{j k}^{(i)} - β_{j k}^{(i)}) \overset{d}{⟶} N (0, ϱ_{j k}^{(i)})

for some

ϱ_{j k}^{(i)}

which is the corresponding component of the matrix

σ^{2} Σ^{- 1}

in Theorem 2, where

ϱ_{j k}^{(i)} =

ϱ_{k}^{(i)} = σ^{2} ϱ_{(k - 1) p + 1 + i, (k - 1) p + 1 + i}^{(- 1)}

. We may write

\sum_{k = 1}^{q} \sum_{j = 1}^{q} {\hat{β}}_{j k}^{(i)} = C λ^{i - 1} + \frac{1}{\sqrt{n}} \sum_{k = 1}^{q} \sum_{j = 1}^{q} Z_{j k}^{(i)} + o_{p} (1 / \sqrt{n})

where

Z_{j k}^{(i)}

s are normal random variables with mean zero and variance

ϱ_{k}^{(i)}

for each j and we have

{\hat{λ}}_{n} = \frac{λ + \frac{1}{C \sqrt{n}} \sum_{k = 1}^{q} \sum_{j = 1}^{q} Z_{j k}^{(2)} + o_{p} (1 / \sqrt{n})}{1 + \frac{1}{C \sqrt{n}} \sum_{k = 1}^{q} \sum_{j = 1}^{q} Z_{j k}^{(1)} + o_{p} (1 / \sqrt{n})},

\sqrt{n} ({\hat{λ}}_{n} - λ) = \frac{1}{C} \sum_{k = 1}^{q} \sum_{j = 1}^{q} (Z_{j k}^{(2)} - λ Z_{j k}^{(1)}) + O_{p} (1 / \sqrt{n}) .

Note that

Z_{j k}^{(2)} - λ Z_{j k}^{(1)}

has asymptotically normal distribution with mean zero and variance

ϱ_{k}^{(2)} + λ^{2} ϱ_{k}^{(1)} - 2 λ C o v (Z_{j k}^{(2)}, Z_{j k}^{(1)}) =

σ^{2} [ϱ_{(k - 1) p + 3, (k - 1) p + 3}^{(- 1)} + λ^{2} ϱ_{(k - 1) p + 2, (k - 1) p + 2}^{(- 1)} - 2 λ ϱ_{(k - 1) p + 2, (k - 1) p + 3}^{(- 1)}]

. Therefore, the desired asymptotic normality of

\sqrt{n} ({\hat{λ}}_{n} - λ)

is obtained. □

Remark 1.

As pointed out by [3], the exponential decay condition

β_{j k}^{(i)} = c_{j k} λ^{i - 1}

is a condition for the long-memory property of HAR models. Ref. [3] discussed the HAR model of infinity order and its approximation of finite orders, where it has been shown that the exponential decay condition is equivalent to algebraically decay autocorrelation functions along with a mild lag condition. We are interested in testing whether or not the model is the exponentially weighted multivariate HAR model with common rate λ. For example, we construct the following hypothesis and test statistic: the null hypothesis

H_{0} :

β_{j k}^{(i)} = c_{j k} λ^{i - 1}

for some

c_{j k}

and common rate

| λ | < 1

, for each

j, k

, versus. the alternative hypothesis

H_{A}

: the model does not have the common rate nor the exponentially weighted multivariate HAR model. For

i = 1, 2, \dots, p - 1

, let

{\hat{λ}}_{(i), n} = \frac{\sum_{k = 1}^{q} \sum_{j = 1}^{q} {\hat{β}}_{j k}^{(i + 1)}}{\sum_{k = 1}^{q} \sum_{j = 1}^{q} {\hat{β}}_{j k}^{(i)}}

and note that asymptotic normality of

{\hat{λ}}_{(i), n}

holds like Theorem 4 by that of the OLSEs. A collection of the differences

\{{\hat{λ}}_{(i_{1}), n} - {\hat{λ}}_{(i_{2}), n} : for all pairs (i_{1}, i_{2}), i_{1} \neq i_{2}\}

is considered. In the collection, all distinct elements in absolute values are at most

(p - 1) (p - 2) / 2 (= : p^{*})

. These are relabelled as

{Λ_{ℓ} : ℓ = 1, 2, \dots, p^{*}}

. Let

{\bar{Λ}}_{n} = \sum_{ℓ = 1}^{p^{*}} Λ_{ℓ} / p^{*}

or some related statistics of

{Λ_{ℓ}}

. Similarly to Theorem 4, we can find the limiting distribution of

{\bar{Λ}}_{n}

under the null hypothesis

H_{0}

. The null

H_{0}

might be rejected if

| {\bar{Λ}}_{n} |

is large or if

{max}_{ℓ} Λ_{ℓ}

is large. In a future work, a specific test statistic related to

Λ_{ℓ}

will be constructed and the limiting distribution of the test will be investigated.

5. A Monte-Carlo Study

We present simulation results for model (2) with

p = 3

and

q = 2

, which is a bivariate case of two assets, using parameters

β_{1} = {(β_{10}, β_{11}^{(d)}, β_{11}^{(w)}, β_{11}^{(m)}, β_{12}^{(d)}, β_{12}^{(w)}, β_{12}^{(m)})}^{⊤} = {(0.3, 0, 2, 0.1, 0.05, 0.1, 0.05, 0.02)}^{⊤},

β_{2} = {(β_{20}, β_{21}^{(d)}, β_{21}^{(w)}, β_{21}^{(m)}, β_{22}^{(d)}, β_{22}^{(w)}, β_{22}^{(m)})}^{⊤} = (- 0.3, 0.07, 0.04,

{0.01, 0.25, 0.07, 0.1)}^{⊤}

. We consider both independent errors and correlated ones for

{ϵ_{j, t}}

,

j = 1, 2

: (i) i.i.d. normal distribution

N (0, 1)

with

E [E_{t} E_{t}^{⊤}] = I_{2}

and (ii) correlated normal distribution with

E [E_{t} E_{t}^{⊤}] = (σ_{j l}) \in R^{2 \times 2}

with

σ_{11} = σ_{1}^{2} = 1, σ_{22} = σ_{2}^{2} = 1.21, σ_{12} = ρ σ_{1} σ_{2}

with

ρ = 0.5

. The bivariate HAR(3,2) processes are generated according to (2) with sample size

n = 1000

.

We compute the LSEs for the parameters with replication numbers 500 and give the sample means and standard errors of the 500 estimates in Table 1 and Table 2 with

n = 300, 600, 1000

. It is reported in Tables that sample means are closer to the real values of parameters as well as standard errors decreases as n increases. Additionally, Figure 1 and Figure 2 illustrate plots of sample means (a), (b) and standard errors (c), (d) of the LSEs of the parameters as n increases on the horizontal axis: Figure 1a,c and Figure 2a,c for the estimates of

β_{1}

and Figure 1b,d and Figure 2b,d for the estimates of

β_{2}

.

Table 1. Sample mean and standard error of 500 estimates in i.i.d. case.

Table 2. Sample mean and standard error of 500 estimates in correlated case.

Figure 1. Sample mean (a,b) and standard error (c,d) of the 500 estimates in i.i.d. case.

Figure 2. Sample mean (a,b) and standard error (c,d) of the 500 estimates in correlated case.

Confidence intervals using the normal approximations in Theorems 2 and 3 with confidence level

95 %

are constructed:

0.95 = P (\hat{β} - z_{0.975} \hat{s e} \leq β \leq \hat{β} + z_{0.975} \hat{s e})

, for each parameter component

β

, with its LSE

\hat{β}

and standard-error estimate

\hat{s e} = \hat{σ} / \sqrt{n}

, where

{\hat{σ}}^{2}

is the corresponding (diagonal) component of estimate of asymptotic covariance matrix in Theorems 2 and 3. The confidence intervals of 500 samples as well as the empirical coverage probabilities and average lengths in the i.i.d. error case by the normal approximations are demonstrated in Figure 3, where sample size

n = 1000

and replication number 500 are used. To illustrate the multivariate asymptotic normality, plots of the normal approximations for estimates of some parameters are depicted in Figure 4, and the bivariate normalities of some pairs of two chosen estimates also can be seen in Figure 5. The three figures support normality results established in the theory.

Figure 3. Empirical coverage and average length (in parenthesis) of 95% confidence intervals for seven components of

β_{1}

(left column) and

β_{2}

(right column) in i.i.d. error case. The horizontal dotted line indicates the true value of the parameter in each plot. Confidence intervals that do not contain the parameter are depicted as red color.

Figure 4. Normal approximation of estimates

{\hat{β}}_{11}^{(d)}, {\hat{β}}_{11}^{(w)}, {\hat{β}}_{11}^{(m)}

in i.i.d.case (a–c) and correlated case (d–f).

Figure 5. Bivariate normal approximation of some pairs of two chosen estimates (

{\hat{β}}_{11}^{(d)}, {\hat{β}}_{21}^{(d)}),

({\hat{β}}_{11}^{(w)}, {\hat{β}}_{21}^{(w)}),

({\hat{β}}_{11}^{(m)}, {\hat{β}}_{21}^{(m)})

in i.i.d.case (a–c) and correlated case (d–f).

As for the exponentially weighted multivariate HAR(3,2) model with

β_{j k}^{(i)} = c_{j k} λ^{i - 1}

in Section 4, the estimates for the common rate are given in Table 3, from which we see that sample means of estimated values are close to the true ones with reasonable standard errors.

Table 3. Sample mean and standard error of 500 estimates in exponentially weighted HAR model.

6. Application

This section addresses empirical data analysis on the Gold spot price and S&P500 index. Their volatility is modeled by bivariate HAR model, using three years of daily closing price, from 18 September 2017 to 17 September 2020, of Gold and S&P500. The three years of Gold price and S&P500 index movement and their log return are shown in Figure 6 while their volatility and autocorrelation coefficients functions (ACFs) in Figure 7.

Figure 6. Gold spot price in USD and its return (in red) and S&P500 index and its return (in blue) against number of days starting from 18 September 2017 to 17 September 2020.

Figure 7. Volatility and ACF for Gold (in red) and S&P 500 (in blue).

We list some critera of MSE, R

^{2}

, AIC, BIC in Table 4 for the OLSEs (ordinary least squares estimators) of the bivariate HAR model of order

p = 3, 4

by examining the volatility of Gold and S&P500. Conventionally lags of order

p = 3, 4

are used for a day (

h_{1} = 1

), a week

(h_{2} = 5)

, a month (

h_{3} = 22

) and a quarter (

h_{4} = 66

) in a HAR model. Nevertheless, in this work, we consider optimal lags in the sense of minimizing MSE of OLSEs for Gold and S&P500 volatility simultaneously. For order

p = 3, 4

, we choose lags

h_{2}, \dots, h_{p}

to satisfy the condition,

1 = h_{1} < h_{2} < \dots < h_{p} < \bar{h}

for some large

\bar{h}

, so that the sum of the two mean squared residuals of the OLSEs for the volatility of Gold and S&P500 is minimized. As a result, we found optimal lags

(h_{1}, h_{2}, h_{3}) = (1, 5, 6)

for

p = 3

with (MSE

_{Gold}

, MSE

_{S&P}) = (0.0461, 0.0387)

and

(h_{1}, h_{2}, h_{3}, h_{4}) = (1, 3, 6, 7)

for

p = 4

with (MSE

_{Gold}

, MSE

_{S&P}) = (0.0488, 0.0410)

. In Table 4, we compare our optimal lags with the conventional lags in different criteria. In all cases considered, our selection of optimal lags

(h_{1}, h_{2}, h_{3}) = (1, 5, 6)

with

p = 3

turns out to be the best.

Table 4. OLSEs in the bivariate HAR(

p, 2)

model on volatility of Gold and S&P500.

Therefore we report estimation results of the HAR(3,2) model with

(h_{1}, h_{2}, h_{3}) = (1, 5, 6)

for the volatilities of Gold and S&P500 in Table 5, where coefficients estimates, standard errors and 95% confidence intervals are provided. In Figure 8, two plots of the bivariate HAR(3,2) fitted model by the OLSEs for this optimal case on the datasets of Gold and S&P volatilities are depicted along with residuals. We see that the bivariate HAR(3,2) fittings are similar to the real volatility plots of both datasets as reported upon the criterions.

Table 5. Estimation results for Gold and S&P500 using HAR(3,2) with

(h_{1}, h_{2}, h_{3}) = (1, 5, 6)

.

Figure 8. Fitted HAR model and residual using OLSE for the volatility of Gold and S&P500.

7. Conclusions

Due to the cross-correlation of multiple assets and spillover effect of volatility in the financial market, a multivariate heterogeneous autoregressive-realized volatility (HAR-RV) model has attained much attention recently. In the multivariate HAR model, its stationarity is discussed and estimation problems are studied. We first investigate the strictly stationarity solution of the multivariate HAR model and second develop the asymptotic normality theory for the least squares estimates (LSEs) with i.i.d. and correlated errors, respectively. Third, we propose an exponentially weighted multivariate HAR model and estimate its common exponential decay rate. In a Monte-Carlo experiment, performances of the LSEs are numerically illustrated with sample mean and standard error of the estimates as well as empirical coverage and average length of confidence intervals by using the normal approximation. In addition, as a real data example, volatilities of Gold spot price and S&P500 index during recent three years are used to analyze in a bivariate HAR model. The coefficient estimates and confidence intervals are found in the bivariate HAR model of volatility of Gold and S&P500, along with choosing optimal lags, and it is shown that the bivariate HAR model with the proposed optimal lags is well matched with the volatility of the financial data.

We suggest some problems on the multivariate HAR model. As we proposed before, the exponentially weighted HAR models with decay rates are of interest owing to reduced numbers of parameters as well as the long-memory property. In modeling the multivariate HAR model, testing whether the HAR coefficients have exponentially weighted decay rates or not and furthermore whether multiple assets have a common decay rate or not might provide statistically useful tools to analyze the time series model. In follow-up studies we will deal with the hypotheses tests by constructing the test statistics and establishing the null limiting distribution. Finally, we mention that in a multivariate HAR model with heteroscedasticity errors, asymptotic properties of the estimates differ from the existing results, and thus we will derive the asymptotic theory on the HAR models in the presence of dynamic heteroscedasticity. In this case, financial market data can be represented more remarkably and hence forecasting volatility with forecast accuracy will be carried out in the further research.

Author Contributions

Validation, W.-T.H., J.L. and E.H.; writing–review and editing, W.-T.H, J.L and E.H; funding acquisition, E.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Research Fund of Gachon University (GCU-2019-0299).

Conflicts of Interest

The authors declare no conflict of interest.

References

Corsi, F. A simple approximate long-memory model of realized volatility. J. Financ. Econom. 2009, 7, 174–196. [Google Scholar] [CrossRef]
Chen, Y.; Härdle, W.K.; Pigorsch, U. Localized realized volatility modeling. J. Am. Stat. Assoc. 2010, 105, 1376–1393. [Google Scholar] [CrossRef]
Hwang, E.; Shin, D.W. Infinite-order, long-memory heterogeneous autoregressive models. Comput. Stat. Data Anal. 2014, 76, 339–358. [Google Scholar] [CrossRef]
Qu, H.; Ji, P. Modeling realized volatility dynamics with a genetic algorithm. J. Forecast. 2016, 35, 434–444. [Google Scholar] [CrossRef]
Audrino, F.; Huang, C.; Okhrin, O. Flexible HAR model for realized volatility. Stud. Nonlinear Dyn. Econom. 2018, 23. [Google Scholar] [CrossRef]
Bubák, V.; Kočenda, E.; Žikeš, F. Volatility transmission in emerging European foreign exchange markets. J. Bank. Financ. 2011, 35, 2829–2841. [Google Scholar] [CrossRef]
Chiriac, R.; Voev, V. Modelling and forecasting multivariate realized volatility. J. Appl. Econom. 2011, 26, 922–947. [Google Scholar] [CrossRef]
Souček, M.; Todorova, N. Realized volatility transmission between crude oil and equity futures markets: A multivariate HAR approach. Energy Econ. 2013, 40, 586–597. [Google Scholar] [CrossRef]
Cubadda, G.; Guardabascio, B.; Hecq, A. A vector heterogeneous autoregressive index model for realized volatility measures. Int. J. Forecast. 2017, 33, 337–344. [Google Scholar] [CrossRef]
Čech, F.; Baruník, J. On the modelling and forecasting of multivariate realized volatility: Generalized heterogeneous autoregressive (GHAR) model. J. Forecast. 2017, 36, 181–206. [Google Scholar] [CrossRef]
Luo, J.; Chen, L. Realized volatility forecast with the Bayesian random compressed multivariate HAR model. Int. J. Forecast. 2020, 36, 781–799. [Google Scholar] [CrossRef]
Taylor, N. Realized volatility forecasting in an international context. Appl. Econom. Lett. 2015, 22, 503–509. [Google Scholar] [CrossRef]
Hall, P.; Heyde, C.C. Martingale Limit Theory and Its Application; Academic Press: Cambridge, MA, USA, 2014. [Google Scholar]
Hwang, E.; Shin, D.W. A CUSUM test for a long memory heterogeneous autoregressive model. Econ. Lett. 2013, 121, 379–383. [Google Scholar] [CrossRef]

Figure 1. Sample mean (a,b) and standard error (c,d) of the 500 estimates in i.i.d. case.

Figure 2. Sample mean (a,b) and standard error (c,d) of the 500 estimates in correlated case.

Figure 3. Empirical coverage and average length (in parenthesis) of 95% confidence intervals for seven components of

β_{1}

(left column) and

β_{2}

(right column) in i.i.d. error case. The horizontal dotted line indicates the true value of the parameter in each plot. Confidence intervals that do not contain the parameter are depicted as red color.

Figure 4. Normal approximation of estimates

{\hat{β}}_{11}^{(d)}, {\hat{β}}_{11}^{(w)}, {\hat{β}}_{11}^{(m)}

in i.i.d.case (a–c) and correlated case (d–f).

Figure 5. Bivariate normal approximation of some pairs of two chosen estimates (

{\hat{β}}_{11}^{(d)}, {\hat{β}}_{21}^{(d)}),

({\hat{β}}_{11}^{(w)}, {\hat{β}}_{21}^{(w)}),

({\hat{β}}_{11}^{(m)}, {\hat{β}}_{21}^{(m)})

in i.i.d.case (a–c) and correlated case (d–f).

Figure 6. Gold spot price in USD and its return (in red) and S&P500 index and its return (in blue) against number of days starting from 18 September 2017 to 17 September 2020.

Figure 7. Volatility and ACF for Gold (in red) and S&P 500 (in blue).

Figure 8. Fitted HAR model and residual using OLSE for the volatility of Gold and S&P500.

Table 1. Sample mean and standard error of 500 estimates in i.i.d. case.

Parameter			Sample Mean (Standard Error)
Parameter			$n = 300$	$n = 600$	$n = 1000$
	$β_{10}$	0.30	0.36 (0.17)	0.33 (0.12)	0.32 (0.08)
	$β_{11}^{(d)}$	0.20	0.19 (0.07)	0.20 (0.04)	0.20 (0.03)
	$β_{11}^{(w)}$	0.10	0.08 (0.14)	0.09 (0.10)	0.10 (0.07)
$β_{1}$	$β_{11}^{(m)}$	0.05	−0.05 (0.26)	0.00 (0.18)	0.02 (0.13)
	$β_{12}^{(d)}$	0.10	0.10 (0.07)	0.10 (0.04)	0.10 (0.03)
	$β_{12}^{(w)}$	0.05	0.06 (0.13)	0.06 (0.09)	0.06 (0.07)
	$β_{12}^{(m)}$	0.02	0.04 (0.25)	0.02 (0.17)	0.02 (0.12)
	$β_{20}$	−0.30	−0.37 (0.16)	−0.34 (0.12)	−0.32 (0.08)
	$β_{21}^{(d)}$	0.07	0.06 (0.07)	0.07 (0.05)	0.07 (0.04)
	$β_{21}^{(w)}$	0.04	0.06 (0.14)	0.05 (0.09)	0.05 (0.07)
$β_{2}$	$β_{21}^{(m)}$	0.01	0.01 (0.27)	0.01 (0.18)	0.01 (0.14)
	$β_{22}^{(d)}$	0.25	0.24 (0.06)	0.25 (0.05)	0.25 (0.04)
	$β_{22}^{(w)}$	0.07	0.05 (0.14)	0.06 (0.10)	0.06 (0.07)
	$β_{22}^{(m)}$	0.10	0.00 (0.24)	0.05 (0.16)	0.08 (0.12)

Table 2. Sample mean and standard error of 500 estimates in correlated case.

Parameter			Sample Mean (Standard Error)
Parameter			$n = 300$	$n = 600$	$n = 1000$
	$β_{10}$	0.30	0.36 (0.30)	0.33 (0.20)	0.32 (0.15)
	$β_{11}^{(d)}$	0.20	0.18 (0.09)	0.19 (0.07)	0.19 (0.05)
	$β_{11}^{(w)}$	0.10	0.07 (0.22)	0.09 (0.15)	0.10 (0.12)
$β_{1}$	$β_{11}^{(m)}$	0.05	−0.04 (0.44)	0.00 (0.29)	0.01 (0.22)
	$β_{12}^{(d)}$	0.10	0.11 (0.09)	0.11 (0.07)	0.11 (0.05)
	$β_{12}^{(w)}$	0.05	0.05 (0.21)	0.05 (0.14)	0.04 (0.11)
	$β_{12}^{(m)}$	0.02	0.03 (0.44)	0.03 (0.28)	0.04 (0.21)
	$β_{20}$	−0.30	−0.38 (0.33)	−0.33 (0.21)	−0.31 (0.15)
	$β_{21}^{(d)}$	0.07	0.06 (0.11)	0.06 (0.07)	0.06 (0.06)
	$β_{21}^{(w)}$	0.04	0.04 (0.25)	0.05 (0.16)	0.05 (0.12)
$β_{2}$	$β_{21}^{(m)}$	0.01	0.06 (0.48)	0.02 (0.31)	0.01 (0.22)
	$β_{22}^{(d)}$	0.25	0.25 (0.10)	0.25 (0.07)	0.26 (0.06)
	$β_{22}^{(w)}$	0.07	0.05 (0.23)	0.06 (0.15)	0.06 (0.11)
	$β_{22}^{(m)}$	0.10	−0.03 (0.47)	0.05 (0.30)	0.08 (0.21)

Table 3. Sample mean and standard error of 500 estimates in exponentially weighted HAR model.

Parameter		Sample Mean (Standard Error)
Parameter		$n = 300$	$n = 600$	$n = 1000$
$β_{10}$	0.10	0.11 (0.09)	0.10 (0.06)	0.10 (0.05)
$c_{11}$	0.30	0.29 (0.05)	0.29 (0.04)	0.30 (0.03)
$c_{12}$	0.20	0.20 (0.05)	0.20 (0.04)	0.20 (0.03)
$β_{20}$	−0.30	−0.33 (0.09)	−0.32 (0.07)	−0.31 (0.05)
$c_{21}$	0.10	0.10 (0.05)	0.10 (0.04)	0.10 (0.03)
$c_{22}$	0.25	0.24 (0.05)	0.25 (0.04)	0.25 (0.03)
$λ$	0.50	0.49 (0.23)	0.52 (0.19)	0.50 (0.15)

Table 4. OLSEs in the bivariate HAR(

p, 2)

model on volatility of Gold and S&P500.

Table 4. OLSEs in the bivariate HAR(

p, 2)

model on volatility of Gold and S&P500.

		Gold				S&P500
		MSE	R $^{2}$	AIC	BIC	MSE	R $^{2}$	AIC	BIC
$p = 3$	$(1, 5, 22)$	0.0492	0.790	−117.0	−84.68	0.0420	0.910	−236.6	−204.2
	$(1, 5, 6)$	0.0461 *	0.801 *	−169.8 *	−137.3 *	0.0387 *	0.916 *	−303.2 *	−270.7 *
$p = 4$	$(1, 5, 22, 66)$	0.0519	0.789	−73.30	−32.24	0.0448	0.909	−180.3	−139.3
	$(1, 3, 6, 7)$	0.0488	0.791	−128.1	−86.28	0.0410	0.912	−264.4	−222.6

* Denotes the best.

Table 5. Estimation results for Gold and S&P500 using HAR(3,2) with

(h_{1}, h_{2}, h_{3}) = (1, 5, 6)

.

Table 5. Estimation results for Gold and S&P500 using HAR(3,2) with

(h_{1}, h_{2}, h_{3}) = (1, 5, 6)

.

Parameter		Gold ( $j = 1$ )		S&P ( $j = 2$ )
Parameter		Coeff. est.(s.e.)	(95% C.I.)	Coeff. est.(s.e.)	(95% C.I.)
	$β_{j 0}$	0.0790 (0.016)	(0.047, 0.111)	−0.0088 (0.015)	(−0.038, 0.021)
	$β_{j 1}^{(1)}$	0.9751 (0.038)	(0.901, 1.049)	0.0534 (0.034)	(−0.014, 0.121)
	$β_{j 1}^{(2)}$	−1.0822 (0.187)	(−1.450, −0.714)	−0.4278 (0.172)	(−0.765, −0.091)
$β_{j}$	$β_{j 1}^{(3)}$	0.9301 (0.175)	(0.586, 1.274)	0.4458 (0.161)	(0.130, 0.761)
	$β_{j 2}^{(1)}$	0.1478 (0.039)	(0.071, 0.225)	1.1153 (0.036)	(1.045, 1.186)
	$β_{j 2}^{(2)}$	−0.6484 (0.198)	(−1.036, −0.261)	−1.1196 (0.181)	(−1.475, −0.764)
	$β_{j 2}^{(3)}$	0.5879 (0.179)	(0.237, 0.939)	0.9338 (0.164)	(0.612, 1.256)

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

A Note on the Asymptotic Normality Theory of the Least Squares Estimates in Multivariate HAR-RV Models

Abstract

1. Introduction

2. Multivariate HAR( $p, q$ ) Models

3. The Least Squares Estimation

4. Exponentially Weighted Multivariate HAR Model

5. A Monte-Carlo Study

6. Application

7. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

A Note on the Asymptotic Normality Theory of the Least Squares Estimates in Multivariate HAR-RV Models

Abstract

1. Introduction

2. Multivariate HAR( p , q ) Models

3. The Least Squares Estimation

4. Exponentially Weighted Multivariate HAR Model

5. A Monte-Carlo Study

6. Application

7. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

2. Multivariate HAR( $p, q$ ) Models