Abstract
In this work, multivariate heterogeneous autoregressive-realized volatility (HAR-RV) models are discussed with their least squares estimations. We consider multivariate HAR models of order p with q multiple assets to explore the relationships between two or more assets’ volatility. The strictly stationary solution of the HAR() model is investigated as well as the asymptotic normality theories of the least squares estimates are established in the cases of i.i.d. and correlated errors. In addition, an exponentially weighted multivariate HAR model with a common decay rate on the coefficients is discussed together with the common rate estimation. A Monte Carlo simulation is conducted to validate the estimations: sample mean and standard error of the estimates as well as empirical coverage and average length of confidence intervals are calculated. Lastly, real data of volatility of Gold spot price and S&P index are applied to the model and it is shown that the bivariate HAR model fitted by selected optimal lags and estimated coefficients is well matched with the volatility of the financial data.
1. Introduction
Volatility in financial asset returns is one of the most important components in the financial market for optimization decisions such as portfolio selection, risk management and asset pricing. Over recent decades, extensive research works by statisticians and econometricians have analyzed volatility alongside time series modelings. In particular, because multiple assets are correlated with each other in the financial markets, the cross-correlation of two or more asset returns and the spillover effect of the volatility have been represented in multivariate time series models rather than the univariate models. Probabilistic properties such as long-memory of the volatility, which means that historical volatility has a persistent impact on future volatility, have been imposed efficiently to time series models through various statistical techniques.
In order to capture the long-memory persistence of the volatility, Corsi [1] suggested a simple but remarkably efficient time series model with additive cascades defined on different time periods by incorporating the daily, weekly and monthly volatility components. Corsi’s model [1] called a heterogeneous autoregressive-realized volatility model of order 3, (HAR-RV(3) or simply HAR(3)), is a type of linear autoregression with heterogeneous regressors, which are previous lag, lagged weekly moving average, and lagged monthly moving average. Since its introduction, various adaptive versions of the HAR model are used to analyze the volatility along with empirical data analysis. Here, we refer to [2,3,4,5] for univariate data and [6,7,8,9,10,11] for multivariate data.
In particular, Bubak et al. [6] used a multivariate extension of the HAR model to uncover volatility transmission between Central European currencies and the EUR/USA foreign exchange rate, whereas Soucek and Todorova [8] employ a bivariate HAR model to explore the relationship between equity and oil market volatility. Cubadda et al. [9] proposed a vector HAR index model for detecting the presence of co-movements and analyzing the joint behavior in a set of daily realized volatility measures. Cech and Barunik [10] proposed a generalized HAR model for dynamic covariance matrix modelling and forecasting.
All of the references above empirically only dealt with the multivariate HAR models, but they have not presented theoretical results of the HAR model. Indeed, there is only a few of theoretical analysis of the HAR model, even in the univariate case. Furthermore, the least squares estimate (LSE) method has been popularly used as a tool of estimation in time series data analysis. The LSEs are used often in the literature to analyze the multivariate data in the multivariate HAR models and good performances on the volatility forecasting are obtained with forecast accuracy, for instance, see [9,12]. However, in the multivariate HAR models, rigorous formulation of asymptotic multivariate normality theory of the LSE has not yet been established in spite of the frequent uses of the LSEs. The earlier studies motivate us to consider multivariate HAR models of order p with q multiple assets, and look into theoretical analysis of the LSEs. In this work we provide, as a main result, strictly stationarity of the HAR() model and asymptotic normality theories of LSEs for the parameters in the multivariate HAR models. The multivariate normalities of the LSEs are established in two cases of i.i.d. and correlated errors of the models. In addition, we consider an exponentially weighted multivariate HAR model and discuss the estimation of a common decay rate on coefficients in the exponentially weighted HAR model.
A Monte Carlo experiment is conducted to illustrate performance of the estimations. Sample mean and standard error of the LSEs as well as empirical coverage and average length of confidence intervals are calculated. In addition, we model volatility of Gold spot price and S&P500 index using a bivariate HAR model and estimate the coefficient parameters. The parameter estimates and confidence intervals are computed according to the asymptotic normality we derived. Finally, the bivariate HAR fitting is demonstrated to the market data with optimally selected lags under some criterion rules.
The remainder of the paper is organized as follows: In Section 2, the multivariate HAR model is presented with discussion on strictly stationary solution and in Section 3 the LSEs and their multivariate normalities are derived. Section 4 deals with the exponentially weighted multivariate HAR model. A Monte-Carlo experiment is performed and an application with market data is given in Section 5 and Section 6, respectively. We conclude in Section 7, where the main conclusion and discussion on future works are stated.
2. Multivariate HAR() Models
We consider the following HAR-RV model with multivariate data which has been discussed in [8] as an extension of the HAR-RV(3) model in [1].
where are random variables for error terms and are coefficient parameters. For each jth-asset, , three volatility components , corresponding to time horizons of one day (d), one week (w), one month (m) are given by the realized volatility on day if , and
weakly and monthly averages of realized volatility if ; and . Soucek and Todorova [8] examined the volatility linkages of oil with three considered equity markets in three separate bivariate models in the bivariate case of assets , with in model (1).
Motivated from the above, we consider a general multivariate HAR-RV() model :
where are random variables for errors with mean zero and finite variance, are assumed to satisfy , and are parameters to be estimated. Volatility components are given in the same as above. For and , .
Model (2) is more general than model (1) and thus we consider model (2) in this work. As main results we investigate the strict stationarity of the model and develop the estimation of parameters.
Let with , for . Model (2) can be written as follows:
where, for , (with ), , We write, for each , where and thus
where , and
Note that , is a matrix in and is a column vector in . Furthermore note that is written as
where, for ,
Therefore we have the following vector-autoregression of order :
which is written as
with lag operator L and the identity matrix .
We make the following assumption:
(A1) The roots of lie outside the complex unit circle (have modulus greater than one), or equivalently, the eigenvalues of the companion matrix (of dimension )
have modulus less than one.
Denote by the submatrix of a square matrix A, consisting of the first q rows and the first q columns of A, that is where . I is the identity matrix with compatible dimension.
Theorem 1.
Assume (A1) holds. Then there exists a unique strictly stationary solution with a finite first-order moment to the model and the solution has the form of
Proof of Theorem 1.
Let be the ith eigenvalue of the matrix and . Under (A1), we have . Furthermore it follows that exists and exists. Thus is strictly stationary and satisfies (5). It can be given straightforwardly.
Let denote column vector consisting of the first q components of a column vector A. Since , and , is also strictly stationary and
that is the desired result in (4). Now we show the uniqueness of the strictly stationarity solution (4). Assume that is another strictly stationary solution of the vector autoregression (3) with . Then we have in the same way as above in (5). For any positive integer m we have , and we observe
where C is a constant independent of t and m. Since m is arbitrary, we see and thus almost surely. This completes the uniqueness. □
3. The Least Squares Estimation
We consider two cases on the error processes to discuss the least squares estimates: In the first case, we assume that are i.i.d. random variables with mean zero and variance for all , that is, . As a second but more general case, we adopt correlated error processes with where .
First, the case of i.i.d. error processes are stated on assumption as follows and the ordinary least squares estimator (OLSE) of the parameters in model (2) is investigated.
(A2) are i.i.d. random variables with mean zero and variance for all , that is, . Under (A1) and (A2), let with and , lag-ℓ autocovariance matrix function. Note that, by using (4),
Suppose that data are observed where with for .
Let
Then Equation (2) is, for each j, given by
and its matrix form is
where , matrix, and . The matrix form with is given by
where , and . The ordinary least squares estimator (OLSE) of is obtained by
Its transpose is given by
Equivalently, from (6) with , whose matrix form with , is given by
where , we obtain OLSE of for each j as follows.
The following theorem states the asymptotic normality of the OLSE. We need the notations for the theorem:
Under the strictly stationary condition in (A1), for , let and
Furthermore let be matrix with all components as ones and the usual operator, stacking the columns of a given matrix.
Theorem 2.
Assume that (A1) and (A2) holds, and the matrix is of full rank. (a) For each , as we have
where Σ is given by
with and
(b) As we have
where ⊗ is the Kronecker product.
Proof of Theorem 2.
For (a), is written as
where
For the desired result, it suffices to show that as ,
First, we prove the convergence of .
where
It can be shown straightforwardly that the right-hand side of (9) converges to as by the WLLN of the stationary sequences.
Second, we verify the convergence of in distribution. For this purpose, it is enough to show that
for any . Fix j and let and . Let . Note that and thus is a martingale sequence with respect to .
Similar to the proof of the convergence of , we can show that , which implies that for any
for some constant . Furthermore as . Hence, by the central limit theorem for martingale difference sequences (see [13]), the desired convergence of to in (10) in distribution holds. We complete the proof of (a).
For (b), we observe covariance matrix for , which is equal to
By (8),
which is zero because are independent. Thus the covariance matrix is zero, and therefore the desired asymptotic multivariate normality of is obtained with the covariance matrix . □
Now we adopt correlated error processes by assuming the following:
(A3) are correlated random variables with mean zero and , that is, the covariance matrix is assumed to be .
The covariance matrix is nonsingular and positive definite so there exists a nonsingular symmetric matrix and .
Under (A1) and (A3), note that lag-ℓ autocovariance matrix functions of are given by and
In this case the generalized least squares estimator (GLSE) is computed by minimizing the sum of squared standardized errors. The GLSE of is given as follows: From (7) we have
Let , and Note that and we have
The GLSE is obtained by
The matrix form of (11) with is given as where
The least squares estimator of is of the form , which is the estimator of . Furthermore note that . Hence the GLSE of is given as
that is of the same form as the OLSE. The following theorem presents the asymptotic normality of the GLSE.
Theorem 3.
Assume that (A1) and (A3) hold, and the matrix is of full rank.
(a) For each , as we have
where Σ is given as in Theorem 2 and .
(b) As we have
Theorem 3 includes the following case of the uncorrelated but heterogeneous variance of the error processes with if , and zero if .
Corollary 1.
Assume (A1) and if and , then as , and where is given by
Proof of Theorem 3.
The proof is similar to that of the Theorem 2, except for the limit of the covariance matrix for . Following what we’ve done in the proof of Theorem 2, we get
which is equal to . Thus the desired asymptotic multivariate normality of is completed with the covariance matrix . □
We have discussed parameter estimation by means of the least squared method and have established the multivariate normalities in multivariate HAR() models. For a univariate HAR model, Hwang and Shin [3] proposed infinite-order, long-memory HAR model to capture the long-memory property and studied the asymptotic theories of the LSE. In the work of [3,14], it was assumed that HAR coefficients decrease exponentially, and it was shown that, under the exponential decay condition, the autocorrelation function of the HAR model is algebraically decreasing and thus the model is of long-memory. For this reason, additionally we consider the exponentially weighted multivariate HAR model with exponential decay rate, and develop its asymptotic normality of the rate estimator. As a simple case we assume that multiple assets have a common rate on coefficients, and the rate estimation problem is investigated in the following section. A general case of the exponentially weighted multivariate HAR model will be dealt in the future study.
4. Exponentially Weighted Multivariate HAR Model
In this section, as a special case of model (2), we consider an exponentially weighted multivariate HAR model with common rate on the parameters of common regressors as follows: In model (2), for , and for , for some and . The parameters and are estimated using the LSE in Section 3 as follows:
We write , which is given in the asymptotic variances in Theorems 2 and 3, for , that is, is -component of . By Theorem 2 for each , we have for some which is the corresponding component of the matrix in Theorem 2. Indeed, is the same variance for all j, say, , and we can easily represent with . The following theorem states the asymptotic normality of the estimates for the common rate.
Theorem 4.
Assume (A1) and (A2) hold. In model (2) with as , we have
where with and following normal distribution with mean zero and variance
Proof of Theorem 4.
By Theorem 2 for each , we have for some which is the corresponding component of the matrix in Theorem 2, where . We may write
where s are normal random variables with mean zero and variance for each j and we have
Note that has asymptotically normal distribution with mean zero and variance . Therefore, the desired asymptotic normality of is obtained. □
Remark 1.
As pointed out by [3], the exponential decay condition is a condition for the long-memory property of HAR models. Ref. [3] discussed the HAR model of infinity order and its approximation of finite orders, where it has been shown that the exponential decay condition is equivalent to algebraically decay autocorrelation functions along with a mild lag condition. We are interested in testing whether or not the model is the exponentially weighted multivariate HAR model with common rate λ. For example, we construct the following hypothesis and test statistic: the null hypothesis for some and common rate , for each , versus. the alternative hypothesis : the model does not have the common rate nor the exponentially weighted multivariate HAR model. For , let
and note that asymptotic normality of holds like Theorem 4 by that of the OLSEs. A collection of the differences is considered. In the collection, all distinct elements in absolute values are at most . These are relabelled as . Let or some related statistics of . Similarly to Theorem 4, we can find the limiting distribution of under the null hypothesis . The null might be rejected if is large or if is large. In a future work, a specific test statistic related to will be constructed and the limiting distribution of the test will be investigated.
5. A Monte-Carlo Study
We present simulation results for model (2) with and , which is a bivariate case of two assets, using parameters . We consider both independent errors and correlated ones for , : (i) i.i.d. normal distribution with and (ii) correlated normal distribution with with with . The bivariate HAR(3,2) processes are generated according to (2) with sample size .
We compute the LSEs for the parameters with replication numbers 500 and give the sample means and standard errors of the 500 estimates in Table 1 and Table 2 with . It is reported in Tables that sample means are closer to the real values of parameters as well as standard errors decreases as n increases. Additionally, Figure 1 and Figure 2 illustrate plots of sample means (a), (b) and standard errors (c), (d) of the LSEs of the parameters as n increases on the horizontal axis: Figure 1a,c and Figure 2a,c for the estimates of and Figure 1b,d and Figure 2b,d for the estimates of .
Table 1.
Sample mean and standard error of 500 estimates in i.i.d. case.
Table 2.
Sample mean and standard error of 500 estimates in correlated case.
Figure 1.
Sample mean (a,b) and standard error (c,d) of the 500 estimates in i.i.d. case.
Figure 2.
Sample mean (a,b) and standard error (c,d) of the 500 estimates in correlated case.
Confidence intervals using the normal approximations in Theorems 2 and 3 with confidence level are constructed: , for each parameter component , with its LSE and standard-error estimate , where is the corresponding (diagonal) component of estimate of asymptotic covariance matrix in Theorems 2 and 3. The confidence intervals of 500 samples as well as the empirical coverage probabilities and average lengths in the i.i.d. error case by the normal approximations are demonstrated in Figure 3, where sample size and replication number 500 are used. To illustrate the multivariate asymptotic normality, plots of the normal approximations for estimates of some parameters are depicted in Figure 4, and the bivariate normalities of some pairs of two chosen estimates also can be seen in Figure 5. The three figures support normality results established in the theory.
Figure 3.
Empirical coverage and average length (in parenthesis) of 95% confidence intervals for seven components of (left column) and (right column) in i.i.d. error case. The horizontal dotted line indicates the true value of the parameter in each plot. Confidence intervals that do not contain the parameter are depicted as red color.
Figure 4.
Normal approximation of estimates in i.i.d.case (a–c) and correlated case (d–f).
Figure 5.
Bivariate normal approximation of some pairs of two chosen estimates ( in i.i.d.case (a–c) and correlated case (d–f).
As for the exponentially weighted multivariate HAR(3,2) model with in Section 4, the estimates for the common rate are given in Table 3, from which we see that sample means of estimated values are close to the true ones with reasonable standard errors.
Table 3.
Sample mean and standard error of 500 estimates in exponentially weighted HAR model.
6. Application
This section addresses empirical data analysis on the Gold spot price and S&P500 index. Their volatility is modeled by bivariate HAR model, using three years of daily closing price, from 18 September 2017 to 17 September 2020, of Gold and S&P500. The three years of Gold price and S&P500 index movement and their log return are shown in Figure 6 while their volatility and autocorrelation coefficients functions (ACFs) in Figure 7.
Figure 6.
Gold spot price in USD and its return (in red) and S&P500 index and its return (in blue) against number of days starting from 18 September 2017 to 17 September 2020.
Figure 7.
Volatility and ACF for Gold (in red) and S&P 500 (in blue).
We list some critera of MSE, R, AIC, BIC in Table 4 for the OLSEs (ordinary least squares estimators) of the bivariate HAR model of order by examining the volatility of Gold and S&P500. Conventionally lags of order are used for a day (), a week , a month () and a quarter () in a HAR model. Nevertheless, in this work, we consider optimal lags in the sense of minimizing MSE of OLSEs for Gold and S&P500 volatility simultaneously. For order , we choose lags to satisfy the condition, for some large , so that the sum of the two mean squared residuals of the OLSEs for the volatility of Gold and S&P500 is minimized. As a result, we found optimal lags for with (MSE, MSE and for with (MSE, MSE. In Table 4, we compare our optimal lags with the conventional lags in different criteria. In all cases considered, our selection of optimal lags with turns out to be the best.
Table 4.
OLSEs in the bivariate HAR( model on volatility of Gold and S&P500.
Therefore we report estimation results of the HAR(3,2) model with for the volatilities of Gold and S&P500 in Table 5, where coefficients estimates, standard errors and 95% confidence intervals are provided. In Figure 8, two plots of the bivariate HAR(3,2) fitted model by the OLSEs for this optimal case on the datasets of Gold and S&P volatilities are depicted along with residuals. We see that the bivariate HAR(3,2) fittings are similar to the real volatility plots of both datasets as reported upon the criterions.
Table 5.
Estimation results for Gold and S&P500 using HAR(3,2) with .
Figure 8.
Fitted HAR model and residual using OLSE for the volatility of Gold and S&P500.
7. Conclusions
Due to the cross-correlation of multiple assets and spillover effect of volatility in the financial market, a multivariate heterogeneous autoregressive-realized volatility (HAR-RV) model has attained much attention recently. In the multivariate HAR model, its stationarity is discussed and estimation problems are studied. We first investigate the strictly stationarity solution of the multivariate HAR model and second develop the asymptotic normality theory for the least squares estimates (LSEs) with i.i.d. and correlated errors, respectively. Third, we propose an exponentially weighted multivariate HAR model and estimate its common exponential decay rate. In a Monte-Carlo experiment, performances of the LSEs are numerically illustrated with sample mean and standard error of the estimates as well as empirical coverage and average length of confidence intervals by using the normal approximation. In addition, as a real data example, volatilities of Gold spot price and S&P500 index during recent three years are used to analyze in a bivariate HAR model. The coefficient estimates and confidence intervals are found in the bivariate HAR model of volatility of Gold and S&P500, along with choosing optimal lags, and it is shown that the bivariate HAR model with the proposed optimal lags is well matched with the volatility of the financial data.
We suggest some problems on the multivariate HAR model. As we proposed before, the exponentially weighted HAR models with decay rates are of interest owing to reduced numbers of parameters as well as the long-memory property. In modeling the multivariate HAR model, testing whether the HAR coefficients have exponentially weighted decay rates or not and furthermore whether multiple assets have a common decay rate or not might provide statistically useful tools to analyze the time series model. In follow-up studies we will deal with the hypotheses tests by constructing the test statistics and establishing the null limiting distribution. Finally, we mention that in a multivariate HAR model with heteroscedasticity errors, asymptotic properties of the estimates differ from the existing results, and thus we will derive the asymptotic theory on the HAR models in the presence of dynamic heteroscedasticity. In this case, financial market data can be represented more remarkably and hence forecasting volatility with forecast accuracy will be carried out in the further research.
Author Contributions
Validation, W.-T.H., J.L. and E.H.; writing–review and editing, W.-T.H, J.L and E.H; funding acquisition, E.H. All authors have read and agreed to the published version of the manuscript.
Funding
This work was supported by Research Fund of Gachon University (GCU-2019-0299).
Conflicts of Interest
The authors declare no conflict of interest.
References
- Corsi, F. A simple approximate long-memory model of realized volatility. J. Financ. Econom. 2009, 7, 174–196. [Google Scholar] [CrossRef]
- Chen, Y.; Härdle, W.K.; Pigorsch, U. Localized realized volatility modeling. J. Am. Stat. Assoc. 2010, 105, 1376–1393. [Google Scholar] [CrossRef]
- Hwang, E.; Shin, D.W. Infinite-order, long-memory heterogeneous autoregressive models. Comput. Stat. Data Anal. 2014, 76, 339–358. [Google Scholar] [CrossRef]
- Qu, H.; Ji, P. Modeling realized volatility dynamics with a genetic algorithm. J. Forecast. 2016, 35, 434–444. [Google Scholar] [CrossRef]
- Audrino, F.; Huang, C.; Okhrin, O. Flexible HAR model for realized volatility. Stud. Nonlinear Dyn. Econom. 2018, 23. [Google Scholar] [CrossRef]
- Bubák, V.; Kočenda, E.; Žikeš, F. Volatility transmission in emerging European foreign exchange markets. J. Bank. Financ. 2011, 35, 2829–2841. [Google Scholar] [CrossRef]
- Chiriac, R.; Voev, V. Modelling and forecasting multivariate realized volatility. J. Appl. Econom. 2011, 26, 922–947. [Google Scholar] [CrossRef]
- Souček, M.; Todorova, N. Realized volatility transmission between crude oil and equity futures markets: A multivariate HAR approach. Energy Econ. 2013, 40, 586–597. [Google Scholar] [CrossRef]
- Cubadda, G.; Guardabascio, B.; Hecq, A. A vector heterogeneous autoregressive index model for realized volatility measures. Int. J. Forecast. 2017, 33, 337–344. [Google Scholar] [CrossRef]
- Čech, F.; Baruník, J. On the modelling and forecasting of multivariate realized volatility: Generalized heterogeneous autoregressive (GHAR) model. J. Forecast. 2017, 36, 181–206. [Google Scholar] [CrossRef]
- Luo, J.; Chen, L. Realized volatility forecast with the Bayesian random compressed multivariate HAR model. Int. J. Forecast. 2020, 36, 781–799. [Google Scholar] [CrossRef]
- Taylor, N. Realized volatility forecasting in an international context. Appl. Econom. Lett. 2015, 22, 503–509. [Google Scholar] [CrossRef]
- Hall, P.; Heyde, C.C. Martingale Limit Theory and Its Application; Academic Press: Cambridge, MA, USA, 2014. [Google Scholar]
- Hwang, E.; Shin, D.W. A CUSUM test for a long memory heterogeneous autoregressive model. Econ. Lett. 2013, 121, 379–383. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).