Next Article in Journal
Research on Nickel Material Trade Redistribution Strategy Based on the Maximum Entropy Principle
Next Article in Special Issue
Asymmetric Fractal Characteristics and Market Efficiency Analysis of Style Stock Indices
Previous Article in Journal
Detecting Errors with Zero-Shot Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Exponentially Weighted Multivariate HAR Model with Applications in the Stock Market

1
Department of International Studies, Kyung Hee University, Yongin-si 17104, Korea
2
Department of Applied Statistics, Gachon University, Seongnam-si 13120, Korea
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Entropy 2022, 24(7), 937; https://doi.org/10.3390/e24070937
Submission received: 30 May 2022 / Revised: 30 June 2022 / Accepted: 1 July 2022 / Published: 6 July 2022
(This article belongs to the Special Issue Applications of Statistical Physics in Finance and Economics)

Abstract

:
This paper considers a multivariate time series model for stock prices in the stock market. A multivariate heterogeneous autoregressive (HAR) model is adopted with exponentially decaying coefficients. This model is not only suitable for multivariate data with strong cross-correlation and long memory, but also represents a common structure of the joint data in terms of decay rates. Tests are proposed to identify the existence of the decay rates in the multivariate HAR model. The null limiting distributions are established as the standard Brownian bridge and are proven by means of a modified martingale central limit theorem. Simulation studies are conducted to assess the performance of tests and estimates. Empirical analysis with joint datasets of U.S. stock prices illustrates that the proposed model outperforms the conventional HAR models via OLSE and LASSO with respect to residual errors.

1. Introduction

Financial market data are often correlated with each other and need to be analyzed together. If two or more financial data belong to the same category or reveal a similar pattern with strong correlation, the bivariate or multivariate datasets should be modeled simultaneously by an appropriate multivariate time series model to obtain good performance. Multivariate financial data with characteristics such as strong correlation and long memory have attracted much attention from econometricians and statisticians. Among various time series models, one of the most popular and powerful models capturing such financial features is the heterogenous autoregressive realized-volatility (HAR-RV) model [1]. Based on the HAR model, this paper considers a multivariate model to analyze joint time series data with strong cross-correlation.
The HAR-RV model has been originally proposed and widely used to explore the predictability of realized volatility [2,3,4,5]. In particular, Anderson et al. [2] used the HAR models for volatility prediction of stock prices, foreign exchange rates, and bond prices. Corsi et al. [3] discussed the volatility of the realized volatility based on HAR models with non-Gaussianity and volatility clustering. McAleer and Medeiros [4] proposed an extension of the HAR model with a multiple-regime smooth transition which contains long memory and nonlinearity, and incorporates sign and size asymmetries. Hillebrand and Medeiros [5] considered log–linear and neural network HAR models of realized volatility. Tang and Chi [6] found that the HAR model showed better predictive ability than the ARFIMA-RV model. Clements et al. [7], Bollerslev et al. [8], Bianco et al. [9], and Asai et al. [10] investigated successful uses of the HAR models for risk management with VaR measures, risk-return tradeoff, serial correlation, implied volatility, and realized volatility errors. Luo et al. [11] incorporated jumps, leverage effects, and speculation effects into the realized volatility modeling and showed that the portfolio using infinite hidden Markov regime-switching HAR model achieves higher portfolio returns than benchmark HAR model. Meanwhile, as an application of the HAR-RV model to various financial data such as oil, gold, and bitcoin realized volatility [12,13,14,15], we considered extensions of the model incorporating with an associated-uncertainty index to obtain high forecasting gains.
Along with the success of univariate HAR models as above, a multivariate HAR model has been adopted for financial analysis due to its usefulness with multivariate data. Many researchers discussed the superiority of the multivariate HAR model. Busch et al. [16] used a vector HAR model to control possible endogeneity issues. Taylor [17] demonstrated that the multivariate HAR-RV model improved forecast accuracy of the realized volatility in the international stock market. The claim in [17] was also verified by Hwang and Hong [18], who dealt with the multivariate HAR-RV model with heteroskedastic errors. Cech and Barunik [19] showed the generalized HAR model offers better predictability than univariate models in commodity markets. Tang et al. [20,21] showed that the multivariate HAR-RV model is more accurate in out-of-sample forecasting and outperforms the univariate models.
In addition, the multivariate HAR model represents strong correlations between the multiple assets data and examines cross-market spillover effects. For instance, Bubak et al. [22] used a multivariate extension of the HAR model to analyze volatility transmission between currencies and foreign exchange rates, whereas Bauer and Vorkink [23] adopted a multivariate setting of the HAR model and showed how to ensure positive covariance matrix without parameter restrictions. Soucek and Todorova [24] found instantaneous correlation between equity and energy futures by proposing a vector HAR model. Cubadda et al. [25] studied a vector HAR index model for detecting the presence of commonalities in a set of realized volatility measures, whereas Bollerslev et al. [26] proposed a model for a scalar version of vectorized HAR model for the variances and correlations separately. Luo and Ji [27] combined the HAR model with other models to identify time-varying volatility connectedness. Luo and Chen [28] employed matrix log transformation method to ensure the positive definiteness of covariance matrices and developed a Bayesian random compressed multivariate HAR model to forecast the realized covariance matrices of stock returns. Wilms et al. [29] showed that cross-market spillover effects embedded in the multivariate HAR models have long-term forecasting power.
Even though the HAR model is widely used for volatility forecasting based on the realized volatility of intraday prices, it is not restricted to the realized volatility but can be applied to various time series data such as the stock price itself or other economics index, because the HAR model is theoretically a linear AR model. Stock market forecasting techniques were surveyed in [30,31,32], including stock returns, stock prices, and volatility via conventional time series methods and soft computing methods. Stock price modelings are mostly based on efficient market hypothesis (EMH), random walk theory and machine learning techniques as in [33,34,35,36]. According to the EMH, the only relevant information on the stock is its current values. A promising application of the HAR model could be the stock price movement. A reason why the HAR model is expected to perform well on the stock price modeling is that the current value itself and the current averages make its future value in the model.
In this work, we propose a multivariate time series model for strongly correlated data and study its statistical inference of hypothesis test and estimation, with empirical analysis on joint data of financial assets. More specifically, we focus on the multivariate HAR model with exponentially decaying coefficients for the application to stock prices in the stock market. Because two or more financial data exhibit a similar pattern with strong correlation, a multivariate model should be adopted for the multiple assets, instead of univariate models for each asset. However, when the multivariate HAR model is employed to analyze the multiple data, there are many parameters to be estimated. For example, even with two assets, a bivariate HAR(3) model has 14 parameters including two intercept terms. For better performance, we need to make some efforts to reduce errors along with fewer parameters in the model. As a trial for this, we consider the exponentially weighted multivariate HAR model that has exponentially decaying coefficients. If decay rates can be imposed in the multivariate HAR model, the number of parameters substantially decreases and the proposed model might outperform the existing models, reducing the errors as well. This is one of the motivations for this work in the spirit of a principle of parsimony. Moreover, the decay rates not only serve as the long-memory effect as seen in [37], but also represent the commonality of the joint data, as we expect a common structure in multiple assets with strong correlation.
In order to employ the proposed model for joint time series data, the data need to be tested before fitting the model. To this end, we deal with a test problem based on the CUSUM test to identify the presence of decay coefficients in the multivariate HAR model of the fitted data. In general, the CUSUM test is a change-point test and would be reasonable if parameter changes are expected within the time series. For example, Refs. [38,39] dealt with CUSUM(SQ) tests for mean and variance change-detection in univariate HAR() models and [40] proposed a CUSUM test for parameter change in dynamic panel models. However, the idea of the CUSUM tests can be applied to detect other dynamic structures. In this work, we suggest the use of such an idea to detect coefficient structure by generating pseudo-time series of residuals in two versions. In other words, by applying the idea of the tests to the difference series of two types of residuals, but not to the original data, the coefficient structure can be identified. That is, the CUSUM(SQ) tests of mean and variance change-detection in [38,39] are used for the pseudo-time series generated by two residuals. The key point is that under the null hypothesis, the mean or variance of the difference series are not changed over time, whereas under the alternative hypothesis there exist change-points in mean or variance of the difference series of two residuals. This idea is a novel attempt in that the CUSUM tests are used for other test problems in time series analysis, not limited to the conventional change-point detection of the raw data.
This work proposes two CUSUM-based tests to detect whether the underlying model has exponentially decaying coefficients. The first test is conducted to test whether the model has an exponential decay rate for each asset, and the second tests whether the exponentially weighted multivariate HAR model has a common decay rate for all the multiple assets. The null limiting distributions are developed as the standard Brownian bridge, and the theoretical results are proven by means of a modified version of a martingale central limit theorem. Additionally, easy-to-implement estimators of the decay rates are discussed.
A Monte Carlo experiment is carried out to see the sample paths of our model and to validate the proposed statistical methods. The sample paths depict the long-memory feature as well as strong cross-correlation of the simulated data. Furthermore, various related series such as difference series and test statistics are depicted to justify our proposed tests under the null and alternative hypotheses. The simulation study not only strongly supports the proposed CUSUM tests with reasonable performances of size and power, but also shows consistency in estimates of decay rates. To compare with the conventional HAR model, root mean squared error (RMSE), mean absolute error (MAE), AIC, and BIC are evaluated in the models with several values of fitting parameters as well as efficiency of the exponentially weighted HAR model, relative to the benchmark HAR model, is computed by using two metrics of RMSE and MAE. It is reported that our proposed model with fewer parameters can reduce the residual errors, compared to the existing HAR models.
As an empirical application of this work, financial market stock prices with similar patterns are selected to suit the multivariate HAR model. It is interesting that the exponentially weighted multivariate HAR model is shown to be suitable for the joint data of U.S. stock prices, rather than the volatility. Our proposed CUSUM tests favor the existence of the decay rates in the multivariate HAR model of the stock prices, based on the computed test statistics. The decay-rate estimators for the stock prices are evaluated as well. The stock prices are well-matched to the exponentially weighted multivariate HAR model. To compare performance of the proposed model, RMSE, MAE, AIC, and BIC are evaluated along with those of the conventional univariate and multivariate HAR models via OLSE and LASSO. The exponentially weighted multivariate HAR model outperforms others in the chosen datasets of U.S. stock prices.
We summarize main benefits of the exponentially weighted multivariate HAR model according to the following points: fewer number of parameters, reduction of the model-fitting errors, representation of the common structure with decay rates, and an appropriate model for joint datasets of stock prices with similar patterns. Our proposed model is suitable for strongly cross-correlated multivariate (bivariate) data with similar patterns because the decay rates yield a common structure in the joint data. Along with the high applicability of the HAR model, the proposed model can be used to analyze and forecast the joint data with strong correlation and long memory as well as its extension can be considered with an exogenous variable such as associated-uncertainty index, as in [12,13,14,15]. The proposed model would help analysts provide simpler and more efficient models by producing smaller errors in predictions in financial time series. Furthermore, it has the potential to extend to dynamic time series models with error terms of heteroscedasticity, time-varying variance, non-Gaussianity, or heavy-tailed distribution, which are more practical in real-world financial markets.
The remainder of the paper is organized as follows. In Section 2 we describe the model and develop main results of the tests, and in Section 3 a simulation study is performed. In Section 4, empirical examples are given. Concluding remarks are stated in Section 5, and proofs are drawn in the Appendix A.

2. Model and Main Results

We consider a multivariate HAR( p , q ) model { Y j , t : t Z , j = 1 , 2 , , q } of order p with q multiple assets, given by
Y j , t = β j 0 + i = 1 p β j 1 ( i ) Y 1 , t 1 ( i ) + + i = 1 p β j q ( i ) Y q , t 1 ( i ) + ϵ j , t
where Y j , t 1 ( i ) = 1 h i Y j , t 1 + + Y j , t h i with positive integers { h i , i = 1 , 2 , , p } satisfying 1 = h 1 < h 2 < < h p < , β j 0 , β j k ( i ) : j , k { 1 , , q } ; i { 1 , 2 , , p } are parameters to be estimated, and { ϵ j , t , t Z , j = 1 , , q } are independent random variables with mean zero and finite variance.
In this work, we are particularly concerned with the multivariate HAR model with exponentially decaying coefficients in order to account for the lesser weights on the farther past values. In the conventional HAR model, regressors are previous value, weekly average and monthly average of consecutive data, which are assigned with coefficients in a decreasing order to represent the long-memory features. For example, see [37], which introduced the (univariate) HAR ( ) model with coefficients decaying exponentially to capture the genuine long-memory. They showed that exponentially decaying coefficients make algebraic decreasing autocovariance functions under appropriate lag conditions in the HAR() model. Likewise, we consider the exponentially weighted coefficient version of model (1) with multiple assets, called exponentially weighted multivariate HAR( p , q ) model. In our proposed model, coefficients are assumed to be
β j k ( i ) = c j k λ j i 1 for some c j k and 0 < | λ j | < 1 ,
for j , k { 1 , , q } and i { 1 , , p } . The c j k is the first coefficient for the previous value of the kth asset, Y k , t 1 , at the first lag t 1 , and the λ j is the decay rate for the next coefficients. The exponentially weighted multivariate HAR model has long memory as seen in Figure 1 and Figure 2 in the next section, where autocovariance functions as well as the sample paths of the model with decay rates are observed. The decay rates λ j not only clearly represent the long-memory feature but also reduce the number of parameters to estimate. In this work, we mainly focus on detecting the existence of the decay rates in model (1) and additionally deal with estimating the decay rates.
In the multivariate HAR model, we first study the hypothesis test problem whether the underlying model is an exponentially weighted multivariate HAR model with decay rates, and secondly we handle easy-to-implement estimators of the decay rates.
For the hypothesis test problem, we consider two tests in (i) and (ii) as follows:
(i)
whether or not, in the multivariate HAR model, the jth asset has a decay rate λ j satisfying β j k ( i ) = c j k λ j i 1 for some c j k for each j;
(ii)
whether or not the exponentially weighted multivariate HAR model has a common rate λ for all multiple assets, i.e., β j k ( i ) = c j k λ i 1 for some 0 < | λ | < 1 for all j , k .
In test (i), each asset is first individually analyzed. Once test (i) has been conducted to favor the null, test (ii) is performed to detect a common rate. For test (i), the null hypothesis H j , 0 and the alternative hypothesis H j , A are, for each j, stated as
H j , 0 : β j k ( i ) = c j k λ j i 1 for some c j k and 0 < | λ j | < 1 , vs . H j , A : the model is not the exponentially weighted HAR model .
In order to introduce a test statistic, we adopt the ordinary least squares estimator (OLSE) of the multivariate HAR model. Suppose that we have observed { Y j , t : h p + 1 t n , j = 1 , 2 , , q } of sample size n. Let the OLSE of β j ( β j 0 , β j 1 ( 1 ) , , β j 1 ( p ) , , β j q ( 1 ) ,   , , β j q ( p ) ) be denoted by
β ^ j ( β ^ j 0 , β ^ j 1 ( 1 ) , , β ^ j 1 ( p ) , , β ^ j q ( 1 ) , , β ^ j q ( p ) ) .
The asymptotic property of OLSE β ^ j in the multivariate HAR model is derived theoretically by Hong et al. [41]. From the OLSE, we first choose an estimate of c j k by c ^ j k = β ^ j k ( 1 ) , and then consider a regression model with the decay rates as its coefficients under the null hypothesis as in (2) and (4). To describe the regression model, we let
η j , t = Y j , t β j 0 k = 1 q c j k Y k , t 1 ( 1 ) , η ^ j , t = Y j , t β ^ j 0 k = 1 q c ^ j k Y k , t 1 ( 1 ) .
Note that under the null hypothesis,
η j , t = c j 1 i = 2 p λ j i 1 Y 1 , t 1 ( i ) + + c j q i = 2 p λ j i 1 Y q , t 1 ( i ) + ϵ j , t .
We rewrite η j , t as follows:
η j , t = λ j W j , t 1 ( 2 ) + + λ j p 1 W j , t 1 ( p ) + ϵ j , t ,
where W j , t 1 ( i ) = c j 1 Y 1 , t 1 ( i ) + + c j q Y q , t 1 ( i ) , for i = 2 , 3 , , p . Let
W ^ j , t 1 ( i ) = c ^ j 1 Y 1 , t 1 ( i ) + + c ^ j q Y q , t 1 ( i ) ,
and consider the following regression in (4) with coefficients λ j , 1 , , λ j , p 1 , which is a similar form to (2) but replaced by observable quantities η ^ j , t and W ^ j , t 1 ( k ) , k = 2 , , p :
η ^ j , t = λ j , 1 W ^ j , t 1 ( 2 ) + + λ j , p 1 W ^ j , t 1 ( p ) + ϵ j , t .
From this regression we compute OLSE Λ ^ j , n of the parameters ( λ j , 1 , , λ j , p 1 ) by Λ ^ j , n = ( λ ^ j , 1 , , λ ^ j , p 1 ) . Note that under the null hypothesis with β j k ( i ) = c j k λ j i 1 , it follows that
| λ ^ j , i 1 λ ^ j , 1 i 1 | p 0 .
Thus, to construct a test statistic, two types of residuals { ϵ ^ j , t } and { ϵ ˜ j , t } are respectively defined by
ϵ ^ j , t = η ^ j , t λ ^ j , 1 W ^ j , t 1 ( 2 ) + λ ^ j , 2 W ^ j , t 1 ( 3 ) + + λ ^ j , p 1 W ^ j , t 1 ( p ) , ϵ ˜ j , t = η ^ j , t λ ^ j , 1 W ^ j , t 1 ( 2 ) + λ ^ j , 1 2 W ^ j , t 1 ( 3 ) + + λ ^ j , 1 p 1 W ^ j , t 1 ( p ) .
To construct a test statistic, we use the difference series of the two types of residuals, (not the original time series). Let
D j , t D j , t , n = ϵ ˜ j , t ϵ ^ j , t = ( λ ^ j , 2 λ ^ j , 1 2 ) W ^ j , t 1 ( 3 ) + + ( λ ^ j , p 1 λ ^ j , 1 p 1 ) W ^ j , t 1 ( p ) .
Let S j , n = 1 σ ^ j , D n t = 1 n D j , t where σ ^ j , D 2 is a consistent estimator of V a r ( D j , t ) , for example,
σ ^ j , D 2 = 1 n t = 1 n D j , t 2 or σ ^ j , D 2 = 1 n t = 1 n D j , t 2 1 n t = 1 n D j , t 2 ,
noting that E [ D j , t ] 0 as n under the null hypothesis. Now we define a CUSUM test statistic T ^ j , n ( z ) as follows: for 0 z 1 ,
T ^ j , n ( z ) = 1 σ ^ j , D n t = 1 [ n z ] D j , t z t = 1 n D j , t .
The following theorem states asymptotic distribution of both statistics. It provides critical values of the test for H j , 0 .
Theorem 1.
We assume E | Y j , t | 2 + δ < for some δ > 0 , for all j , t . If the multivariate HAR( p . q ) model has exponential decay rate λ j with β j k ( i ) = c j k λ j i 1 for some 0 < | λ j | < 1 for each j, then we have, as n ,
S j , n d N ( 0 , 1 ) a n d sup 0 z 1 | T ^ j , n ( z ) | d sup 0 z 1 | B 0 ( z ) |
where B 0 ( z ) = B ( z ) z B ( 1 ) is the standard Brownian bridge with the Brownian motion B ( z ) .
Remark 1.
In order to test H j , 0 vs. H j , A , we adopt the CUSUM test statistics T ^ j , n ( z ) , rather than S j , n , and the null hypothesis is rejected if | T ^ j , n ( z ) | is large. The reason is as follows: The difference series { D j , t } has coefficients { λ ^ j , i λ ^ j , 1 i : i = 2 , 3 , , p 1 } in the linear combination of { W ^ j , t 1 ( i ) : i = 3 , , p } . Note that the pseudo-time series { D j , t = D j , t , n : t = 1 , 2 , , n } is a triangular array and under the null hypothesis the coefficients are asymptotically zeros whereas under the alternative hypothesis the coefficients are changed over the time without vanishing asymptotically, which makes a change-point in mean or variance of the difference series. This idea is the reason why we adopt the CUSUM-based test for our goal that is to detect the exponentially decay rates. Sample paths of the series W ^ j , t 1 ( i ) and D j , t under both H j , 0 and H j , A can be seen in Figure 3 and Figure 4 along with values of | T ^ j , n ( z ) | . In Figure 3 and Figure 4, it is shown that the difference series under the null is an asymptotical constant due to the asymptotical zeros of the coefficients, whereas under the alternative, it fluctuates with large variance; that is, it indicates that there are change-points in mean or variance. On the other hand, we might use the full sum S j , n as a test statistic in a view of theoretical insight. However, as seen in Figure 3 and Figure 4, even under H j , 0 , the sum is not evaluated as small values because of the following reason: S j , n can be expressed as a linear combination of { t = 1 n W ^ j , t 1 ( i ) / n : i = 3 , , p } and thus as a linear combination of { t = 1 n c ^ j k Y k , t 1 ( i ) / n : i = 3 , , p ; k = 1 , , q } . Note that for each k, t = 1 n ( Y k , t 1 ( i ) E [ Y k , t 1 ( i ) ] ) / n converges to the normal distribution with asymptotic mean zero. Thus S j , n makes an asymptotic bias of the form n ( λ ^ j , k λ ^ j , 1 k ) E [ Y k , t 1 ( i ) ] in a finite sample. Because the asymptotic bias is not negligible even though n ( λ ^ j , k λ ^ j , 1 k ) tends to normal distribution with mean zero under the null hypothesis, the sum S j , n has somewhat large values and thus cannot distinguish significantly the two hypotheses. Therefore, this work adopts the test statistic T ^ j , n ( z ) to resolve our problem.
Now we would further like to test whether or not the exponential weighted multivariate HAR model has a common exponential decay rate λ for all multiple assets. That is, in the exponentially weighted multivariate HAR( p , q ) model with β j k ( i ) = c j k λ j i 1 for some 0 < | λ j | < 1 for all j , k , after the first test has been performed, we test the null hypothesis H 0 * versus the alternative hypothesis H A * as follows:
H 0 * : λ 1 = = λ q = λ with a common rate λ on all assets j , H A * : not the null .
Similar to the above, under the null H 0 * , for all j we have η j , t = λ W j , t 1 ( 2 ) + + λ p 1 W j , t 1 ( p ) + ϵ j , t instead of (2), but we use a consistent estimate λ ^ j of λ j for { Y j , t } . For the estimation of the rates λ j , j = 1 , , q , we discuss below in Remark 2. By using the estimate λ ^ j we compute residuals ϵ ^ j , t * = η ^ j , t λ ^ j W ^ j , t 1 ( 2 ) + + λ ^ j p 1 W ^ j , t 1 ( p ) for each j. Now, we let
ϵ t * ( j ) = sup k j η ^ j , t λ ^ k W ^ j , t 1 ( 2 ) + + λ ^ k p 1 W ^ j , t 1 ( p )
and let d j , t = ϵ ^ j , t * 2 ϵ t * ( j ) 2 , D ˜ j , t = d j , t 2 σ ^ j , d 2 where σ ^ j , d 2 = t = 1 n d j , t 2 / n . Also, let D t * = j = 1 q D ˜ j , t . We construct a test statistic for testing if the HAR model has common rate as follows:
For 0 z 1 , let T ^ n * ( z ) = t = 1 [ n z ] D t * / ( σ ^ D * n ) , which is rewritten as
T ^ n * ( z ) = 1 σ ^ D * n j = 1 q t = 1 [ n z ] d j , t 2 z t = 1 n d j , t 2
where σ ^ D * 2 is a consistent estimator of V a r ( D t * ) such as 1 n t = 1 n D t * 2 . The following theorem provides the null limiting distribution of the test statistic.
Theorem 2.
We assume E | Y j , t | 2 + δ < for some δ > 0 , for all j , t . If the multivariate HAR( p . q ) model has a common exponential decay rate λ with β j k ( i ) = c j k λ i 1 for some 0 < | λ | < 1 for all j , k , then we have, as n ,
sup 0 z 1 | T ^ n * ( z ) | d sup 0 z 1 | B 0 ( z ) | .
Note that under the null hypothesis H 0 * , the difference series { d j , t } are evaluated as small values and are characterized with small variance, but under the alternative hypothesis H A * , they have large values with dynamic variance over the time. Thus we use the CUSUMSQ test for the difference series { d j , t } (not the original data) to see the change-point of the variance. Justification of suitability of the CUSUMSQ test can be seen in the next section, where sample paths of the difference squared, d j , t 2 , and the values of test statistics in absolute, | T ^ n * ( z ) | , under both hypotheses are depicted.
Once the first test in Theorem 1 has been conducted to datasets of multiple assets, we obtain estimates of the decay rates by using (9), and then the second test in Theorem 2 is conducted to see whether the datasets have a common rate. Finally, the estimate (10) is used to find the common rate.
Remark 2.
The following concerns the estimation of the decay rates. In the exponentially weighted multivariate HAR model (1) with coefficients β j k ( i ) = c j k λ j i 1 , estimators of the decay rates λ j can be obtained in a simple way. From the OLSEs of parameter vector β j , we construct an easy-to-implement estimator of λ j as follows:
λ ^ j = k = 1 q β ^ j k ( 2 ) k = 1 q β ^ j k ( 1 ) .
Furthermore, in case of the common rate with β j k ( i ) = c j k λ i 1 , the common rate λ is estimated by
λ ^ = j = 1 q k = 1 q β ^ j k ( 2 ) j = 1 q k = 1 q β ^ j k ( 1 ) .
In the estimates of the decay rates in (9) and (10), only the first and the second coefficients estimates, i.e., β ^ j k ( 1 ) and β ^ j k ( 2 ) , are used. This is because these two estimates have comparatively fewer standard errors than others. To see their performances, sample means and standard errors of the estimates in (9) and (10) are computed and compared in the next section.
In the conventional multivariate HAR( p , q ) model there are a total of ( 1 + p q ) q coefficient parameters to estimate, whereas in the exponential weighted multivariate HAR ( p , q ) model the number of parameters is decreased to ( 2 + q ) q . Each j = 1 , 2 , , q , Y j , t has one intercept, and q coefficients of the previous lag values of q assets and one decay rate. For a simple case with p = 3 and q = 2 , the number of parameters is reduced from 14 to 8. This implies that some measures of statistical models such as AIC and BIC might be improved considerably. This improvement can be shown in the following sections with simulated data and real data examples. In the multivariate HAR ( p , q ) model, the asymptotic normality for the OLSE β ^ j , O ( β ^ j , O L S E ) of β j has been established by Hong et al. [41]: n ( β ^ j , O β j ) d N ( 0 , Σ ) as n , where Σ is some ( 1 + p q ) × ( 1 + p q ) covariance matrix.
Remark 3.
The following concerns the bias adjustment for a finite sample. In our exponential weighted multivariate HAR model with β j k ( i ) = c j k λ j i 1 , which are components of β j = ( β j 0 , β j 1 ( 1 ) , , β j 1 ( p ) , , β j q ( 1 ) , , β j q ( p ) ) , we construct an estimator β ˜ j , Λ of β j , called the rate-adopted estimator (RE), as follows: β ˜ j , Λ = ( β ˜ j 0 , β ˜ j 1 ( 1 ) , , β ˜ j 1 ( p ) , , β ˜ j q ( 1 ) , , β ˜ j q ( p ) ) where
β ˜ j 0 = β ^ j 0 , β ˜ j k ( 1 ) = c ^ j k ( β ^ j k ( 1 ) ) a n d β ˜ j k ( i ) = c ^ j k λ ^ j i 1 f o r i 2
with λ ^ j in (9). It is obvious that β ^ j k ( i ) β ˜ j k ( i ) = β ^ j k ( i ) c ^ j k λ ^ j i 1 p 0 as n . Here we need to observe the residuals on behalf of the empirical analysis for a finite sample. We rewrite model (1) as Y j , t = β j X t 1 + ϵ j , t , where
X t 1 = 1 , Y 1 , t 1 ( 1 ) , , Y 1 , t 1 ( p ) , Y 2 , t 1 ( 1 ) , , Y 2 , t 1 ( p ) , , Y q , t 1 ( 1 ) , , Y q , t 1 ( p ) R ( 1 + p q ) .
Let ϵ ^ j , t , O and ϵ ˜ j , t , Λ be residuals by the OLSE and the RE, respectively:
ϵ ^ j , t , O = Y j , t β ^ j , O X t 1 a n d ϵ ˜ j , t , Λ = Y j , t β ˜ j , Λ X t 1 .
Note that
ϵ ˜ j , t , Λ = Y j , t β ^ j , O X t 1 + ( β ^ j , O β ˜ j , Λ ) X t 1 = ( β j β ^ j , O ) X t 1 + ( β ^ j , O β ˜ j , Λ ) X t 1 + ϵ j , t .
Let μ ¯ j , Λ = 1 n t = 1 n ϵ ˜ j , t , Λ , and then by the asymptotic normality of n ( β j β ^ j , O ) with asymptotic mean zero and by noticing that ( β ^ j , O β ˜ j , Λ ) =
0 ; 0 , β ^ j 1 ( 2 ) c ^ j 1 λ ^ j , , β ^ j 1 ( p ) c ^ j 1 λ ^ j p 1 ; 0 , β ^ j 2 ( 2 ) c ^ j 2 λ ^ j ; 0 , β ^ j q ( 2 ) c ^ j q λ ^ j , , β ^ j q ( p ) c ^ j q λ ^ j p 1 ,
we have
μ ¯ j , Λ = 1 n t = 1 n ( β ^ j , O β ˜ j , Λ ) X t 1 + o p ( 1 ) = 1 n t = 1 n k = 1 q i = 2 p ( β ^ j k ( i ) c ^ j k λ ^ j i 1 ) Y k , t 1 ( i ) + o p ( 1 ) .
Even though | β ^ j k ( i ) c ^ j k λ ^ j i 1 | p 0 under the null hypothesis, k = 1 q i = 2 p ( β ^ j k ( i ) c ^ j k λ ^ j i 1 ) Y k , t 1 ( i ) is not negligible in a small finite sample. Thus we need the bias adjustment for a fitting model in a finite sample. When we fit the exponentially weighted HAR model to real datasets, especially ones with small sample size, the error performances can be improved by means of the bias adjustment. For instance, one way is that the fitted model is shifted by the residual mean μ ¯ j , Λ , which is a constant. An alternative way is that the model is shifted by a moving average of residuals, which is a time-varying process, as we define in the following. For a positive integer m and t = 1 , 2 , , n , let
ω ¯ m , t = 1 τ 2 τ 1 + 1 s = τ 1 τ 2 ϵ ˜ j , s , Λ
where τ 1 τ 1 ( t , m ) = max { 1 , t m } and τ 2 τ 2 ( t , m ) = min { n , t + m } , (j is omitted in ω ¯ m , t for notational simplicity). The time-varying process { ω ¯ m , t , t = 1 , 2 , n } determines the error performances of the fitting model shifted by { ω ¯ m , t } . The fitted model with exponential decay rates is now determined by
Y j , t = F ^ j , Λ ( X t 1 ) + ε j , t
where F ^ j , Λ ( X t 1 ) = β ˜ j , Λ X t 1 + ω ¯ m , t and ε j , t = ϵ ˜ j , t , Λ ω ¯ m , t . Note that 1 n t = 1 n ε j , t = o p ( 1 ) . Effects of m, called the fitting parameter, on the error performances of (12) will be discussed in the next section.

3. Monte Carlo Simulation

In this section, we first see the plots of sample paths of the proposed model and their autocorrelation coefficient functions (ACFs). Secondly, finite sample validity of the proposed tests is investigated along with the plots of various related series for the justification of the tests. Thirdly, the estimates of the decay rates are computed, and finally comparisons with conventional HAR models are addressed in terms of measures such as RMSE, MAE, AIC, and BIC. Moreover, efficiency of the proposed model vs. the benchmark HAR model is discussed.
In the simulation experiment, to see the plots of the proposed model, simulated data are generated by bivariate exponentially weighted HAR models of order p = 3 , HAR(3,2) models, with lag structure h = ( h 1 , , h p ) = ( 1 , 5 , 22 ) , by using i.i.d. standard normal distributed N ( 0 , 1 ) -errors { ϵ j , t } and size n = 400 . In order to avoid the effect of selected initial value in the models, data of size 600 are generated and the first 200 data are deleted to obtain n = 400 . Figure 1 and Figure 2 depict sample paths with parameters ( β 10 , c 11 , c 12 ) = ( 1.0 , 0.6 , 0.45 ) , ( β 20 , c 21 , c 22 ) = ( 3 , 0.2 , 0.35 ) , together with their ACFs; Figure 1 uses individual decay rates λ 1 = 0.1 , λ 2 = 0.3 whereas Figure 2 uses common rate λ = 0.15 . We see that the simulated data are strongly correlated with each other and reveal the long-memory feature. In Figure 1, two datasets have a correlation coefficient of 0.7856, and in Figure 2, the correlation coefficient is 0.6748.
Figure 1. Sample paths of exponentially weighted bivariate HAR model and their ACFs, with decay rates λ 1 = 0.1 , λ 2 = 0.3 ; ( β 10 , c 11 , c 12 ) = ( 1.0 , 0.6 , 0.45 ) , ( β 20 , c 21 , c 22 ) = ( 3 , 0.2 , 0.35 ) ; n = 400 . The simulated data of the exponentially weighted bivariate HAR model are characterized with strong cross-correlation and long memory.
Figure 1. Sample paths of exponentially weighted bivariate HAR model and their ACFs, with decay rates λ 1 = 0.1 , λ 2 = 0.3 ; ( β 10 , c 11 , c 12 ) = ( 1.0 , 0.6 , 0.45 ) , ( β 20 , c 21 , c 22 ) = ( 3 , 0.2 , 0.35 ) ; n = 400 . The simulated data of the exponentially weighted bivariate HAR model are characterized with strong cross-correlation and long memory.
Entropy 24 00937 g001
Figure 2. Sample paths of exponentially weighted bivariate HAR model and their ACFs, with common decay rate λ 1 = λ 2 = 0.15 ; ( β 10 , c 11 , c 12 ) = ( 1.0 , 0.6 , 0.45 ) , ( β 20 , c 21 , c 22 ) = ( 3 , 0.2 , 0.35 ) ; n = 400 . The simulated data of the exponentially weighted bivariate HAR model are characterized with strong cross-correlation and long memory.
Figure 2. Sample paths of exponentially weighted bivariate HAR model and their ACFs, with common decay rate λ 1 = λ 2 = 0.15 ; ( β 10 , c 11 , c 12 ) = ( 1.0 , 0.6 , 0.45 ) , ( β 20 , c 21 , c 22 ) = ( 3 , 0.2 , 0.35 ) ; n = 400 . The simulated data of the exponentially weighted bivariate HAR model are characterized with strong cross-correlation and long memory.
Entropy 24 00937 g002
To verify Theorems 1 and 2, we compute the test statistics in the HAR(3,2) model and report their rejection rates in Table 1 and Table 2, respectively. To see the validation of Theorem 1, four combinations of two datasets are assumed as follows:
Case I: Both are exponentially weighted models with λ 1 = 0.5 and λ 2 = 0.4 .
Case II: The first data set follows an exponentially weighted model with λ 1 = 0.5 whereas the second is not.
Case III: The second is an exponentially weighted model with λ 2 = 0.4 whereas the first is not.
Case IV: None of them are exponentially weighted models.
For the null hypothesis of Cases I, II, and III, ( β 10 , c 11 , c 12 ) = ( 1 , 0.1 , 0.25 ) , ( β 20 , c 21 , c 22 ) = ( 3 , 0.2 , 0.15 ) , λ 1 = 0.5 , λ 2 = 0.4 are used. For Case IV, there are total 14 (irregular) parameters, and thus their presentation with so many parameters, including those under the alternative hypothesis in Cases II and III, are omitted here, but are available upon request.
Prior to reporting the test results, some plots of related series are illustrated in order to justify the suitability of the CUSUM tests. In particular, for Cases I, III, and IV, sample paths of W ^ j , t 1 ( i ) , D j , t and | T ^ j , n ( z ) | in (3), (6), and (8), respectively, are depicted with n = 500 , 1000 in Figure 3 and Figure 4, where values of λ ^ j , 2 λ ^ j , 1 2 in Figure 5 are used to compute D j , t together with using W ^ j , t 1 ( i ) . In Figure 5, we see that λ ^ j , 2 λ ^ j , 1 2 tends to zero in Case I ( j = 1 , 2 ) and Case III ( j = 2 ) as its theory indicates as in (5) under the null hypothesis. However, it is shown in Figure 5 that Case III ( j = 1 ) and Case IV ( j = 1 , 2 ) have the deviation from zeros under the alternative. In Figure 3 and Figure 4, under the null hypothesis with decay rates, the difference series D j , t does not fluctuate due to the asymptotic zero of λ ^ j , 2 λ ^ j , 1 2 , which can be interpreted as constant coefficients in the linear combination of W ^ j , t 1 ( i ) , whereas under the alternative, plots of D j , t are dynamic with large variance because of nonzero λ ^ j , 2 λ ^ j , 1 2 , (see Equation (6)). This fact yields higher values of the CUSUM test statistic in absolute, | T ^ j , n ( z ) | , as seen in the third columns of Figure 3 and Figure 4.
Figure 3. Sample paths of W ^ j , t 1 ( i ) , D j , t and | T ^ j , n ( z ) | , ( j = 1 , 2 ; i = 2 , 3 ) in Cases I, III, IV in Theorem 1 with n = 500 . Difference series D j , t in the second column are obtained by multiplying W j , t 1 ( 3 ) in the first column by λ ^ j , 2 λ ^ j , 1 2 given in Figure 5. Test statistics in absolute | T ^ n , j ( z ) | , ( 0 z 1 ) , are computed using the difference series D j , t in the second column. In the first row of Case I with both H 1 , 0 and H 2 , 0 , D j , t has no change in mean and thus small values of | T ^ n , j ( z ) | for all z. In the second row of Case III with H 2 , 0 , j = 2 (in red), the same interpretation is given.
Figure 3. Sample paths of W ^ j , t 1 ( i ) , D j , t and | T ^ j , n ( z ) | , ( j = 1 , 2 ; i = 2 , 3 ) in Cases I, III, IV in Theorem 1 with n = 500 . Difference series D j , t in the second column are obtained by multiplying W j , t 1 ( 3 ) in the first column by λ ^ j , 2 λ ^ j , 1 2 given in Figure 5. Test statistics in absolute | T ^ n , j ( z ) | , ( 0 z 1 ) , are computed using the difference series D j , t in the second column. In the first row of Case I with both H 1 , 0 and H 2 , 0 , D j , t has no change in mean and thus small values of | T ^ n , j ( z ) | for all z. In the second row of Case III with H 2 , 0 , j = 2 (in red), the same interpretation is given.
Entropy 24 00937 g003
Figure 4. Sample paths of W ^ j , t 1 ( i ) , D j , t and | T ^ j , n ( z ) | , ( j = 1 , 2 ; i = 2 , 3 ) in Cases I, III, IV in Theorem 1 with n = 1000 . Difference series D j , t in the second column are obtained by multiplying W j , t 1 ( 3 ) in the first column by λ ^ j , 2 λ ^ j , 1 2 given in Figure 5. Test statistics in absolute | T ^ n , j ( z ) | , ( 0 z 1 ) , are computed using the difference series D j , t in the second column. In the first row of Case I with both null hypotheses H 1 , 0 and H 2 , 0 , D j , t has no change in mean and thus small values of | T ^ n , j ( z ) | for all z. In the second row of Case III with H 2 , 0 , j = 2 (in red), the same interpretation is given.
Figure 4. Sample paths of W ^ j , t 1 ( i ) , D j , t and | T ^ j , n ( z ) | , ( j = 1 , 2 ; i = 2 , 3 ) in Cases I, III, IV in Theorem 1 with n = 1000 . Difference series D j , t in the second column are obtained by multiplying W j , t 1 ( 3 ) in the first column by λ ^ j , 2 λ ^ j , 1 2 given in Figure 5. Test statistics in absolute | T ^ n , j ( z ) | , ( 0 z 1 ) , are computed using the difference series D j , t in the second column. In the first row of Case I with both null hypotheses H 1 , 0 and H 2 , 0 , D j , t has no change in mean and thus small values of | T ^ n , j ( z ) | for all z. In the second row of Case III with H 2 , 0 , j = 2 (in red), the same interpretation is given.
Entropy 24 00937 g004
Figure 5. λ ^ j , 2 λ ^ j , 1 2 , ( j = 1 , 2 ) , with sample size n = 500 , 501 , , 1000 on horizontal axis, in Case I, III, IV in Theorem 1. In Case I( j = 1 , 2 ) and Case III( j = 2 ) with the null hypothesis H j , 0 , λ ^ j , 2 λ ^ j , 1 2 tends to zero as n 1000 .
Figure 5. λ ^ j , 2 λ ^ j , 1 2 , ( j = 1 , 2 ) , with sample size n = 500 , 501 , , 1000 on horizontal axis, in Case I, III, IV in Theorem 1. In Case I( j = 1 , 2 ) and Case III( j = 2 ) with the null hypothesis H j , 0 , λ ^ j , 2 λ ^ j , 1 2 tends to zero as n 1000 .
Entropy 24 00937 g005
As for the CUSUMSQ test in Theorem 2, Figure 6 describes series of difference squared, d j , t 2 , and test statistic values in absolute, | T ^ n * ( z ) | , under the two hypotheses with n = 500 , 1000 , respectively. It is shown that under H 0 * , the difference series d j , t has small variance with small values whose squares are less than 0.025 for n = 500 and 0.006 for n = 1000 , whereas under H A * it has large values with their squares between 0 and 2500. Therefore we adopt the idea of change-point test of variance to the difference series d j , t , which yields a solution of the existence of the exponential decay rate.
Throughout the simulation study of the CUSUM tests, replication number 1000, significant level α = 0.05 , and sample size n = 100 , 200 , 500 , and 1000 are used.
Table 1 and Table 2 display the evaluated rejection rates in Theorems 1 and 2, respectively. In Table 1, results of rejection rates of two tests H j , 0 , j = 1 , 2 , in Theorem 1 for the four cases are addressed. To compute the test statistic T ^ j , n ( z ) in Theorem 1, a consistent estimate σ ^ j , D of the standard deviation should be found. Recall two of σ ^ j , D 2 in (7), which are estimators of V a r ( D j , t ) = E [ D j , t 2 ] E [ D j , t ] 2 . Because E [ D j , t ] 0 as n due to λ ^ j , k λ ^ j , 1 k p 0 under the null hypothesis H j , 0 , we may use both in (7) for σ ^ j , D 2 . However, the use of the estimates incurs slow convergence rates because of the bias problem, as seen in the argument in Remark 1. If the mean has a large bias, the convergence rate tends to be slow, as affected by the bias. Thus we adopt the two estimates partially to adjust the convergence rate. In particular, so as to visualize the convergence to the nominal level with increasing n, we here take partially the second sample variance in (7) of the form: 1 n t = 1 n D j , t 2 ( D ¯ j ) 2 , where D ¯ j = 1 n t = 1 n D j , t , which converges to zero in probability along with | 1 n t = 1 n D j , t E [ D j , t ] | p 0 as n . To use it partially, we construct a consistent estimator with the following threshold: σ ˜ j , D 2 = 1 n t = 1 n D j , t 2 δ ( D ¯ j ) 2 , where δ = I ( | D ¯ j | < t h * ) with an indicator function I ( · ) and a threshold t h * . In other words, we use either the first one or the second one in (7), depending on the magnitude of the mean D ¯ j . By doing this, we can adjust the convergence rate by adopting the following threshold: setting the value δ of the indicator function to zero in case of the large bias. In this simulation, threshold t h * is chosen empirically such that P ( | D ¯ j | < t h * ) = 0.05 ; that is, if | D ¯ j | t h * with probability 0.95, then the first in (7) is used and otherwise with probability 0.05, the second in (7) is used for the estimate σ ˜ j , D 2 . This is because the probability of having the bias is high as seen in Remark 1. In Table 1, results of rejection rates are given by using the estimate σ ˜ j , D with the chosen threshold t h * and are seen with the convergence to the nominal level α = 0.05 as n increases under the null hypothesis. For Table 2 with Theorem 2, a similar argument can be given.
Table 1. Validation of Theorem 1. Rejection rate for hypotheses H j , 0 and H j , A of level α = 0.05 , j = 1 , 2 in exponentially weighted bivariate HAR models of order p = 3 ; h = ( 1 , 5 , 22 ) ; ( β 10 , c 11 , c 12 ) = ( 1 , 0.1 , 0.25 ) , ( β 20 , c 21 , c 22 ) = ( 3 , 0.2 , 0.15 ) , λ 1 = 0.5 , λ 2 = 0.4 in the null hypothesis H j , 0 of Cases I, II, III; replication number =1000. (Other parameter values used are available upon request ).
Table 1. Validation of Theorem 1. Rejection rate for hypotheses H j , 0 and H j , A of level α = 0.05 , j = 1 , 2 in exponentially weighted bivariate HAR models of order p = 3 ; h = ( 1 , 5 , 22 ) ; ( β 10 , c 11 , c 12 ) = ( 1 , 0.1 , 0.25 ) , ( β 20 , c 21 , c 22 ) = ( 3 , 0.2 , 0.15 ) , λ 1 = 0.5 , λ 2 = 0.4 in the null hypothesis H j , 0 of Cases I, II, III; replication number =1000. (Other parameter values used are available upon request ).
 nCase ICase IICase IIICase IV
H 1 , 0 H 2 , 0 H 1 , 0 H 2 , A H 1 , A H 2 , 0 H 1 , A H 2 , A
1000.0660.0560.0830.1980.5470.0580.6400.776
2000.0530.0580.0570.2360.7670.0540.7890.897
5000.0510.0500.0570.3130.9460.0530.8260.920
10000.0500.0500.0520.4080.9850.0540.8230.936
Table 2. Validation of Theorem 2. Rejection rate for hypotheses H 0 * and H A * of level α = 0.05 in exponentially weighted bivariate HAR models of order p = 3 ; h = ( 1 , 5 , 22 ) , with decay rates λ j { 0.2 , 0.4 , 0.5 , 0.8 } , j = 1 , 2 ; ( β 10 , c 11 , c 12 ) = ( 1.0 , 0.12 , 0.01 ) , ( β 20 , c 21 , c 22 ) = ( 1.0 , 0.07 , 0.01 ) ; replication number =1000.
Table 2. Validation of Theorem 2. Rejection rate for hypotheses H 0 * and H A * of level α = 0.05 in exponentially weighted bivariate HAR models of order p = 3 ; h = ( 1 , 5 , 22 ) , with decay rates λ j { 0.2 , 0.4 , 0.5 , 0.8 } , j = 1 , 2 ; ( β 10 , c 11 , c 12 ) = ( 1.0 , 0.12 , 0.01 ) , ( β 20 , c 21 , c 22 ) = ( 1.0 , 0.07 , 0.01 ) ; replication number =1000.
 n λ 1 = 0.2 λ 1 = 0.4 λ 1 = 0.5
λ 2 = 0 . 2 λ 2 = 0 . 8 λ 2 = 0 . 4 λ 2 = 0 . 8 λ 2 = 0 . 5 λ 2 = 0 . 8
H 0 * H A * H 0 * H A * H 0 * H A *
1000.0270.3010.0210.3400.0280.326
2000.0330.2880.0320.2990.0430.274
5000.0500.2840.0450.2660.0480.258
10000.0480.2850.0510.2650.0490.270
It is shown from Table 1 that Case I favors two of the null hypotheses in Theorem 1; i.e., the models are exponentially weighted with small rejection numbers of the null hypothesis. Moreover, Table 2 depicts reasonable rejection rates of the test in Theorem 2 for a large sample size under both null and alternative hypotheses. Note that in Table 2, the null hypothesis H 0 * indicates that both λ 1 and λ 2 are the same values; i.e., the common rate λ = λ 1 = λ 2 { 0.2 , 0.4 , 0.5 } , where small rejection rates are reported. The rejection rates in Table 1 and Table 2 tend to the nominal level α = 0.05 as n increases under the null hypothesis.
Next, we observe the size and power properties of our proposed CUSUM test in Theorem 1. To assess the performance of the test, we use bivariate HAR models of orders p = 3 , 6 . We set h = ( 1 , 5 , 22 ) , ( 1 , 7 , 14 ) if p = 3 , and h = ( 1 , 5 , 7 , 9 , 14 , 22 ) , ( 1 , 7 , 14 , 19 , 22 , 25 ) if p = 6 . The sizes of the proposed test in the HAR( p , 2 ) models with λ 1 { 0.5 , 0.8 } , λ 2 { 0.1 , 0.4 } , are illustrated in Table 3. Most cases indicate very small values of type I errors, consistent with the size of the test. Also, the powers of the CUSUM test are displayed in Table 4, where we see comparatively reasonable power results.
In Table 5, estimates of the decay rates in (9) and (10) are computed to obtain sample means and standard errors. Cases I–III have the same parameters as those in Table 1 whereas Cases IV * –VI * use common rates given as follows:
Case IV * : Common rate λ 1 = λ 2 = λ = 0.1 .
Case V * : Common rate λ 1 = λ 2 = λ = 0.5 .
Case VI * : Common rate λ 1 = λ 2 = λ = 0.9 .
Table 5 reports that estimate results are consistent in the sample sizes, whereas for Cases IV * , V * and VI * , estimates λ ^ j in (9) are used for λ j , j = 1 , 2 , and estimates λ ^ in (10) are used for λ . We notice that in the common rate cases, Equation (10) gives smaller standard errors of the estimates.
Table 3. Size of the CUSUM test in Theorem 1 for HAR ( p , 2 ) , p = 3 , 6 ; ( β 10 , c 11 , c 12 ) = ( 1.0 , 0.12 , 0.10 ) , ( β 20 , c 21 , c 22 ) = ( 1.0 , 0.07 , 0.25 ) ; Replication number = 1000, α = 0.05 .
Table 3. Size of the CUSUM test in Theorem 1 for HAR ( p , 2 ) , p = 3 , 6 ; ( β 10 , c 11 , c 12 ) = ( 1.0 , 0.12 , 0.10 ) , ( β 20 , c 21 , c 22 ) = ( 1.0 , 0.07 , 0.25 ) ; Replication number = 1000, α = 0.05 .
phn λ 1 = 0.5 λ 1 = 0.8
λ 2 = 0 . 1 λ 2 = 0 . 4 λ 2 = 0 . 1 λ 2 = 0 . 4
H 1 , 0 H 2 , 0 H 1 , 0 H 2 , 0 H 1 , 0 H 2 , 0 H 1 , 0 H 2 , 0
p = 3 (1, 5, 22)1000.0870.0780.0800.0670.0910.0900.0940.078
2000.0710.0710.0740.0590.0710.0750.0610.061
5000.0540.0580.0500.0540.0520.0540.0510.050
10000.0500.0500.0500.0530.0510.0520.0500.050
p = 6 (1, 5, 7, 9, 14, 22)1000.0980.0890.1080.0780.0760.0650.0540.053
2000.0640.0840.0580.0710.0580.0530.0520.052
5000.0540.0760.0490.0690.0510.0560.0510.050
10000.0490.0550.0450.0580.0470.0560.0500.051
p = 3 (1, 7, 14)1000.1120.0910.1050.0810.0990.0930.0950.076
2000.0700.0640.0600.0610.0710.0720.0690.059
5000.0540.0510.0520.0530.0520.0560.0500.053
10000.0500.0510.0520.0530.0500.0500.0500.052
p = 6 (1, 7, 14, 19, 22, 25)1000.0910.0900.0770.0760.0710.0640.0570.053
2000.0760.0760.0730.0690.0610.0530.0560.050
5000.0530.0650.0650.0550.0480.0480.0470.049
10000.0540.0620.0550.0510.0480.0520.0500.051
Table 4. Power of the CUSUM test in Theorem 1 for HAR ( p , 2 ) , p = 3,6; Replication number = 1000, α = 0.05 . (Parameter values used in Power Models 1 & 2 are available upon request).
Table 4. Power of the CUSUM test in Theorem 1 for HAR ( p , 2 ) , p = 3,6; Replication number = 1000, α = 0.05 . (Parameter values used in Power Models 1 & 2 are available upon request).
nPower Model 1Power Model 2
p = 3 p = 6 p = 3 p = 6
( 1 , 5 , 22 ) ( 1 , 5 , 7 , 9 , 14 , 22 ) ( 1 , 7 , 14 ) ( 1 , 7 , 14 , 19 , 22 , 25 )
H 1 , A H 2 , A H 1 , A H 2 , A H 1 , A H 2 , A H 1 , A H 2 , A
1000.6640.8070.7160.5180.8120.7270.6910.722
2000.7940.9060.8490.6420.9010.8230.6790.771
5000.8500.9670.9480.8060.9370.8630.7380.852
10000.9070.9870.9760.8410.9480.8870.8020.952
Finally, we discuss the fitting problem of the exponentially weighted HAR models in (12) and compare with the conventional HAR model fitted by OLSEs. The simulated data in Figure 1 are used to compute the OLSEs β ^ j , O of coefficients and the estimates λ ^ j of the decay rates in (9). From the estimated rates λ ^ j , the rate-adopted estimators (REs) β ˜ j , Λ of the coefficients are evaluated as a further step. To compare the fitted models by the OLSEs and REs, Table 6 presents some criteria such as the root mean square error (RMSE), mean absolute error (MAE), AIC, and BIC. In the case of the exponential weighted bivariate HAR models, the fitting parameter m is used with m = 5 , 10 , 20 , 100 . Because the conventional bivariate HAR model has 14 parameters whereas our proposed model has 8 parameters, the AIC and BIC of the latter are smaller values than those of the former. Intuitively, the small choice of m yields small errors in RMSEs and MAEs because the average ω ¯ m , t in (11) for interval [ t m , t + m ] is closer to ϵ ˜ j , t , Λ for smaller m, and thus this fact makes the error term ε j , t in (12) smaller. In the latter case, (12) is applied with m = 5 for the bias adjustment. RMSE and MAE of the OLSE residuals in fitting the conventional HAR model are 0.9971, 0.8253 for j = 1 and 0.9644, 0.7596 for j = 2 , and those of the RE residuals in fitting our proposed model are 0.9424, 0.7805 for j = 1 and 0.9226, 0.7381 for j = 2 .
Table 5. Sample means and standard errors (s.e.) of estimates for the decay rates λ 1 , λ 2 of exponentially weighted HAR (3, 2) models, Replication number = 1000. Note that in common rate cases with *, Cases IV * , V * and VI * , estimates λ ^ j in (9) are used for λ j , j = 1 , 2 , while estimates λ ^ in (10) are used for λ .
Table 5. Sample means and standard errors (s.e.) of estimates for the decay rates λ 1 , λ 2 of exponentially weighted HAR (3, 2) models, Replication number = 1000. Note that in common rate cases with *, Cases IV * , V * and VI * , estimates λ ^ j in (9) are used for λ j , j = 1 , 2 , while estimates λ ^ in (10) are used for λ .
n = 500 n = 1000 n = 2000
Sample Mean(s.e.)Sample Mean(s.e.)Sample Mean(s.e.)
Case I λ 1 = 0.5 0.458(0.038)0.493(0.017)0.528(0.008)
λ 2 = 0.4 0.418(0.022)0.452(0.012)0.419(0.005)
Case II λ 1 = 0.5 0.764(0.143)0.625(0.021)0.505(0.007)
- - - -
Case III- - - -
λ 2 = 0.4 0.497(0.034)0.445(0.014)0.413(0.006)
Case IV * λ 1 = 0.1 0.188(0.037)0.121(0.019)0.104(0.011)
λ 2 = 0.1 −0.022(0.164)0.101(0.011)0.103(0.007)
λ = 0.1 0.118(0.014)0.084(0.009)0.095(0.006)
Case V * λ 1 = 0.5 0.684(0.044)0.579(0.019)0.525(0.012)
λ 2 = 0.5 0.556(0.019)0.526(0.012)0.518(0.008)
λ = 0.5 0.507(0.015)0.511(0.010)0.506(0.007)
Case VI * λ 1 = 0.9 0.768(0.652)0.934(0.021)0.921(0.014)
λ 2 = 0.9 0.983(0.021)0.928(0.014)0.922(0.010)
λ = 0.9 0.930(0.018)0.888(0.011)0.900(0.008)
Table 6. Comparison of exponential weighted HAR(3,2) fitting models from the simulated data in Figure 1.
Table 6. Comparison of exponential weighted HAR(3,2) fitting models from the simulated data in Figure 1.
Model  Fitting j = 1 j = 2
Parameter mRMSEMAEAICBICRMSEMAEAICBIC
Conventional HAR-0.99710.82531098.491153.570.96440.75961073.381128.47
Exp. HAR m = 5 0.94240.78051043.881075.360.92260.73811027.801059.28
Exp. HAR m = 10 0.98980.81471080.841112.320.94430.75351045.341076.82
Exp. HAR m = 20 0.99490.81801084.831116.310.97090.76211066.241097.71
Exp. HAR m = 100 1.01050.83341096.231127.700.98270.77451075.591107.07
Furthermore, to elaborate more on the comparison with the conventional HAR model, the efficiency of the proposed model vs. the conventional one, is computed by using two metrics of RMSE and MAE: The Exp. HAR Model Efficiency, relative to the benchmark HAR model, is defined by
Effi _ R M S E = R M S E 0 R M S E 1 R M S E 1 × 100 , Effi _ M A E = M A E 0 M A E 1 M A E 1 × 100 ,
where RMSE 0 and MAE 0 are RMSE, MAE of the conventional HAR model, respectively and RMSE 1 and MAE 1 are those of the exponentially weighted HAR model. Table 7 displays the Exp. HAR model efficiency in the first case of ( λ 1 , λ 2 ) of Table 3. We see that all values are positive with highest value 7.0817 in percentage, which means that the proposed model with the RE fitting improves the conventional HAR model with respect to residual errors.
Table 7. Comparison with conventional HAR model by computing Exp. HAR model efficiency, defined by Effi _ R M S E = 100 × ( R M S E 0 R M S E 1 ) / R M S E 1 , Effi _ M A E = 100 × ( M A E 0 M A E 1 ) / M A E 1 of exponential weighted HAR model, where RMSE 0 , MAE 0 are the root mean square error (RMSE), and mean absolute error (MAE) of the conventional HAR model, respectively; and RMSE 1 , MAE 1 are those of Exp. HAR model.
Table 7. Comparison with conventional HAR model by computing Exp. HAR model efficiency, defined by Effi _ R M S E = 100 × ( R M S E 0 R M S E 1 ) / R M S E 1 , Effi _ M A E = 100 × ( M A E 0 M A E 1 ) / M A E 1 of exponential weighted HAR model, where RMSE 0 , MAE 0 are the root mean square error (RMSE), and mean absolute error (MAE) of the conventional HAR model, respectively; and RMSE 1 , MAE 1 are those of Exp. HAR model.
n = 500 n = 1000
RatesEffi _ RMSE Effi _ MAE Effi _ RMSE Effi _ MAE
( λ 1 , λ 2 ) j = 1 j = 2 j = 1 j = 2 j = 1 j = 2 j = 1 j = 2
(0.5, 0.1)4.16356.33615.21124.30944.84754.94337.08176.9621
(0.5, 0.4)5.17946.89795.09892.89845.85105.44115.39645.4328
(0.8, 0.1)5.91584.45234.59515.19825.30786.30534.73096.9481
(0.8, 0.4)5.73793.67836.38333.76515.10284.47095.70666.0533
The HAR model has high applicability in the financial market. In particular, it is very powerful for the realized volatility forecasting [1,17,18,29]. However, besides volatility forecasting, as a theoretically linear AR model, it has many applications to various time series data as in [12,13,14,42]. The exponentially weighted multivariate HAR model, which is one of the special cases of HAR models, is suitable for joint data with strong cross-correlation. The decay rate of the model plays a key role in the common structure of the joint data with strong correlation. In economics and finance, there are many strongly correlated time series data that are important for policy decisions to improve the global economy and human society. For example, stock prices in the same category tend to have the same pattern. Stock price modellings are known to be based on efficient market hypotheses (EMH), according to which only relevant information on the stock is its current values. The proposed model may be proper to the stock price modelling because the current value and the current averages (with exponentially decaying coefficients) are used as regressor variables in the model. The following section will address empirical analysis of the joint data of strongly correlated stock prices to confirm the intuition of the proposed model for the stock prices.

4. Empirical Analysis

In this section, we provide empirical examples of U.S. stock prices that are applied to the exponentially weighted multivariate HAR models. Note that realized volatility does not fit for our model, but the stock price itself may be suitable for the proposed model with exponential decay coefficients. To this end, we choose some datasets of U.S. stock prices and conduct our proposed CUSUM tests. For a bivariate joint data ( q = 2 ), stock prices of Amazon.com Inc. (AMZN) (Seattle, WA, USA) and Netflix Inc. (NFLX) (Los Gatos, CA, USA), and for a triple joint data ( q = 3 ), those of Apple Inc. (AAPL) (Cupertino, CA, USA), Microsoft Corporation (MSFT) (Redmond, WA, USA) and Facebook Inc. (FB) (Menlo Park, CA, USA) are selected from 7 May 2020 to 6 May 2021. In the analysis, closing price is chosen because it reflects all the activities in a trading day. Plots of these stock prices are shown in Figure 7 and Figure 8, where we see that the time series data reveal somewhat similar patterns for the pair (AMZN, NFLX) and for the triple (FB, AAPL, MSFT).
First we adopt bivariate HAR(3,2) models for pairs of (AMZN, NFLX), (AAPL, MSFT), (FB, AAPL) and (FB, MSFT) for q = 2 , respectively, and second a multivariate HAR(3,3) model for three datasets (FB, AAPL, MSFT) for q = 3 . Order p = 3 and lag h = ( 1 , 5 , 22 ) are used. Results of tests and estimates as well as correlation coefficients are reported in Table 8, where suprema of test statistics and decay rate estimates are computed by Theorem 1 and Equation (9), respectively. More specifically, conducting the CUSUM test in Theorem 1 to detect the presence of the exponential decay rates, test statistics are evaluated as follows. In the case of (AMZN, NFLX) for detecting the existence of λ A M Z N and λ N F L X , the CUSUM test statistics are computed as 0.3986 for AMZN and 0.3489 for NFLX. In the case of (AAPL, MSFT), the CUSUM test statistics are computed as 0.7508 for AAPL and 0.4979 for MSFT. These values imply that the null hypothesis is not rejected because the critical values of the standard Brownian bridge are 1.224 of level α = 0.1 and 1.358 of level α = 0.05 . On the other hand, as the test in Theorem 2 for a common rate is conducted, the test statistics are evaluated as values greater than 2, which rejects the null hypothesis with the common rate.
Now comparisons with the conventional HAR models are presented for the two pairs (AMZN, NFLX), (AAPL, MSFT), and for the triple (FB, AAPL, MSFT). We compare performances for these datasets applied to univariate HAR model, (conventional) multivariate HAR model and exponentially weighted multivariate HAR (Exp. HAR) model. For the conventional HAR models, two estimation methods of OLSE and LASSO used in [43] are adopted. LASSO estimates are computed by LassoLarsCV in sklearn.linear_model with Python version 3.8.6. For the Exp. HAR model, the RE is computed. In Table 9, for two joint datasets ( q = 2 ), the conventional HAR(3,2) model has 14 parameters, whereas the exponential weighted HAR(3,2) model has 8 parameters. Each univariate HAR model has 4 parameters, so the total number is 8. Table 9 reports the comparison results of the HAR(3,2) models. The CUSUM test favors the existence of exponential decay rates, and thus the exponential bivariate HAR(3,2) is fitted with the rate estimates. The decay rates for AMZN and NFLX are estimated as ( λ A M Z N , λ N F L X ) = ( 0.2983 , 0.1144 ) whereas those for AAPL and MSFT are ( λ A A P L , λ M S F T ) = ( 0.02175 , 0.01483 ) , which are presented in Table 8. Four measures of RMSE, MAE, AIC, and BIC are compared in the three models via OLSE, LASSO, and RE. In Table 9, the best values are displayed in bold. The exponential bivariate HAR(3,2) model has the best performance on RMSE and MAE, whereas the univariate HAR model with LASSO has the best performance on the AIC and BIC. Figure 7 depicts the actual data of stock prices of (AMZN, NFLX) and the fitted Exp. HAR(3,2) model by the REs along with their residuals. It can be seen that the stock prices of (AMZN, NFLX) are well matched to the fitted model.
Table 8. Results of tests and estimates for (MZN, NFLX) and (FB, AAPL, MSFT) in exponentially weighted HAR( 3 , q ) models, q = 2 , 3 .
Table 8. Results of tests and estimates for (MZN, NFLX) and (FB, AAPL, MSFT) in exponentially weighted HAR( 3 , q ) models, q = 2 , 3 .
CorrelationqTest StatisticsRate Estimates
Coefficient sup 0 z 1 | T ^ j , n ( z ) | , ( j = 1 , , q ) λ ^ j , ( j = 1 , , q )
(AMZN, NFLX)0.82682(0.3986,  0.3489)( 0.2983 , 0.1144 )
(AAPL,  MSFT)0.83962(0.7508,  0.4979)( 0.02175 , 0.01483 )
(FB,  AAPL)0.81852(0.4644,  0.7925)( 006340 , 0.03158 )
(FB,  MSFT)0.82032(0.4549,  0.4794)(0.01255,  0.01322)
(FB,  AAPL,  MSFT)-3(0.4889, 0.7424, 0.4762)( 0.03302 , 0.03097 , 0.02733 )
Table 9. Comparison of univariate HAR, bivariate HAR and exponential weighted bivariate HAR models for (AMZN, NFLX), (AAPL, MSFT) stock prices from 7 May 2020 to 6 May 2021; p = 3 , h = ( 1 , 5 , 22 ) ; * ( λ A M Z N , λ N F L X ) = ( 0.2983 , 0.1144 ) , * ( λ A A P L , λ M S F T ) = ( 0.02175 , 0.01483 ) .
Table 9. Comparison of univariate HAR, bivariate HAR and exponential weighted bivariate HAR models for (AMZN, NFLX), (AAPL, MSFT) stock prices from 7 May 2020 to 6 May 2021; p = 3 , h = ( 1 , 5 , 22 ) ; * ( λ A M Z N , λ N F L X ) = ( 0.2983 , 0.1144 ) , * ( λ A A P L , λ M S F T ) = ( 0.02175 , 0.01483 ) .
 Univariate HAR(3)Bivariate HAR(3,2)Exp. Bi. HAR(3,2) *
Total # of Parameters8148
EstimatorOLSELASSOOLSELASSORE
AMZNRMSE61.618262.961061.578062.961061.2005
MAE47.101947.697747.144747.697746.2293
AIC2556.352321.292576.052341.282561.21
BIC2570.102334.642624.182388.012588.71
NFLXRMSE13.356213.660213.317413.879812.3834
MAE9.03799.28669.05659.98678.4594
AIC1853.021685.891871.681712.521826.23
BIC1866.771699.251919.821759.251853.73
AAPLRMSE2.62552.68662.60612.66092.4899
MAE1.94781.98771.93461.97091.8520
AIC1104.741009.341121.311025.341088.24
BIC1118.481022.691269.451072.061115.75
MSFTRMSE3.82343.89793.82083.89753.5746
MAE2.92492.96252.93152.96332.7488
AIC1277.631164.141297.331184.101254.68
BIC1291.391177.491345.461230.831282.19
Table 10 reports the performances of the HAR(3,3) models for (FB, AAPL, MSFT). The conventional HAR(3,3) model has 30 parameters, whereas the exponential weighted HAR(3,3) model has 15 parameters. The decay rate estimates of (FB, AAPL, MSFT) are ( λ F B , λ A A P L , λ M S F T ) = ( 0.03302 , 0.03097 , 0.02733 ) , from which Exp. HAR(3,3) models are fitted in Figure 8. As seen in Table 10, our proposed model performs better than others with respect to the RMSE, MAE, and AIC whereas the univariate HAR model with LASSO has good performance on BIC, which are indicated by bold numbers in Table 10. Consequently, the proposed model not only has fewer parameters than the conventional HAR models, but also yields the best performance on the loss errors such as RMSE and MAE. The exponentially weighted HAR model with decay rates is suitable for the stock prices of joint financial assets with strong cross-correlation, rather than the volatility, in the stock market.

5. Concluding Remark

This work presents the exponentially weighted multivariate HAR models with exponentially decaying coefficients. The models represent very well two main features: long memory and strong cross-correlation, of financial market data. The common structure of multivariate data, which has such features, can be expressed by the existence of the decay rates of the coefficients in the model. For detecting the existence of the decay rates in the multivariate HAR models, CUSUM-based tests are established in two stages. The first is whether the multivariate HAR model has an exponential decay rate for each asset. The second is whether the model has a common rate for all assets. To test the presence of the rates, difference series are generated from two types of residuals and the change-points of its mean or variance are detected from the pseudo-time series, but not the raw data. The null limiting distributions of the test statistics are derived to be the standard Brownian bridge and are used in providing the asymptotic critical values for rejection of the hypothesis. Easy-to-implement estimators of the decay rates are computed. A Monte Carlo simulation study verifies the proposed tests by illustrating the related series and evaluating finite sample performance of size and power of the tests. Empirical examples show the usefulness of our proposed models in the stock market, especially stock price, but not volatility. Fewer parameters and smaller residual errors in our models are demonstrated.
Let us consider four aspects: (i) decreased number of parameters; (ii) smaller model-fitting errors; (iii) representation of common structure with decay rates; and (iv) a suitable model for stock price movement. These are the main advantages of the exponentially weighted multivariate HAR models. These advantages will help to practically provide more efficient models with smaller errors of predictions in financial time series modelling. In economics and finance, many strongly correlated data of joint assets play a crucial role in policymaking on economic and social regulations. The multivariate feature of our proposed model could be useful to improve forecasting accuracy of the financial assets and, thus it is possible to fine-tune policymaking on such asset classes.
The HAR model has very high applicability in the financial market. An extension of our proposed model can be useful to joint mutivariate data with strong correlation and long memory. In particular, like [12,13,14,15], an extended model incorporating exogenous variables such as an associated-uncertainty index would be a good prediction model to assess the high forecasting gain. For example, joint datasets, like gold and silver, oil and exchange rate, or several stock prices in the same sector, are affected together by global issues like COVID-19 and the Ukrainian War in the current decade. Such a uncertainty-related index can be added to the model as regressors and the extended multivariate HAR model is expected to give a good performance.
Also, an extension of the proposed model can be established as a dynamic time series model that is more applicable to real-world market data, for example, with time-varying variance, non-Gaussianity or heavy-tailed distribution. Recently, Ref. [18] analyzed a multivariate HAR-RV model with GARCH errors, for which a weighting scheme based on the conditional variances of the errors is used to construct the weighted least squares estimates. An extension of this work can be linked to heteroscedasticity. Exponentially weighted multivariate HAR models with time-varying variances such as GARCH errors or ARCH without intercept (see Ref. [44]) would be interesting topics. Reference [44] proposed a double AR model without an intercept (DARWIN model) as a modification of an AR-ARCH model as follows: y t = ϕ y t 1 + η t α y t 1 2 where ϕ R , α > 0 , { η t } is i.i.d. with zero mean and unit variance, and independent of { y j : j < t } . The DARWIN model is nonstationary and heteroscedastic regardless of the sign of Lyapunov exponent, and hence it provides us a new way to model the nonstationary heteroscedastic time series. Analysis on a nonstationary, exponentially weighted HAR model combined with the DARWIN model will be an interesting topic in modelling heteroscedastic time series data. In the exponentially weighted multivariate HAR model with the DARWIN errors, statistical methods for detecting and estimating the decay rates along with the DARWIN parameter estimation will be a challenging study.

Author Contributions

Writing—original draft, W.-T.H. and E.H.; Writing—review & editing, W.-T.H. and E.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Research Foundation of Korea (NRF-2018R1D1A1B07 048745) and (2020R1G1A1101911).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proofs

Proof of Theorem 1.
From (6), we write D j , t = Δ ^ j W ˜ j , t 1 for simplicity, where
Δ ^ j = λ ^ j , 2 λ ^ j , 1 2 , , λ ^ j , p 1 λ ^ j , 1 p 1 , W ˜ j , t 1 = W ^ j , t 1 ( 3 ) , , W ^ j , t 1 ( p ) .
We fix j and first prove
S j , n = 1 σ ^ j , D n t = 1 n D j , t d N ( 0 , 1 ) .
Let σ j , D 2 = lim n σ ^ j , D 2 in probability, and let
X n , t = 1 σ j , D n D j , t = 1 σ j , D n Δ ^ j W ˜ j , t 1 .
{ X n , t : t = 1 , 2 , , n ; n = 1 , 2 , } is a triangular array and it depends on j, but the subscript is omitted for notational simplicity. It is clear that
t = 1 n X n , t 2 p 1 .
Note that for each k , j { 1 , , p } , λ ^ j , k λ ^ j , 1 k = O p ( 1 / n ) because the OLSE λ ^ j , k of λ j , k satisfies the asymptotic normality of n ( λ ^ j , k λ j , k ) with asymptotic mean zero. Moreover, we may have the asymptotic normality: n ( λ ^ j , k λ ^ j , 1 k ) d N ( 0 , υ k 2 ) for some υ k 2 > 0 . From this, and from the boundedness of W ^ j , t 1 ( k ) under condition E | Y j , t | 2 + δ < , we have
E ( max t | X n , t | ) 0 .
For (A1) we prove S n S j , n = t = 1 n X n , t d N ( 0 , 1 ) . It suffices to show
E [ exp ( ι x S n ) ] exp x 2 2
where ι = 1 and x R .
Let Z n , 1 = X n , 1 and Z n , t = X n , t I ( s = 1 t 1 X n , s 2 2 ) for 2 t n , where I ( · ) is the indication function. Let J = inf { t : s = 1 t X n , s 2 > 2 } n . Then
P ( X n , t Z n , t for some t n ) = P ( J n 1 ) P s = 1 n X n , s 2 > 2 0
by (A2). Now we use the following expansion: exp ( ι x ) = ( 1 + ι x ) exp x 2 2 + R ( x ) where R ( x ) is some function with | R ( x ) | | x | 3 . Let
V n = t = 1 n ( 1 + ι x Z n , t ) and U n = exp x 2 2 t = 1 n Z n , t 2 + t = 1 n R ( x Z n , t ) .
Note that V n U n = exp ( ι x S n ) and U n p exp x 2 2 = : a by (A2) and (A5). Because V n U n = V n ( U n a ) + a V n , we may show that | V n | is uniformly integrable and E ( V n ) 1 .
We observe
| V n | = t = 1 n | ( 1 + ι x Z n , t ) | = t = 1 J 1 ( 1 + x 2 X n , t 2 ) 1 / 2 ( 1 + x 2 X n , J 2 ) 1 / 2
exp x 2 2 t = 1 J 1 X n , t 2 ( 1 + | x | | X n , J | ) exp ( x 2 ) ( 1 + | x | max t | X n , t | ) ,
where the inequality | ( 1 + ι x z ) | 2 = ( 1 + x 2 z 2 ) exp ( x 2 z 2 ) is used. Thus, by (A3), V n is uniformly integrable. Finally we show that E ( V n ) 1 . Let
F n , t = σ Δ ^ j , W ˜ j , s : < s t , j = 1 , , q
with F n , t 1 F n , t . Set I t 1 : = I s = 1 t 1 X n , s 2 2 . We have E ( V n ) =
E t = 1 n 1 + ι x X n , t I s = 1 t 1 X n , s 2 2 = E E t = 1 n 1 + ι x σ j , D n Δ ^ j W ˜ j , t 1 I t 1 F n , n 1
= E t = 1 n 1 1 + ι x σ j , D n Δ ^ j W ˜ j , t 1 I t 1 E 1 + ι x σ j , D n Δ ^ j W ˜ j , n 1 I n 1 F n , n 1 .
We also observe
E 1 + ι x σ j , D n Δ ^ j W ˜ j , n 1 I n 1 F n , n 1 = E 1 + ι x σ j , D n k = 2 p 1 ( λ ^ j , k λ ^ j , 1 k ) W ^ j , n 1 ( k + 1 ) I n 1 F n , n 1
= 1 + ι x σ j , D n k = 2 p 1 E ( λ ^ j , k λ ^ j , 1 k ) W ^ j , n 1 ( k + 1 ) I n 1 F n , n 1
= 1 + ι x σ j , D n k = 2 p 1 E 1 n ( Γ k + o p ( 1 ) ) W ^ j , n 1 ( k + 1 ) I n 1 F n , n 1
where Γ k is a normal random variable, (with mean E [ Γ k ] = 0 ), which is independent of W ^ j , n 1 ( k + 1 ) . The expectation in (A7) can be written as
1 n E Γ k W ^ j , n 1 ( k + 1 ) I n 1 F n , n 1 + o p ( 1 / n ) =
1 n E [ Γ k ] E W ^ j , n 1 ( k + 1 ) I n 1 F n , n 1 + o p ( 1 / n ) = o p ( 1 / n ) .
Thus, the last expression in (A7) is 1 + o p ( 1 / n ) . Proceed this argument of successive conditioning to (A6) to obtain that E [ V n ] = E t = 1 n 1 + o p ( 1 / n ) = exp ( o p ( 1 ) ) 1 . Therefore,
E [ exp ( ι x S n ) ] = E [ V n U n ] = E [ V n ( U n a ) ] + a E [ V n ] a : = exp x 2 2 ,
which is (A4), and we complete the asymptotic normality of 1 σ j , D n t = 1 n D j , t d N ( 0 , 1 ) . Because σ j , D 2 = lim n σ ^ j , D 2 , (A1) holds. Similarly, we can show that S [ n z ] : = t = 1 [ n z ] X n , t converges to the Brownian motion B ( z ) . It follows that the desired result in Theorem 1 is obtained. □
Proof of Theorem 2.
The proof is similar to that of Theorem 1. A key difference between proofs of Theorems 1 and 2 is in the following: Let X n , t * = 1 σ D * n j = 1 q ( d j , t 2 σ ^ j , d 2 ) where σ D * 2 = lim n σ ^ D * 2 in probability, and let S n * = t = 1 n X n , t * . By Theorem 4 of [41], we have the asymptotic normality of n ( λ ^ j λ ) with asymptotic mean zero for all j = 1 , 2 , , q under the null hypothesis with common rate λ . Thus ϵ ^ j , t * ϵ t * ( j ) = O p ( 1 / n ) as well as d j , t : = ϵ ^ j , t * 2 ϵ t * ( j ) 2 = O p ( 1 / n ) for all j. So max t | X n , t * | = o p ( 1 / n ) p 0 . Instead of X n , t and S n in the proof of Theorem 1, we replace by X n , t * and S n * to obtain the same results, along with V n * : = t = 1 n ( 1 + ι x Z n , t * ) and E [ V n * ] 1 , where Z n , t * is given in the same way with X n , t * . Asymptotic normality of S n 2 holds as well as the desired limiting distribution in Theorem 2 holds. □

References

  1. Corsi, F. A simple approximate long-memory model of realized volatility. J. Financ. Econom. 2009, 7, 174–196. [Google Scholar] [CrossRef]
  2. Andersen, T.G.; Bollerslev, T.; Diebold, F.X. Roughing it up: Including jump components in the measurement, modeling, and forecasting of return volatility. Rev. Econ. Stat. 2007, 89, 701–720. [Google Scholar] [CrossRef]
  3. Corsi, F.; Mittnik, S.; Pigorsch, C.; Pigorsch, U. The volatility of realized volatility. Econom. Rev. 2008, 27, 46–78. [Google Scholar] [CrossRef]
  4. McAleer, M.; Medeiros, M.C. A multiple regime smooth transition heterogeneous autoregressive model for long memory and asymmetries. J. Econom. 2008, 147, 104–119. [Google Scholar] [CrossRef] [Green Version]
  5. Hillebrand, E.; Medeiros, M.C. The benefits of bagging for forecast models of realized volatility. Econom. Rev. 2010, 29, 571–593. [Google Scholar] [CrossRef]
  6. Tang, Y.; Chi, Y. HAR model and long memory in financial market. In Proceedings of the 2010 2nd International Workshop on Database Technology and Applications, Wuhan, China, 27–28 November 2010; pp. 1–5. [Google Scholar]
  7. Clements, M.P.; Galvão, A.B.; Kim, J.H. Quantile forecasts of daily exchange rate returns from forecasts of realized volatility. J. Empir. Financ. 2008, 15, 729–750. [Google Scholar] [CrossRef] [Green Version]
  8. Bollerslev, T.; Tauchen, G.; Zhou, H. Expected stock returns and variance risk premia. Rev. Financ. Stud. 2009, 22, 4463–4492. [Google Scholar] [CrossRef]
  9. Bianco, S.; Corsi, F.; Reno, R. Intraday lebaron effects. Proc. Natl. Acad. Sci. USA 2009, 106, 11439–11443. [Google Scholar] [CrossRef] [Green Version]
  10. Asai, M.; McAleer, M.; Medeiros, M.C. Modelling and forecasting noisy realized volatility. Comput. Stat. Data Anal. 2012, 56, 217–230. [Google Scholar] [CrossRef] [Green Version]
  11. Luo, J.; Klein, T.; Ji, Q.; Hou, C. Forecasting realized volatility of agricultural commodity futures with infinite Hidden Markov HAR models. Int. J. Forecast. 2022, 38, 51–73. [Google Scholar] [CrossRef]
  12. Bouri, E.; Demirer, R.; Gupta, R.; Pierdzioch, C. Infection diseases, market uncertainty and oil market volatility. Energies 2020, 13, 4090. [Google Scholar] [CrossRef]
  13. Bouri, E.; Gkillas, K.; Gupta, R.; Pierdzioch, C. Forecasting power of infectious diseases-related uncertainty for gold realzed variance. Financ. Res. Lett. 2021, 42, 101936. [Google Scholar] [CrossRef]
  14. Bouri, E.; Gkillas, K.; Gupta, R.; Pierdzioch, C. Forecasting realzed volatility of bitcoin: The role of the trade war. Comput. Econ. 2021, 57, 29–53. [Google Scholar] [CrossRef]
  15. Bouri, E.; Gupta, R.; Pierdzioch, C.; Salisu, A.A. El Nino and forecastability of oil-price realzed volatility. Theor. Appl. Climatol. 2021, 144, 1173–1180. [Google Scholar] [CrossRef]
  16. Busch, T.; Christensen, B.J.; Nielsen, M.Ø. The role of implied volatility in forecasting future realized volatility and jumps in foreign exchange, stock, and bond markets. J. Econom. 2011, 160, 48–57. [Google Scholar] [CrossRef] [Green Version]
  17. Taylor, N. Realized volatility forecasting in an international context. Appl. Econ. Lett. 2015, 22, 503–509. [Google Scholar] [CrossRef]
  18. Hwang, E.; Hong, W.T. A multivariate HAR-RV model with heteroscedastic errors and its WLS estimation. Econ. Lett. 2021, 203, 109855. [Google Scholar] [CrossRef]
  19. Čech, F.; Barunik, J. On the modelling and forecasting of multivariate realized volatility: Generalized heterogeneous autoregressive (GHAR) model. J. Forecast. 2017, 36, 181–206. [Google Scholar] [CrossRef] [Green Version]
  20. Tang, Y.; Ma, F.; Zhang, Y.; Wei, Y. Forecasting the oil price realized volatility: A multivariate heterogeneous autoregressive model. Int. J. Financ. Econ. 2020, 1–14. [Google Scholar] [CrossRef]
  21. Tang, Y.; Xiao, X.; Wahab, M.; Ma, F. The role of oil futures intraday information on predicting US stock market volatility. J. Manag. Sci. Eng. 2021, 6, 64–74. [Google Scholar] [CrossRef]
  22. Bubák, V.; Kočenda, E.; Žikeš, F. Volatility transmission in emerging European foreign exchange markets. J. Bank. Financ. 2011, 35, 2829–2841. [Google Scholar] [CrossRef] [Green Version]
  23. Bauer, G.H.; Vorkink, K. Forecasting multivariate realized stock market volatility. J. Econom. 2011, 160, 93–101. [Google Scholar] [CrossRef]
  24. Souček, M.; Todorova, N. Realized volatility transmission between crude oil and equity futures markets: A multivariate HAR approach. Energy Econ. 2013, 40, 586–597. [Google Scholar] [CrossRef]
  25. Cubadda, G.; Guardabascio, B.; Hecq, A. A vector heterogeneous autoregressive index model for realized volatility measures. Int. J. Forecast. 2017, 33, 337–344. [Google Scholar] [CrossRef] [Green Version]
  26. Bollerslev, T.; Patton, A.J.; Quaedvlieg, R. Modeling and forecasting (un) reliable realized covariances for more reliable financial decisions. J. Econom. 2018, 207, 71–91. [Google Scholar] [CrossRef] [Green Version]
  27. Luo, J.; Ji, Q. High-frequency volatility connectedness between the US crude oil market and China’s agricultural commodity markets. Energy Econ. 2018, 76, 424–438. [Google Scholar] [CrossRef]
  28. Luo, J.; Chen, L. Realized volatility forecast with the Bayesian random compressed multivariate HAR model. Int. J. Forecast. 2020, 36, 781–799. [Google Scholar] [CrossRef]
  29. Wilms, I.; Rombouts, J.; Croux, C. Multivariate volatility forecasts for stock market indices. Int. J. Forecast. 2021, 37, 484–499. [Google Scholar] [CrossRef]
  30. Atsalakis, G.S.; Valavanis, K.P. Surveying stock market forecasting techniques–Part II: Soft computing methods. Expert Syst. Appl. 2009, 36, 5932–5941. [Google Scholar] [CrossRef]
  31. Atsalakis, G.S.; Valavanis, K.P. Surveying stock market forecasting techniques-Part I: Conventional methods. J. Comput. Optim. Econ. Financ. 2010, 2, 45–92. [Google Scholar]
  32. Buche, A.; Chandak, M. Stock Market Forecasting Techniques: A Survey. J. Eng. Appl. Sci. 2019, 14, 1649–1655. [Google Scholar] [CrossRef] [Green Version]
  33. Fama, E.F. The behavior of stock-market prices. J. Bus. 1965, 38, 34–105. [Google Scholar] [CrossRef]
  34. Fama, E.F. Random walks in stock market prices. Financ. Anal. J. 1995, 51, 75–80. [Google Scholar] [CrossRef] [Green Version]
  35. Weng, B.; Ahmed, M.A.; Megahed, F.M. Stock market one-day ahead movement prediction using disparate data sources. Expert Syst. Appl. 2017, 79, 153–163. [Google Scholar] [CrossRef]
  36. Parray, I.R.; Khurana, S.S.; Kumar, M.; Altalbe, A.A. Time series data analysis of stock price movement using machine learning techniques. Soft Comput. 2020, 24, 16509–16517. [Google Scholar] [CrossRef]
  37. Hwang, E.; Shin, D.W. Infinite-order, long-memory heterogeneous autoregressive models. Comput. Stat. Data Anal. 2014, 76, 339–358. [Google Scholar] [CrossRef]
  38. Hwang, E.; Shin, D.W. A CUSUM test for a long memory heterogeneous autoregressive model. Econ. Lett. 2013, 121, 379–383. [Google Scholar] [CrossRef]
  39. Hwang, E.; Shin, D.W. A CUSUMSQ test for structural breaks in error variance for a long memory heterogeneous autoregressive model. Stat. Probab. Lett. 2015, 99, 167–176. [Google Scholar] [CrossRef]
  40. Jo, M.; Lee, S. On CUSUM test for dynamic panel models. Statstical Mathods Appl. 2021, 30, 515–542. [Google Scholar] [CrossRef]
  41. Hong, W.T.; Lee, J.; Hwang, E. A note on the asymptotic normality theory of the least squares estimates in multivariate HAR-RV models. Mathematics 2020, 8, 2083. [Google Scholar] [CrossRef]
  42. Hwang, E.; Yu, S. Modeling and forecasting the COVID-19 pandemic with heterogeneous autoregression approarches: South Korea. Results Phys. 2021, 29, 104631. [Google Scholar] [CrossRef]
  43. Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. (Methodol.) 1996, 58, 267–288. [Google Scholar] [CrossRef]
  44. Li, D.; Guo, S.; Zhu, K. Double AR model without intercept: An alternative to modeling nonstationarity and heteroscedasticity. Econom. Rev. 2019, 38, 319–331. [Google Scholar] [CrossRef] [Green Version]
Figure 6. Sample paths of d j , t 2 , ( j = 1 , 2 ) , and T ^ n * ( z ) under H 0 * and H A * in Theorem 2 with n = 500 , 1000 . Under H 0 * in the first column, difference squared d j , t 2 are small values less than 0.025 for n = 500 and 0.006 for n = 1000 and thus the test statistic | T ^ n ( z ) | are evaluated as small values less than 1 while under H A * in the second column, d j , t 2 are between 0 and 2500 and so there exists a variance change-point with large values of | T ^ n ( z ) | ; sup | T ^ n ( z ) | 1.5 .
Figure 6. Sample paths of d j , t 2 , ( j = 1 , 2 ) , and T ^ n * ( z ) under H 0 * and H A * in Theorem 2 with n = 500 , 1000 . Under H 0 * in the first column, difference squared d j , t 2 are small values less than 0.025 for n = 500 and 0.006 for n = 1000 and thus the test statistic | T ^ n ( z ) | are evaluated as small values less than 1 while under H A * in the second column, d j , t 2 are between 0 and 2500 and so there exists a variance change-point with large values of | T ^ n ( z ) | ; sup | T ^ n ( z ) | 1.5 .
Entropy 24 00937 g006
Figure 7. (AMZN, NFLX) stock prices and exponentially weighted HAR(3,2) models fitted by REs with their residuals.
Figure 7. (AMZN, NFLX) stock prices and exponentially weighted HAR(3,2) models fitted by REs with their residuals.
Entropy 24 00937 g007
Figure 8. (FB, AAPL, MSFT) stock prices and exponentially weighted HAR(3,3) models fitted by REs with their residuals.
Figure 8. (FB, AAPL, MSFT) stock prices and exponentially weighted HAR(3,3) models fitted by REs with their residuals.
Entropy 24 00937 g008
Table 10. Comparison of univariate HAR, multivariate HAR, and exponential weighted multivariate HAR models for (FB, AAPL, MSFT) stock prices from 7 May 2020 to 6 May 2021; p = 3 , h = ( 1 , 5 , 22 ) ; * ( λ F B , λ A A P L , λ M S F T ) = ( 0.03302 , 0.03097 , 0.02733 ) .
Table 10. Comparison of univariate HAR, multivariate HAR, and exponential weighted multivariate HAR models for (FB, AAPL, MSFT) stock prices from 7 May 2020 to 6 May 2021; p = 3 , h = ( 1 , 5 , 22 ) ; * ( λ F B , λ A A P L , λ M S F T ) = ( 0.03302 , 0.03097 , 0.02733 ) .
 Univariate HAR(3)Multivariate HAR(3,3)Exp. Multi. HAR(3,3) *
Total # of Parameters123015
EstimatorOLSELASSOOLSELASSORE
FBRMSE6.17576.19536.05126.15715.7108
MAE4.51104.58994.54434.57034.2440
AIC1498.201356.831540.831406.191345.09
BIC1511.951370.181643.971506.321395.15
AAPLRMSE2.62552.68662.59782.68842.5393
MAE1.94781.98771.93341.98311.8996
AIC1104.741009.341151.861061.761007.94
BIC1118.481022.691255.011161.631058.00
MSFTRMSE3.82343.89793.80573.90243.6608
MAE2.92492.96252.93162.94452.8092
AIC1277.631164.141327.491216.411160.11
BIC1291.391177.491430.641316.531210.17
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Hong, W.-T.; Hwang, E. Exponentially Weighted Multivariate HAR Model with Applications in the Stock Market. Entropy 2022, 24, 937. https://doi.org/10.3390/e24070937

AMA Style

Hong W-T, Hwang E. Exponentially Weighted Multivariate HAR Model with Applications in the Stock Market. Entropy. 2022; 24(7):937. https://doi.org/10.3390/e24070937

Chicago/Turabian Style

Hong, Won-Tak, and Eunju Hwang. 2022. "Exponentially Weighted Multivariate HAR Model with Applications in the Stock Market" Entropy 24, no. 7: 937. https://doi.org/10.3390/e24070937

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop