Asymptotic and Finite Sample Properties for Multivariate Rotated GARCH Models

: This paper derives the statistical properties of a two-step approach to estimating multivariate rotated GARCH-BEKK (RBEKK) models. From the deﬁnition of RBEKK, the unconditional covariance matrix is estimated in the ﬁrst step to rotate the observed variables in order to have the identity matrix for its sample covariance matrix. In the second step, the remaining parameters are estimated by maximizing the quasi-log-likelihood function. For this two-step quasi-maximum likelihood (2sQML) estimator, this paper shows consistency and asymptotic normality under weak conditions. While second-order moments are needed for the consistency of the estimated unconditional covariance matrix, the existence of the ﬁnite sixth-order moments is required for the convergence of the second-order derivatives of the quasi-log-likelihood function. This paper also shows the relationship between the asymptotic distributions of the 2sQML estimator for the RBEKK model and variance targeting quasi-maximum likelihood estimator for the VT-BEKK model. Monte Carlo experiments show that the bias of the 2sQML estimator is negligible and that the appropriateness of the diagonal speciﬁcation depends on the closeness to either the diagonal BEKK or the diagonal RBEKK models. An empirical analysis of the returns of stocks listed on the Dow Jones Industrial Average indicates that the choice of the diagonal BEKK or diagonal RBEKK models changes over time, but most of the differences between the two forecasts are negligible.


Introduction
The BEKK model of Baba et al. (1985) and Engle and Kroner (1995) is widely used for estimating and forecasting time-varying conditional covariance dynamics, especially in the empirical analysis of the multiple asset returns of financial time series (Bauwens et al. 2006;Laurent et al. 2012;McAleer 2005;Silvennoinen and Teräsvirta 2009). The BEKK model is a natural extension of the ARCH/GARCH models of Engle (1082) and Bollerslev (1986). One of the features of the BEKK model is that it guarantees the positive definiteness of the covariance matrix. However, as it does not satisfy suitable regularity conditions, the corresponding estimators do not possess asymptotic properties, except under restrictive conditions (Chang and McAleer 2019;Comte and Lieberman 2003;McAleer et al. 2008). To overcome this problem, Hafner and Preminger (2009) showed the asymptotic properties of the quasi-maximum likelihood (QML) estimator under moderate regularity conditions. Similar to other multivariate GARCH models, a drawback of the BEKK model is that it contains a large number of parameters, even for moderate dimensions. To reduce the number of parameters, the so-called scalar BEKK and diagonal BEKK specifications are used in empirical analyses (Chang and McAleer 2019). Recently, Noureldin et al. (2014) suggested the rotated BEKK (RBEKK) model to handle the high-dimensional BEKK model. They proposed the estimation of the unconditional covariance matrix of the observed variables in the first step to rotate the variables in order to have unit sample variance and zero sample correlation coefficients. In the second step, they considered simplified BEKK models for the QML estimation. We call this procedure the two-step QML (2sQML) estimation. One of the major advantages of the RBEKK model is that it can reduce the number of parameters in the optimization step, while another is that it is more natural to consider simplified specifications after the rotation than to simplify the structure directly without the rotation.
2sQML is closely related to the variance targeting (VT) specification analyzed by Francq et al. (2011) and Pedersen and Rahbek (2014), among others. The VT-QML estimation also uses the estimated unconditional covariance matrix in the first step to reduce the number of parameters in the QML maximization step. Pedersen and Rahbek (2014) showed the consistency and asymptotic normality of the VT-QML estimator under finite sixth-order moments. As Noureldin et al. (2014) discussed the general framework for the asymptotic distribution of the 2sQML estimator for the RBEKK model, it is thus worth examining the detailed moment condition, as in Pedersen and Rahbek (2014).
In this study, we show the consistency and asymptotic normality of the 2sQML estimator for the RBEKK model by extending the approach of Pedersen and Rahbek (2014). For asymptotic normality, we need to impose sixth-order moment restrictions, as in Hafner and Preminger (2009) and Pedersen and Rahbek (2014). We also derive the asymptotic relationship between the VT-QML estimator for the BEKK model and the 2sQML estimator for the RBEKK model. We conduct Monte Carlo experiments to check the finite sample properties of the 2sQML estimator and compare the performance of the estimated diagonal BEKK and diagonal RBEKK models. The proofs of the propositions and corollaries are given in Appendix A. We present an empirical result based on the returns of stocks listed on the Dow Jones Industrial Average.
There are several works related to the idea of Noureldin et al. (2014). First, the transformation suggested by Noureldin et al. (2014) is related to the (generalized) orthogonal GARCH models of Alexander (2001), van der Weide (2002), and Lanne and Saikkonen (2007). While these authors attempt to find orthogonal or unconditionally uncorrelated components in the raw returns, which can then be modeled individually through univariate variance models, Noureldin et al. (2014) suggest fitting flexible multivariate models to the rotated returns using the VT approach. Second, the symmetric square root rotation of Noureldin et al. (2014) is not the most general type of rotation one could use-see, for instance, the hyper-rotation suggested by Asai and McAleer (2020) and the structural multivariate GARCH approach of Hafner et al. (2020). Both use generalized rotations that are not necessarily symmetric. From the structure, we can infer that these works are not based on the VT approach. We use the following notation throughout the paper. For a matrix A, we define A ⊗2 = (A ⊗ A). With ξ 1 , . . . , ξ n , the n eigenvalues of a matrix A, ρ(A) = max i∈{1,...,n} |ξ i |, is the spectral radius of A. The Frobenius norm of the matrix, or vector A, is defined as ||A|| = tr(A A). For a positive matrix A, we define the square root, A 1/2 , by the spectral decomposition of A. For an m × n matrix A, the mn × mn commutation matrix C mn has the property C mn vec(A) = vec(A ).

RBEKK-GARCH Model
As in Hafner and Preminger (2009) and Pedersen and Rahbek (2014), we focus on a simple specification of the BEKK model defined by: where t = 1, . . . , T, A * , and B * are d-dimensional square matrices, C * is a d-dimensional positive definite matrix, and Z t (d × 1) is an i.i.d.(0, I d ) sequence of random variables. We begin with the following assumption.
(a) The distribution of Z t is absolutely continuous with respect to the Lebesgue measure of d and zero is an interior point of the support of the distribution.
From Theorem 2.4 of Boussama et al. (2011), Assumption 1 implies the existence of a unique stationary and ergodic solution to the model in (1) and (2). Furthermore, the stationary solution has finite second-order moments, E||X t || 2 < ∞, and variance V(X t ) = E(H t ) = Ω, with positive definite Ω, which is the solution to: Lemma 2.4 and Proposition 4.3 of Boussama et al. (2011) indicate that the necessary and sufficient conditions for (3) to have a solution of a positive definite matrix is Assumption 1(b). As in Pedersen and Rahbek (2014), we obtain the VT specification by substituting C * in (3) into the model (2), giving: Based on this specification, Noureldin et al. (2014) suggested the RBEKK model, which is obtained by setting A * = Ω 1/2 AΩ −1/2 and B * = Ω 1/2 BΩ −1/2 in (2), where A and B are d-dimensional square matrices. The transformation yields: with the rotated vectorX t = Ω −1/2 X t , which gives E(X tX t ) = I d . As discussed by Noureldin et al. (2014), the specification provides a natural interpretation for considering the diagonal matrices A and B to reduce the number of parameters. Rather than the special case with these diagonal matrices, we consider general A and B for asymptotic theory. With respect to the initial values, we consider the estimation conditional on the initial values X 0 and H 0 = h, where h is a positive definite matrix. Under this structure, it is natural to replace Assumption 1(b) with the following assumption.
Lemma A2 in Appendix A.2 shows that Assumption 2 is equivalent to Assumption 1(b).
In the next section, we consider the 2sQML estimation for the RBEKK model (1) and (5), as in Noureldin et al. (2014) and Pedersen and Rahbek (2014).
As mentioned above, we consider the 2sQML estimation, which comprises two steps. In the first step, we estimate ω using the sample covariance matrix, while the second step conducts the QML estimation by optimizing the log-likelihood function for λ conditional on the estimates of ω. For the RBEKK model, the Gaussian log-likelihood function is given by: with the tth contribution to the log-likelihood given as: excluding the constant. In the first step, we estimate the unconditional covariance matrix by: to rotate X t and H t,h (ω, λ) as:X t =Ω −1/2 X t , H t,h (ω, λ).
Note that the sample covariance matrix is positive definite, since the structure confirms the positive semi-definiteness, and since T > d guarantees that the rank of [X 1 · · · X T ] is d. By definition, we have T −1 ∑ T t=1X tX t = I d . The conditional log-likelihood function is given by: which is equivalent to L T,h (ω, λ) + 0.5T log(det(Ω)). Hence, the second step of the estimator is given by: We derive the asymptotic theory for the 2sQML estimator, which consists of (10) and (11). Following Comte and Lieberman (2003), Hafner and Preminger (2009), and Pedersen and Rahbek (2014), we make the following conventional assumptions.

(a)
The process {X t } is strictly stationary and ergodic.
For Assumption 3(a), Assumptions 1(a) and 2 imply the existence of a strictly stationary ergodic solution {X t } in the RBEKK model. Regarding Assumption 3(c), one of the conditions is that the first element in the matrices A and B should be strictly positive, which is a sufficient condition for the parameter identification, as shown by Engle and Kroner (1995).
We now state the following result on the consistency of the 2sQML estimator,θ = (ω ,λ ) .
Assumptions 1(a) and 2 imply the finite second-order moments of X t , which are necessary for estimating Ω with the sample covariance matrix. As shown by Hafner and Preminger (2009), the consistency of the QML estimator for the BEKK model (1) and (2) does not require the finite second-order moment of X t .
We make the following assumption about the asymptotic normality of the 2sQML estimator.

As in Pedersen and
Rahbek (2014), we need to assume finite sixth-order moments to show that the second-order derivatives of the log-likelihood function converge uniformly on the parameter space. This is different than the univariate case, which only requires finite fourth-order moments (Francq et al. 2011).

Proposition 2. Under Assumptions 1(a) and 2-4, as T → ∞,
with the matrix K 0 and non-singular matrices J 0 and Γ 0 defined by: Remark 1. The structure of the asymptotic covariance matrix is similar to that of the VT-QML estimator for Equations (2) and (4), derived by Pedersen and Rahbek (2014). The major difference comes from the model structure, as the RBEKK further assumes that A * and B * depend on the non-linear function of Ω.
Remark 2. We can estimate Γ 0 , K 0 , and J 0 using the sample outer product of the gradient and Hessian matrices as follows: Given the asymptotic distribution ofθ, we can show the asymptotic distribution of the 2sQML estimator of (Ω, A * , B * ) in the VT representation of the BEKK. Define θ = (ω , λ * ) , where λ * = (α * , β * ) .
Remark 3. The difference in the asymptotic covariance matrix for the 2sQML and VT-QML estimators depends on R 0 and R * 0 . While R * 0 is a symmetric matrix, R 0 is a square matrix in general.
From Proposition 3 and the delta method, we provide the asymptotic distribution of the 2sQML Remark 4. From Proposition 1, we can estimate S 0 and R 0 using the 2sQML estimate,θ.

Experimental Framework
In this section, we illustrate the theoretical results presented in the previous section using Monte Carlo experiments for d-dimensional rotated diagonal GARCH models. We use the diagonal GARCH model for the model comparison because the number of parameters to be estimated is the same. It is difficult to estimate the fully parametrized BEKK model when d takes a higher value such as d = 30, which gives the number of parameters for A * and B * as 2d 2 = 1800. On the contrary, rotated and unrotated diagonal GARCH models use 2d = 60 parameters. We consider four experiments to examine the (i) finite sample property of the 2sQML estimator, (ii) difference between the true and estimated covariance matrices, (iii) approximation of the fully parametrized BEKK model via the diagonal models, and (iv) finite sample property of the 2sQML estimator when fourth moments of X t do not exist. We generate Z t from N(0, I d ) for all the experiments.

Performance of the 2sQML Estimator
In the first Monte Carlo experiment, we consider the bivariate case (d = 2) for the data-generating processes (DGPs) with the following structure in (5): We consider two types of parameter sets: which are used to obtain (C * 0 , A * 0 , B * 0 ) for the DGPs from (1) and (2). We use H 1 = I 2 for the initial value to generate T = 500 observations. We set the number of replications to 2000. Tables 1 and 2 provide the values of (Ω 0 , A 0 , B 0 ) and corresponding values of (C * 0 , A * 0 , B * 0 ), respectively. While DGP1 describes the positive unconditional correlation, DGP2 uses the negative correlation. Using this specification, we can verify that From this setting, we examine the finite sample property of the 2sQML estimator for (Ω, A, B).  Table 1 shows the sample mean, standard deviations, and root mean squared error of the 2sQML estimates. The table indicates that the bias of the estimators is negligible, even for T = 500. The standard deviation for the sample covariance matrix is relatively large for DGP1, which is expected to decrease as the sample size increases. Figure 1 shows the histograms and QQ Plots of 2sQML estimates for representative parameters. As the estimate of Ω 11 is obtained from the sample covariance matrix, its distribution is skewed to the right. By contrast, the distributions of the 2sQML estimates of A 11 and B 11 are close to the normal distribution.
We also check the effects of the transformation from (Ω,Â,B) to (Ĉ * ,Â * ,B * ), as shown in Corollary 1. Table 2 shows the sample mean, standard error, and root mean squared error of the transformed estimator. As in Table 1, the bias of the estimators is negligible. Figure 2 indicates that the transformed estimators for the full BEKK specification can be approximated using the normal distribution.

Performance of Conditional Covariance Matrix Estimator
In the second experiment, we consider higher-dimensional cases with d = {5, 10, 30}. Denoting by U(a, b), we simulate the uniform distribution on [a, b]: We discard the parameters which do not satisfy the stationarity condition. Based on the simulated parameters, we generate X t for the sample size T = {500, 1000}, and estimate the diagonal RBEKK and diagonal BEKK models to calculate the average of the Frobenius norm of the difference in the conditional covariance matrices: whereĤ t is the estimated covariance matrix and H t,0 is the true covariance matrix. Whilê H t = H t,h (θ) for the diagonal BEKK model,Ĥ t is similarly defined for the diagonal BEKK model. We use the common random parameters for the two models and repeat the procedure 100 times with different random parameters. Table 3 shows the sample means of the average distances. The values of the measure for d = 10 are expected to take values approximately four times larger than those for d = 5, as implied by the dimension ofĤ t (d × d). Table 3 supports this result. As the sample size increases, the average distance decreases. Compared with the diagonal BEKK model, the diagonal RBEKK model has a smaller distance measure.

Effects of Diagonal Specification
In the third experiment, we examine the effects of the diagonal specification for the BEKK and RBEKK models when the true model is full BEKK. For this purpose, we consider several measures to check the distance from the diagonal BEKK and RBEKK models to the full BEKK model. The non-diagonal indices are defined as: where diag(Y) creates a diagonal matrix from the square matrix Y. Using the non-diagonal indices, we can calculate the theoretical distance of the diagonal BEKK and RBEKK models.
For the remaining measures, we use the estimated values of the parameters of these models. The maximized log-likelihood L T,h (θ) is used, as is the average of the Frobenius norm of the difference of the conditional covariance matrices, as explained above.
Using these measures, the following Monte Carlo simulations investigate the effects of the diagonal specification of the BEKK and RBEKK models when the true model is bivariate full BEKK. For this purpose, we consider the specification for (4): for 0 ≤ w ≤ 1, where D a1 , D a2 , D b1 , and D b2 are diagonal matrices. When w = 1, the specification reduces to the diagonal BEKK model, while it becomes the diagonal RBEKK model for w = 0. Except for these endpoints, the full BEKK specification provides a non-diagonal structure for A * 0 and B * 0 in (4) and A 0 and B 0 in (5). For the specification in (4), the non-diagonal indices provide the linear functions of w: to calculate the theoretical distances. Consider the parameter settings for the DGPs as: (14).
Set w = 0, 0.1, . . . , 1 to examine 11 cases, with T = 500 and the number of replications set to 100. We estimate the diagonal RBEKK model using the 2sQML method, while VT-QML is used for the diagonal BEKK model. Figures 3 and 4 show the sample means of the average bias for the conditional covariance matrices and sample means of the maximized log-likelihood function for DGP3 and DGP4, respectively. As expected from the structure, the superiority of the diagonal models depends on the structure of the true BEKK model. If w is closer to zero, the diagonal RBEKK model is preferred. The non-diagonality indices are DGP4 w : γ w = 0.0212(1 − w), γ r w = 0.0309w, crossing at w † = 0.407, DGP5 w : γ w = 0.0563(1 − w), γ r w = 0.0614w, crossing at w † = 0.479, and these theoretical values of w † correspond to the intersections shown in Figures 3b  and 4b, respectively. The Akaike information criterion and Bayesian information criterion lead to the same conclusion, as the numbers of parameters in these two models are the same.

Heavy Tails and Moment Conditions
The last experiment uses DGPs 1 and 2 with the multivariate standardized t distribution and the degree-of-freedom parameter ν (denoted by St(ν)), instead of the multivariate standard normal distribution. We consider three cases: (i) a heavy-tailed distribution (ν = 7), which satisfies Assumption 4(a); (ii) DGPs in which the sixth moments are not finite (ν = 5); and (iii) DGPs in which the fourth moments are not finite (ν = 5). The latter two cases imply that the 2sQML estimator is consistent, but its asymptotic normality is not guaranteed. As an alternative approach, we may use the parameter condition derived from Theorem 3 of Hafner (2003). To save space, we omitted the tables for the sample mean, standard deviations, and root mean squared error of the 2sQML estimates, which are available in the Supplementary Materials. Figure 5 shows the histograms and QQ plots of 2sQML estimates for DGP1 with St (7). The Monte Carlo results indicate that the bias is negligible, and the standard deviations are larger than in the case of standard normal distribution. The distributions of the estimates of Ω 11 , A 11 , and B 11 are similar to those of Figure 1. This result supports Proposition 2.  Although the sixth moments are not finite, the distributions of the estimates of Ω 11 , A 11 , and B 11 are close to those of Figure 5. This result implies that we can relax Assumption 4 to guarantee asymptotic normality for the second-step estimator.
For the DGPs with St(3), the Monte Carlo result shown in the Supplementary Materials indicates that the bias of the estimators for Ω and A are relatively small, compared with those of B 11 and B 22 . Figure 7 shows the histograms and QQ plots of 2sQML estimates for DGP1 with St(5). The QQ plot of Figure 7a shows the instability of the first-step estimator. The instability affects the standard deviations of the second-step estimator. Figure 7e implies that the effects are more serious on the estimates of B 11 , with a pressure towards zero. The result indicates that a larger sample size is required in order to improve the estimates for the parameters of B.  The Monte Carlo experiments show that the finite sample properties of the 2sQML estimator are satisfactory and that the average distance between the true and estimated covariance matrices indicates that the difference between the two diagonal BEKK models is not negligible. Because it is difficult to estimate the fully parametrized BEKK model for higher d, it is necessary to examine the model specifications, as in Noureldin et al. (2014), using parsimonious specifications. The Monte Carlo experiments indicate that we may relax the assumption for the sixth moments for the second step estimator.

Empirical Analysis
In this section, we assess the diagonal specification of the RBEKK model compared with the diagonal BEKK model. For this purpose, we focus on the out-of-sample forecasts evaluation, adopting the approach of Engle and Colacito (2006), for the mean-variance portfolio. We calculate the returns of stocks listed on the Dow Jones Industrial Average, except for Dow Inc. (d = 29), for the period starting from 18 February 2010 to 23 January 2020, yielding 2500 observations. We exclude Dow Inc. since it went public on April 1, 2019. Fixing the sample size as T = 2000, we use rolling windows to obtain one-step-ahead forecasts for the last 500 observations for model i (i = 1, 2), denoted byĤ , where ι is the d × 1 vector of ones. Engle and Colacito (2006) show that the realized portfolio volatility is the smallest one when the variance-covariance matrices are correctly specified. We define the distance based on the difference of the squared returns of the two portfolios as: Since the portfolio variances are the same if the forecasts of the covariance matrices are the same, we examine the null hypothesis H 0 : E(e t ) = 0 using the Diebold and Mariano (1995) test. In this case, the test can be constructed in the following manner. Consider the linear regression model given by e t = µ e + u et with E(u et ) = 0, and test H 0 : µ e = 0 using the heteroskedasticity-and autocorrelation-consistent standard errors. If the mean of e t is negative (positive), the diagonal RBEKK (the diagonal BEKK) model is preferred. Table 4 indicates that the Engle and Colacito (2006) test fails to reject the null hypothesis that the two forecasts are equivalent. Figure 8 shows the difference of squared portfolio returns, defined by Equation (15), accompanied by the 95% confidence interval. With a few exceptions, there is no significant difference between the two portfolio weights calculated by the forecasts of the two different models. For instance, at time t = 2178, the forecast by the diagonal RBEKK is preferred. As discussed in Section 4.4, the choice of the diagonal RBEKK or the diagonal BEKK model depends on the true structure of the full BEKK. Figure 8 supports the result of the Engle and Colacito (2006) test in Table 4. To improve the forecasting performance, we may consider a more general rotation matrix, as in Asai and McAleer (2020) and Hafner et al. (2020).

Conclusions
For the RBEKK-GARCH model, we show the consistency and asymptotic normality of the 2sQML estimator under weak conditions. The 2sQML estimation uses the unconditional covariance matrix for the first step and rotates the observed vector to have the identity matrix for its sample covariance matrix. The second step conducts the QML estimation for the remaining parameters. While we require second-order moments for consistency because of the estimation of the covariance matrix, we need finite sixth-order moments for asymptotic normality, as in Pedersen and Rahbek (2014). We also show the asymptotic relation of the 2sQML estimator for the RBEKK model and the VT-QML estimator for the VT-BEKK model. The Monte Carlo results show that the finite sample properties of the 2sQML estimator are satisfactory, and that the adequacy of the diagonal RBEKK depends on the structure of the true parameters. The empirical result for the returns of stocks listed on the DOW30 indicates that the diagonal RBEKK and diagonal BEKK models are competitive, with the superiority of each model changing over time.
As an extension of the dynamic conditional correlation (DCC) model of Engle (2002), Noureldin et al. (2014) suggested rotated DCC models (for a caveat about the regularity conditions underlying DCC, see McAleer (2018)). We can apply the rotation to the different kinds of correlation models suggested by McAleer et al. (2008) and Tse and Tsui (2002). Together with such extensions, the derivation of asymptotic theory for the rotated DCC models is an important direction for future research.

Acknowledgments:
The authors are most grateful to the editor, four anonymous reviewers, and Yoshihisa Baba for very helpful comments and suggestions. The first author acknowledges the financial support of the Japan Ministry of Education, Culture, Sports, Science and Technology, Japan Society for the Promotion of Science, and the Australian Academy of Science. The second author thanks the Ministry of Science and Technology (MOST) for financial support. The third author is most grateful for the financial support of the Australian Research Council, Ministry of Science and Technology (MOST), Taiwan, and the Japan Society for the Promotion of Science.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analysis, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations
The following abbreviations are used in this manuscript: Although Pedersen and Rahbek (2014) demonstrated the derivatives with respect to Ω, A * , and B * , they are not applicable, as A * and B * in (2) depend on Ω 1/2 and Ω −1/2 in the RBEKK models (6) and (7), respectively. Related to this issue, we need the following lemma to show the derivatives of the log-likelihood function.
Because Ω 1/2 is positive definite, we obtain the result. A similar application produces the following: ∂vec Ω −1 ∂ω From the derivative of the inverse of the symmetric matrix shown by 10.6.1(1) of Lütkepohl (1996), we obtain the second result. The gradient and Hessian matrices of the log-likelihood function are given by Applying the chain rule and product rule, we obtain where θ i (i = 1, . . . , 3d 2 ) is the ith element of θ: The first equation of (A2) uses 10.3.2(23) and 10.3.3(10) of Lütkepohl (1996), while we apply 10.6.1(1) for the second equation. From Lemma A1, the product rule, and the chain rule, we obtain the first derivatives: and where C dd is the commutation matrix, which consists of one and zero satisfying vec(A ) = C dd vec(A). Similarly, the second derivatives of H t are given by + Ω 1/2 ⊗2 ∂ 2 vec(H t ) ∂λ i ∂ω j (i = 1, . . . , 2d 2 , j = 1, . . . , d 2 ), where e (j) is a d 2 × 1 vector of zeros except for the jth element, which takes one. We omit the derivatives of H t . Before we proceed, we show the equivalence of Assumptions 1(b) and 2.
Lemma A2. For the RBEKK model defined by (4) and (5), it can be shown that Proof. Noting that Lütkepohl (1996) indicates that the eigenvalues of (A * ⊗ A * ) + (B * ⊗ B * ) are the same as those of (A ⊗ A) + (B ⊗ B), which proves the lemma.
with Q 0 and Γ 0 as stated in Proposition 2.

Lemma A4.
Under the assumptions of Proposition 2, the Γ 0 stated in Proposition 2 can be written as Proof of Lemma A4. Using an argument similar to the proof of Lemma B.8 in Pedersen and Rahbek (2014), we can show which implies the equivalence of the asymptotic covariance matrices on both sides.