What They Did Not Tell You About Algebraic (Non-)Existence, Mathematical (IR-)Regularity and (Non-)Asymptotic Properties of the Full BEKK Dynamic Conditional Covariance Model

Persistently high negative covariances between risky assets and hedging instruments are intended to mitigate against risk and subsequent financial losses. In the event of having more than one hedging instrument, multivariate covariances need to be calculated. Optimal hedge ratios are unlikely to remain constant using high frequency data, so it is essential to specify dynamic covariance models. These values can either be determined analytically or numerically on the basis of highly advanced computer simulations. Analytical developments are occasionally promulgated for multivariate conditional volatility models. The primary purpose of the paper is to analyse purported analytical developments for the most widelyused multivariate dynamic conditional covariance model to have been developed to date, namely the Full BEKK model of Baba et al. (1985), which was published as Engle and Kroner (1995). Dynamic models are not straightforward (or even possible) to translate in terms of the algebraic existence, underlying stochastic processes, specification, mathematical regularity conditions, and asymptotic properties of consistency and asymptotic normality, or the lack thereof. The paper presents a critical analysis, discussion, evaluation and presentation of caveats relating to the Full BEKK model, and an emphasis on the numerous dos and don’ts in implementing Full BEKK in practice.


Introduction
Persistently high negative covariances between risky assets and hedging instruments are intended to mitigate against risk and subsequent financial losses.It is possible to hedge against risky assets using one or more hedging instruments as the benchmark, which requires the calculation of multivariate covariances.As optimal hedge ratios are unlikely to remain constant using high frequency data, it is essential to specify dynamic models of covariances.
Modeling, forecasting and evaluating dynamic covariances between hedging instrument and risky financial assets requires the specification and estimation of multivariate models of covariances.These values can either be determined analytically or numerically on the basis of highly advanced computer simulations.High frequency time periods such daily data can lead to models of conditional volatility, where analytical developments are occasionally promulgated.
The primary purpose of the paper is to analyze purported analytical developments for the most widely-used multivariate dynamic conditional covariance model to have been developed to date, namely the Full BEKK model of Baba et al. (1985), which was published as Engle and Kroner (1995).
Dynamic models are not straightforward (or even possible) to translate in terms of the algebraic existence, underlying stochastic processes, specification, mathematical regularity conditions, and asymptotic properties of consistency and asymptotic normality, or the lack thereof.The paper presents a critical analysis, discussion, evaluation and presentation of caveats relating to the Full BEKK model, and an emphasis on the numerous dos and don'ts for implementing Full BEKK in practice.
For the variety of detailed possible outcomes mentioned above, where problematic issues arise constantly, and sometimes unexpectedly, a companion paper by the author evaluates the recent developments in modeling dynamic conditional correlations on the basis of the Dynamic Conditional correlation (DCC) model (see McAleer 2019).Both papers are intended as Topical Collections to bring the known and unknown results pertaining to Full BEKK and associated non-diagonal multivariate conditional volatility models, such as Triangular BEKK and Hadamard BEKK, into a single collection.
The remainder of the paper is as follows.The Full BEKK model is presented in Section 2, which will enable a subsequent critical analysis and emphasis on a discussion, evaluation and presentation of caveats in Section 3 of the numerous dos and don'ts in implementing the Full BEKK model in practice.

Model Specification
Some of the results in this section, though not all, are available in the extant literature, but the interpretation of the models and their non-existent underlying stochastic processes, as well as the discussions and caveats in the following section, are not available.Much of the basic material relating to the univariate and multivariate specifications in Sections 2.1 and 2.3 overlap with the presentation in McAleer (2019).
The first step in estimating Full BEKK is to estimate the standardized shocks from the univariate conditional mean returns shocks.The most widely used univariate conditional volatility model, namely the Generalized AutoRegressive Conditional Heteroskedasticity (GARCH) model, will be presented briefly, followed by Full BEKK.Consider the conditional mean of financial returns, as follows: where the returns, y t = ∆ log P t , represents the log-difference in financial asset prices (P t ), I t−1 is the information set at time t − 1, and ε t is a conditionally heteroskedastic returns shock that has the same unit of measurement as the returns.In order to derive conditional volatility specifications, it is necessary to specify, wherever possible, the stochastic processes underlying the returns shocks, ε t .

Univariate Conditional Volatility Models
Univariate conditional volatilities can be used to standardize the conditional covariances in alternative multivariate conditional volatility models to estimate conditional correlations, which are particularly useful in developing dynamic hedging strategies.The most widely-used univariate model, GARCH, is presented below as an illustration because the focus of the paper is on estimating and testing Full BEKK.

Random Coefficient Autoregressive Process and GARCH
Consider the random coefficient autoregressive process of order one: where and The standardized residual is unit-free of measurement, and is a financial fundamental as it represents a riskless asset.Tsay (1987) derived the ARCH(1) model of Engle (1982) from Equation (1) as: where h t is conditional volatility, and I t−1 is the information set available at time t − 1.The mathematical regularity condition of invertibility is used to relate the conditional variance, h t , in Equation ( 3) to the returns shocks, ε t , which has the same measurement as y t in Equation ( 1), thereby yielding a valid likelihood function of the parameters given the data.The use of an infinite lag length for the random coefficient autoregressive process in Equation ( 2), with appropriate geometric restrictions (or stability conditions) on the random coefficients, leads to the GARCH model of Bollerslev (1986).From the specification of Equation ( 2), it is clear that both ω and α should be positive as they are the unconditional variances of two independent stochastic processes.The GARCH model is given as: where α is the short run ARCH effect, and β, which lies in the range (−1, 1), is the GARCH contribution to the long run persistence of returns shocks.
From the specification of Equation (2), it is clear that both ω and α should be positive as they are the unconditional variances of two independent stochastic processes.It should be emphasized that the random coefficient autoregressive process is a sufficient condition to derive ARCH, but to date the ARCH specification has not been derived from any other underlying stochastic process.

Multivariate Conditional Volatility Models
Multivariate conditional volatility GARCH models are often used to analyze the interaction between the second moments of returns shocks to a portfolio of assets, and can model and the possible risk transmission or spillovers among different assets.
In order to establish volatility spillovers in a multivariate framework, it is useful to define the multivariate extension of the relationship between the returns shocks and the standardized residuals, that is, The multivariate extension of Equation ( 1), namely: can remain unchanged by assuming that the three components in the above equation are now m × 1 vectors, where m is the number of financial assets.
The multivariate definition of the relationship between ε t and η t is given as: where D t = diag(h 1t , h 2t , . . ., h mt ) is a diagonal matrix comprising the univariate conditional volatilities.Define the conditional covariance matrix of ε t as Q t .As the m × 1 vector, η t , is assumed to be iid for all m elements, the conditional correlation matrix of ε t , which is equivalent to the conditional correlation matrix of η t , is given by Γ t .
Therefore, the conditional expectation of the process in Equation ( 4) is defined as: Equivalently, the conditional correlation matrix, Γ t , can be defined as: Equation ( 5) is useful if a model of Γ t is available for purposes of estimating the conditional covariance matrix, Q t , whereas Equation ( 6) is useful if a model of Q t is available for purposes of estimating the conditional correlation matrix, Γ t .
Both Equations ( 5) and ( 6) are instructive for a discussion of asymptotic properties.As the elements of D t are consistent and asymptotically normal, the consistency of Q t in Equation ( 5) depends on consistent estimation of Γ t , whereas the consistency of Γ t in Equation ( 6) depends on consistent estimation of H t .As both Q t and Γ t are products of matrices, and the inverse of the matrix D is not asymptotically normal, even when D is asymptotically normal, neither the QMLE of Q t nor Γ t will be asymptotically normal, especially based on the definitions that relate the conditional covariances and conditional correlations given in Equations ( 5) and (6).

Full BEKK and Diagonal BEKK Models
The vector random coefficient autoregressive process of order one is the multivariate extension of Equation ( 2), and is given as: where ε t and η t are m × 1 vectors, Φ t is an m × m matrix of random coefficients, Φ t ∼ iid(0, A), η t ∼ iid(0, QQ ) (for the structural and statistical properties of finite-order random coefficient autoregressive process, see Nicholls and Quinn (1980, 1981, 1982)).
Technically, a vectorization of a full (that is, non-diagonal) matrix A to vec A can have dimensions as high as m 2 × m 2 , whereas a half-vectorization of a symmetric matrix A to vech A can have dimensions as low as m(m − 1)/2 × m(m − 1)/2.The matrix A is crucial in the interpretation of symmetric and asymmetric weights attached to the returns shocks.
As the dimension of the unconditional variance of ε t in Equation ( 7) is m, if the variance matrix is not restricted parametrically, the dynamic conditional covariance matrix of (7) would depend on the product of the variance of Φ t , with a dimension that lies between m(m − 1)/2 and m 2 , neither of which would be conformable with the dimension of ε t−1 .
Where A is either a diagonal matrix, or the special case of a scalar matrix,A = aI m , McAleer et al. (2008) showed that the multivariate extension of GARCH(1,1) from Equation ( 7), incorporating an infinite geometric lag in terms of the returns shocks, is given as the diagonal (or scalar) BEKK model, namely: where A and B are diagonal (or scalar) matrices.
As in the univariate case, it should be emphasized that the vector random coefficient autoregressive process is a sufficient condition to derive diagonal BEKK, but to date the diagonal BEKK specification has not been derived from any other underlying multivariate stochastic process.
Although the Full BEKK model is always presented in the form of Equation ( 8), with A and B given as full matrices, as stated above, the specification is not consistent with Equation (8) as the matrices A and B for Full BEKK would have dimensions that lie between m(m − 1)/2 and m 2 , which would not be conformable for multiplication with the dimension of the vector ε t−1 .McAleer et al. (2008) showed that the QMLE of the parameters of the Diagonal BEKK models are consistent and asymptotically normal, so that standard statistical inference on testing hypotheses is valid.The theoretical results can also be obtained from Nicholls and Quinn (1980, 1981, 1982).Moreover, as Q t in Equation ( 8) can be estimated consistently, Γ t in Equation ( 6) can also be estimated consistently.However, as explained above, asymptotic normality cannot be proved given the definitions in Equations ( 5) and ( 6).
However, other special cases of Full BEKK, such as Triangular BEKK and Hadamard BEKK, cannot be obtained from a vector random coefficient autoregressive process, so that any purported asymptotic properties of the QMLE do not exist.
It should be emphasized that the QMLE of the parameters in the conditional means and the conditional variances for univariate GARCH, Diagonal BEKK and Full BEKK will differ as the multivariate models are estimated jointly, whereas the univariate models are estimated individually.The QMLE of the parameters of the conditional means and the conditional variances of Diagonal BEKK and Full BEKK will differ as Diagonal BEKK imposes parametric restrictions on the off-diagonal terms of the conditional covariance matrix of the Full BEKK model.

Discussion and Caveats of Dos and Don'ts Regarding Full BEKK
The results in the previous section allow a clear discussion of the caveats associated with the widely-used Full BEKK.The deficiencies and limitations in virtually all published papers that use the deeply flawed Full BEKK model are given below.The discussion and caveats are presented in a clear and entirely straightforward manner that would seem to need no further elaboration.
(1) Engle (1982) developed an autoregressive model of conditional correlations, ARCH, based on the conditional returns shocks.(2) Bollerslev (1986) extended ARCH by adding a lagged dependent variable to obtain Generalized ARCH, GARCH.
(3) The GARCH(1,1) parameters must satisfy the regularity conditions of positivity as they are the unconditional variances from a univariate random coefficient autoregressive process (see Tsay 1987;McAleer 2014).(4) However, the coefficient of the arbitrary lagged conditional variance is a positive or negative fraction (see Bollerslev 1986).( 5) The Full BEKK model was proposed in Baba et al. (1985), after whom the model is named.( 6) The Full BEKK model was published ten years later in Engle and Kroner (1995).( 7) The Full BEKK model does not satisfy the definition of a conditional covariance matrix, as the purported conditional covariances do not satisfy the definition of a covariance, except by an untenable assumption.(8) There is no underlying stochastic process that leads to the Full BEKK model, so that there are no regularity conditions relating to its specification.(9) The regularity conditions include invertibility, which is essential in relating the iid standardized residuals to the returns data.(10) It follows that there is no likelihood function.(11) Consequently, there are derivatives that would enable the derivation of asymptotic properties for the Quasi-Maximum Likelihood Estimates (QMLE) of the estimated parameters.(12) Therefore, any statements regarding the purported "statistical significance" of the estimated parameters are meaningless and lack statistical validity (see Chang and McAleer (2019) for a critical analysis).
(13) It follows that any empirical results based on the Full BEKK estimates are fatally flawed and lack statistical validity.( 14) As Full BEKK does not satisfy appropriate regularity conditions, the QMLE do not possess asymptotic properties.(15) The only exceptions to the non-existence of asymptotic properties of the QMLE of Full BEKK are under highly restrictive and untestable assumptions (see Chang and McAleer 2019;Comte and Lieberman 2003;Hafner and Preminger 2009;McAleer et al. 2008).( 16) The novel results in Tsay (1987) 20) It should come as little or no surprise that, when the Full BEKK model is estimated using real data, there are always difficulties in terms of computational convergence, especially when m > 4, and include the fact that the model does not actually exist!(21) The computational difficulties are almost certainly associated with the fact that the model does not actually exist!( 22) Moreover, such computational outcomes would almost certainly arise from the addition of between m(m − 1)/2 and m 2 parameters when p = 1, especially when the value of m is high for the large (such as m > 100) financial portfolios that are observed in practice.(23) In short, Diagonal BEKK is mathematically and statistically preferable to the fatally flawed Full BEKK and the related non-Diagonal BEKK models, such as Triangular BEKK and Hadamard BEKK.(24) If Full BEKK is to be considered at all, except in connection with the algebraic non-existence, absence of an underlying stochastic process, mathematical irregularity, and unknown asymptotic statistical properties, or alternatively, in the presence of problems that should be avoided at all costs, it is advisable that the Full BEKK specification be used with extreme and utter caution in empirical practice.
McAleer et al. (2008)ctor random coefficient stochastic process, which is a sufficient condition to derive Diagonal BEKK inMcAleer et al. (2008):  (17)McAleer et al. (2008)demonstrate that the Diagonal BEKK model has an underlying stochastic process that leads to its specification, and hence satisfies the regularity conditions, including invertibility.(18)Consequently, the QMLE of the estimated parameters of Diagonal BEKK are consistent and asymptotic normal.(19)Other special cases of Full BEKK, such as Triangular BEKK and Hadamard BEKK, cannot be obtained from any known underlying vector random coefficient autoregressive process, so that any purported asymptotic properties of the QMLE do not exist.(