The Realized Hierarchical Archimedean Copula in Risk Modelling

Ostap Okhrin 1 and Anastasija Tetereva 2,* 1 Chair of Econometrics and Statistics esp. Transportation, Institute of Economics and Transport, Faculty of Transportation, Dresden University of Technology, Helmholtzstraße 10, 01069 Dresden, Germany; ostap.okhrin@tu-dresden.de 2 Chair of Mathematics and Statistics, University of St Gallen, Bodanstrasse 6, 9000 St Gallen, Switzerland * Correspondence:anastasija.tetereva@unisg.ch; Tel.: +41-71-224-2183


Introduction
One of the main objectives of quantitative research is the modelling and approximation of multivariate distributions. A multivariate model should be flexible enough to capture the stylized facts of empirical finance. Moreover, increasing interest in short-term quantitative risk management requires the time-variability of such models. The current paper builds on two actively developing areas of financial econometrics: copulae and high-frequency data. On the one hand, copulae appear to be a helpful tool to analyse complex dependence structures, evaluate the risk, and are therefore widely used to price financial derivatives, see Embrechts et al. (2003), Rodriguez (2007), Hofert and Scherer (2011), Krämer et al. (2013). On the other hand, models based on high-frequency data yield superior predictions in comparison to approaches based on daily data. Among others, Andersen et al. (2002),  and Zhang et al. (2005) made it possible to compute the daily realized covariances from high-frequency data. Many researchers have implemented the obtained realized measures to model financial time series. Most of those studies, however, employ models where the realized correlation matrix directly characterizes the multivariate distribution, see, for example, Bauer and Vorkink (2011), Chiriac and Voev (2011), Jin and Maheu (2012), or address GARCH type models, for example, Hansen et al. (2014), Bauwens et al. (2012), Noureldin et al. (2012), Bollerslev et al. (2016). There are only a limited number of studies which discuss the implementation of high-frequency data in copula models. Breymann et al. (2003) and Dias and Embrechts (2004) employ copulae to study the properties of intraday log-returns. Creal et al. (2013) consider an autoregressive updating equation and improve the predictive power in Salvatierra and Patton (2015) by including the lagged realized volatility in the equation.
To the best of our knowledge, the only model that parameterizes the whole Archimedean copula (AC) by the realized variance-covariance matrix is in Fengler and Okhrin (2016), who introduced the realized copula parameter. The authors suggested capturing time-varying dependence by using high-frequency intraday data to estimate the parameter of an AC daily. It has been demonstrated empirically that the realized copula model outperforms the list of benchmark models in one-day-ahead out-of-sample VaR prediction. The realized copula model of Fengler and Okhrin (2016) has, however, several limitations. First, their realized copula is driven by one single parameter, which limits the flexibility of the model. Second, the estimation procedure is performed by applying a method of moments kind of estimator, which suffers from the curse of dimensionality.
We propose to extend the work of Fengler and Okhrin (2016) by introducing the realized hierarchical Archimedean copula (rHAC), which allows more flexibility and is applicable to managing highdimensional portfolios. We adapt the estimation procedures described in Segers and Uyttendaele (2014) and Górecki et al. (2016a) to high-frequency data, which allows estimating the structure and the parameters of a copula based only on a realized covariance matrix. As a result, the estimate does not suffer from microstructure noise or jumps. Moreover, it can be applied to high-dimensional portfolios since the computationally expensive optimization procedure proposed in Fengler and Okhrin (2016) is reduced to a set of simple tasks. This result is of particular importance in many financial applications, especially in risk management. This paper is structured as follows. Section 2 contains a literature review of the theory of the copula and introduces the concept of a realized copula. An estimator of the structure and the parameters of an rHAC is presented in Section 3. Simulation studies and a comparison with the benchmark models are provided in Section 4. Section 5 discusses the construction of the rHAC, and gives a short summary of competing models. Section 6 describes an application of the proposed models to one-day-ahead VaR prediction for a multidimensional portfolio. Finally, we summarize the main contribution of the paper.

The Concept of the Realized Copula
The concept of the copula was introduced to the statistical literature by Sklar (1959) and further popularized in the world of finance by Embrechts et al. (1999) in the context of risk management. Sklar's theorem, see Sklar (1959), states that a d-dimensional distribution function F (x 1 , . . . , x d ) with marginals F 1 , . . . , F d can be represented as where C d (u 1 , . . . , u d ) is a d-dimensional copula. In addition, it states that the continuity of the marginal distributions F 1 , . . . , F d ensures the uniqueness of the copula. Having a huge number of classes of bivariate copulae, see Nelsen (2007), there is still a lack of multivariate ones. The most popular classes of multivariate copulae currently are elliptical, factor, pair-copula constructions, and HAC. The first class is often used in practice due to its simplicity and intuitive interpretation. However, elliptical copulae are not able to capture the stylized facts observed in financial data. The factor approach overcomes this limitation and has attracted attention in the copula literature over the last decade, see, for example, Andersen and Sidenius (2004), Van der Voort (2007), Krupskii and Joe (2013), Oh and Patton (2017). The limitation of the factor copula models is that the likelihood function is often not known in closed form, which complicates the estimation of the parameters. Pair-copula constructions are discussed in more detail by Joe (1996), Bedford and Cooke (2001), Czado (2010), and Kurowicka (2011), and are increasing in popularity. Another popular copula class is AC, which contains, among others, the Clayton, Gumbel and Frank copulae. The AC parametrized by the parameter θ is defined as C d (u 1 , . . . , u d ; θ) = ψ θ {ψ [−1] θ (u 1 ) + . . . + ψ [−1] θ (u d )}, u 1 , . . . , u d ∈ [0, 1] with (−1) j ψ (j) θ (t) ≥ 0 being non-decreasing and convex on [0, ∞) for t > 0, where j ∈ N. ψ θ (0) = 1, ψ θ (∞) = 0 and the pseudo inverse is defined as ψ [−1] θ (t) = ψ −1 θ (t) for 0 ≤ t ≤ ψ θ (0) and 0 otherwise. The generators and the densities of some AC are given in Appendix A.
Due to the lack of flexibility of AC, caused by the fact that the whole copula is driven by just one parameter θ, generalizations such as nested copulae have been introduced. This paper employs a flexible multivariate copula family, HAC, a special case of which may be defined recursively in the following way: where θ θ θ = (θ 1 , . . . , θ d−1 ) is the parameter vector of the HAC and s is the structure of the HAC. As is evident from (2), the current study assumes that all generators of the HAC belong to the same parametric family and each of them depends on one single parameter. For simplicity, we compress the notation of (2) and denote the d-dimensional HAC with k generators which is parametrized by the structure s and the parameter vector θ θ θ = (θ 1 , . . . , θ k ) as C d (u 1 , . . . , u d ; s, θ θ θ). The structure s is the merging ordering s = (. . . (qr)s . . .), where q, r, s ∈ 1, . . . , d, q = r = s is a reordering of the indices of the variables X i , i = 1, . . . , d. The structure of a d-dimensional HAC s can be seen as a tree with k ≤ d − 1 non-leaf nodes that correspond to the generators and d leaves representing the variables X = (X 1 , X 2 , . . . , X d ) . The leaves correspond to the lowest level of the tree. The root corresponding to the variable C d (u 1 , . . . , u d ; s, θ θ θ) is assumed to be the highest level of the tree. The nodes, which are not the leaves are called internal nodes, each corresponds to the generator. A node which is directly connected to another node when moving away from the root is called the child node. A node which is directly connected to another node when moving from the leaves to the root is called the parent node. Descendants are the children nodes of the node, children of these children, etc. The set of ancestors includes the parent node of the node, parents of the parents, etc. The structure of the HAC is called binary if it corresponds to the binary tree, i.e., if each internal node has exactly two children. Further on, we denote the nodes associated with the generators by D X i , where X i is the set of leaves (variables) that are descendant nodes of the node D X i , i = 1, . . . , k. Assuming this notation, the node D X i is an ancestor of the node D X j (the leave associated with the variable X l ) if X j ⊂ X i ( X l ⊂ X i ), l = 1, . . . , d, i, j = 1, . . . , k. Another concept that will be used later on is the concept of the lowest common ancestor (lca). The lca of the nodes D X i (the leave X q ) and D X j (the leave X r ) is the node D X l that is the lowest node satisfying X i ⊂ X l ( X q ⊂ X l ) and X j ⊂ X l (X r ⊂ X l ), q, r = 1, . . . , d, i, j, l = 1, . . . , k.
To clarify the above-mentioned definitions and avoid introducing the comprehensive notation of the graph theory, we illustrate the above-named concepts by an example. Consider the 5-dimensional copula 1.5 (u 5 ) that can be written as C 5 u 1 , u 2 , u 3 , u 4 , u 5 ; s = ((12)(34)5), θ θ θ = (4, 2.5, 2, 1.5) , where u i = F −1 i (x i , ν i ) with ν i being the parameters of the marginal distributions F i (·), i = 1, . . . , 5. The tree corresponding to this copula is presented in the Figure 1. This copula has the binary structure s = ((12)(34)5). There are k = 4 non-leaf (internal) nodes. The leaves which correspond to the lowest level of the copula tree are given by the variables X 1 , X 2 , X 3 , X 4 and X 5 . The root D X 4 which represents the highest level of the copula tree corresponds to the variable C 5 (u 1 , u 2 , u 3 , u 4 , u 5 ; s, θ θ θ), where X 4 = (X 1 , X 2 , X 3 , X 4 , X 5 ) . The root node is the parent node for the node corresponding to the variable X 5 and the node D X 3 associated with the variable generated by C 4 u 1 , u 2 , u 3 , u 4 ; s = (12)(34), θ θ θ = (4, 2.5, 2) , where X 3 = (X 1 , X 2 , X 3 , X 4 ) . The root node is the ancestor for all other nodes of the given copula tree. The lca of the nodes associated with the variables X 1 and X 2 is the node D X 1 that corresponds to the variable C 2 (u 1 , u 2 ; s = (12), θ θ θ = 4), where X 1 = (X 1 , X 2 ) . The lca of the nodes corresponding to the variables X 1 and X 5 is the root node D X 4 as it is the lowest node satisfying X 1 ⊂ X l and X 5 ⊂ X l , l = 1, . . . , d. Although copula models are flexible enough to capture nonlinear dependencies, many empirical applications require the time variability of the parameters (and the structure) of the whole copula. For example, the empirical evidence makes it reasonable to assume that the dependence between asset log-returns gets stronger during periods of financial turbulence. A vast amount of literature is devoted to dynamic copula models, including the parsimonious rolling window approach and more sophisticated models, such as, for example, the local change point procedure of Härdle et al. (2013). Recent developments in time-varying copula models take advantage of the rapidly growing availability of high-frequency observations and include the realized measures (volatility and correlations) in the copula models to improve their predictive power, see, for example, Salvatierra and Patton (2015). The improvement is obtained due to the fact that the actual realizations of the volatility of log-returns which are not directly observable can be estimated by the sum of finely-sampled squared realizations of log-return over a fixed time interval when the high-frequency observations are available. Such a nonparametric ex-post measurement of the log-return variation is called the realized volatility. In an analog manner, the realized covariances are defined by summing all the cross products of intraday log-returns. The formal definition of the realized measures is given in Appendix B. Despite the constantly growing research on incorporating the realized measures into multivariate Gaussian models, discussed in Chiriac and Voev (2011) and Bauer and Vorkink (2011), and into GARCH type models, for example, Hansen et al. (2014) and Bollerslev et al. (2016), there is still a gap in the literature on how the parameters of non-Gaussian copula can be estimated daily based on high-frequency observations. It is important to note here that such standard copula estimation techniques as the Maximum Likelihood (ML) method or the inversion of Kendall's τ can not be directly applied to tick-by-tick observations. Estimating the copula by applying these approaches to high-frequency data would estimate the multivariate distribution of high-frequency log-returns, which in general does not coincide with the multivariate distribution of daily log-returns. Such a model would estimate the intraday dependence and produce the forecast of the multivariate distribution of log-returns in the next second and could not be used for one-day-ahead VaR forecasts. For further details on the standard estimation procedures, refer to Nelsen (2007), Trivedi and Zimmer (2007), Jaworski et al. (2013), Cherubini et al. (2011), Joe (2014 and Durante and Sempi (2015). In contrast to the direct application of the ML approach to tick-by-tick data or high-frequency estimator of Kendall's τ, there is a considerable literature discussing how to estimate the correlation matrix of daily log-returns via a realized correlation matrix or similar methods, see , , Zhang et al. (2005), Hayashi and Yoshida (2005), De Pooter et al. (2008). The idea of using the information concentrated in the realized covariance matrix to estimate the parameters of a copula daily has been employed by Fengler and Okhrin (2016), who used a combination of the results from a lemma of Hoeffding (1940) and Sklar's theorem (1) to express the covariance σ ij between two random variables X i and X j in terms of the marginal distributions F i (·) and F j (·) and the copula C 2 (·, ·; θ) where θ is the parameter of the copula and ν i , ν j are the parameters of the marginal distributions F i (·) and F j (·). In the high-frequency framework, the covariance σ ij in (3) is replaced by the element r ij,t of the realized covariance matrix R t computed at day t. From now on, we denote the diagonal elements of matrix R t by r i,t instead of r ii,t , i = 1, . . . , d. As has been shown in Breymann et al. (2003) and discussed in more detail in Hautsch (2011), with an increasing sampling frequency, the marginal distributions of log-returns can be assumed to be Gaussian with zero mean and the standard deviation equal to √ r i,t , t = 1, . . . , d, this leads us to assume throughout this study that margins are N(0, r i,t ). Thus, if the realized covariance matrix R t can be computed, according to Fengler and Okhrin (2016), it can be assumed that for the Archimedean copula driven by one single parameter θ the integral in (3) depends on just the parameter of the copula which belongs to some parametric family C = {C (·; θ) , θ ∈ Θ}. Therefore, after replacing the covariances in (3) by their realized counterparts and standardizing the variables, the expression (3) can be rewritten for the realized correlations as where Φ(·) is the cdf of the standard normal distribution and ρ ij,t = r ij,t √ r i,t ·r j,t is the element of the realized correlation matrix P t calculated at day t, t = 1, . . . , T. According to (4), the realized correlations depend solely on the copula parameter, under the assumption of some parametric family. Based on (4), the parameter of the copula can be estimated based on just the realized correlation matrix: where g t (θ) is a vector of length d(d−1) 2 where all the g ij,t (θ) = ρ ij,t − f (θ) are stacked together and W is a 2 -dimensional positive definite weighting matrix. When the copula parameter is estimated from (5) and the diagonal elements of the realized covariance matrix R t are calculated, the multivariate distribution of X = (X 1 , X 2 , . . . , X d ) is fully specified. It is important to note that Fengler and Okhrin (2016) consider the restrictive setting of AC. Therefore, all bivariate copulae in (4) coincide and are driven by one single parameter θ.
In practice, one is usually interested in predicting a multivariate distribution, rather than just estimating it. This can be done in two ways. The parameter of the realized copula can be estimated daily and predicted using some time-series model. Alternatively, the realized correlation matrix can be predicted and the parameter of the copula can be estimated from P t+1|t , which is one-day-ahead prediction of the realized correlation matrix P t+1 obtained by applying the specific time series model in the spirit of Bauer and Vorkink (2011) or Chiriac and Voev (2011). The limitation of both approaches comes from the estimation procedure (5), which suffers from the curse of dimensionality and enables the estimation of the realized copula only in moderate dimensions. Moreover, as was mentioned earlier, the whole realized copula in Fengler and Okhrin (2016) is driven by just one parameter θ, which might be too restrictive for multivariate portfolios.
We propose to overcome these limitations by using the HAC instead of the simple AC. This extension is not straightforward, as in addition to the parameter vector θ θ θ of C d (u 1 , . . . , u d ; s, θ θ θ), the structure of the copula s needs to be estimated. The estimation of the parameter vector θ θ θ of a d-dimensional copula C d (u 1 , . . . , u d ; s, θ θ θ) should be addressed as well. The procedure of Fengler and Okhrin (2016) allows the estimation of the parameters at the bottom level of the copula. The estimation of the parameters of the higher levels is not trivial, as the realized correlation among the original variables and the variables determined by the copulae of the bottom levels can not be specified. This motivates the estimation of the structure and the parameters of the hierarchical copula based just on the realized correlation matrix. Recent studies in the copula literature address the question of how the structure (or the structure and the parameters) of a hierarchical copula can be estimated based on Kendall's τ correlation matrix, see, for example, Segers and Uyttendaele (2014), Górecki et al. (2016aGórecki et al. ( , 2016b, Uyttendaele (2016). We propose to combine the methods discussed in Segers and Uyttendaele (2014) and Górecki et al. (2016a) and adapt them to the realized correlation matrix with the final goal of improving one-day-ahead VaR prediction for multivariate portfolios.

Estimating the Realized Hierarchical Archimedean Copula
This section discusses how the structure and the parameters of an HAC can be estimated based on the realized correlation matrix P t only. From now on, we refer to such a copula as an rHAC. In this section, the subindex t is dropped to simplify the notation. We suggest generalizing the clustering method proposed by Górecki et al. (2016a) by applying an adaptation of the algorithm introduced in Segers and Uyttendaele (2014) in order to estimate the structure of an HAC. Consequently, the parameters can be estimated by applying (4) to the specific average of the realized correlations. We restrict ourselves to the case when all the generators of the copula belong to the same Archimedean family and satisfy the nesting condition. A brief discussion of this will be provided later in this section.

Estimating the Structure
In analog to the method mentioned in Górecki et al. (2016a) for Kendall's τ, we suggest defining the distance between two variables X i and X j as where ρ ij is the realized correlation between X i and X j , i, j = 1, . . . , d. Next, the dependence-based distance matrix is used as the input for an agglomerative cluster analysis. The obtained hierarchical clustering dendrogram corresponds to the estimated structure of the HAC. This approach is, however, valid only for HACs with binary (bivariate) structure. The introduction of an additional merging parameter that allows collapsing a binary structure into a general one is discussed in Uyttendaele (2016). The optimal choice of such a parameter still needs to be addressed in the literature. To reduce the computational costs, we will adapt the method proposed in Segers and Uyttendaele (2014) to the distance (6) to recover the general structure of an rHAC.

Segers' and Uyttendaele's Algorithm
According to Segers and Uyttendaele (2014), the structure of a nested HAC s can be uniquely recovered from the structures of the set of ( d 3 ) triples X q , X r , X s with distinct q, r, s = 1, . . . , d using the concept of lca. According to the definition given in Section 2, the lca of X q and X r is the node which is the lowest node that has both X q and X r as descendants, q, r = 1, . . . , d. In the first step, the structures of the triples are estimated and the lcas of all pairs of variables in each triple are found. For a given tree, there are d − 2 lcas that correspond to all possible pairs X q , X r , q, r = 1, . . . , d. In the second step, the pairs of variables which correspond to the same equivalence class are merged together step by step, resulting in the tree of the HAC. Two pairs of variables X q , X r and X p , X s are said to belong to the same equivalence class if they share the same lca in the tree s.
As an example, we consider the 4-dimensional HAC with the predefined structures of the triples presented in Figure 2. Consider the first triple (U 1 , U 2 , U 3 ) with the structure ((12)3). The lca of (U 1 , U 2 ) is the node D U 1 U 2 . For simplicity of notation, we write D 12 instead of D U 1 U 2 . The parent node of U 1 and U 2 is given by D 12 . The ancestor nodes of U 1 and U 2 are the nodes D 12 and D 123 . Therefore, the lca of (U 1 , U 2 ) in the structure ((12)3) is the node D 12 and the lca of (U 1 , U 3 ) is the node D 123 .
The lcas of each pair are: In the given example, the pairs (U 1 , U 2 ) and (U 3 , U 4 ) do not share lcas with any other pair. Therefore, U 1 and U 2 belong to the same equivalence class and are merged together in the first step. The same is true for the pair (U 3 , U 4 ). Consequently, it is observed that the pairs (U 1 , U 3 ), (U 1 , U 4 ), (U 2 , U 3 ) and (U 2 , U 4 ) belong to the same equivalence class and are merged together in the second step. The final structure of the copula is s = ( (12)(34)). For further examples on how the structure of an HAC can be recovered by applying the concept of an lca, we refer to Segers and Uyttendaele (2014).
In this method, the structure of the individual triples should be found first. Each triple can have a binary or a trivial structure. The structure of the triple is called trivial if all three variables are merged together in one step, and binary otherwise. Formally speaking, for each triple of variables X q , X r , X s , q, r, s = 1, . . . , d we aim to test the null hypotheses H 0 : 'the structure is trivial (q, r, s)' against H 1 : 'the structure is binary ((q, r) , s)'. Segers and Uyttendaele (2014) suggest estimating the individual triples using a rank-based method. Let K qr (w) = P{C 2 X q , X r w} be Kendall's distribution between X q and X r . Its empirical counterpart is then K qr (w) = 1 n ∑ n m=1 I w m,qr w , where 0 < w < 1, w m,qr = 1 n+1 ∑ n l=1 I x lq < x mq , x lr < x mr and I (·) is the identity function. The distance between the empirical Kendall distributions of pairs X s , X q and (X s , X r ) is defined as where w (1),ij , . . . , w (n),ij are ordered pseudo-observations of w 1,sq . . . w n,sq . Segers and Uyttendaele (2014) point out that a trivial trivariate structure usually results in three distances which are approximately the same, but a binary structure results in one small distance and two larger approximately equal distances. In order to calculate the test statistic, Segers and Uyttendaele (2014) suggest drawing K samples from the nonparametrically estimated trivariate Archimedean copula using the work of . As the present paper addresses the framework when the copula family is assumed to be known, we modify the algorithm proposed in Segers and Uyttendaele (2014) and simulate from the copula coming from a predefined class. The test statistic is simulated under the assumption that the structure is trivial, therefore, the parameter of the copula can be found by inversion of the average empirical counterpart of Kendall's τ, i.e., θ = v −1 τ avg , where τ avg = τ qr + τ qs + τ rs /3, q, r, s = 1, . . . , d. The inverse v −1 τ avg corresponds to the solution of the equation where τ ij = 2 P X i − X j Y i − Y j > 0 − 1, with (X i , Y i ) and X j , Y j are independent draws from (X, Y). For some copula functions, the integral in (8) is known in closed form as a function of θ, for example, for the Gumbel and Clayton copulae θ Gumbel (τ) = 1 1−τ and θ Clayton (τ) = 2τ 1−τ , respectively. To sum up, the modification of the algorithm of Segers and Uyttendaele (2014) which allows identifying the structure of an HAC based on Kendall's distance is summarized in Algorithm 1.
Algorithm 1 Adaptation of the algorithm of Segers and Uyttendaele (2014).
Compute δ (k) for the simulated sample k in analog to (9).

end for
Compute δ crit by taking the α = α quantile of the empirical distribution of δ (k) , k = 1, . . . , K. if δ > δ crit then reject the H 0 : the true trivariate structure is the trivial structure.

end if end for
Recover the full structure of the d-dimensional HAC from the set of ( d 3 ) triples of variables using the concept of the lowest common ancestor (lca). Return: the estimated structure of the HAC s.
The significance level of the individual tests α should be selected considering the multiple testing procedure. For the significance level of the test to be α, the significance level of the individual tests . However, this approach is not recommended for high-dimensional samples. Therefore, in the empirical part of the paper, we use the rule of thumb proposed in Uyttendaele (2016) and choose the significance level of the individual tests to be smaller or equal than the overall significance level. It is worth noting that the method of Segers and Uyttendaele (2014) is much more general as no prior specification of the copula generators is necessary and generators might differ from level to level of the hierarchy. In contrary, our method assumes that generators on all levels of the hierarchy belong to the same predefined family. However, the method proposed in Segers and Uyttendaele (2014) and its modification described in Algorithm 1 are not applicable to the case of high-frequency data because of the absence of a high-frequency estimator of Kendall's τ and Kendall's distribution. The computation of the empirical Kendall's distribution (7) involves realizations of X 1 , . . . , X d . Therefore, the estimation of a multivariate distribution of daily observations would require data of a longer time horizon in comparison to the case when the copula is parameterized by solely the realized correlation matrix. The structure and the parameters would have to be fixed within some time window, resulting in the reduced time flexibility of the estimated multivariate distribution. Moreover, Algorithm 1 employs Kendall's distance as the test statistic, which leads to large computational costs in higher dimensions.

Clustering Estimator of the Structure
We propose to proceed analogously to Segers and Uyttendaele (2014) and recover the full structure of an HAC from the set of triples of variables. The estimation of the structure of the individual triples is made using a test that, in contrast to Segers and Uyttendaele (2014), does not involve the observations themselves and is based solely on pairwise correlations.
Consider the triple X q , X r , X s and assume that the estimated distance h qr = min h qr , h qs , h rs , where h qr is defined in (6). Therefore, the variables X q and X r are merged together into the variable X q , X r in the first step. The distance between the cluster X q , X r and X s is calculated according to the complete linkage rule: Preliminary simulation studies have shown that the choice of the clustering algorithm is of minor importance. We refer to Kaufman and Rousseeuw (2005) and Hastie et al. (2009) for more details on cluster analysis.
It can be observed that the difference between merging distances h qr,s and h qr is generally bigger if the trivariate copula has a binary structure. Therefore, the measure can be chosen as the test statistic to distinguish between trivial and binary structure of a triple. To sum up, the testing procedure is performed in the following way: for each triple, it is assumed that the structure is trivial, the average correlation is computed, and inverted to the parameter of the trivial copula f −1 ρ avg according to (4). The test statistic is obtained by simulating k = 1, . . . , K samples from the trivial copula and calculating K distances ∆ h (k) according to (11). The sample size of the simulated sample corresponds to the sample size of the original sample. Finally, the empirical difference of the merging distances is compared to the quantile of the simulated one. The proposed procedure is briefly summarized in Algorithm 2.
Algorithm 2 Structure determination using cluster analysis.
for the simulated sample k according to (11).

end for
Compute h crit by taking the α = α quantile of the empirical distribution of ∆ h (k) , k = 1, . . . , K.
if ∆ h > h crit then reject the H 0 : the true trivariate structure is the trivial structure.

end if end for
Recover the full structure of the d-dimensional rHAC from the set of ( d 3 ) triples of variables using the concept of the lowest common ancestor (lca). Return: the estimated structure of the HAC s.
It is important to note that the estimation of the marginal distributions F i (·) is a trivial task, as the distribution of the high-frequency log-returns can be assumed to be Gaussian N (0, r i ), i = 1, . . . , d based on the results described in Hautsch (2011).
The left plot in Figure 3 illustrates the dendrogram of the hierarchical cluster analysis based on the distance (6) and complete linkage merging rule for a random sample of size 100 from the trivial Gumbel copula. The central part of Figure 3 shows the dendrogram for the binary trivariate Gumbel copula. It can be observed that the difference between merging distances h 12,3 − h 12 is much smaller for the trivial copula. We simulated k = 1, . . . , 100 random samples from each of the above mentioned copulae, and each time calculated ∆ h (k) according to (11). The kernel density estimate of the ∆ h based on 100 random samples is presented in the right part of Figure 3. For the given copulae, the density estimate of ∆ h for the trivial copula is more concentrated. This example only illustrates the validity of the proposed test statistic. The distance between these two distributions is influenced by the values of the parameters, and more research should be done to find the asymptotic properties of the proposed test.  Figure 3. Dendrograms for the trivial Gumbel copula C 3 (u 1 , u 2 , u 3 ; s = (123); θ = 1.4), the binary Gumbel copula C 3 u 1 , u 2 , u 3 ; s = ((12)3); θ θ θ = (1.7, 1.2) (center) and kernel density estimate of ∆ h = h 12,3 − h 12 , where h 12,3 = max h 13 , h 23 , blue for the trivial structure and green for the binary structure.

Benchmark Models
Many recent studies have addressed the question of the structure's estimation of an HAC, for example, Okhrin et al. (2013Okhrin et al. ( , 2015, Górecki et al. (2014Górecki et al. ( , 2016b and Uyttendaele (2016). Most of the studies illustrate the performance of the proposed methods by means of simulations. The consistensy of the structure's estimator still has to be addressed in the literature. Some of these studies are much more general than Algorithm 2. However, they are not applicable in the current framework, where the observations can not be directly used, as discussed in the previous section. Moreover, in the overwhelming majority of cases, the methods perform in a similar way for big samples. To illustrate the validity of Algorithm 2, it will be compared, by means of simulations, to the recursive procedure proposed in Okhrin et al. (2013) and further improved by Górecki et al. (2014). It has been implemented in the R package HAC by Okhrin and Ristig (2014). The idea of the method is to construct a binary tree by recursively merging the variables with the largest values of the estimated parameter. Subsequently, the obtained tree is collapsed using a predefined merging parameter. As is the case with many others, this method can not be applied to high-frequency data. However, it will provide an opportunity to evaluate the loss of precision and gain in computational speed when the general structure is estimated based solely on the realized correlation matrix.

Estimating the Parameters
As was mentioned in Section 2, the parameters of the copula can be estimated by the inversion of the realized correlation according to (4). However, this is usually done only for the correlation between two variables. Some generalizations for Kendall's τ have already been addressed in the literature. Nelsen (1996) discusses how the parameter of a three-dimensional binary copula can be found by inverting the average coefficient of agreement.  have described the average Kendall's τ based approach to the trivial copulae with an odd number of parameters. Górecki et al. (2016a) mention the estimation of the parameters of a binary HAC based on Kendall's τ correlation matrix and discuss a trivial extension to HAC with general structures in Górecki et al. (2016b).
We suggest following the idea of averaging the correlation coefficient ρ ij , i, j = 1, . . . , d over some given set of variables to estimate the parameters of the rHAC. The question whether the procedure based on the average realized correlation gives a valid estimate has not been addressed in the literature.
Suppose that k parameters of the HAC θ i , i = 1, . . . , k corresponding to k merging nodes need to be estimated. Let ρ (X i ) be the average correlation of the pairs of variables with the lca at node Thus, the parameter θ i of the HAC may be estimated by inverting the average correlation measure ρ (X i ), i = 1, . . . , k. For the HAC with the structure presented in Figure 1, the node associated with the parameter θ 3 = 2 is the node D 1234 . The children nodes of the node D 1234 are the nodes D 12 and D 34 . The node D 12 is associated with the parameter θ 1 = 4 and the node D 34 is associated with the parameter θ 2 = 2.5. Moreover, the node D 1234 is the ancestor for the nodes associated with the variables X 1 , X 2 , X 3 and X 4 . The lca of the pair (X 1 , X 2 ) is the node D 12 and the lca of the pair (X 3 , X 4 ) is the node D 34 . Therefore, the pairs of variables with the lca at node D 1234 are (X 1 , X 3 ), (X 1 , X 4 ), (X 2 , X 3 ) and (X 2 , X 4 ). Therefore, the average correlation corresponding to the parameter θ 3 is given by ρ (X 1 , X 2 , X 3 , X 4 ) = 1 4 {ρ 13 + ρ 23 + ρ 14 + ρ 24 }. The parameter θ 3 is estimated by inverting the mentioned above average correlation, i.e., A summary of the estimation procedure is given in Algorithm 3.

Algorithm 3 Average correlation estimator
Input: the realized correlation matrix P, the estimated structure s from Algorithm 2, parametric family of the HAC. Let θ i , i = 1, . . . , k be the set of the HAC parameters to be estimated. Let X i , i = 1, . . . , k be the set of the descendants of the node D X i ; X is the set of all variables.
Simulation studies show that the proposed estimator is asymptotically unbiased and follows a Gaussian distribution. In the case when the realized correlation is replaced by Kendall's correlation and the parameter is estimated by applying (8) to the average Kendall's τ. Let τ (U i ) be the average empirical Kendall's τ of the pairs of variables with the lca at node D U i and is defined analogically to (12). Let L i be a set of the pairs of variables with the lca at node . . , k, then the asymptotic variance of the average Kendall's τ associated with the node D U i and the parameter θ i can be estimated as and n cov{τ jl ,τ pq } → is the survival copula and |L| is the cardinality of the set L. Combined with the expression (8), this implies The estimator of the variance is a straightforward application of the result developed in .

Simulation Results
In this section, we show the validity of the clustering estimator (CE) presented in Algorithms 2 and 3 and compare it to the adaptation of the method of Segers and Uyttendaele (2014) (SU) and the approach of Okhrin et al. (2013) (OOS) which was improved by Górecki et al. (2014) and was implemented in the R package HAC by Okhrin and Ristig (2014). We compare the introduced estimator only to a couple of currently available studies and leave the recent advances discussed in, for example, Górecki et al. (2014Górecki et al. ( , 2016b, Uyttendaele (2016) and Okhrin et al. (2015) outside the scope of this study since the objective of the simulation studies is rather to answer the question whether the proposed algorithm is valid in the case of linear correlation, than to find the best possible estimator of an HAC. We are aware of the fact that the linear correlation based estimator might be not as efficient as an ML approach or a nonlinear correlation based estimator, as it contains information only about linear dependencies among the variables. However, in the framework of high-frequency data, this is so far the only possible way to proceed. Moreover, we aim to define a minimal recommended sample size.
In the current simulation study no high-frequency observations are presented. In order to compare different methods, CE is applied to the Kendall's correlation matrix and to the linear correlation matrix estimated in the usual manner over the whole sample path that corresponds to the correlation matrices of the daily log-returns. In the case of the SU estimator, the parameters are estimated by the sequential inversion of Kendall's τ. For the estimation of the structure according to Algorithms 1 and 2, we set K = 500 and α = 0.01. A full ML is applied to the structures estimated by OOS. For illustrative purposes, the 5-dimensional copulae structures presented in Figure 4 are considered. For each structure, Clayton, Gumbel and Frank copulae are analysed with the parameters corresponding to τ τ τ = (0.40, 0.25, 0.10) and τ τ τ = (0.45, 0.35, 0.25, 0.10) . The marginal distributions are assumed to be known. For each of the above mentioned estimators, we proceed as follows: a sample of size n is simulated from the copula, and the structure is estimated. If the estimated structure coincides with the true one, the parameters are estimated. The procedure is repeated m times until 200 structures are estimated correctly. Thus, the estimators of the structure are compared in terms of the proportion of correctly estimated structures 200/m. For the comparison of the estimation of the parameters, we introduce the characteristic E = θ θ θ − θ θ θ , which is the Euclidean norm of the difference between the vector of true parameters and the estimated ones. Tables 1 and 2 present the mean E, the variance Var(E) and the 25% q 0.25 (E), 50% q 0.5 (E) and 75% q 0.75 (E) quantiles of E for different structures.   Table 1 shows the simulation results for the 5-dimensional Clayton copula presented in Figure 4 with sample sizes n = 30, 50, 70, 100, 200, 300, 500, 800, 1000. The results make evident that the OOS method outperforms all the competitors for small samples for the Clayton copula with the structure s = ( (123)(45)). However, there are some outliers, which can be seen from the sample variance of E. This means that the full ML estimate had a large deviation from the true value of the parameter for a few samples. The interquantile range q 0.75 (E) − q 0.25 (E) is still smaller for the ML in small samples. The same results for the variance are observed for the CE ρ, therefore, this estimator is not recommended for small samples. In contrast, Table 2 shows that for the structure s = (((12)3)(45)), OOS is not the best method for estimating the structure in small samples. This is due to the fact that the performance of this estimator depends on the choice of the merging parameter. The results for the other copulae are presented in Appendix C and show that there is no leading method in terms of estimating the structure. The method to choose depends on the type of the copula and the values of the parameters. For a large enough sample, all the methods perform similarly. The general conclusion to be drawn for the estimation of the parameters is that the variance of the CE r estimator is the highest for small samples and that the full ML has the smallest variance, however, some exceptions are observed. It is worth noting that the simulation results are used just for comparison purposes, as the difference in the parameters influences the proportion of the correctly estimated structures more severely than does the type of the copula. Additionally, the dimension of the copula should always be taken into consideration in order to select the minimal sufficient sample size. The question of convergence of the estimator to the true structure still needs to be addressed in the literature. Table 2. Simulation results for the Clayton copula with the structure (( (12)3)(45)) and θ θ θ = (1.67, 1.07, 0.67, 0.22) . In Figure 5, we take a closer look at the individual components of θ θ θ. We compare only CE based on Kendall's correlation and the full ML, as the CE ρ and SU behave very similarly in terms of the properties of θ θ θ. It is evident that both estimators are asymptotically unbiased, however, CE has a higher variance. In addition to the kernel density estimates of CE and ML, we add a kernel density estimate of the Gaussian sample (blue line) with the mean θ θ θ and the variance estimated from (14) and observe that it coincides with the kernel density estimate of CE.  (123)(45)) and θ θ θ = (1.67, 1.33, 1.11) .
It is worth noting that the computational advantage is on the side of CE. Figure 6 shows the average computational time in seconds for all the above mentioned estimators over 100 trials. The difference in the computational time becomes crucial with growing dimensions, for example, in Segers and Uyttendaele (2014), the SU estimation of a 7-dimensional copula needs roughly 20 min versus 15 s for the proposed clustering estimator (CE).
The main conclusion of this section is that the linear correlation based clustering estimator is applicable in practice and can be applied to high-frequency data, where moderate samples are atypical.

Predicting rHAC
The model introduced in this section extends the work of Fengler and Okhrin (2016) to higher dimensions. The computationally expensive estimating procedure (5) is reduced to a set of simple tasks of the form (13). Moreover, this procedure allows avoiding the question of the optimal choice of the weighting matrix W in (5).
As mentioned in Section 2, the combination of a lemma of Hoeffding (1940) and Sklar's theorem (1) allows to express the pairwise covariances in terms of the copula and the corresponding marginal distributions. Under the assumption that the marginal distributions F i (x i , r i,t+1 ), i = 1, . . . , d, are Gaussian N (0, r i,t+1 ), the multivariate distribution of daily log-returns X t+1 | F t ∼ F (·; R t+1 ) is parametrized solely by a F t -measurable covariance matrix R t+1 . This is due to the fact that the structure s t+1 and the parameters θ θ θ t+1 of the HAC are estimated from realized correlation matrix P t+1 using Algorithms 2 and 3 and the margins are fully specified by the realized volatilities r i,t+1 , i = 1, . . . , d, i.e., where x = (x 1 , x 2 , . . . , x d ) . The prediction of the multivariate distribution of daily log-returns is based on the predicted realized covariance matrix R t+1|t obtained by the Heterogeneous Autoregressive (HAR) model introduced by Corsi (2009) and applied in the spirit of Bauer and Vorkink (2011). First, the individual elements of the realized covariance matrix are stacked together into a joint matrix. Then, the matrix logarithm A t = logm (R t ) is calculated to guarantee that the matrix is positive definite.
In the next step, the covariances are stacked into one vector a t = vech (A t ) and modeled using the logarithmic version of the HAR model: where and ε t+1 is an error term. When the coefficients in (16) are estimated using ordinary least squares, the prediction a t+1 is obtained. The prediction R t+1|t is obtained by applying the reverse vech-operator to a t+1 and taking the matrix exponential R t+1|t = expm A t+1|t . The prediction of the realized correlation matrix P t+1|t is obtained by dividing the elements of R t+1|t by the product of the square roots of the corresponding predicted realized . Since we consider only one-day-ahead prediction, we assume that the prediction bias caused by the nonlinear transformation is small and omit the bias adjustment, analogously to Chiriac and Voev (2011).
We stress once again that only the realized correlation matrix is used for the estimation procedure. The computational costs of such an estimator are low, and the rHAC model still shows excellent forecasting properties.

Competitor Models
In order to show a competitive advantage of the rHAC, we apply it to one-day-ahead VaR prediction for a multidimensional portfolio and compare the performance of the rHAC to three classes of benchmark models:

•
Rolling window copula models • Dynamic copula models -Copula DCC model Engle (2002) -Dynamic copula model by Patton (2004) -GAS, GRAS by Creal et al. (2013) and Salvatierra and Patton (2015) • Realized covariance model by Bauer and Vorkink (2011) The first class employs copula models with parameters fixed over the given time interval. The second includes dynamic copula models which assume that the parameter of the copula follows some autoregressive process. The third class, which is both popular and successful, comprises the realized volatility models. A more detailed description of the benchmark models is given in Appendix E.

Application
This section illustrates the rHAC model using high-frequency log-returns of stocks traded on the New York Stock Exchange. First, we give a description of the data used in the empirical part of this section. Thereafter, we apply the rHAC and the above mentioned competing models to one-day-ahead VaR prediction. The interpretation of the results is provided at the end of this section.

Value-At-Risk Prediction
The selected data set consists of the tick-by-tick prices of 6 assets obtained from TickData: AA (Alcoa Inc.), AXP (American Express), BAX (Baxter International Inc.), C (Citigroup Inc.), INTC (Intel Corporation) and KO (Coca-Cola Co.). The selection of the number of assets was motivated by the computational intensity of some of the competing models. A well-diversified portfolio was chosen. The selected companies represent the following industrial sectors: consumer products, technology, financial services, chemicals, health care, communications, and energy. The considered time period is from January 2005 to March 2010 which corresponds to T = 1346 trading days. This choice stems from the fact that the correlations among the log-returns increased during the financial crisis. We are interested in testing whether the rHAC model is able to capture the crashes appearing in 2008 and 2009. To answer this question, we compare the VaR level α to the exceedance ratio α = N T , where the VaR is defined as the quantile of the profit and loss (P&L) a j,t S j,t is the value of the portfolio at time t, a j,t are some weights, S j,t is the jth asset's closing price at day t, x j,t+1 is the jth asset's log-return at day t + 1, d is the number of assets in the portfolio, T is the sample size, and N = ∑ T t=1 I{l t < VaR t (α)} is the number of exceedances of the realization of distribution l t . From now on, portfolios with equal wealth allocation are considered, i.e., a j, Before applying the models, the dataset was cleaned according to Brownless and Gallo (2006), namely the quotes with normal trading conditions, positive price and volume with the timestamp within office trading hours of NYSE are used. Then, outliers have been removed according to a specific bid-ask spread rule.
After the dataset was cleaned, the log-returns were aggregated to the 1-minute frequency and the realized volatilities and correlations were obtained using the realized kernel estimator, which allows reducing the microstructure noise. More details on this estimator are given in Appendix B.
The prediction of the realized volatilities and the realized correlations is made using the HAR model (16). The realized volatilities of the selected assets and their out-of-sample predictions are given in Appendix D, Figure A1. The time series of the selected realized correlations together with the predicted values are given in Appendix D, Figure A2. The results coincide with the conclusions of Audrino and Corsi (2010), who state that the prediction of the realized correlations is more difficult than the prediction of the realized volatility due to their large variance. When the realized correlations and the realized volatilities are estimated and the forecast is made, the out-of-sample prediction of the one-day-ahead VaR at the 0.5%, 1%, 5% and 10% levels can be made using the clustering estimator according to Algorithm 4.
In the VaR modelling, it is required that the exceedances are independent and the percentage of the exceedances corresponds to the predefined VaR level. Three backtesting procedures have been used to test these properties. The first testing procedure is the unconditional coverage testing due to Kupiec (1995), which compares the exceedance ratio to the VaR level. The second procedure is the VaR duration test of Christoffersen and Pelletier (2004), which checks the independence of the exceedances. This backtesting tool is based on the number of days between the violations of the risk metric.
The dynamic quantile (DQ) test of Engle and Manganelli (2004) enables testing the two required properties simultaneously. In the most widespread version of the test, the demeaned exceedances are regressed on their first lag and the lagged values of the VaR: The null hypothesis for independence and conditional coverage is given by H 0 : γ 0 = 0 , γ 1 = 0 and γ 2 = 0.
Predict the R t+1|t using HAR, compute P t+1|t . Estimate the structure s t+1|t and the parameter vector θ θ θ t+1|t of the rHAC from P t+1|t using Algorithms 2 and 3 with α = 0.01. for i = k, . . . , 1000 do Simulate a sample u j,t+1|t from C d ·; s t+1|t , θ θ θ t+1|t , j = 1, . . . , d. Compute To verify this method, the results are compared to the benchmark models described in Section 5.2. The backtesting results of the unconditional coverage and independence tests are presented in Table 3. The p-values indicate that the copula models give more accurate prediction for the AA-AXP-BAX-C-INTC-KO portfolio, at the 0.5%, 1% and 5% levels, and do not match the 10% level quantile well. The unconditional coverage test supports both the rolling window and rHAC models. However, the independence test of Christoffersen and Pelletier (2004) speaks in favor of the rHAC model.
The time series of the P&L for the given portfolio and the corresponding VaR bounds are illustrated in Figure 7. The rHAC method has been found to be effective in handling the 1% and 0.5% quantiles, which is especially important in risk management. No models with a similar predictive power have been found. The hitting ratios of the dynamic copula and the realized covariance approaches are disappointing.  Table 3. Cont.  Figure 8. The p-values for three considered backtesting procedures can be found in Table 4. It is evident that the multidimensional realized copula model does not suffer from the curse of dimensionality, and performs satisfactorily in the sense of unconditional coverage for moderate α levels in higher dimensions. The null hypothesis of the unconditional coverage test for the Gaussian model of Bauer and Vorkink (2011) is rejected at all VaR levels.

Conclusions
The concept of the realized hierarchical Archimedean copula has been introduced. This model allows combining the flexibility of copula models with the additional information contained in high-frequency data. It has been suggested to combine the estimation procedures described in Segers and Uyttendaele (2014) and Górecki et al. (2016a) and adapt them to high-frequency data. This estimator is of particular importance in short-term financial risk management, as the structure and the parameters of the copula are estimated daily based solely on the realized correlation matrix.
Based on the simulation results, it has been concluded that the linear correlation matrix based estimator performs well for large enough samples; it is unbiased but less efficient that the full maximum likelihood estimator. However, it is less computationally intensive than benchmark models and does not suffer from the curse of dimensionality.
In the empirical part of the study, the proposed estimator has been applied to predict the VaR based on high-frequency data for two portfolios, one of 6 and the other of 17 assets. The results have been compared to the benchmark approaches including dynamic copulas and realized covariance models. Based on three tests (Kupiec, Christoffersen, DQ), it has been concluded that the VaR regions obtained by the high-dimensional realized copula models outperform the benchmark models in higher dimensions, especially for lower VaR levels. thank Professor Francesco Audrino and Professor Matthias Fengler, who gave us much valuable advice in the early stages of this work.
Author Contributions: Both authors contributed equally to the paper.

Conflicts of Interest:
The authors declare no conflicts of interest. Table A1. Archimedean copulae: Gumbel, Clayton and Frank.

Copula
Generator Distribution Parameter

Appendix B. Realized Covariance and Realized Kernel Estimator
Assume that the d-dimensional log-price process follows a Brownian semimartingale is a period corresponding to one trading day, σ t is a càdlàg volatility matrix process and W t is a d-dimensional vector of independent Brownian motions. It is important to note that the price process is superimposed by the market microstructure noise U τ i , i.e., one observes The realized covariance over the time interval [t − 1; t] is defined as the sample analog of the quadratic variation of X given by [X] t,t−1 = t t−1 Σ u du with Σ = σσ and is denoted by R t in Section 2. One of the estimators which reduces the effect of microstructure noise is the realized kernel estimator proposed by Barndorff-Nielsen et al. (2008). As the realized covariances are obtained by summing all the cross products of log-returns that have a non zero overlapping of their respective time span, the data should be synchronized first. The procedure which is called refresh time sampling and described in Hautsch (2011) is applied to synchronize the data. The first refresh time is defined as τ * 1 = max{τ 1,1 , . . . , τ d,1 } and τ * i+1 = min{τ j,k j |τ j,k j > τ * i , ∀k j = 1, . . . , N j ; j ∈ 1 . . . d}, where N j is the number of price observations for asset j. As a result, a new high-frequency vector of returns p i = P τ * i − P τ * i−1 is produced, where i = 1, . . . , n, and n is the number of the synchronized observations. The multivariate realized kernel estimator is given by where Γ h is the autocovariance matrix defined as Γ h = ∑ n j=|h|+1 p j p j−h , h ≥ 0 ∑ n j=|h|+1 p j−h p j , h < 0 , and k(y) is the Parzen kernel k(y) =      1 − 6y 2 + 6y 3 0 ≤ y ≤ 1/2 2(1 − y) 3 1/2 ≤ y ≤ 1 0 y > 1 .
The multivariate bandwidth parameter H is selected according to Barndorff-Nielsen et al. (2008).   exponential, i.e., R t+1 = expm A t+1 . The predicted realized covariance matrix is used as the input for a multivariate Gaussian distribution. Another realized volatility model which uses the Cholesky decomposition instead of the logarithmic transformation is addressed in Chiriac and Voev (2011). As it performs similarly to that of Bauer and Vorkink (2011), we do not use it in the empirical part of the study.