A New Kernel Estimator of Copulas Based on Beta Quantile Transformations

: A copula is a multivariate cumulative distribution function with marginal distributions Uniform ( 0,1 ) . For this reason, a classical kernel estimator does not work and this estimator needs to be corrected at boundaries, which increases the difﬁculty of the estimation and, in practice, the bias boundary correction might not provide the desired improvement. A quantile transformation of marginals is a way to improve the classical kernel approach. This paper shows a Beta quantile transformation to be optimal and analyses a kernel estimator based on this transformation. Furthermore, the basic properties that allow the new estimator to be used for inference on extreme value copulas are tested. The results of a simulation study show how the new nonparametric estimator improves alternative kernel estimators of copulas. We illustrate our proposal with a ﬁnancial risk data analysis.


Introduction
Based on the kernel method and transformations, we present a new nonparametric estimator of a multivariate copula that improves the empirical copula and the most prominent kernel estimators (see reference [1], for a detailed review). We use the new estimator to analyse and test the extreme value dependence between the losses in the Spanish stock market index and different stock market indexes of Europe, USA and China.
The copula model allows us to represent the dependence structure of a multivariate random vector of continuous variables X = X 1 , ..., X J , which combines with marginal distributions to give the multivariate distribution. This idea was established in the fundamental theorem proposed by Sklar [2]. This theorem shows that a multivariate cumulative distribution function (cdf) H of the random vector X, with marginal distributions functions F 1 , ..., F J , has associated a copula C, so that: In practice, the dependence structure and marginal distributions are unknown and both will need to be fitted. We assume that marginal cdfs can be easily adjusted using parametric distributions or nonparametric methods and we focus on the fitting of dependence structure using a copula. It is often difficult by visualizing the data to select the appropriate dependency structure and, therefore, the right copula model. Alternatively, a nonparametric estimation of a copula can be obtained whose results can be used for estimating joint probabilities or for testing the adequacy of a copula family, for example, the extreme value copula family. In this paper, these two aims of our new nonparametric estimator are analysed through a simulation study.
Because there are a lot of dependence structures represented by different copulas families, specific tests for choosing the best copula are useful. The approach for developing a test for the adequacy of copulas takes its lead from, for example, the proposal of Genest and Rivest [3] for bivariate Archimedean copulas; the test of Scaillet [4] on inference for the positive quadrant dependence hypothesis; the test for equality between two copulas of [5] or the test of symmetry for bivariate copulas of Genest et al. [6].
On inference for extreme value copulas, alternative types of tests have been proposed, among which the most well known are the test of Genest et al. [7] based on a Cramérvon Mises statistic, the test analysed by Ghorbal et al. [8] based on an U-statistic and the test of Kojadinovic et al. [9] that uses the max-stable property and is also based on a Cramér-von Mises statistic (see also [10] for complete properties of the test based on max-stable property).
The test proposed by Kojadinovic et al. [9] is based on the empirical copula that is equivalent to the multivariate empirical distribution. However, the empirical copula is inefficient for certain shapes of distribution, for example, when the marginal cdfs are associated with extreme value distributions. Alternatively, Omelka et al. [1] analyse how testing extreme value copula can be based on different kernel estimators. The main difficulty of a classical kernel estimator is its bias on the boundaries when the function values at these points are positive. Based on this concern, Chen and Huang [11] analyse the kernel copula estimator with local linear boundary correction which the authors proved reduces bias and variance. Alternatively, Omelka et al. [1] propose the transformation of a kernel copula estimator based on standard normal inverse distribution function transformations, which is very easy to implement and has the same weak convergence properties as the previous proposal. In this paper, an improved transformed kernel estimator is proposed that has the same weak convergence properties and is useful for the inference on extreme value copulas. The theoretical results are shown for the bivariate case, but they are easily extrapolated to the multivariate case.
In Section 2, we present the background on kernel estimation of copulas, the new estimator and its theoretical asymptotic properties for testing the max-stable property of extreme value copulas. Section 3 presents the simulation results that allow us to analyse finite-sample properties and inference errors type 1 and 2. As an illustration, in Section 4, a financial risk analysis is carried out where the extreme value copula family hypothesis between the Spanish stock market index and different neighbouring and non-neighbouring countries is tested. Finally, we conclude in Section 5.

Kernel Estimation of Copulas
Let (X i1 , X i2 ) , ∀ i = 1, ..., n, be a sample of n independent and identically distributed (i.i.d.) bivariate data, the product kernel estimator of the bivariate cdf can be expressed as: where K is the cdf associated with the kernel function k, that is a bounded or asymptotically bounded and symmetric probability density function (pdf) (see [12,13] for a review on kernel estimation of the multivariate distribution function). Examples of such functions are the Epanechnikov and the Gaussian kernels (see [14]). The parameters b 1 > 0 and b 2 > 0, known as the bandwidths or smoothing parameters, control the smoothness of the estimation. Thus, the larger the value of b 1 and b 2 , the smoother the resulting function. Their values depend on the sample size n-the biggest sample size n, the lower the smoothing parameters-but obtaining optimal values for these smoothing parameters is one of the greatest difficulties posed by the kernel estimation. Based on Slark's theorem, from (2) we specify the kernel estimator of copula as: where F j (x j ), j = 1, 2, are estimators of the marginal cdfs that, in practice, can be obtained based on a parametric distribution or with a non-parametric estimator. Given that the copulas allow us to separate dependence structure from marginal distribution, we focus on estimating the first; so, the aim is to estimate a multivariate cdf with Uni f orm(0, 1) marginal distributions, whose kernel estimator for bivariate case is expressed as: where, unlike (2), given that the marginal distributions are Uni f orm(0, 1), we assume b 1 = b 2 = b and b → 0 as n → ∞, taking into account the relationship between b and n, hereinafter we denote it as b n . In practice, we need to define observations (U i1 , U i2 ) , ∀ i = 1, ..., n, the values of the marginal empirical distributions U ij = 1 n ∑ n k=1 I X kj ≤ X ij , j = 1, 2 and i = 1, ..., n, are a natural choice. However, it is known that empirical distribution takes value 1 at the maximum value observed and most of the commonly used copulas (Gumbel, Clayton, Gaussian and Student's t) are not finite derivatives (copula density values) at corners (0 ); then, these empirical distributions are replaced by corrected versions that are known as pseudo-data and that can be defined as:Û ij = 1 n+1 ∑ n k=1 I X kj ≤ X ij or, as Chen and Huang [11] suggested, U ij = 1 n ∑ n k=1 I X kj ≤ X ij − 1 2n . So, the kernel estimator of a copula is defined as: To obtain the estimator defined in (5) a kernel function, K, needs to be selected that will have minimal effect on the results obtained, and to calculate the bandwidth b n , whose value will have an important effect on the estimated copula. The bandwidth b n can be calculated using some cross-validation or plug-in method or using the rule-of-thumb proposed by Silverman [14] for the kernel estimator of pdf adapted to the kernel estimator of cdf (see [12,15]).
The properties of a kernel estimator depend on some smoothness characteristics of the cdf; in our context in particular, it is a requirement that the first two derivatives take finite values different from zero. Furthermore, when the distribution has a bounded domain and the density at boundary takes positive values, as in the case of the bivariate copula with domain on [0, 1] 2 , the estimator defined in (5) has boundary bias. This means that the kernel estimator at boundary is not consistent (see [16] pp. 46-47, for a clear description in the kernel density estimator context). This is problematic since our aim is to test if our data is generated by an extreme value copula. There are three alternative proposals to achieve consistency at boundary of a kernel estimator of a copula. Boundary kernel methods are the most common techniques proposed in the context of kernel regression and density estimation (see [17,18]), the main difficulty with the use of this type of kernel being that it does not integrate one which, in practice, could be inconvenient. Chen and Huang [11] proposed a kernel estimator of copulas with linear boundary correction, the weakness of their method is that for many common families of copulas (e.g., Clayton, Gumbel, Gaussian and Student's t) the bias at some of the corners of the unit square is only of order O(b n ), versus the O(b 2 n ) that is reached in the central values of the domain, where O(·) is the asymptotic order operator. Another way to correct boundary bias is using the mirror-reflection kernel estimator, this method being proposed by Gijbels and Mielniczuk [19] to estimate the density of the copula. In all cases, the main difficulty of a kernel estimator with or without boundary bias is calculating the smoothing parameter whose value will greatly affect the results.
An alternative strategy to avoid boundary bias and to calculate the smoothing parameter easily is to transform Uni f orm(0, 1) marginal distributions of the copula so that the kernel estimator of the new marginal distributions does not have boundary bias and their shapes allow us to minimise the bias of the kernel estimator. This idea also addresses the problems of the estimator defined in (3) based on the original scale of the data. On the one hand, although the marginal distributions are not uniform, they can have shapes that could also be subject to inconsistency at the boundaries, i.e., the distribution could have bounded domain on one or both sides with positive density. On the other hand, the problems associated with the kernel estimator defined in (3) are widely known when the distribution to estimate has one or two long tails (see [20][21][22]).
The transformed estimator of the copula is based on the equality: i.e., the values of the copula function C evaluated on original Uni f orm(0, 1) scale are equal to the values of function C T evaluated on transformed scale. So, the transformed kernel estimator (TKE) of a copula is defined as: where T(·) is a transformation which is equal to the inverse of a given continuous cdf. The estimator defined in (6) has a fundamental advantage over the kernel estimator defined in (5) and its versions that incorporate boundary bias reduction; given that the function T(·) is the inverse of a given cdf, we know the marginal distributions of C T and the bandwidth can be calculated based on these distributions. Omelka et al. [1] proposed that T = Φ −1 , where Φ is the cdf of the standard normal distribution. This standard normal transformation is based on the idea that the normal distribution does not have boundary bias problems and it can be estimated easily using a classical kernel estimator. This transformed estimator is called Gaussian transformed kernel estimator and is defined as: In practice, in this case the value of bandwidth can be calculated using the idea of rule-ofthumb of Silverman [14] applied to the standard normal marginal cdfs, that is b n = 3.572n − 1 3 . In the simulation study presented in Section 3, we show the difference between the mean integrated squared error of a copula, MISE = 1 0 2 du 1 du 2 , using optimal b n and using the proposed rule-of-thumb. We propose an alternative estimator to the one defined in (7) using a transformation T that is better than Φ −1 . Our proposal is based on the second-order approximation properties of univariate kernel estimator of marginal distributions. When b n → 0 as n → ∞, f is a continuous pdf and the first derivative f exists, the bias and variance of kernel estimator of cdf are (see [15,23,24]): and By addition of the integrated variance and the integrated squared bias, we can approximate the MISE of the kernel estimation of marginal distributions as: where the integral limits are given by the domain of argument variable Y. From expression (10) it is easy to deduce that the distribution that minimises MISE also minimises the functional [ f (y)] 2 dt = [F (y)] 2 dt. Terrell [25] found the pdf family that minimises the functionals of type f (p) (y) 2 dy, where p is the order of the derivative. This principle was applied to cdf and quantile kernel estimation by Alemany et al. [21], who showed how the Beta(3, 3), whose pdf and cdf are: minimises the functional F j t j 2 dt j , j = 1, 2, and therefore minimises the integrated bias of the classical kernel estimator of a cdf. Section 2 includes the theoretical results on testing extreme value copulas, and Theorem 1 shows as the cdf M has the properties that allow us to conclude that the kernel estimator of Beta(3, 3) does not have boundary bias (see [1]). So, the Beta transformed kernel estimator of a copula is: where b n can be calculated using rule-of-thumb applied to the Beta(3, In the simulation study shown in Section 3, the MISE calculated with this bandwidth is compared to the one that minimises MISE. Next, we present some theoretical results related to the weak convergence to a Gaussian process G of the estimator defined in (12) and the max-stable property for testing extreme value copulas.

Theoretical Results
We use the result from Fermanian et al. [26] for the weak convergence of the kernel estimator of a copula defined in (5) to a Gaussian process G in the space of all bounded real-valued functions on [0, 1] 2 , i.e., l ∞ ([0, 1] 2 ), which is expressed as follows: where ∂ j C(u 1 , u 2 ), j = 1, 2, are the partial derivatives of the function C with respect to u j , −→ indicates weak convergence and B is a Brownian bridge on [0, 1] 2 with covariance function: where ∧ is the minimum. The weak convergence defined in (13) requires that the copula has continuous partial derivatives. Furthermore, Omelka et al. [1] proved the weak convergence of local linear, mirror reflection and Gaussian transformed kernel estimators of copula. These authors remark that it is sufficient to assume that the first partial derivatives are continuous on (0, 1) 2 , i.e., we can eliminate the corners. This is an important result, given that most of the commonly used copulas (Clayton, Gumbel, Normal and Student's t) do not have finite partial derivatives at the corners.
The weak convergence of our Beta transformed kernel estimator is defined in the following theorem. Theorem 1. Let us suppose a continuous copula C, with continuous first order partial derivatives and bounded second order partial derivatives on (0, 1) 2 that satisfies the following asymptotic properties: the Beta transformed kernel estimator C B meets the weak convergence defined in (13).
Proof of Theorem 1. Let J(t) = T −1 (t) be the inverse transformation function in C T , the proof of Theorem 1 comes directly from the results of Theorem 2 in Omelka et al. [1], who proved that, if the first derivative J (t) and [J (t)] 2 J(t) are bounded, then C T converge weakly to the Gaussian process G. The weak convergence of Theorem 1 allows us to use C B for the inference on copulas. We focus on an extreme value copula test based on the proposal of Kojadinovic et al. [9], that analyses the max-stable property associated with this family of copulas (see, for example, [27]). A copula is max-stable if ∀r > 0 and ∀u 1 , u 2 in [0, 1] the null hypothesis H r 0 : C(u 1 , u 2 ) = C r (u 1/r 1 , u 1/r 2 ) is not rejected from the alternative H r 1 : C(u 1 , u 2 ) = C r (u 1/r 1 , u 1/r 2 ). In practice, we test the max-stable hypothesis using some values of r ≥ 1 (see [9]), H 0 : r≥1 H r 0 H 1 : r≥1 H r 1 . To test the previous hypotheses we propose estimating D r (u 1 , u 2 ) = √ n C(u 1 , u 2 ) − C r u 1/r 1 , u 1/r 2 using the Beta transformed kernel estimator of the copula, i.e., D r (u 1 , u 2 ) = √ n C B (u 1 , u 2 ) − C B r u 1/r 1 , u 1/r 2 .
Proof of Proposition 1. The result in Proposition 1 is obtained from: Using the convergence of Theorem 1: We now need to prove the weak convergence of √ n C B r (u 1/r 1 , u 1/r 2 ) − C r (u 1/r 1 , u 1/r 2 ) . To this end, we use the result of Kojadinovic et al. [9], that proved the weak convergence of this difference for empirical copula (see also [10]). In general, this result can be directly extrapolated to the kernel estimator and, in particular, to the Beta transformed kernel estimator, considering that C r u 1/r 1 , u 1/r 2 = C B r M −1 (u 1/r 1 ), M −1 (u 1/r 2 ) . Then, under H 0 , D r (u 1 , u 2 ) = 0, D r (u 1 , u 2 ) it weakly converges to process (14).
For hypothesis testing given a fixed r, we use a Cramér-von Mises statistic: and for a range of values r 1 , ..., r t , the following statistic can be considered: For implementing the test based on S r 1 ,...,r t , we use the numerical approximation proposed by Kojadinovic et al. [9], replacing the empirical copula by a Beta transformed kernel estimator of the copula. The procedure is as follows: 1.

2.
R independent copies of D r l , D r l ,(1) , . . . , D r l (R) are generated, such that where D r l ,(1) , . . . , D r l ,(R) are independent copies of D r l . The process of obtaining these independent copies of D r l is described in Appendix A.

3.
To calculate the copies of S r l as S r l ,(k) = 1 m ∑ m j=1 D r l ,(k) (u j1 , u j2 ) and to obtain the p-value of the statistics as: 1 R R ∑ s=1 I( S r l ,(s) ≥ S r l ).

Simulation Study
We summarise the results of our simulation study, we aim to evaluate the finite sample properties of our Beta transformed kernel estimator in (12), and compare it with the empirical copula, with the classical kernel estimator in (5) and with the Gaussian transformed kernel estimator in (7). We also obtain some results using boundary kernel, but the computational times are longer and in our simulation study we do not achieve better results than those obtained with classical kernel.
We show two types of results; in the former, the errors between the estimations and true copulas are compared and, in the latter, the differences between the extreme value copula tests obtained with the empirical copula and with the Beta transformed kernel estimator are analysed.

Analysing the Errors of Kernel Estimators
To carry out the study, we simulate 500 samples of size n = 50 and n = 500 from different family and parameters of copulas that are indicated in the tables with the simulation results shown in this section. The alternative estimators are compared approxi- 2 du 1 du 2 using a grid uniformly spaced in 99 × 99 points on (0, 1) 2 . The Epanechnikov kernel is used in all cases. Furthermore, bandwidth b n needs to be calculated, its value has an important impact on the results. Sometimes, the calculation of b n requires long optimization processes based on leave-one out estimators (see [16] for a review on kernel density estimation). Alternatively, the rule-of-thumb similar to the proposal of Silverman [14] can be used; however, to calculate the rule-of-thumb smoothing parameter we would need to use a parametric copula with a given parameter. A direct alternative, based on the product kernel estimator, consists of using the rule-of-thumb based on independent marginal reference distribution. The difficulty with Uni f orm(0, 1) marginal is that F j (t) 2 dt = 0, j = 1, 2, and the rule-of-thumb smoothing parameter based on the proposal of Silverman [14] can not be calculated. In this case, to use the standard normal bandwidth is an easy solution (see [20] for an example on kernel density estimation of Uni f orm(0, 1) transformed data). In our simulation study, two types of results are shown, on the one hand, the obtained with the rule-of-thumb bandwidth based on standard normal distribution for kernel and Gaussian transformed kernel estimator and based on Beta(3, 3) for Beta transformed kernel estimator. On the other hand, the obtained optimizing the approximate MISE on a grid the values for b n .
To facilitate the interpretation of the results, we calculate the quotient between the MISE obtained for each kernel estimator and the one obtained using the empirical copula. The reference values of MISE for the empirical copula are shown in Table A1 in Appendix B. Tables 1 and 2 contain, respectively, the quotients for the analysed elliptical and archimedean copulas. These results show how, using the adequate smoothing parameter, the analysed kernel estimators improve the empirical copula. Focusing on archimedean copulas in Table 1, it can be seen that, if the optimal smoothing parameter is used, in all cases the best results are obtained with the Beta transformed kernel estimator ( C B ). With the optimal bandwidth, the Gaussian transformed kernel estimator ( C G ) only slightly improves the classical kernel estimator ( C) for Gumbel with dependence parameter equal to 3 and 4, i.e., for the most extreme value dependence copulas. In Table 2, for elliptical copulas, the results with optimal smoothing parameter are similar, C B is the best and C G improves C when the data is generated by the most extreme value copulas, the Student's t with dependence parameter equal to 0.9.
In practice, we will not know what the optimal smoothing parameter is, and having estimators that allow this parameter to be obtained in a direct and simple way is essential. As shown in Tables 1 and 2, C B and C G have this characteristic; in both cases the results with the rule-of-thumb smoothing parameter are near the optimal results and the lowest MISEs are obtained with C B .

Test for Extreme Value Copula
We show the results of a reduced simulation study (to avoid long computing periods) that allows us to compare type 1 and 2 errors of the extreme value copula test on smallersized samples, calculated from the empirical copula (proposed by [9]) and from the Beta transformed kernel estimator ( C B ) proposed in this paper, where the optimal bandwidth is used. For this experiment, we use 100 samples of size n = 50. Both tests are implemented using a uniformly spaced 99 × 99 points on (0, 1) 2 for calculating S r and with K = 100 estimated copies of D r l ,(k) , k = 1, ..., 100. The results for type 1 errors are shown in Table 3 for theoretical extreme value copulas with which H 0 is true. Table 4 shows the type 2 errors for copulas with non-dependence in extreme. They indicate that using C B we reduce type 1 error at the cost of increasing type 2 error, although for n = 50 this error is already high for the test based on the empirical copula. In general, a larger-size sample is required for reducing type 2 error.
The results in Table 3 imply that, if the data is generated by an extreme value copula, C B based test practically ensures a correct result. This is fundamental in the context of risk quantification, given that the consequences of not detecting extreme dependency could be more serious than those of not detecting the opposite, i.e., non-dependence in extreme. Table 3. Error type 1 calculated with different significant levels α.

Data Analysis
For illustrating the usefulness of our proposed estimator C B , we analyse the dependence between the Spanish stock market index (IBEX35) and the stock market indexes of some neighbouring European countries, namely Germany (DAX) , France (CAC40), Italy (FTSE MIB), Portugal (PSI20) and United Kingdom (FTSE100) as well as the two principal stock market indexes of the USA (DOWJONES and S&P500) and the Hong Kong stock market index (HANG SENG) (see [28] for an analysis of extreme dependence between markets). Two types of results are shown: 1.
The fit of non parametric copulas to estimate the probability that the observed losses of two stock market indexes together exceed some percentiles, i.e., we estimate the value of 1 − C(q, q), q = 0.9, 0.95, 0.99, 0.995, with the analysed kernels estimators.

2.
The test to analyse if the data is generated by an extreme value copula.
To carry out the analysis we use a database of the monthly losses of the stock market indexes from January 2000 to March 2021. These losses are calculated from the quotes of the analysed indexes that are public and can be downloaded, for example, from Investing.com. Throughout the period analysed, three major events influenced market performance leading to higher losses than in periods of stability: the Lehman Brothers crisis that began in September 2008, the referendum on Brexit on 23 June 2016, and the ongoing COVID-19 crisis which started in March 2020. The three events are considered systemic risks that affect all markets and, if this effect is simultaneous, the data should be generated by an extreme value copula. In Table 5, the main descriptive statistics of the losses in percentage are shown. Furthermore, normality tests and a positive skewness test are carried out and, in all cases, normality hypothesis is rejected and skewness greater than zero can not be rejected, i.e., in absolute value, positive losses are bigger that negative ones. The losses of the Spanish stock market index are plotted together with the indexes of the countries listed for comparison. In Figure 1, we compare Spain (in blue) with four countries that also currently belong to the European Union (in black) and, Figure 2, the comparison is made with the other countries (in black).  To obtain the Beta transformed kernel estimator we need the data to be i.i.d. so, with this in mind, we analyse if the the monthly losses of the stock market indexes have some kind of time dependence on the mean or on the variance. The simple and partial autocorrelation functions of the series and the square series allows us to find the ARMA(p, q) − GARCH(P, Q) model used to filter series and to get i.i.d. data (see, for example, [29]). The filter models used are shown in Table A2 of Appendix C.
In Table 6, we show the results of 1 − C(q, q) for q = 0.9, 0.95, 0.99, 0.999 estimated with C B , i.e., the probability of jointly exceeding a given extreme quantile. The upper tail dependence can be approximated as . In Appendix C, we show that the results obtained with the empirical copula and C G provide lower values than C B . It should be noted that the empirical copula tends to underestimate the probability of the tail when extreme values exist. Furthermore, in the simulation study C B improves C G for all the compared copulas. We obtain the results of the extreme value copula test of Kojadinovic et al. [9] based on the empirical copula, and the same test based on the Beta transformed kernel estimator that is analysed in this paper, using the asymptotically optimal smoothing parameter b n = 3 1 3 n − 1 3 and a grid of 4 values around it. As expected, all the results indicate that all the analysed bivariate data have a dependence structure generated by an extreme value copula. This behaviour has been accentuated by the COVID-19 crisis, which has led to greater losses and a systemic risk that lasts over time (see [30,31] for a review on effect of COVID-19 on markets returns and volatility). In Figures 3 and 4, pairs of pseudo-data are plotted; in all cases some accumulation of points is detected near the corner [1,1], which is an indicator of extreme value dependence.

Conclusions
A new kernel estimator of a copula based on a transformation is analysed. The asymptotic theoretical properties that allow it to be used for inference are proved. A simulation study shows that the proposed estimator improves the alternative estimator in the most common copulas.
The new estimator allows us to reduce the type 1 error associated with the extreme value copula test, while the type 2 error increases slightly. A future line of work would be to investigate how to reduce the type 2 error with a small sample size of the tests based on the max-stable property.
The financial data analysis shows that the new estimator is useful for the risk analysis linked to the dependence between stock markets.

Appendix C. Application
Results of application. Table A2 shows ARMA-GARCH models and Table A3 shows extreme probabilities estimated with the empirical copula and Gaussian transformed kernel estimator.