This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/.)
Two goodnessoffit tests for copulas are being investigated. The first one deals with the case of elliptical copulas and the second one deals with independent copulas. These tests result from the expansion of the projection pursuit methodology that we will introduce in the present article. This method enables us to determine on which axis system these copulas lie as well as the exact value of these very copulas in the basis formed by the axes previously determined irrespective of their value in their canonical basis. Simulations are also presented as well as an application to real datasets.
The need to describe the dependency between two or more random variables triggered the concept of copulas. We consider a joint cumulative distribution function (cdf)
The objective of projection pursuit is to generate one or several projections providing as much information as possible about the structure of the dataset regardless of its size. Once a structure has been isolated, the corresponding data are transformed through a Gaussianization. Through a recursive approach, this process is iterated to find another structure in the remaining data, until no further structure can be evidenced in the data left at the end. Friedman [
This paper is organised as follows: Section 2 contains preliminary definitions and properties. In Section 3, we present in details the
In this section, we recall the concept of copula. We will also define the family of elliptical copulas through a brief reminder of elliptical distributions—see
First, let us define a copula in ℝ
A ddimensional copula is a joint cumulative distribution function
The following theorem explains in what extent a copula does describe the dependency between two or more random variables.
Let
If marginal cumulative distributions are continuous, then the copula is unique. Otherwise, the copula is unique on the range of values of the marginal cumulative distributions.
First, for any copula
We set the independent copula Π as
Moreover, we define the density of a copula as the density associated with the cdf
Whenever there exists, the density of
Finally, let us present several examples of copulas (see also
The Gaussian copula
Defining
The Student copula
Defining
The Elliptical copula :
Similarly as above, elliptical copulas are the copulas of elliptical distributions (an overview is provided in
Let us first introduce the concept of
Let
Throughout this article, we will also assume that
Let
In our second step, we replace
First, to obtain an approximation of
Second, we get
Finally, the specific form of the relationship (
The main steps of the present algorithm have been summarized in
At present, let us study the following example:
Let
Since
To recapitulate our method, if
In the remaining of our study of the algorithm, after having clarified the choice of
Let
We define
For simplicity, let us assume that the family {
The very definition of
Let us assume that
Consequently, lemma C.1 and the fact that the conditional densities with elliptical distributions are also elliptical, as well as the above relationship, lead us to infer that
Now, if the family {
The end of our algorithm implies that
In summary, the following proposition clarifies our choice of
With the above notations,
More generally, the above proposition defines the cosupport of
Let
Any (
Let
Let ℙ
As defined in Section 2.2, we consider the following sequences (
The stochastic setting up of the algorithm uses
Now, from the second step and as defined in Section 2.2, we derive the fact that the density
All estimates of
Let ℙ
The stochastic setting up of the algorithm uses
And so on, we end up obtaining a sequence (
Let us now summarize the main steps of the stochastic implementation of our algorithm (the dual representation of the estimators will be further detailed in
In this paragraph, we define the set of hypotheses on
In the remaining of this section, for legibility reasons, we replace
Similarly as in chapter
(
(
(
(
Putting
(
(
(
(
Let ℛ be the class of all positive functions
There exists a vector a belonging to
Following Broniatowski [
Let
Then,
Let us also introduce the following sequences (
We also note that
Convergence Study at the
In this paragraph, we show that the sequence (
Let
Both sup_{a∈Θ} ‖
Finally, the following theorem shows that
It holds
In this paragraph, through a test of our criteria, namely
For
Note that
We have
Hence, we propose the test of the null hypothesis
Based on this result, we stop the algorithm, then, defining
Consequently, the following corollary provides us with a confidence region for the above test:
ℰ
Let
Hence, since lemma C.1 page 110 implies that
Consequently,
More generally, if
Finally, putting
With the above notations, should a sequence (
Let
Consequently, keeping the notations introduced in Section 5.1, we perform a statistical test of the null hypothesis
Since, under (
The set ℰ_{d} is a confidence region for the test of the null hypothesis (
1/If
2/If the
Let
Since, under (
Keeping the notations of Section 5.2, the set ℰ_{d} is a confidence region for the test of the null hypothesis (
(1) As explained in Section 5.2, if
(2) If the
Thus, our method enables us to determine if the copula of
Let
First, we have:
Moreover, defining
Hence, we can infer that
The following theorem explicitly describes the form of the
Defining
If there exists
At present, using relationship 5.2 and remark 5.3, the following corollary gives us the copula of
In the case where, for any
Let us examine three simulations and an application to real datasets. The first simulation studies the elliptical copula and the second studies the independent copula. In each simulation, our program will aim at creating a sequence of densities (
We are in dimension 2(=
Let us generate then a Gaussian random variable
We theoretically obtain
To get this result, we perform the following test:
Then, theorem 5.1 enables us to verify (
Results of this optimisation can be found in
Therefore, we can conclude that
We are in dimension 2(=
Let us consider a sample of 50(=
Let
We theoretically obtain
Then, theorem 5.2 enables us to verify (
Results of this optimisation can be found in
Therefore, we can conclude that
(On the choice of a
At present, we consider a sample of
where the Gumbel distribution parameters are (1, 2) and where the Laplace distribution parameters are 4 and 3. In theory, we get
Outliers = 0  Time  Outliers = 2  Time  

Relative Entropy  (0.10, 0.83) (1.13, 0.11)  30 mn  (0.1, 0.8) (0.80, 0.024)  43 mn 
(0, 0.8) (1.021, 0.09)  22 mn  (0.12, 0.79) (0.867, −0.104)  31 mn  
Hellinger distance  (0.1, 0.9) (0.91, 0.15)  35 mn  (0.1, 0.85) (0.81, 0.14)  46 mn 
Outliers = 0  Time  Outliers = 5  Time  

Relative Entropy  (0.09, 0.89) (1.102, 0.089)  50 mn  (0.1, 0.88) (1.15, 0.144)  60 mn 
(0, 0.9) (0.97, −0.1)  43 mn  (−0.1, 0.9) (0.87, 0.201)  52 mn  
Hellinger distance  (0.1, 0.91) (0.93, −0.11)  57 mn  (−0.05, 1.1) (0.79, 0.122)  62 mn 
Outliers = 0  Time  Outliers = 25  Time  

Relative Entropy  (0, 1.07) (1.1, −0.05)  107 mn  (0.13, 0.75) (0.79, 0.122)  121 mn 
(0, 0.95) (1.12, −0.02)  91 mn  (0.15, 0.814 (0.922, 0.147)  103 mn  
Hellinger distance  (−0.01, 0.95) (1.01, −0.073)  100 mn  (−0.17, 1.3) (0.973, 0.206)  126 mn 
We have worked with a calculator presenting the following characteristics :
Processor : Mobile AMD 3000+,
Memory RAM : 512 DDR,
Windows XP.
Our method, which uses the
This results from the fact that the projection index (or criteria) of
Let us for instance study the moves in the stock prices of Renault and Peugeot from January 4, 2010 to July 25, 2010. We thus gather 140(=
Let us also consider
Consequently,
Let
We first assume that there exists a vector
In order to verify this hypothesis, our reasoning will be the same as in Simulation 6.1. Indeed, we assume that this vector is a cofactor of
Numerical results of the first projection are summarized in
Therefore, our first hypothesis is confirmed.
However, our goal is to study the copula of (
In order to verify this hypothesis, we use the same reasoning as above. Indeed, we assume that this vector is a cofactor of
Therefore, our second hypothesis is confirmed.
In conclusion, as explained in corollary 5.1, the copula of
This result has been illustrated at
In the case where
Moreover, we choose the
This has also been the case in our application to real datasets.
Finally, the shape of the copula in the case of real datasets in the {
Projection pursuit is useful in evidencing characteristic structures as well as onedimensional projections and their associated distribution in multivariate data. This article clearly demonstrates the efficiency of the
Graph of the estimate of (
Graph of the independent copula estimate.
Graph of the copula of (
Graph of the copula of (
Graph of the copula of (
Proposal.
0.  We define 
 
We perform the goodnessoffit test  
• Should this test be passed, we derive  
And the algorithm stops.  
• Should this test not be verified, and should we look to approximate  
Otherwise, let us define a vector  
 
Then we replace 
Stochastic outline of the algorithm.
0.  We define 
 
Given
 
And we set
 
Then we replace

Simulation 1: Numerical results of the optimisation.
Our Algorithm  

Projection Study 0:  minimum : 0.445199 
at point : (1.0171,0.0055)  
Test:  
Projection Study 1:  minimum : 0.009628 
at point : (0.0048,0.9197)  
Test:  
3.57809 
Simulation 2: Numerical results of the optimisation.
Our Algorithm  

Projection Study 0 :  minimum : 0.057833 
at point : (0.9890,0.1009)  
Test :  
Projection Study 1 :  minimum : 0.02611 
at point : (−0.1105,0.9290)  
Test :  
1.25945 
Numerical results: First projection.
Our Algorithm  

Projection Study 0:  minimum : 0.02087685 
at point :  
PValue : 0.748765  
Test:  
K(Kernel Estimation of 
4.3428735 
Numerical results: Second projection.
Our Algorithm  

Projection Study 1:  minimum : 0.0198753 
at point :  
Test:  
K(Kernel Estimation of 
4.38475324 
Stock prices of Renault and Peugeot.
23/07/10  34.9  24.2  22/07/10  34.26  24.01  21/07/10  33.15  23.3 
20/07/10  32.69  22.78  19/07/10  33.24  23.36  16/07/10  33.92  23.77 
15/07/10  34.44  23.71  14/07/10  35.08  24.36  13/07/10  35.28  24.37 
12/07/10  33.84  23.16  09/07/10  33.46  23.13  08/07/10  33.08  22.65 
07/07/10  32.15  22.19  06/07/10  31.12  21.56  05/07/10  30.02  20.81 
02/07/10  30.17  20.85  01/07/10  29.56  20.05  30/06/10  30.78  21.07 
29/06/10  30.55  20.97  28/06/10  32.34  22.3  25/06/10  31.35  21.68 
24/06/10  32.29  22.25  23/06/10  33.58  22.47  22/06/10  33.84  22.77 
21/06/10  34.06  23.25  18/06/10  32.89  22.7  17/06/10  32.08  22.31 
16/06/10  31.87  21.92  15/06/10  32.03  22.12  14/06/10  31.45  22.2 
11/06/10  30.62  21.42  10/06/10  30.42  20.93  09/06/10  29.27  20.34 
08/06/10  28.48  19.73  07/06/10  28.92  20.15  04/06/10  29.19  20.27 
03/06/10  30.35  20.46  02/06/10  29.33  19.53  01/06/10  28.87  19.45 
31/05/10  29.39  19.54  28/05/10  29.16  19.55  27/05/10  29.18  19.81 
26/05/10  27.5  18.5  25/05/10  26.76  18.08  24/05/10  28.75  18.81 
21/05/10  28.78  18.82  20/05/10  28.53  18.84  19/05/10  29.49  19.25 
18/05/10  30.95  19.76  17/05/10  30.92  19.35  14/05/10  31.35  19.34 
13/05/10  33.65  20.76  12/05/10  33.63  20.52  11/05/10  33.38  20.34 
10/05/10  33.28  20.3  07/05/10  31  19.24  06/05/10  32.4  20.22 
05/05/10  32.95  20.45  04/05/10  33.3  21.03  03/05/10  35.58  22.63 
30/04/10  35.41  22.45  29/04/10  35.53  22.36  28/04/10  34.75  22.33 
Stock prices of Renault and Peugeot.
27/04/10  36.2  22.9  26/04/10  37.65  23.73  23/04/10  36.72  23.5 
22/04/10  34.36  22.72  21/04/10  35.01  22.86  20/04/10  35.62  22.88 
19/04/10  34.08  21.77  16/04/10  34.46  21.71  15/04/10  35.16  22.22 
14/04/10  35.1  22.22  13/04/10  35.28  22.45  12/04/10  35.17  21.85 
09/04/10  35.76  21.9  08/04/10  35.67  21.67  07/04/10  36.5  21.89 
06/04/10  36.87  22  01/04/10  35.5  21.97  31/03/10  34.7  21.8 
30/03/10  34.8  22.24  29/03/10  35.7  22.73  26/03/10  35.54  22.58 
25/03/10  35.53  22.73  24/03/10  33.8  21.82  23/03/10  34.1  21.58 
22/03/10  33.73  21.64  19/03/10  34.12  21.68  18/03/10  34.44  21.75 
17/03/10  34.68  21.98  16/03/10  34.33  21.88  15/03/10  33.57  21.53 
12/03/10  33.9  21.86  11/03/10  33.27  21.58  10/03/10  33.12  21.47 
09/03/10  32.69  21.54  08/03/10  32.99  21.66  05/03/10  32.89  21.85 
04/03/10  31.64  21.26  03/03/10  31.65  20.7  02/03/10  31.05  20.2 
01/03/10  30.26  19.54  26/02/10  30.2  19.39  25/02/10  29.42  18.98 
24/02/10  30.9  19.49  23/02/10  30.54  19.74  22/02/10  31.89  20.06 
19/02/10  32.29  20.67  18/02/10  32.26  20.41  17/02/10  31.69  20.31 
16/02/10  31.08  19.8  15/02/10  30.25  19.66  12/02/10  29.56  19.57 
11/02/10  31  20.4  10/02/10  32.78  21.21  09/02/10  33.31  22.31 
08/02/10  32.63  21.95  05/02/10  32.15  22.33  04/02/10  33.72  22.86 
03/02/10  35.32  23.93  02/02/10  35.29  23.8  01/02/10  35.31  24.05 
29/01/10  34.26  23.64  28/01/10  33.94  23.31  27/01/10  33.85  23.88 
26/01/10  34.97  24.86  25/01/10  35.06  24.35  22/01/10  35.7  24.95 
21/01/10  36.1  25  20/01/10  36.92  25.35  19/01/10  38.4  25.81 
18/01/10  39.28  25.95  15/01/10  38.6  25.7  14/01/10  39.56  26.67 
13/01/10  39.49  26.13  12/01/10  38.36  25.98  11/01/10  39.21  26.65 
08/01/10  39.38  26.5  07/01/10  39.69  26.7  06/01/10  39.25  26.32 
05/01/10  38.31  24.74  04/01/10  38.2  24.52 
All the demonstrations of this article have been gathered in the Technical Report [
There exists many copula families. Let us here present the most important amongst them.
The Gaussian copula can be used in several fields. For example, many credit models are built from this copula, which also presents the property to make extreme values (minimal or maximal) independent in the limit; see Joe [
Let us begin with defining the class of elliptical distributions and its properties—see also Cambanis [
where
where
with
(1) For any
Therefore, any marginal density of multivariate elliptical distribution is elliptical,
Landsman [
Let
Consider two Gaussian densities
Finally, let us introduce the definition of an elliptical copula which generalizes the above overview of the Gaussian copula:
Elliptical copulas are the copulas of elliptical distributions.
These copulas exhibit a simple form as well as properties such as associativity. They also present a variety of dependent structures. They can generally be defined under the following form
Let us now present several examples:
The Clayton copula is an asymmetric Archimedean copula, displaying greater dependency in the negative tail than in the positive tail. Let us define
And its generator is:
For
The Gumbel copula (GumbelHougard copula) is an asymmetric Archimedean copula, presenting greater dependency in the positive tail than in the negative tail. This copula is given by:
The Frank copula is a symmetric Archimedean copula given by:
In 2005, Alfonsi and Brigo [
Let us call
We define a
The most used distances (Kullback, Hellinger or
with the KullbackLeibler divergence, we associate
with the Hellinger distance, we associate
with the
more generally, with power divergences, we associate
and, finally, with the
Let us now expose some wellknown properties of divergences.
We have
The divergence function
Finally, we will also use the following property derived from the first part of corollary (1.29) page 19 of Friedrich and Igor [
If
For any
We have
Should there exist a family (
Whenever there exists
For any continuous density
Let
Let us consider
Our objective is to estimate the minimum of
Let us consider now a positive sequence
We then generate
The vectors meeting these conditions will be called
Consequently, the next proposition provides us with the condition required to derive our estimates:
Using the notations introduced in Broniatowski [
With the KullbackLeibler divergence, we can take for
Not all hypotheses will be used simultaneously.
Hypotheses (
As shown by the below subsection for relative entropy, hypothesis (
Hypotheses (
Hypothesis (
Hypothesis (
Let us work with the KullbackLeibler divergence and with
For all
This hypothesis consists in the following assumptions:
(0) We work with the KullbackLeibler divergence,
(1) We have
Shows that
Thus, our hypothesis enables us to derive
Shows that
Thus, our hypothesis enables us to derive
We can consequently conclude as above.
Let us now verify (
We have
Thus, the preliminary studies (