Multivariate Extension of Raftery Copula

: This paper introduces a multivariate extension of Raftery copula. The proposed copula is exchangeable and expressed in terms of order statistics. Several properties of this copula are established. In particular, the multivariate Kendall’s tau and Spearman’s rho, as well as the density function, of the suggested copula are derived. The lower and upper tail dependence of the proposed copula are also established. The dependence parameter estimator of this new copula is examined based on the maximum likelihood procedure. A simulation study shows a satisfactory performance of the presented estimator. Finally, the proposed copula is successfully applied to a real data set on black cherry trees.


Introduction
The bivariate exponential distribution has been receiving more attention in research studies for many decades [1].It has been applied in a variety of statistical practices, such as reliability theory, queuing theory and physics.
In [19], Raftery has introduced another class of bivariate exponential distribution that shows an interesting interpretation in physics.Unlike the Marshall-Olkin model, Fréchet's upper bound belongs to the class of Raftery's bivariate exponential distributions, which allows the correlation to be modeled broadly.Moreover, this distribution is continuous without any singular part, which makes the dependence parameter estimation more tractable [20].
As outlined in [19], there are several versions of Raftery bivariate and multivariate exponential distributions.An important version of these kinds of distributions is defined as follows.Consider Z 1 , Z 2 , and Z 3 identical and independent exponential random variables with parameter λ.Assume J is a Bernoulli random variable with parameter θ ∈ (0, 1) assumed independent of Z 1 , Z 2 and Z 3 .The random pair (X, Y) defined by has a Raftery bivariate exponential distribution with parameters λ > 0 and θ ∈ (0, 1).As shown in [19], the marginal random variables X and Y are exponentially distributed with parameter λ.This is a simple and efficient model in the context of its ability to model the full range of positive correlations using only one dependence parameter, namely θ ∈ (0, 1).In fact, it is easily seen that the model includes the bivariate Fréchet distribution when θ tends to 1 and reaches the independence case when θ tends to 0. Furthermore, it could be seen that the random vector (X, Y) is exchangeable, so the model can only be used to describe an exponential random pair with the same marginal distributions.
To avoid this limitation and considering [7], one can adopt the comonotonic shocks method introduced in [9] in order to adapt Model (1) for no exchangeable random pair (X, Y).The idea behind this method is to replace the common shock Z 3 with the pair of shocks (Z 3 , Z 4 ), such that Z 3 and Z 4 are perfectly positively dependent and exponentially distributed with parameters λ 1 and λ 2 , respectively.In other words, the distribution of the random vector (Z 3 , Z 4 ) is the upper Fréchet bound expressed by M(x, y) = min{1 − exp(−λ 1 x), 1 − exp(−λ 2 x)}.This means that there exists an exponential random variable Z with parameter 1 such that Z 3 = λ 1 Z and Z 4 = λ 2 Z.To define the proposed alternative model, let Z 1 , Z 2 and Z be exponential random variables with parameters λ 1 , λ 2 and 1, respectively.Let J be Bernoulli random variable with parameter θ ∈ (0, 1) and presume that the random variables Z 1 , Z 2 , Z and J are independent.Hence, the suggested bivariate exponential random pair (X 1 , Y 2 ) is defined by It can be shown that X 1 and X 2 are exponentially distributed with different parameters λ 1 and λ 2 , correspondingly.Finally, in contrast to Model (1), Representation (2) provides a bivariate exponential random vector that is not necessarily exchangeable.Similarly, the family of random pairs generated by Representation (2) contains the Fréchet upper bound, so it models the full range of positive correlation, namely [0, 1].Moreover, Model (1) is a special case of Model (2) obtained when λ 1 = λ 2 = λ.Furthermore, the joint survival function of (X , Y ) is given, for all (x, y) ∈ R + × R + , by Note that both of the random vectors generated by Models (1) and ( 2) have the same survival copula expressed, for all (u, v) ∈ [0, 1] 2 , by The latter is called the Raftery copula and its Spearman's rho and Kendall's tau of this copula are given in terms of θ by and respectively.Hence, the goal of this paper is to extend the Raftery copula to the multivariate setting and study its properties.The paper is structured as follows.Section 2 establishes a multivariate extension of the Raftery copula extracted from a multivariate version of the model described in Model (2).Section 3 derives the Kendall's tau, the Spearman's rho and the density function corresponding to the proposed copula.Section 4 establishes the lower and upper tail dependence of the proposed survival copula.Section 5 is devoted to the estimation of the dependence parameter of this copula and presents a simulation study showing its performance.The proposed approach has been applied successfully to fit a multivariate distribution to a real data set about black cherry trees.

Multivariate Raftery Copula with One Parameter of Dependence
The aim of this section is to propose a multivariate extension of the Raftery copula presented in Equation (4).To do so, we start by briefly discussing the multivariate Raftery exponential distribution.In fact, Raftery has presented a multivariate exponential model in [19] that extends the bivariate distribution given in Model (1).As outlined in this paper, the resulting model is exchangeable and the number of its parameters decreases exponentially in terms of the dimension of the random vector.Here, we propose a nonexchangeable multivariate extension of the bivariate model (Model (1)) with fewer parameters.This can be carried out by using the concept of the comonotonic shocks method introduced in [9].Specifically, let Z 1 , . . ., Z d , Z be independent exponential random variables with parameters λ 1 . . . ,λ d , 1, respectively.Let J be a Bernoulli random variable with parameter θ ∈ (0, 1).Assume further that J is independent of Z 1 , . . ., Z d , Z.A multivariate exponential random vector (X 1 , . . ., X d ) of the Raftery type is constructed as follows: The above construction provides a class of multivariate distributions with given marginals that are exponentially distributed with parameters λ 1 , . . ., λ d .Note that the value of θ ∈ [0, 1] can be viewed as a dependence parameter of this set of distributions.In addition, this family of distributions describes only the positive dependence and contains the Fréchet upper bound, obtained when θ tends to 1 − .An alternative formulation of the above model is where U 1 , . . ., U d , U are independent random variables uniformly distributed over [0,1].These random variables are independent of J. Since the marginal random variable X j is exponentially distributed with parameters λ j , j = 1, . . ., d.It is easy to check that the survival copula Ĉθ associated with (X 1 , . . ., X d ) is the distribution of the uniform random vector (V 1 , . . ., V d ) defined by Hereafter, we explicit the form of the survival copula Ĉθ .

Proof. For all
Let F θ be the distribution function of U 1−θ 1 .Conditioning on the random variable U, one obtains where By inserting ( 12) into (10), one obtains , which ends the proof.
Observe from Equation ( 9) that the survival copula Ĉθ coincides with the independence copula when θ = 0, and it reaches the Fréchet upper bound when θ tends to 1 − .Furthermore, one checks that the Raftery bivariate survival copula given in Equation ( 4) is a special case of Equation ( 9), obtained when d = 2.In fact, for d = 2, one has . For illustration, let us express Formula (9) for d = 3 in terms of the order statistics u (1) , u (2) and u (3) .Indeed, for d = 3, standard calculations show from Equation ( 9) that . Note that it is easy to simulate from the survival copula Ĉθ .This follows from the fact that the proposed copula is deduced from the stochastic representation described in Equation ( 8).Hence, the following algorithm allows simulating data from the survival copula Ĉθ .
As a corollary of Proposition 1, one obtains the survival function of the multivariate exponential random vector given by the stochastic representation shown in Equation (7).
Proof.From Sklar's theorem, one observes that ).It results from Equation (13) that the multivariate survival function Hθ provided in the above corollary is directly deduced by substituting u (i) into exp (− x(d−i+1) ) in Equation (9).
Simple calculations allow checking that, for d = 2, the multivariate survival function Hθ given in Corollary 1 reduces to the bivariate survival function given in Equation (3).Hereafter, we derive the Pearson correlation coefficients of random pairs selected from a random vector following the survival function Hθ .Proposition 2. Let (X 1 , . . ., X d ) be an exponential random vector with survival function given in Corollary 1. Then for any, Proof.The components of (X 1 , . . ., X d ) are defined through Equation ( 7).Thus, one has for any, 1 Since X i and X j are exponentially distributed with parameters λ i and λ j , var(X i ) = 1/λ 2 i and var(X j ) = 1/λ 2 j .Using the fact that, respectively, Z i , Z j , Z and J are independent, which ends the proof.

Note that the function
. This is not surprising because the family of survival functions derived in Corollary 1 reaches the Fréchet upper bound when θ tends to 1 − .

Properties of Ĉθ
This section provides some properties of the proposed survival copula Ĉθ .

Density Function of the Survival Copula Ĉθ
The next result states the expression of the density c θ of the survival copula Ĉθ .This formula will be used to estimate the dependence parameter θ through the maximum likelihood method.Proposition 3. The density function of the survival copula Ĉθ is given by Proof.The density function of c θ is the derivative of Ĉθ with respect to each of its arguments.Therefore, one has .
To more easily handle the above derivatives, let us decompose Ĉθ as follows: where , and Notice that the partial derivative of g(u (1) , . . ., u (d−1) ) with respect to u (d) vanishes.Hence, standard calculations lead to which ends the proof.

Spearman's Rho
Spearman's rho is an important measure of dependence.It measures the strength of association among the components of a random pair (X, Y).It is well-known that this dependence measure is independent of the marginal distributions of X and Y, and it can be written with regards to the copula of (X, Y).There exists several ways to extend the Spearman's rho to a multivariate case.For instance, Schmid and Schmidt have studied in [21] three multivariate extensions of the population version of Spearman's rho.This section provides an expression of Spearman's rho related to the proposed survival copula Ĉθ .To this end, we study the version of the multivariate Spearman's rho defined by where (V 1 , . . ., V d ) is a uniform random vector distributed as the survival copula Ĉθ .
It can be easily seen that for d = 2, the general formula of Spearman's rho given in Equation ( 15) reduces to Equation (5).In fact, for d = 2, one has Similarly, for d = 3, and after some elementary calculations, one obtains (2 − θ) 3  .

Kendall's Tau of Ĉθ
The multivariate Kendall's tau is introduced in [14 ,17].For the proposed survival copula, this measure is defined by where the uniform random vector (V 1 , . . ., V d ) is distributed as the survival copula Ĉθ .
Proposition 5.The Kendall's tau of the survival copula Ĉθ is given by Proof.Since the uniform random vector (U 1−θ 1 U J , . . ., U 1−θ d U J ) is distributed as the survival copula Ĉθ , then The fact that the survival copula Ĉθ is exchangeable implies that where I[.] stands for the indicator function.From Equation ( 9), one observes that the expectation, E Ĉθ (U 1−θ 1 U J , . . ., , is reduced to the next expression: where and for k = 2, . . ., d, By virtue of Equations ( 19)-( 21), one observes that The quantities I k , k = 1, 2, . . ., d, involved in the above expression of τ d can be calculated as follows: and for k = 2, . . ., d, .
Hereafter, one shows that for d = 2, the general formula of Kendall's tau described in Equation ( 18) reduces to Equation ( 6).In fact, for d = 2, one see from Equation ( 18) that .
Remark that both ρ d and τ d derived in Propositions 4 and 5, respectively, are nonincreasing functions in terms of d.This behavior is illustrated in Figures 1 and 2. In fact, Figure 1 presents the curves of ρ 2 , ρ 3 and ρ 4 in terms of θ ∈ [0, 1].Since the calculations show that τ 2 = τ 3 , Figure 2 exposes the curves of τ 2 , τ 4 and τ 5 in terms of θ.

Lower and Upper Tail Dependence
There are many ways to define the lower and upper tail dependence in the multivariate setting.In this section, we adopt the definition of these parameters provided in [13].Specifically, let (U 1 , . . ., U d ) be a uniform random vector with copula C. According to [13], the multivariate lower and upper tail dependence associated with C are defined by and where C denotes the survival function of (U 1 , . . ., U d ).Hereafter, we derive the expressions of λ U and λ L of the proposed survival copula.
Proposition 6.The lower and upper tail dependence of the survival copula Ĉθ are expressed by Proof.From Equation ( 9), one has Ĉθ (u, . . ., u) Clearly, u d−1+θ 1−θ tends to 0 as u tends to 0 + because d−1+θ 1−θ > 0. Therefore, the above sum can be simplified as follows: From Equations ( 26) and ( 27), one deduces that To establish λ U , first note that It is well-known that for the Bivariate Raftery copula, λ U = 0 (see Example 5.21 in [16]).This means that Combining Equations ( 28) and ( 29), one obtains Notice that for d = 2, the lower tail dependence provided in Proposition 6 reduces to 2θ/(1 + θ), which is the expression of λ L related to the bivariate Raftery copula (see Example 5.21 in [16]).It is similar for the upper tail dependence.

Dependence Parameter Estimation
In this section, we discuss the estimation of the dependence parameter θ using the maximum likelihood procedure; moreover, we will examine the finite-sample accuracy of the estimates for several sample sizes.To this end, let u i = (u i1 , . . ., u id ), i = 1, . . ., n be a sample that has been established earlier from the survival copula Ĉθ .The log-likelihood function is given by where u (id) = max{u i1 , . . ., u id }, i = 1, . . ., n.The maximum likelihood estimator θ of θ is achieved by maximizing the above log-likelihood function in terms of θ.More specifically, This maximization cannot be explicitly solved.Because, there is no closed-form solution of the next equation, To solve the problem shown in Equation (30), one adopts numerical maximization, which provides efficient results, as shown by the following simulation study established for d = 3.

Simulation Study
The tables below present the outcomes of the estimator θ, the bias and the mean squared error (MSE) of θ.The following scenarios present, when investigated, simulations that demonstrate that θ provides a good estimator for the dependence parameter θ.Being a comprehensible result, the effectiveness of our estimator θ increases as n increases, and the bias and MSE of θ decrease.Hence, the estimator becomes narrower as the sample size grows.In addition to the fact that we observed many more simulations, which are not presented here, where the estimator θ performed very well when n was getting larger.

Real Data
An actual data set is utilized to assess the effectiveness of this extension.The data set trees may be found in the datasets R package, as well as in [12].The information was gathered from a sample size of 31 black cherry trees from the forest in order to calculate the volume of a tree based on its height and diameter.The data set includes 31 observations of the variables.The variables are given as Diam, which represents the diameter in inches, Height, which represents the height in feet, and Volume, which represents the volume in cubic feet.Our goal is to fit a multivariate distribution describing the data by using the proposed copula.This can be conducted in two steps.In the first step, we select the marginal distributions.Then, we look for the copula in the second step.These two steps are achieved through goodness-of-fit procedures.
First step.Using the bootstrap technique based on the Kolmogorov-Smirnov (KS) Test, Table 4 demonstrates that the variables Diam and Volume are distributed as a gamma distribution and the Height follows the Weibull distribution.The maximum likelihood estimates (MLEs) of the model parameters are given in Tables 5-7.Second step.Now, we evaluate the goodness-of-fit test (GOF) of the proposed copula Ĉθ based on Cramér-von Mises statistics using the bootstrap algorithm proposed by [10].The estimator of the dependence parameter of the proposed copula is obtained by θ = 0.6102115.The p-value of this GOF test is 0.813, which is much higher than 0.05.This confirms that our proposed copula Ĉθ describes the dependency among the components of the data set reasonably well.The next Table 8 summarizes this information.Data Availability Statement: This The data set may be found in the R package called: datasets, or in [12].

Figure 1 .
Figure 1.The curves of ρ 2 , ρ 3 and ρ 4 in terms of θ are indicated with colors red, blue and green, respectively.

Figure 2 .
Figure 2. The curves of τ 2 , τ 4 and τ 5 in terms of θ are indicated with colors blue, green and purple, respectively.

Table 1 .
Estimation for θ corresponding to weak dependence.

Table 2 .
Estimation for θ corresponding to moderate dependence.

Table 3 .
Estimation for θ corresponding to strong dependence.

Table 4 .
Test statistics and p-value tests.

Table 5 .
MLEs of the model parameters for Dimension.

Table 6 .
MLEs of the model parameters for Height.

Table 7 .
MLEs of the model parameters for Volume.