Abstract
This paper introduces a multivariate extension of Raftery copula. The proposed copula is exchangeable and expressed in terms of order statistics. Several properties of this copula are established. In particular, the multivariate Kendall’s tau and Spearman’s rho, as well as the density function, of the suggested copula are derived. The lower and upper tail dependence of the proposed copula are also established. The dependence parameter estimator of this new copula is examined based on the maximum likelihood procedure. A simulation study shows a satisfactory performance of the presented estimator. Finally, the proposed copula is successfully applied to a real data set on black cherry trees.
MSC:
62H05
1. Introduction
The bivariate exponential distribution has been receiving more attention in research studies for many decades [1]. It has been applied in a variety of statistical practices, such as reliability theory, queuing theory and physics.
A significant class of the bivariate family of distributions is the Marshall–Olkin bivariate exponential distribution (MOBE) introduced in [1]. The MOBE model provides an interesting physical interpretation in terms of fatal shocks as it contains both continuous and singular components. Furthermore, the MOBE provides an interesting copula, which has been embraced in several practical applications, such as finance [2]. The bivariate exponential literature demonstrated the existence of countless models and classes that have been further developed in recent decades (see references from [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]) especially.
In [19], Raftery has introduced another class of bivariate exponential distribution that shows an interesting interpretation in physics. Unlike the Marshall–Olkin model, Fréchet’s upper bound belongs to the class of Raftery’s bivariate exponential distributions, which allows the correlation to be modeled broadly. Moreover, this distribution is continuous without any singular part, which makes the dependence parameter estimation more tractable [20].
As outlined in [19], there are several versions of Raftery bivariate and multivariate exponential distributions. An important version of these kinds of distributions is defined as follows. Consider , , and identical and independent exponential random variables with parameter . Assume J is a Bernoulli random variable with parameter assumed independent of , and . The random pair defined by
has a Raftery bivariate exponential distribution with parameters and . As shown in [19], the marginal random variables X and Y are exponentially distributed with parameter . This is a simple and efficient model in the context of its ability to model the full range of positive correlations using only one dependence parameter, namely . In fact, it is easily seen that the model includes the bivariate Fréchet distribution when tends to 1 and reaches the independence case when tends to 0. Furthermore, it could be seen that the random vector is exchangeable, so the model can only be used to describe an exponential random pair with the same marginal distributions.
To avoid this limitation and considering [7], one can adopt the comonotonic shocks method introduced in [9] in order to adapt Model (1) for no exchangeable random pair . The idea behind this method is to replace the common shock with the pair of shocks , such that and are perfectly positively dependent and exponentially distributed with parameters and , respectively. In other words, the distribution of the random vector is the upper Fréchet bound expressed by . This means that there exists an exponential random variable Z with parameter 1 such that and . To define the proposed alternative model, let and Z be exponential random variables with parameters and 1, respectively. Let J be Bernoulli random variable with parameter and presume that the random variables and J are independent. Hence, the suggested bivariate exponential random pair is defined by
It can be shown that and are exponentially distributed with different parameters and , correspondingly. Finally, in contrast to Model (1), Representation (2) provides a bivariate exponential random vector that is not necessarily exchangeable. Similarly, the family of random pairs generated by Representation (2) contains the Fréchet upper bound, so it models the full range of positive correlation, namely . Moreover, Model (1) is a special case of Model (2) obtained when . Furthermore, the joint survival function of is given, for all , by
Note that both of the random vectors generated by Models (1) and (2) have the same survival copula expressed, for all , by
The latter is called the Raftery copula and its Spearman’s rho and Kendall’s tau of this copula are given in terms of by
and
respectively. Hence, the goal of this paper is to extend the Raftery copula to the multivariate setting and study its properties.
The paper is structured as follows. Section 2 establishes a multivariate extension of the Raftery copula extracted from a multivariate version of the model described in Model (2). Section 3 derives the Kendall’s tau, the Spearman’s rho and the density function corresponding to the proposed copula. Section 4 establishes the lower and upper tail dependence of the proposed survival copula. Section 5 is devoted to the estimation of the dependence parameter of this copula and presents a simulation study showing its performance. The proposed approach has been applied successfully to fit a multivariate distribution to a real data set about black cherry trees.
2. Multivariate Raftery Copula with One Parameter of Dependence
The aim of this section is to propose a multivariate extension of the Raftery copula presented in Equation (4). To do so, we start by briefly discussing the multivariate Raftery exponential distribution. In fact, Raftery has presented a multivariate exponential model in [19] that extends the bivariate distribution given in Model (1). As outlined in this paper, the resulting model is exchangeable and the number of its parameters decreases exponentially in terms of the dimension of the random vector. Here, we propose a nonexchangeable multivariate extension of the bivariate model (Model (1)) with fewer parameters. This can be carried out by using the concept of the comonotonic shocks method introduced in [9]. Specifically, let be independent exponential random variables with parameters , respectively. Let J be a Bernoulli random variable with parameter . Assume further that J is independent of . A multivariate exponential random vector of the Raftery type is constructed as follows:
The above construction provides a class of multivariate distributions with given marginals that are exponentially distributed with parameters . Note that the value of can be viewed as a dependence parameter of this set of distributions. In addition, this family of distributions describes only the positive dependence and contains the Fréchet upper bound, obtained when tends to . An alternative formulation of the above model is
where are independent random variables uniformly distributed over [0,1]. These random variables are independent of J. Since the marginal random variable is exponentially distributed with parameters , . It is easy to check that the survival copula associated with is the distribution of the uniform random vector defined by
Hereafter, we explicit the form of the survival copula .
Proposition 1.
The survival copula of the random vector is given, for all , by
where denote the order statistics of .
Proof.
For all , one has
Let be the distribution function of . Conditioning on the random variable U, one obtains
where and . Note that, for all , , one has , if and , if . Hence, Equation (11) becomes,
Observe from Equation (9) that the survival copula coincides with the independence copula when , and it reaches the Fréchet upper bound when tends to . Furthermore, one checks that the Raftery bivariate survival copula given in Equation (4) is a special case of Equation (9), obtained when . In fact, for , one has
For illustration, let us express Formula (9) for in terms of the order statistics , and . Indeed, for , standard calculations show from Equation (9) that
Note that it is easy to simulate from the survival copula . This follows from the fact that the proposed copula is deduced from the stochastic representation described in Equation (8). Hence, the following algorithm allows simulating data from the survival copula .
Algorithm of simulation:
- 1.
- Generate independent values from uniform
- 2.
- Generate j from Bernoulli distribution with parameter ;
- 3.
- Set ;
- 4.
- The desired vector is .
As a corollary of Proposition 1, one obtains the survival function of the multivariate exponential random vector given by the stochastic representation shown in Equation (7).
Corollary 1.
The survival function of the random vector is given, for all , by
where denote the order statistics of .
Proof.
From Sklar’s theorem, one observes that
Simple calculations allow checking that, for , the multivariate survival function given in Corollary 1 reduces to the bivariate survival function given in Equation (3). Hereafter, we derive the Pearson correlation coefficients of random pairs selected from a random vector following the survival function .
Proposition 2.
Let be an exponential random vector with survival function given in Corollary 1. Then for any, ,
Proof.
The components of are defined through Equation (7). Thus, one has for any, ,
Since and are exponentially distributed with parameters and , and . Using the fact that, respectively, and J are independent,
which ends the proof. □
Note that the function increases from to . This implies that the range of is exactly . This is not surprising because the family of survival functions derived in Corollary 1 reaches the Fréchet upper bound when tends to .
3. Properties of
This section provides some properties of the proposed survival copula .
3.1. Density Function of the Survival Copula
The next result states the expression of the density of the survival copula . This formula will be used to estimate the dependence parameter through the maximum likelihood method.
Proposition 3.
The density function of the survival copula is given by
Proof.
The density function of is the derivative of with respect to each of its arguments. Therefore, one has
To more easily handle the above derivatives, let us decompose as follows:
where
and
Notice that the partial derivative of with respect to vanishes. Hence, standard calculations lead to
which ends the proof. □
3.2. Spearman’s Rho
Spearman’s rho is an important measure of dependence. It measures the strength of association among the components of a random pair . It is well-known that this dependence measure is independent of the marginal distributions of X and Y, and it can be written with regards to the copula of . There exists several ways to extend the Spearman’s rho to a multivariate case. For instance, Schmid and Schmidt have studied in [21] three multivariate extensions of the population version of Spearman’s rho. This section provides an expression of Spearman’s rho related to the proposed survival copula . To this end, we study the version of the multivariate Spearman’s rho defined by
where is a uniform random vector distributed as the survival copula .
Proposition 4.
The expression of the above Spearman’s rho for the survival copula is given by
Proof.
Since the random vector follows the survival copula , then from Equation (14),
Moreover,
It can be easily seen that for , the general formula of Spearman’s rho given in Equation (15) reduces to Equation (5). In fact, for , one has
Similarly, for , and after some elementary calculations, one obtains
3.3. Kendall’s Tau of
The multivariate Kendall’s tau is introduced in [14,17]. For the proposed survival copula, this measure is defined by
where the uniform random vector is distributed as the survival copula .
Proposition 5.
The Kendall’s tau of the survival copula is given by
Proof.
Since the uniform random vector is distributed as the survival copula , then
The fact that the survival copula is exchangeable implies that
where I stands for the indicator function. From Equation (9), one observes that the expectation, , is reduced to the next expression:
where
and for ,
The quantities , , involved in the above expression of can be calculated as follows:
and for ,
Hereafter, one shows that for , the general formula of Kendall’s tau described in Equation (18) reduces to Equation (6). In fact, for , one see from Equation (18) that
Remark that both and derived in Propositions 4 and 5, respectively, are nonincreasing functions in terms of d. This behavior is illustrated in Figure 1 and Figure 2. In fact, Figure 1 presents the curves of , and in terms of . Since the calculations show that , Figure 2 exposes the curves of , and in terms of .
Figure 1.
The curves of , and in terms of are indicated with colors red, blue and green, respectively.
Figure 2.
The curves of , and in terms of are indicated with colors blue, green and purple, respectively.
4. Lower and Upper Tail Dependence
There are many ways to define the lower and upper tail dependence in the multivariate setting. In this section, we adopt the definition of these parameters provided in [13]. Specifically, let be a uniform random vector with copula C. According to [13], the multivariate lower and upper tail dependence associated with C are defined by
and
where denotes the survival function of . Hereafter, we derive the expressions of and of the proposed survival copula.
Proposition 6.
The lower and upper tail dependence of the survival copula are expressed by
Proof.
From Equation (9), one has
Clearly, tends to 0 as u tends to because . Therefore, the above sum can be simplified as follows:
To establish , first note that
It is well-known that for the Bivariate Raftery copula, (see Example 5.21 in [16]). This means that
□
Notice that for , the lower tail dependence provided in Proposition 6 reduces to , which is the expression of related to the bivariate Raftery copula (see Example 5.21 in [16]). It is similar for the upper tail dependence.
5. Parameter Estimation and Simulation Study
5.1. Dependence Parameter Estimation
In this section, we discuss the estimation of the dependence parameter using the maximum likelihood procedure; moreover, we will examine the finite-sample accuracy of the estimates for several sample sizes. To this end, let , be a sample that has been established earlier from the survival copula . The log-likelihood function is given by
where , . The maximum likelihood estimator of is achieved by maximizing the above log-likelihood function in terms of . More specifically,
This maximization cannot be explicitly solved. Because, there is no closed-form solution of the next equation,
To solve the problem shown in Equation (30), one adopts numerical maximization, which provides efficient results, as shown by the following simulation study established for .
5.2. Simulation Study
The tables below present the outcomes of the estimator , the bias and the mean squared error (MSE) of . The following scenarios present, when investigated, simulations that demonstrate that provides a good estimator for the dependence parameter . Being a comprehensible result, the effectiveness of our estimator increases as n increases, and the bias and MSE of decrease. Hence, the estimator becomes narrower as the sample size grows. In addition to the fact that we observed many more simulations, which are not presented here, where the estimator performed very well when n was getting larger.
This can be brought forward by looking at the behavior of the estimator in three different scenarios: weak dependence (, i.e., ) in Table 1; moderate dependence (, i.e., ) in Table 2; and strong dependence (, i.e., ) in Table 3.
Table 1.
Estimation for corresponding to weak dependence.
Table 2.
Estimation for corresponding to moderate dependence.
Table 3.
Estimation for corresponding to strong dependence.
5.3. Real Data
An actual data set is utilized to assess the effectiveness of this extension. The data set trees may be found in the datasets R package, as well as in [12]. The information was gathered from a sample size of 31 black cherry trees from the forest in order to calculate the volume of a tree based on its height and diameter. The data set includes 31 observations of the variables. The variables are given as Diam, which represents the diameter in inches, Height, which represents the height in feet, and Volume, which represents the volume in cubic feet. Our goal is to fit a multivariate distribution describing the data by using the proposed copula. This can be conducted in two steps. In the first step, we select the marginal distributions. Then, we look for the copula in the second step. These two steps are achieved through goodness-of-fit procedures.
First step. Using the bootstrap technique based on the Kolmogorov–Smirnov (KS) Test, Table 4 demonstrates that the variables Diam and Volume are distributed as a gamma distribution and the Height follows the Weibull distribution. The maximum likelihood estimates (MLEs) of the model parameters are given in Table 5, Table 6 and Table 7.
Table 4.
Test statistics and p-value tests.
Table 5.
MLEs of the model parameters for Dimension.
Table 6.
MLEs of the model parameters for Height.
Table 7.
MLEs of the model parameters for Volume.
Second step. Now, we evaluate the goodness-of-fit test (GOF) of the proposed copula based on Cramér-von Mises statistics using the bootstrap algorithm proposed by [10]. The estimator of the dependence parameter of the proposed copula is obtained by . The p-value of this GOF test is 0.813, which is much higher than 0.05. This confirms that our proposed copula describes the dependency among the components of the data set reasonably well. The next Table 8 summarizes this information.
Table 8.
GOF test for .
Author Contributions
Conceptualization, T.S., M.M. and A.S.; Methodology, T.S., M.M. and A.S.; Software, T.S. and A.S.; Validation, T.S., M.M. and A.S.; Formal analysis, T.S.; Investigation, M.M.; Data curation, T.S., M.M. and A.S.; Writing—original draft, T.S.; Writing—review & editing, T.S., M.M. and A.S.; Visualization, A.S.; Supervision, M.M. and A.S.; Funding acquisition, M.M. All authors have read and agreed to the published version of the manuscript.
Funding
The work of the second author is funded by the Natural Sciences and Engineering Research Council of Canada No. 06536-2018.
Data Availability Statement
This The data set may be found in the R package called: datasets, or in [12].
Acknowledgments
Mhamed Mesfoui acknowledges the Financial support of the Natural Sciences and Engineering Research Council of Canada No. 06536-2018.
Conflicts of Interest
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
References
- Marshall, A.; Olkin, I. A multivariate exponential distribution. J. Am. Stat. Assoc. 1967, 62, 30–44. [Google Scholar] [CrossRef]
- Bouye, E.; Durlleman, V.; Nikeghbali, A.; Riboulet, G.; Roncalli, T. Copulas for Finance; Financial Econometrics Research Centre: Paris, France, 2000. [Google Scholar]
- Arnold, B.C.; Strauss, D. Bivariate distributions with exponential conditionals. J. Am. Stat. Assoc. 1988, 83, 522–527. [Google Scholar] [CrossRef]
- Basu, A.P. Multivariate exponential distributions and their applications in reliability. Handb. Stat. 1988, 7, 467–477. [Google Scholar]
- Basu, A.P. Bivariate exponential distributions. In The Exponential Distribution: Theory, Methods, and Applications; Routledge: London, UK, 1995; pp. 327–331. [Google Scholar]
- Block, H.W.; Basu, A.P. A Continuous Bivariate Exponential Extension. J. Am. Stat. Assoc. 1974, 69, 1031–1037. [Google Scholar]
- Bukovšek, D.K.; Košir, T.; Mojškerc, B.; Omladič, M. Extreme generators of shock induced copulas. Appl. Math. Comput. 2022, 429, 127214. [Google Scholar] [CrossRef]
- David, H.A. Order Statistics, 2nd ed.; Wiley: New York, NY, USA, 1981. [Google Scholar]
- Genest, C.; Mesfioui, M.; Schulz, J. A new bivariate Poisson common shock model covering all possible degrees of dependence. Stat. Probabil. Lett. 2018, 140, 202–209. [Google Scholar] [CrossRef]
- Genest, C.; Rémillard, B.; Beaudoin, D. Goodness-of-fit tests for copulas: A review and a power study. Insur. Math. Econ. 2009, 44, 199–213. [Google Scholar] [CrossRef]
- Gumbel, E.J. Bivariate exponential distributions. J. Am. Stat. Assoc. 1960, 55, 698–707. [Google Scholar] [CrossRef]
- Hand, D.J. A Handbook of Small Data Sets; Chapman & Hall/CRC: Boca Raton, FL, USA, 1994. [Google Scholar]
- Joe, H.; Li, H.; Nikoloulopoulos, A.K. Tail dependence functions and vine copulas. J. Multivariate Anal. 2010, 101, 252–270. [Google Scholar] [CrossRef]
- Joe, H. Multivariate concordance. J. Multivariate Anal. 1990, 35, 12–30. [Google Scholar] [CrossRef]
- Kundu, D.; Gupta, R.D. Absolute continuous bivariate generalized exponential distribution. Adv. Stat. Anal. 2011, 95, 169–185. [Google Scholar] [CrossRef]
- Nelsen, R.B. An Introduction to Copulas; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
- Nelsen, R.B. Concordance and copulas: A survey. Distrib. Given Marginals Stat. Model. 2002, 37, 169–177. [Google Scholar]
- Paulson, A.S. A characterization of the exponential distribution and a bivariate exponential distribution. Sankhy Indian J. Stat. Ser. A 1973, 35, 69–78. [Google Scholar]
- Raftery, A.E. A continuous multivariate exponential distribution. Commun. Stat. Theory Methods 1984, 13, 947–965. [Google Scholar] [CrossRef]
- Regoli, G. A class of bivariate exponential distributions. J. Multivariate Anal. 2009, 100, 1261–1269. [Google Scholar] [CrossRef]
- Schmid, F.; Schmidt, R. Multivariate extensions of Spearman’s rho and related statistics. Stat. Probab. Lett. 2007, 77, 407–416. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).