Previous Article in Journal
Novel Parametric Solutions for the Ideal and Non-Ideal Prouhet Tarry Escott Problem

Article

# A Generator of Bivariate Distributions: Properties, Estimation, and Applications

1
Department of Statistics and Operations Research, University of Murcia, CEIR Campus Mare Nostrum, IMIB-Arrixaca, 30100 Murcia, Spain
2
Department of Mathematics and Statistics, Indian Institute of Technology, Kanpur 208016, India
*
Author to whom correspondence should be addressed.
Mathematics 2020, 8(10), 1776; https://doi.org/10.3390/math8101776
Received: 3 September 2020 / Revised: 6 October 2020 / Accepted: 8 October 2020 / Published: 14 October 2020
(This article belongs to the Section Probability and Statistics Theory)

## Abstract

In 2020, El-Morshedy et al. introduced a bivariate extension of the Burr type X generator (BBX-G) of distributions, and Muhammed presented a bivariate generalized inverted Kumaraswamy (BGIK) distribution. In this paper, we propose a more flexible generator of bivariate distributions based on the maximization process from an arbitrary three-dimensional baseline distribution vector, which is of interest for maintenance and stress models, and expands the BBX-G and BGIK distributions, among others. This proposed generator allows one to generate new bivariate distributions by combining non-identically distributed baseline components. The bivariate distributions belonging to the proposed family have a singular part due to the latent component which makes them suitable for modeling two-dimensional data sets with ties. Several distributional and stochastic properties are studied for such bivariate models, as well as for its marginals, conditional distributions, and order statistics. Furthermore, we analyze its copula representation and some related association measures. The EM algorithm is proposed to compute the maximum likelihood estimations of the unknown parameters, which is illustrated by using two particular distributions of this bivariate family for modeling two real data sets.

## 1. Introduction

Gumbel [1], Freund [2], and Marshall and Olkin [3] in their pioneering papers developed bivariate exponential distributions. Since then, an extensive amount of work has been done on these models and their different generalizations, which have played a crucial role in the construction of multivariate distributions and modeling in a wide variety of applications, such as physic, economy, biology, health, engineering, computer science, etc. Several continuous bivariate distributions can be found in Balakrishnan and Lai [4], and some generalizations and multivariate extensions have been studied by Franco and Vivo [5], Kundu and Gupta [6], Franco et al. [7], Gupta et al. [8], Kundu et al. [9], among others, and recently by Muhammed [10], Franco et al. [11], and El-Morshedy et al. [12], also see the references cited therein.
Kundu and Gupta [13] introduced a bivariate generalized exponential (BGE) distribution by using the trivariate reduction technique with generalized exponential (GE) random variables, which is based on the maximization process between components with a latent random variable, suitable for modeling of some stress and maintenance models. This procedure has also been applied in the literature to generate other bivariate distributions, for example, the bivariate generalized linear failure rate (BGLFR) given by Sarhan et al. [14], the bivariate log-exponentiated Kumaraswamy (BlogEK) introduced by Elsherpieny et al. [15], the bivariate exponentiated modified Weibull extension (BEMWE) given by El-Gohary et al. [16], the bivariate inverse Weibull (BIW) studied by Muhammed [17] and Kundu and Gupta [18], the bivariate Dagum (BD) provided by Muhammed [19], the bivariate generalized Rayleigh (BGR) depicted by Sarhan [20], the bivariate Gumbel-G (BGu-G) presented by Eliwa and El-Morshedy [21], the bivariate generalized inverted Kumaraswamy (BGIK) given by Muhammed [10], and the bivariate Burr typeX-G (BBX-G) proposed by El-Morshedy et al. [12]. Some associated inferential issues have been discussed in these articles, and all of them are based on considering the same kind of baseline components. In each of these bivariate models, the baseline components belong to the proportional reversed hazard rate (PRH) family with a certain underlying distribution (Gupta et al. [22] and Di Crescenzo [23]). It is worth mentioning that Kundu and Gupta [24] extended the BGE model by using components within the PRH family, called a bivariate proportional reversed hazard rate (BPRH) family, and a multivariate extension of the BPRH model was studied by Kundu et al. [9].
The main aim of this paper is to provide a more flexible generator of bivariate distributions based on the maximization process from an arbitrary three-dimensional baseline continuous distribution vector, i.e., not necessarily identical continuous distributions. Hence, this proposed generator allows researchers and practitioners to generate new bivariate distributions even by combining non-identically distributed baseline components, which may be interpreted as a stress model or as a maintenance model. We refer to the bivariate models from this generator as the generalized bivariate distribution (GBD) family, which contains as special cases the aforementioned bivariate distributions. Note that a two-dimensional random variable $( X 1 , X 2 )$, belonging to the GBD family, has dependent components due to a latent factor, and its joint cumulative distribution function (cdf) is not absolutely continuous, i.e., the joint cdf is a mixture of an absolutely continuous part and a singular part due to the positive probability of the event $X 1 = X 2$, whereas the line $x 1 = x 2$ has two-dimensional Lebesgue measure zero. In general, the maximum likelihood estimation (MLE) of the unknown parameters a GBD model cannot be obtained in closed form, and we propose using an EM algorithm to compute the MLEs of such parameters.
The rest of the paper is organized as follows. The construction of the GBD family is given in Section 2, and we obtain its decomposition in absolutely continuous and singular parts and its joint probability density function (pdf). In Section 3, several special bivariate models are presented. The cdf and pdf of the marginals and conditional distributions are derived in Section 4, as well as for its order statistics. Some dependence and two-dimensional ageing properties for the GBD family, and stochastic properties of their marginals and order statistics are studied in Section 5, as well as its copula representation and some related association measures. The EM algorithm is proposed in Section 6, which is applied in Section 7, for illustrative purposes, to find the MLEs of particular models of the GBD family in the analysis of two real data sets. Finally, the multivariate extension is discussed in Section 8, as well as the concluding remarks. Some of the proofs are relegated to Appendix A for a fluent presentation of the results, and some technical details of the applications can be found in Appendix B.

## 2. The GBD Family

In this section, we define the generalized bivariate distribution family as a generator system from any three-dimensional baseline continuous distribution, and then we provide its joint cdf, decomposition, and joint pdf.
Let $U 1$, $U 2$, and $U 3$ be mutually independent random variables with any continuous distribution functions $F U 1$, $F U 2$ and $F U 3$, respectively. Let $X 1 = max ( U 1 , U 3 )$ and $X 2 = max ( U 2 , U 3 )$. Then, the random vector $( X 1 , X 2 )$ is said to be a GBD model with baseline distribution vector $( F U 1 , F U 2 , F U 3 )$.
Theorem 1.
Let $( X 1 , X 2 )$ be a GBD model with baseline distribution vector $( F U 1 , F U 2 , F U 3 )$, then its joint cdf is given by
$F ( x 1 , x 2 ) = F U 1 ( x 1 ) F U 2 ( x 2 ) F U 3 ( z ) ,$
where $z = min ( x 1 , x 2 )$, for all $x 1 , x 2 ∈ R$.
Proof.
It is immediate since
$F ( x 1 , x 2 ) = P ( X 1 ≤ x 1 , X 2 ≤ x 2 ) = P ( max ( U 1 , U 3 ) ≤ x 1 , max ( U 2 , U 3 ) ≤ x 2 ) = P ( U 1 ≤ x 1 , U 2 ≤ x 2 , U 3 ≤ min ( x 1 , x 2 ) ) = F U 1 ( x 1 ) F U 2 ( x 2 ) F U 3 ( min ( x 1 , x 2 ) ) .$
For instance, a stress model may lead to the GBD family, as in Kundu and Gupta [13]. Suppose a two-component system where each component is subject to an individual independent stress, say $U 1$ and $U 2$, respectively. The system has an overall stress $U 3$ which has been equally transmitted to both the components, independent of their individual stresses. Then, the observed stress for each component is the maximum of both, individual and overall stresses, i.e., $X 1 = max ( U 1 , U 3 )$ and $X 2 = max ( U 2 , U 3 )$, and $( X 1 , X 2 )$ is a GBD model.
Analogously, a GBD model is also plausible for a maintenance model. Suppose a system has two components, and it is assumed that each component has been maintained independently and there is also an overall maintenance. Due to component maintenance, the lifetime of the individual component is increased by a random time, say $U 1$ and $U 2$ respectively, and, because of the overall maintenance, the lifetime of each component is increased by another random time $U 3$. Then, the increased lifetime of each component is the maximum of both individual and overall maintenances, $X 1 = max ( U 1 , U 3 )$ and $X 2 = max ( U 2 , U 3 )$, respectively.
As mentioned before, a bivariate model belonging to the GBD family does not have an absolutely continuous cdf. Let us see now the decomposition of a GBD model as a mixture of bivariate absolutely continuous and singular cdfs, the proof is provided in Appendix A.
Theorem 2.
Let $( X 1 , X 2 )$ be a GBD model with baseline distribution vector $( F U 1 , F U 2 , F U 3 )$. Then,
$F ( x 1 , x 2 ) = α F s ( x 1 , x 2 ) + ( 1 − α ) F a c ( x 1 , x 2 )$
where
$F s ( x 1 , x 2 ) = 1 α ∫ − ∞ z F U 1 ( u ) F U 2 ( u ) d F U 3 ( u )$
and
$F a c ( x 1 , x 2 ) = 1 1 − α F U 1 ( x 1 ) F U 2 ( x 2 ) F U 3 ( z ) − ∫ − ∞ z F U 1 ( u ) F U 2 ( u ) d F U 3 ( u )$
with $z = min ( x 1 , x 2 )$, are the singular and absolutely continuous parts, respectively, and
$α = ∫ − ∞ ∞ F U 1 ( u ) F U 2 ( u ) d F U 3 ( u ) .$
In addition, due to the singular part $F s$ in (2), the GBD family does not have a pdf with respect to the two-dimensional Lebesgue measure even when the distribution functions $F U 1$, $F U 2$, and $F U 3$ are absolutely continuous. However, it is possible to construct a joint pdf for $( X 1 , X 2 )$ through a mixture between a pdf with respect to the two-dimensional Lebesgue measure and a pdf with respect to the one-dimensional Lebesgue measure (the proof is provided in Appendix A).
Theorem 3.
If $( X 1 , X 2 )$ is a GBD model with joint cdf given by (1), then the joint pdf with respect to μ, the measure associated with F, is
$f ( x 1 , x 2 ) = f 1 ( x 1 , x 2 ) , if x 1 < x 2 f 2 ( x 1 , x 2 ) , if x 1 > x 2 f 0 ( x ) , if x 1 = x 2 = x ,$
where
$f i ( x 1 , x 2 ) = f U j ( x j ) f U i ( x i ) F U 3 ( x i ) + F U i ( x i ) f U 3 ( x i ) , w i t h i ≠ j ∈ { 1 , 2 } ,$
and
$f 0 ( x ) = f U 3 ( x ) F U 1 ( x ) F U 2 ( x ) ,$
when the pdf $f U i$ of $U i$ exists, $i = 1 , 2 , 3$.

## 3. Special Cases

In this section, we derive new bivariate models from Theorem 1, taking into account particular baseline distribution vectors $( F U 1 , F U 2 , F U 3 )$.
Note that, if the baseline components $U i$s belong to the same distribution family, say $F U$, then the proposed generator provides novel extended bivariate versions of that distribution $F U$. Furthermore, under certain restrictions on the underlying parameters of each $U i$, bivariate distributions given in the literature are obtained. From now on, it is assumed that all parameters of each $F U i$ are positive unless otherwise mentioned.
Extended bivariate generalized exponential model. A random variable U follows a GE distribution, $U ∼ G E ( θ , λ )$ (see Gupta and Kundu [25]), if its cdf is given by
$F G E ( u ; θ , λ ) = 1 − e − λ u θ , for u > 0 .$
If $U i ∼ G E ( θ i , λ i )$ $i = 1 , 2 , 3$, then the GBD model with the GE baseline distribution vector is an extended BGE model with $θ = ( θ 1 , θ 2 , θ 3 )$ and $λ = ( λ 1 , λ 2 , λ 3 )$ parameter vectors, denoted as $( X 1 , X 2 ) ∼ E B G E ( θ , λ )$, and its joint cdf is
$F E B G E ( x 1 , x 2 ) = F G E ( x 1 ; θ 1 , λ 1 ) F G E ( x 2 ; θ 2 , λ 2 ) F G E ( z ; θ 3 , λ 3 ) , for x 1 > 0 , x 2 > 0 ,$
where $z = min ( x 1 , x 2 )$.
As a particular case, if $λ = λ i$, $i = 1 , 2 , 3$, $( X 1 , X 2 ) ∼ B G E ( θ , λ )$ given by Kundu and Gupta [13].
Extended bivariate proportional reversed hazard rate model. If $U i ∼ P R H ( θ i )$ with base distribution $F B i$ $i = 1 , 2 , 3$, i.e., its cdf can be expressed as $F U i = F B i θ i$ (see Gupta et al. [22] and Di Crescenzo [23]), then the GBD model with PRH baseline distribution vector provides an extended BPRH model, $( X 1 , X 2 ) ∼ E B P R H ( θ , λ )$, with $θ = ( θ 1 , θ 2 , θ 3 )$ parameter vector of the PRH components and $λ = ( λ 1 , λ 2 , λ 3 )$ parameter vector of the underlying distributions $F B i$’s. From (1), its joint cdf is given by
$F E B P R H ( x 1 , x 2 ) = F B 1 θ 1 ( x 1 ; λ 1 ) F B 2 θ 2 ( x 2 ; λ 2 ) F B 3 θ 3 ( z ; λ 3 ) , for x 1 > 0 , x 2 > 0 ,$
where $z = min ( x 1 , x 2 )$.
In particular, if the PRH components have the same base distribution, $F B = F B i$ $i = 1 , 2 , 3$, then $( X 1 , X 2 ) ∼ B P R H ( θ , λ )$ with baseline distribution $F B ( · ; λ )$ introduced by Kundu and Gupta [24].
Extended bivariate generalized linear failure rate model. It is said that a random variable U follows a GLFR distribution, $U ∼ G L F R ( θ , λ , γ )$ (see Sarhan and Kundu [26]), if its cdf is given by
$F G L F R ( u ; θ , λ , γ ) = 1 − exp − λ u − γ 2 u 2 θ , for u > 0 .$
If $U i ∼ G L F R ( θ i , λ i , γ i )$ $i = 1 , 2 , 3$, then the GBD model with GLFRs baseline distribution vector is an extended BGLFR model, $( X 1 , X 2 ) ∼ E B G L F R ( θ , λ , γ )$, with parameters $θ = ( θ 1 , θ 2 , θ 3 )$, $λ = ( λ 1 , λ 2 , λ 3 )$, and $γ = ( γ 1 , γ 2 , γ 3 )$, having joint cdf
$F E B G L F R ( x 1 , x 2 ) = F G L F R ( x 1 ; θ 1 , λ 1 , γ 1 ) F G L F R ( x 2 ; θ 2 , λ 2 , γ 2 ) F G L F R ( z ; θ 3 , λ 3 , γ 3 ) , for x 1 > 0 , x 2 > 0 ,$
where $z = min ( x 1 , x 2 )$.
When $λ i = λ$ and $γ i = γ$, $i = 1 , 2 , 3$, it is obtained that $( X 1 , X 2 ) ∼ B G L F R ( θ , λ , γ )$ given by Sarhan et al. [14].
Extended bivariate log-exponentiated Kumaraswamy model. Let U be a random variable with logEK distribution, $U ∼ l o g E K ( θ , λ , γ )$ (see Lemonte et al. [27]), then its cdf
$F l o g E K ( u ; θ , λ , γ ) = 1 − 1 − 1 − e − u λ γ θ , for u > 0 .$
If $U i ∼ l o g E K ( θ i , λ i , γ i )$ $i = 1 , 2 , 3$, then the GBD model with logEKs baseline distribution vector is an extended BlogEK model, $( X 1 , X 2 ) ∼ E B l o g E K ( θ , λ , γ )$ with parameters $θ = ( θ 1 , θ 2 , θ 3 )$, $λ = ( λ 1 , λ 2 , λ 3 )$, and $γ = ( γ 1 , γ 2 , γ 3 )$, and its joint cdf is given by
$F E B l o g E K ( x 1 , x 2 ) = F l o g E K ( x 1 ; θ 1 , λ 1 , γ 1 ) F l o g E K ( x 2 ; θ 2 , λ 2 , γ 2 ) F l o g E K ( z ; θ 3 , λ 3 , γ 3 ) , for x 1 > 0 , x 2 > 0 ,$
where $z = min ( x 1 , x 2 )$.
Clearly, it can be seen that $( X 1 , X 2 ) ∼ B l o g E K ( θ , λ , γ )$ given by Elsherpieny et al. [15], when $λ i = λ$ and $γ i = γ$, $i = 1 , 2 , 3$.
Extended bivariate exponentiated modified Weibull extension model. A random variable U follows an EMWE distribution, $U ∼ E M W E ( θ , α , β , λ )$ (see Sarhan and Apaloo [28]), if its cdf can be expressed as
$F E M W E ( u ; θ , α , β , λ ) = 1 − exp α λ 1 − e ( u / α ) β θ , for u > 0 .$
If $U i ∼ E M W E ( θ i , α i , β i , λ i )$ $i = 1 , 2 , 3$, then the GBD model with EMWEs baseline distribution vector is an extended BEMWE model, $( X 1 , X 2 ) ∼ E B E M W E ( θ , α , β , λ )$ with $θ = ( θ 1 , θ 2 , θ 3 )$ and $α = ( α 1 , α 2 , α 3 )$, $β = ( β 1 , β 2 , β 3 )$, and $λ = ( λ 1 , λ 2 , λ 3 )$ parameter vectors, and its joint cdf is given by
$F E B E M W E ( x 1 , x 2 ) = F E M W E ( x 1 ; θ 1 , α 1 , β 1 , λ 1 ) F E M W E ( x 2 ; θ 2 , α 2 , β 2 , λ 2 ) F E M W E ( z ; θ 3 , α 3 , β 3 , λ 3 ) ,$
for $x 1 > 0$ and $x 2 > 0$, where $z = min ( x 1 , x 2 )$.
Note that, if $α i = α$, $β i = β$ and $λ i = λ$, $i = 1 , 2 , 3$, then $( X 1 , X 2 ) ∼ B E M W E ( θ , α , β , λ )$ given by El-Gohary et al. [16].
Extended bivariate inverse Weibull model. The cdf of the IW distribution (e.g., see Keller et al. [29]) is defined by
$F I W ( u ; θ , λ ) = e − λ u − θ , for u > 0 .$
If $U i ∼ I W ( θ i , λ i )$ $i = 1 , 2 , 3$, then the GBD model with IWs baseline distribution vector is an extended BIW model with $θ = ( θ 1 , θ 2 , θ 3 )$ and $λ = ( λ 1 , λ 2 , λ 3 )$ parameter vectors, denoted as $( X 1 , X 2 ) ∼ E B I W ( θ , λ )$, and its joint cdf can be written as
$F E B I W ( x 1 , x 2 ) = e − λ 1 x 1 − θ 1 − λ 2 x 2 − θ 2 − λ 3 z − θ 3 , for x 1 > 0 , x 2 > 0 ,$
where $z = min ( x 1 , x 2 )$.
In particular, $( X 1 , X 2 ) ∼ B I W ( θ , λ )$ studied by Muhammed [17] and Kundu and Gupta [18], when $θ i = θ$ for $i = 1 , 2 , 3$.
Extended bivariate Dagum model. It is said that a random variable U follows a Dagum distribution [30], $U ∼ D ( θ , λ , γ )$, if its cdf is given by
$F D ( u ; θ , λ , γ ) = ( 1 + λ u − γ ) − θ , for u > 0 .$
If $U i ∼ D ( θ i , λ i , γ i )$ $i = 1 , 2 , 3$, then the GBD model with Dagum baseline distribution vector is an extended BD model with $θ = ( θ 1 , θ 2 , θ 3 )$, $λ = ( λ 1 , λ 2 , λ 3 )$ and $γ = ( γ 1 , γ 2 , γ 3 )$ parameter vectors, denoted as $( X 1 , X 2 ) ∼ E B D ( θ , λ , γ )$, having joint cdf
$F E B D ( x 1 , x 2 ) = F D ( x 1 ; θ 1 , λ 1 , γ 1 ) F D ( x 2 ; θ 2 , λ 2 , γ 2 ) F D ( z ; θ 3 , λ 3 , γ 3 ) , for x 1 > 0 , x 2 > 0 ,$
where $z = min ( x 1 , x 2 )$.
Note that, when $λ i = λ$ and $γ i = γ$ for $i = 1 , 2 , 3$, it is simplified to the model $( X 1 , X 2 ) ∼ B D ( θ , λ , γ )$ defined by Muhammed [19].
Extended bivariate generalized Rayleigh model. The cdf of the GR distribution, also called Burr type X model [31], is
$F G R ( u ; θ , λ ) = 1 − e − ( λ u ) 2 θ , for u > 0 .$
If $U i ∼ G R ( θ i , λ i )$ $i = 1 , 2 , 3$, then the GBD model with a GR baseline distribution vector is an extended BGR model with $θ = ( θ 1 , θ 2 , θ 3 )$ and $λ = ( λ 1 , λ 2 , λ 3 )$ parameter vectors, $( X 1 , X 2 ) ∼ E B G R ( θ , λ )$, with joint cdf
$F E B G R ( x 1 , x 2 ) = F G R ( x 1 ; θ 1 , λ 1 ) F G R ( x 2 ; θ 2 , λ 2 ) F G R ( z ; θ 3 , λ 3 ) , for x 1 > 0 , x 2 > 0 ,$
where $z = min ( x 1 , x 2 )$.
Hence, if $λ i = λ$, $i = 1 , 2 , 3$, it is obtained that $( X 1 , X 2 ) ∼ B G R ( θ , λ )$ given by Sarhan [20].
Extended bivariate Gumbel-G model. Alzaatrech et al. [32] proposed a transformed-transformer method for generating families of continuous distributions. From such method, it is said that a random variable U follows a Gumbel-G model, $U ∼ G u$-$G ( θ , α , λ )$ if its cdf can be expressed as
$F G u − G ( u ; G , θ , α , λ ) = exp − θ 1 − G ( u ; λ ) G ( u ; λ ) α , for u > 0$
where G is the transformer distribution with parameter vector $λ$. If $U i ∼ G u$-$G ( θ i , α i , λ i )$ $i = 1 , 2 , 3$, then the GBD model with Gu-Gs baseline distribution vector is an extended BGu-G model, $( X 1 , X 2 ) ∼ E B G u$-$G ( θ , α , λ G )$, with parameters $θ = ( θ 1 , θ 2 , θ 3 )$, $α = ( α 1 , α 2 , α 3 )$, and $λ G = ( λ 1 , λ 2 , λ 3 )$, where $λ G$ encompasses all parameter vectors of G in each baseline component. Thus, its joint cdf is given by
$F E B G u − G ( x 1 , x 2 ) = F G u − G ( x 1 ; G , θ 1 , α 1 , λ 1 ) F G u − G ( x 2 ; G , θ 2 , α 2 , λ 2 ) F G u − G ( z ; G , θ 3 , α 3 , λ 3 ) ,$
for $x 1 > 0$, $x 2 > 0$, where $z = min ( x 1 , x 2 )$.
In particular, when $α i = α$ and $λ i = λ$ for $i = 1 , 2 , 3$, $( X 1 , X 2 ) ∼ B G u$-$G ( θ , α , λ )$ given by Eliwa and El-Morshedy [21].
Extended bivariate generalized inverted Kumaraswamy model. A random variable U is said to be a GIK distribution defined by Iqbal et al. [33], if its cdf is given by
$F G I K ( u ; θ , α , γ ) = 1 − ( 1 + u γ ) − α θ , for u > 0 .$
If $U i ∼ G I K ( θ i , α i , γ i )$ $i = 1 , 2 , 3$, then the GBD model with GIKs baseline distribution vector is an extended BGIK model, $( X 1 , X 2 ) ∼ E B G I K ( θ , α , γ )$, with parameters $θ = ( θ 1 , θ 2 , θ 3 )$, $α = ( α 1 , α 2 , α 3 )$, and $γ = ( γ 1 , γ 2 , γ 3 )$, and its joint cdf can be written as
$F E B G I K ( x 1 , x 2 ) = F G I K ( x 1 ; θ 1 , α 1 , γ 1 ) F G I K ( x 2 ; θ 2 , α 2 , γ 2 ) F G I K ( z ; θ 3 , α 3 , γ 3 ) , for x 1 > 0 , x 2 > 0 ,$
where $z = min ( x 1 , x 2 )$.
It is straightforward to see that $( X 1 , X 2 ) ∼ B G I K ( θ , α , γ )$ analyzed by Muhammed [10] when $α = α i$ and $γ = γ i$ for $i = 1 , 2 , 3$.
Extended bivariate Burr type X-G model. From the transformed-transformer method of Alzaatrech et al. [32], it is said that a random variable U follows a Burr X-G model, $U ∼ B X$-$G ( θ , λ )$ if its cdf can be expressed as
$F B X − G ( u ; G , θ , λ ) = 1 − exp − G ( u ; λ ) 1 − G ( u ; λ ) 2 θ , for u > 0$
where $λ$ is the parameter vector of the transformer distribution G.
If $U i ∼ B X$-$G ( θ i , λ i )$ $i = 1 , 2 , 3$, then the GBD model with BX-Gs baseline distribution vector is an extended BBX-G model, $( X 1 , X 2 ) ∼ E B B X$-$G ( θ , λ G )$, with parameters $θ = ( θ 1 , θ 2 , θ 3 )$, and $λ G = ( λ 1 , λ 2 , λ 3 )$, where $λ G$ encompasses all parameter vectors of G in each baseline component. Then, its joint cdf can be expressed as
$F E B B X − G ( x 1 , x 2 ) = F B X − G ( x 1 ; θ 1 , λ 1 ) F B X − G ( x 2 ; θ 2 , λ 2 ) F B X − G ( z ; θ 3 , λ 3 ) , for x 1 > 0 , x 2 > 0 ,$
where $z = min ( x 1 , x 2 )$.
In particular, if $λ = λ i$ for $i = 1 , 2 , 3$, then $( X 1 , X 2 ) ∼ B B X$-$G ( θ , λ )$ introduced by El-Morshedy et al. [12].
GBD models from different baseline components. In addition, a GBD model can be derived from baseline components $U i$s belonging to different distribution families, which allows one to generate new bivariate distributions.
For illustrative purposes, Figure 1a–d display 3D surfaces of different joint pdfs given by Theorem 3, along with their contour plots. Here, $U 1$ and $U 2$ are taken identically distributed $G E ( θ , λ )$ with different shape and scale parameter values, and $U 3$ having a Weibull distribution with scale parameter $λ 3$ and shape parameter $α = 6$, $W ( λ 3 , 6 )$.
Figure 1 shows that some of these GBD models are multi-modal bivariate models. It indicates a variety of shapes for the GBD family depending on the different baseline distribution components and for different parameter values.

## 4. Distributional Properties

Here, we derive the marginal and conditional distributions of the GBD family, and the order statistics. Furthermore, some properties for particular baseline distribution vectors are provided.

#### 4.1. Marginal and Conditional Distributions

From Theorem 1, it is easy to obtain the marginal cdfs of the components $X i$’s, which can be written as
$F X i ( x i ) = F U i ( x i ) F U 3 ( x i ) , with i = 1 , 2 ,$
and, when the pdf $f U i$ of $U i$ exists, $i = 1 , 2 , 3$, the corresponding pdfs are given by
$f X i ( x i ) = f U i ( x i ) F U 3 ( x i ) + F U i ( x i ) f U 3 ( x i ) , with i = 1 , 2 .$
For instance, we shall now suppose that $U i$s have PRH distributions, in order to provide some preservation results of the PRH property on the marginals, and its closure under exponentiation of the underlying distributions.
Proposition 1.
If $( X 1 , X 2 )$ has a GBD model formed by $U i ∼ P R H ( θ i )$ with baseline distribution $F B i$ ($i = 1 , 2 , 3$), then $X i ∼ P R H ( θ i + θ 3 )$ with base distribution $F B i * = F B i θ i / ( θ i + θ 3 ) F B 3 θ 3 / ( θ i + θ 3 )$. Moreover, when the base distribution is common, $F B = F B i$, then $X i$s also have the same baseline distribution $F B$.
Proof.
It immediately follows from (5) and the EBPRH model, since $F U i = F B i θ i$. □
Corollary 1.
If $U i ∼ P R H ( θ i )$ with base distribution $F B i$, having $F B i ∼ P R H ( λ i )$ with base distribution $F B ˜ i$ ($i = 1 , 2 , 3$), then $X i ∼ P R H ( θ i λ i + θ 3 λ 3 )$ with base distribution $F B i * = F B ˜ i θ i λ i / ( θ i λ i + θ 3 λ 3 ) F B ˜ 3 θ 3 λ 3 / ( θ i λ i + θ 3 λ 3 )$. Moreover, if $F B ˜ = F B ˜ i$ ($i = 1 , 2 , 3$), then $X i$s also have the same base distribution $F B ˜$.
In addition, Figure 2 displays the plots of the marginal pdfs of the GBD models depicted in Figure 1a–d.
Note that Figure 2a–d show some bimodal shapes for the marginal pdfs given by (6) of the GBD models represented in Figure 1a–d, which also exhibit some multi-modal shapes of the joint pdfs. In this setting, Proposition 1 might be used to generate bimodal distributions from the marginals of the GBD family by mixing different baseline distribution components as in Figure 1.
Furthermore, we provide some results about the conditional distributions of a GBD model whose proof can be found in Appendix A.
Theorem 4.
If $( X 1 , X 2 )$ has a GBD model with baseline distribution vector $( F U 1 , F U 2 , F U 3 )$, then
1.
The conditional distribution of $X i$ given $X j ≤ x j$ ($i ≠ j$), say $F i | X j ≤ x j$, is an absolutely continuous cdf given by
$F i | X j ≤ x j ( x i ) = F U i ( x i ) F U 3 ( x i ) F U 3 ( x j ) , i f x i < x j F U i ( x i ) , i f x i ≥ x j .$
2.
The conditional pdf of $X i$ given $X j = x j$ ($i ≠ j$), say $f i | X j = x j$, is a convex combination of an absolutely continuous cdf and a degenerate cdf given by
$f i | X j = x j ( x i ) = α j I x j ( x i ) + ( 1 − α j ) f i | x j , a c ( x i ) ,$
where $I x j$ is the indicator function of the given point $x j$, and $f i | x j , a c$ is the absolutely continuous part
$f i | X j = x j , a c ( x i ) = 1 1 − α j f X i ( x i ) f U j ( x j ) f X j ( x j ) , i f x i < x j f U i ( x i ) , i f x i > x j 0 , i f x i = x j$
and the mixing weight $α j$ is constant with respect to $x i$
$α j = F U 1 ( x j ) F U 2 ( x j ) f U 3 ( x j ) f X j ( x j ) .$

#### 4.2. Minimum and Maximum Order Statistics

Now, we provide the cdfs of the maximum and minimum order statistics of a GBD model, which may be interpreted as the lifetimes of parallel and series systems based on the components of $( X 1 , X 2 )$.
Theorem 5.
If $T 1 = min ( X 1 , X 2 )$ and $T 2 = max ( X 1 , X 2 )$ of a GBD model $( X 1 , X 2 )$ with baseline distribution vector $( F U 1 , F U 2 , F U 3 )$, then their cdfs are given by
$F T 1 ( x ) = F U 3 ( x ) F U 1 : 2 ( x ) and F T 2 ( x ) = F U 3 : 3 ( x )$
where $U 1 : 2 = min ( U 1 , U 2 )$ and $U 3 : 3 = max ( U 1 , U 2 , U 3 )$.
Proof.
It is trivial from (1) and (5) by taking into account that $F T 2 ( x ) = F ( x , x )$ and $F T 1 ( x ) = F X 1 ( x ) + F X 2 ( x ) − F T 2 ( x )$. □
The pdfs $f T 1$ and $f T 2$ of the minimum and maximum statistics can be readily obtained by differentiation of (7).
Furthermore, the PRH property is preserved by the maximum order statistic of a GBD model, which is immediately derived from Theorem 5.
Corollary 2.
If $U i ∼ P R H ( θ i )$ with baseline distribution $F B i$ ($i = 1 , 2 , 3$), then $T 2 ∼ P R H ( θ )$ with base $F B ( 2 ) = F B 1 θ 1 / θ F B 2 θ 2 / θ F B 3 θ 3 / θ$ and $θ = θ 1 + θ 2 + θ 3$. Moreover, when $F B = F B i$ ($i = 1 , 2 , 3$), then $T 2$ also has the same base distribution $F B$.

## 5. Dependence and Stochastic Properties

In this section, we study various dependence and stochastic properties on the GBD family, its marginals and order statistics, and its copula representation. Notions of dependence and ageing for bivariate distributions can be found in Lai and Xie [34] and Balakrishnan and Lai [4]; see also Shaked and Shantikumar [35] for univariate and multivariate stochastic orders.

#### 5.1. GBD Model

Proposition 2.
If $( X 1 , X 2 ) ∼ G B D$ model, then $( X 1 , X 2 )$ is positive quadrant dependent (PQD).
Proof.
From (1) and (5), it is readily obtainable that $F ( x 1 , x 2 ) ≥ F X 1 ( x 1 ) F X 2 ( x 2 )$, which is equivalent to say that all random vector $( X 1 , X 2 )$, having a GBD model, is PQD. □
An immediate consequence of the PQD property is that $C o v ( X 1 , X 2 ) > 0$. Other important bivariate dependence properties are the following, whose proofs are provided in Appendix A.
Proposition 3.
Let $( X 1 , X 2 )$ be a random vector having a GBD model:
1.
$( X 1 , X 2 )$ is left tail decreasing (LTD).
2.
$( X 1 , X 2 )$ is left corner set decreasing (LCSD).
3.
Its joint cdf F is totally positive and of order 2 ($T P 2$).
Proof.
Note that F is $T P 2$ is equivalent to $( X 1 , X 2 )$ is LCSD, which implies LTD (e.g., see Balakrishnan and Lai [4]). Thereby, we only have to prove (3). From the definition of $T P 2$ property, it is equivalent to check that the following inequality holds:
$F ( x ) F ( x ′ ) F ( x ∨ x ′ ) F ( x ∧ x ′ ) ≤ 1 ,$
for all $x$ and $x ′$, where $x ∨ x ′ = ( max ( x 1 , x 1 ′ ) , max ( x 2 , x 2 ′ ) )$, and $x ∧ x ′ = ( min ( x 1 , x 1 ′ ) , min ( x 2 , x 2 ′ ) )$. Hence, from (1), the inequality (8) can be expressed as
$F U 3 ( u ) F U 3 ( v ) F U 3 ( w ) F U 3 ( y ) ≤ 1 ,$
where $u = x 1 ∧ x 2$, $v = x 1 ′ ∧ x 2 ′$, $w = ( x 1 ∨ x 1 ′ ) ∧ ( x 2 ∨ x 2 ′ )$ and $y = u ∧ v$. Moreover, one can observe that $y ≤ u ∨ v = max ( u , v ) ≤ w$.
Therefore, when $u ≤ v$, i.e., $y = u ≤ v ≤ w$, the inequality (8) can be simplified as follows:
$F U 3 ( v ) F U 3 ( w ) ≤ 1 ,$
which is trivial, since $v ≤ w$ and $F U 3$ is a cdf. An analogous development follows for $u > v$, which completes the proof. □
Let us see now some results related to the reversed hazard gradient of a random vector from the GBD family, which is defined as an extension of the univariate case, see Domma [36],
$r ( x ) = ( r 1 ( x ) , r 2 ( x ) ) = ∂ ∂ x 1 , ∂ ∂ x 2 ln F ( x 1 , x 2 )$
where each $r i ( x )$ represents the reversed hazard function of $( X i | X j ≤ x j )$, $i ≠ j = 1 , 2$, and assuming that F is differentiable. In addition, it is said that $( X 1 , X 2 )$ has a bivariate decreasing (increasing) reversed hazard gradient, BDRHG (BIRHG), if all components $r i$s are decreasing (increasing) functions in the corresponding variables.
Proposition 4.
If $( X 1 , X 2 )$ has a GBD model with baseline distribution vector $( F U 1 , F U 2 , F U 3 )$, then its reversed hazard gradient $r ( x )$ is given by
$r i ( x ) = r U i ( x i ) + r U 3 ( x i ) , i f x i < x j r U i ( x i ) , i f x i ≥ x j$
for $i ≠ j = 1 , 2$, when the reversed hazard function of $U i$, $r U i = f U i / F U i$ exists, $i = 1 , 2 , 3$.
Proof.
The proof is straightforward from the definition of reversed hazard rate function corresponding to the conditional cdf $F i | X j ≤ x j$ given by (1) of Theorem 4. □
Theorem 6.
Let $( X 1 , X 2 )$ be a random vector having a GBD model. If $U i$s have decreasing reversed hazard functions (DRH), then $( X 1 , X 2 ) ∈ B D R H G$.
Proof.
It is straightforward from Proposition 4. □
Note that Theorem 6 provides the closure of the DRH property under the formation of a GBD model. Thus, the bivariate extension of a DRH distribution $F U$ generated by the GBD family is BDRHG.
Nevertheless, it does not hold for the increasing reversed hazard (IRH) property, since both $r i ( x )$ given in Proposition 4 have a negative jump discontinuity at $x i = x j$ for $i ≠ j = 1 , 2$. Therefore, if $U i ∈ I R H$, then $( X 1 , X 2 )$ cannot be BIRHG.
Finally, we present some interesting stochastic ordering results between bivariate random vectors of GBD type.
Theorem 7.
Let $X = ( X 1 , X 2 )$ and $Y = ( Y 1 , Y 2 )$ have GBD models with baseline distribution vectors $( F U 1 , F U 2 , F U 3 )$ and $( F V 1 , F V 2 , F V 3 )$, respectively. If $U i ≤ s t V i$ ($i = 1 , 2 , 3$), then $X ≤ l o Y$.
Proof.
The result immediately follows from the stochastic ordering between components and (1), since $U i ≤ s t V i$ is equivalent to $F U i ( x ) ≥ F V i ( x )$, and the lower orthant ordering is defined by the inequality $F X ( x ) ≥ F Y ( x )$ for all $x = ( x 1 , x 2 )$. □
Corollary 3.
Let $X ∼ E B P R H ( θ , λ )$ and $Y ∼ E B P R H ( θ * , λ * )$ with base distributions $F B i$ and $F B i *$ ($i = 1 , 2 , 3$), respectively. If $θ i ≤ θ i *$ and $F B i ≤ s t F B i *$ ($i = 1 , 2 , 3$), then $X ≤ l o Y$.
Proof.
It is obvious that $F B i θ i ( x i ) ≥ F B i θ i * ( x i ) ≥ F B i * θ i * ( x i )$, i.e., $U i ≤ s t V i$, and then the proof readily follows from Theorem 7. □
Remark 1.
From Corollary 3, if both EBPRH models are based on a common base distribution vector, $F B i = F B i *$ ($i = 1 , 2 , 3$), then it is only necessary that $θ i ≤ θ i *$ to hold the lower orthant ordering.

#### 5.2. Marginals and Order Statistics

Now, we study some stochastic properties of the marginals and the minimum and maximum order statistics of the GBD model.
Firstly, from (5) and (6), the reversed hazard function of the marginal $X i$s can be expressed as
$r X i ( x ) = f X i ( x ) F X i ( x ) = r U i ( x ) + r U 3 ( x ) , i = 1 , 2 .$
Therefore, the DRH (IRH) property is preserved to the marginals.
Theorem 8.
If $( X 1 , X 2 )$ has a GBD model formed by $U i ∈ D R H$ ($i = 1 , 2 , 3$), then $X i ∈ D R H$ ($i = 1 , 2$).
Remark 2.
Note that the IRH distributions have upper bounded support [37]. Thus, if any $U i$ is not upper bounded, its reversed hazard function is always decreasing at the end, and then the marginal cannot be IRH. Therefore, it is necessary that $U i ∈ I R H$ ($i = 1 , 2 , 3$) and they have the same upper bounds to be $X i ∈ I R H$ ($i = 1 , 2$).
Example 1.
Suppose $U i$s have extreme value distributions of type 3 with a common support, $U i ∼ E V 3 ( β , λ i , k i )$, whose cdf is defined by
$F U i ( u ) = exp − λ i ( β − u ) k i , for u ≤ β$
and $F U i ( u ) = 1$ otherwise. Its reversed hazard function is given by
$r U i ( u ) = λ i k i ( β − u ) k i − 1 , for u ∈ ( − ∞ , β ] ,$
which is increasing (decreasing) in its support for $k i ≤ ( ≥ ) 1$. Thus, if $k i ≤ ( ≥ ) 1$ ($i = 1 , 2 , 3$), then $U i ∈ I R H ( D R H )$, and, consequently, $X i ∈ I R H ( D R H )$ ($i = 1 , 2$).
Example 2.
If $( X 1 , X 2 )$ has an EBGE model, then its marginals are $D R H$, since $r X i$ given by (9) is the sum of two decreasing functions because of each $U i ∼ G E ( θ i , λ i )$ is a $P R H ( θ i )$ with exponential baseline distribution
$r U i ( u ) = θ i r E x p ( λ i ) ( u ) = θ i λ i e λ i u − 1 ,$
which is evidently a decreasing function. Here, $E x p ( λ )$ denotes an exponential random variable with mean $1 / λ$.
Remark 3.
From (9), when the $U i$s have a common distribution $F U$, then the marginals $X i ∼ P R H ( 2 )$ with base distribution $F U$. Therefore, $r X i ( x ) = 2 r U ( x )$ has the same monotonicity. In particular, if $F U ∈ D R H ( I R H )$, then $X i ∈ D R H ( I R H )$.
Remark 4.
From (9), if $U i ∼ P R H ( θ i )$ with the same base distribution $F B$, then $X i ∼ P R H ( θ i + θ 3 )$ with base $F B$, i.e., $r X i ( x ) = ( θ i + θ 3 ) r B ( x )$. Thus, Remark 3 also holds by using $F B$ instead of $F U$.
Secondly, the mean inactivity time (MIT), also called mean waiting time [37], of a random variable X is defined as
$m X ( x ) = E ( x − X | X ≤ x ) = ∫ − ∞ x F X ( y ) F X ( x ) d y .$
Thus, from (5), the MIT of the marginal $X i$s of a GBD model can be derived by
$m X i ( x ) = 1 F U i ( x ) F U 3 ( x ) ∫ − ∞ x F U i ( y ) F U 3 ( y ) d y , i = 1 , 2 .$
Here, we shall focus on two particular cases of GBD models, having baseline components with monotonous MIT, which is preserved by the marginals.
Example 3.
Suppose $U i ∼ E x p ( λ )$, then its MIT can be expressed as
$m U i ( u ) = u 1 − e − λ u − 1 λ ,$
which is an increasing MIT function (IMIT), i.e., $U i ∈ I M I T$. From (10), we obtain the MIT function of the marginals $X i$s for the bivariate exponential version of GBD type,
$m X i ( x ) = ( 2 λ x − 3 + 4 e − λ x ) − e − 2 λ x 2 λ ( 1 − e − λ x ) 2 .$
Then, upon differentiation, $m X i ′ ( x )$ has the same sign as the expression $1 − e − 2 λ x − 2 λ x e − λ x$, which is positive, and therefore $X i ∈ I M I T$ ($i = 1 , 2$).
Example 4.
Suppose $U i ∼ E V 3 ( β , λ i , k = 2 )$, then its MIT can be expressed as
$m U i ( u ) = 1 e − λ i ( β − u ) 2 ∫ − ∞ u e − λ i ( β − y ) 2 d y = Φ ( u ; μ , σ i ) ϕ ( u ; μ , σ i ) , for u ≤ β$
where $Φ ( u ; μ , σ i )$ and $ϕ ( u ; μ , σ i )$ are the cdf and pdf of a normal model with $μ = β$ and $σ i = 1 2 λ i$, respectively. Moreover, taking into account that a random variable and its standardized version have PRH functions, and the standard normal distribution has the DRH property [38], we obtain that $U i ∈ I M I T$.
Upon considering the cdf of $U i$s and (5), the marginal $X i ∼ E V 3 ( β , λ i + λ 3 , 2 )$ ($i = 1 , 2$). Thus, their MIT can be written as
$m X i ( x ) = Φ ( x ; μ , σ ˜ i ) ϕ ( x ; μ , σ ˜ i ) , for x ≤ β$
where $σ ˜ i = 1 2 ( λ i + λ 3 )$ for $i = 1 , 2$, and, consequently, $X i ∈ I M I T .$
On the other hand, the following stochastic orderings among the three baseline components of two GBD models are preserved by their corresponding marginals. The proof immediately follows from the definitions of the stochastic orderings.
Theorem 9.
Let $( X 1 , X 2 )$ and $( Y 1 , Y 2 )$ have GBD models with base distribution vectors $( F U 1 , F U 2 , F U 3 )$ and $( F V 1 , F V 2 , F V 3 )$, respectively.
1.
If $U i ≤ s t V i$ ($i = 1 , 2 , 3$), then $X i ≤ s t Y i$ ($i = 1 , 2$).
2.
If $U i ≤ r h V i$ ($i = 1 , 2 , 3$), then $X i ≤ r h Y i$ ($i = 1 , 2$).
Finally, we discuss some stochastic properties of the minimum and maximum order statistics of the GBD family. In this setting, from (7), the reversed hazard function of the maximum statistic $T 2$ of $( X 1 , X 2 )$ of GBD type is determined by the sum of the reversed hazard rates of the baseline distribution vector:
$r T 2 ( x ) = r U 1 ( x ) + r U 2 ( x ) + r U 3 ( x )$
when the pdf $f U i$ of $U i$ exists, $i = 1 , 2 , 3$. Hence, it is immediate the following result.
Theorem 10.
If $U i ∈ D R H ( I R H )$ ($i = 1 , 2 , 3$), then $T 2 ∈ D R H ( I R H )$.
Example 5.
Suppose $U i ∼ E V 3 ( β , λ i , k i )$ ($i = 1 , 2 , 3$). Then, the reversed hazard function of $T 2$ is given by
$r T 2 ( x ) = ∑ i = 1 3 λ i k i ( β − x ) k i − 1 ,$
and, therefore, if every $k i ≤ ( ≥ ) 1$, $i = 1 , 2 , 3$, then $r T 2$ is increasing (decreasing) in x, i.e., $T 2 ∈ I R H ( D R H )$.
Example 6.
If $U i ∼ G E ( θ i , λ i )$, then the maximum statistic of the EBGE model is $D R H$, $T 2 ∈ D R H$, since (11) is the sum of three decreasing functions.
Remark 5.
When $U i$s have a common distribution $F U$, the GBD model has a maximum statistic whose cdf is $F U$ cube, and (11) can be written as $r T 2 ( x ) = 3 r U ( x )$. In particular, if $F U ∈ D R H ( I R H )$, then $T 2 ∈ D R H ( I R H )$.
Remark 6.
From Corollary 2, if $U i ∼ P R H ( θ i )$ with the same base distribution $F B$, $T 2 ∼ P R H ( θ )$ with base $F B$ and $θ = θ 1 + θ 2 + θ 3$, i.e., $r T 2 ( x ) = θ r B ( x )$. Thus, $T 2 ∈ D R H ( I R H )$ if and only if $F B ∈ D R H ( I R H )$.
Furthermore, the MIT of the maximum statistic of a GBD model $( X 1 , X 2 )$ can be derived by
$m T 2 ( x ) = 1 F U 1 ( x ) F U 2 ( x ) F U 3 ( x ) ∫ − ∞ x F U 1 ( y ) F U 2 ( y ) F U 3 ( y ) d y ,$
for each specific baseline distribution vector $( F U 1 , F U 2 , F U 3 )$, when the integral exists. For instance, we will consider a particular case, similar to one used in Example 4.
Example 7.
Suppose $( X 1 , X 2 )$ has a GBD model with $U i ∼ P R H ( θ i )$ and base distributions $F B i ∼ E V 3 ( β , λ i , k = 2 )$ for $i = 1 , 2 , 3$, then each component $U i ∼ E V 3 ( β , θ i λ i , 2 )$, and consequently, $U i ∈ I M I T$ for $i = 1 , 2 , 3$. Moreover, from Corollary 2, the maximum statistic $T 2 ∼ E V 3 ( β , θ * , 2 )$ with $θ * = θ 1 λ 1 + θ 2 λ 2 + θ 3 λ 3$. Thus, $T 2 ∈ I M I T$ which is obtained along the same line as Example 4, since
$m T 2 ( x ) = Φ ( x ; β , ( 2 θ * ) − 1 / 2 ) ϕ ( x ; β , ( 2 θ * ) − 1 / 2 ) , for x ≤ β .$
Regarding the minimum statistic $T 1$ of $( X 1 , X 2 )$ of GBD type, some preservation results are also obtained based on its reversed hazard rate $r T 1$, the proofs are given in Appendix A, and from (7) $r T 1$ can be written as
$r T 1 ( x ) = r U 1 : 2 ( x ) + r U 3 ( x ) .$
Theorem 11.
If $U i ∈ D R H$ ($i = 1 , 2 , 3$) and $U 1 : 2 ≤ r h U i$ ($i = 1 , 2$), then $T 1 ∈ D R H$.
Corollary 4.
If $U i ∈ D R H$ ($i = 1 , 2 , 3$) and $U 1 = s t U 2$, then $T 1 ∈ D R H$.
Example 8.
Suppose $U i ∼ G E ( θ , λ )$ for $i = 1 , 2$ and $U 3 ∼ G E ( θ 3 , λ 3 )$, then $U i ∈ D R H$, and, consequently, $T 1 ∈ D R H$ from Corollary 4.
Remark 7.
Note that, when $U i$s have a common distribution $F U$, (12) can be expressed as $r T 1 ( x ) = r U ( x ) ( 3 − 2 / ( 2 − F U ( x ) ) )$, and from Corollary 4, it is immediate to have that, if $F U ∈ D R H$, then $T 1 ∈ D R H$.
Theorem 12.
Let $( X 1 , X 2 )$ be a GBD model. Then, $T 1 ≤ r h T 2$.
Proof.
From (11) and (12), the statement is equivalent to $r U 1 : 2 ( x ) ≤ r U 2 : 2 ( x )$, which readily follows from Theorem 1.B.56 of Shaked and Shanthikumar [35], since the baseline components $U i$s are independent. □

#### 5.3. Copula and Related Association Measures

Let us see now the copula representation of the GBD family and some related dependence measures of interest in the analysis of two-dimensional data.
It is well known that the dependence between the random variables $X 1$ and $X 2$ is completely described by the joint cdf $F ( x 1 , x 2 )$, and it is often represented by a copula which describes the dependence structure in a separate form from the marginal behaviour. In this setting, from Sklar’s theorem (e.g., see [39]), if its marginal cdfs $F X i$s are absolutely continuous, then the joint cdf has a unique copula representation for
$F ( x 1 , x 2 ) = C F X 1 ( x 1 ) , F X 2 ( x 2 ) ,$
and reciprocally, if $F X i − 1$ is the inverse function of $F X i$ ($i = 1 , 2$), then there exists a unique copula C in $[ 0 , 1 ] 2$, such that
$C ( u 1 , u 2 ) = F F X 1 − 1 ( u 1 ) , F X 2 − 1 ( u 2 ) .$
Now, we can derive the copula representation for the joint cdf of the GBD family as a function of its base distribution vector $( F U 1 , F U 2 , F U 3 )$. In order to do this, by using (5), the joint cdf (1) can be expressed as
$F ( x 1 , x 2 ) = F X 1 ( x 1 ) F X 2 ( x 2 ) F U 3 ( min ( x 1 , x 2 ) ) F U 3 ( x 1 ) F U 3 ( x 2 )$
and taking $u i = F X i ( x i )$, the associated copula for an arbitrary base distribution vector $( F U 1 , F U 2 , F U 3 )$ can be written as
$C ( u 1 , u 2 ) = u 1 u 2 min ( A 1 ( u 1 ) , A 2 ( u 2 ) ) A 1 ( u 1 ) A 2 ( u 2 ) ,$
where
$A i ( u i ) = F U 3 ( F U i × F U 3 ) − 1 ( u i ) , i = 1 , 2 ,$
which allows us to give an additional result.
Theorem 13.
Let $X = ( X 1 , X 2 )$ and $Y = ( Y 1 , Y 2 )$ be two GBD models with baseline distribution vectors $( F U 1 , F U 2 , F U 3 )$ and $( F V 1 , F V 2 , F V 3 )$, respectively. If $X$ and $Y$ have the same associated copula and $U i ≤ s t V i$, then $X ≤ s t Y$.
Proof.
It is immediate by using Theorem 6.B.14 of Shaked and Shanthikumar [35] and (5), since $U i ≤ s t V i$ implies $X i ≤ s t Y i$. □
Corollary 5.
Let $X = ( X 1 , X 2 )$ and $Y = ( Y 1 , Y 2 )$ be two GBD models with common baseline distributions, $F U$ and $F V$, respectively. If $U ≤ s t V$, then $X ≤ s t Y$.
Note that (13) provides a general formula to establish the specific copula upon considering two particular continuous and increasing bijective functions $A 1$ and $A 2$ from $[ 0 , 1 ]$ onto $[ 0 , 1 ]$. Fang and Li [40] analyzed some stochastic orderings for an equivalent copula representation to (13) with interesting applications in network security and insurance. In the last section, we shall use the bivariate copula representation (13) to discuss the multivariate extension of the GBD family.
Furthermore, (13) may be considered a generalization of the Marshall–Olkin copula, as displayed in the following results whose proofs are omitted.
Corollary 6.
If $( X 1 , X 2 )$ has a GBD model with a common base distribution $F U$, then the copula representation of its joint cdf is
$C ( u 1 , u 2 ) = min ( u 1 u 2 1 / 2 , u 1 1 / 2 u 2 ) .$
Corollary 7.
If $( X 1 , X 2 )$ has a GBD model with PRHs baseline distribution vector of the same base $F B$, i.e., $( X 1 , X 2 ) ∼ B P R H ( θ 1 , θ 2 , θ 3 )$, then the copula representation of its joint cdf is
$C ( u 1 , u 2 ) = min u 1 u 2 θ 2 / ( θ 2 + θ 3 ) , u 1 θ 1 / ( θ 1 + θ 3 ) u 2 .$
Some association measures for a bivariate random vector $( X 1 , X 2 )$ of GBD type can be derived from the dependence structure described by the general expression (13) for each particular pair of continuous and increasing bijective functions $A 1$ and $A 2$ determined by the specific baseline distribution vector. For instance, for the special GBD models given in Corollaries 6 and 7, the measures of dependence namely Kendall’s tau, Spearman’s rho, Blomqvist’s beta, and tail dependence coefficients, see Nelsen [39] among others, can be calculated as follows.
Kendall’s tau. The Kendall’s $τ$ is defined as the probability of concordance minus the probability of discordance between two pairs of independent and identically distributed random vectors, $( X 1 , X 2 )$ and $( Y 1 , Y 2 )$, as follows:
$τ = P ( X 1 − Y 1 ) ( X 2 − Y 2 ) > 0 − P ( X 1 − Y 1 ) ( X 2 − Y 2 ) < 0 ,$
and it can be calculated through its copula representation $C ( u 1 , u 2 )$ by
$τ = 4 E C ( U 1 , U 2 ) − 1 = 1 − 4 ∫ ∫ [ 0 , 1 ] 2 ∂ C ( u 1 , u 2 ) ∂ u 1 ∂ C ( u 1 , u 2 ) ∂ u 2 d u 1 d u 2$
with $U i$s uniform $[ 0 , 1 ]$ random variables whose joint cdf is C.
For example, if $( X 1 , X 2 )$ has a GBD model with a common baseline $F U$, upon substituting from the copula of Corollary 6 in (14), it is easy to check that Kendall’s $τ = 1 / 3$.
Analogously, from the copula given in Corollary 7 of the GBD model for $P R H ( θ i )$ components with a common base $F B$, the Kendall’s $τ$ coefficient (14) can be written as
$τ = θ 3 θ 1 + θ 2 + θ 3 .$
Spearman’s rho. The Spearman’s $ρ$ coefficient measures the dependence by three pairs of independent and identically distributed random vectors, $( X 1 , X 2 )$, $( Y 1 , Y 2 )$ and $( Z 1 , Z 2 )$. It is defined as
$ρ = 3 P ( ( X 1 − Y 1 ) ( X 2 − Z 2 ) > 0 ) − P ( ( X 1 − Y 1 ) ( X 2 − Z 2 ) < 0 ) ,$
which can be computed by its copula representation $C ( u 1 , u 2 )$ by
$ρ = 12 E U 1 U 2 − 3 .$
Thus, if there is a common base distribution as in Corollary 6, the Spearman’s $ρ$ coefficient between $X 1$ and $X 2$ is $ρ = 3 / 7$.
In the case of $U i ∼ P R H ( θ i )$ with a common base distribution $F B$, from (15) and Corollary 7, this association measure is
$ρ = 3 θ 3 2 θ 1 + 2 θ 2 + 3 θ 3$
which coincides with one obtained by Kundu et al. [9] for this specific GBD model, $( X 1 , X 2 ) ∼ B P R H ( θ 1 , θ 2 , θ 3 )$. As remarked by Kundu et al. [9] for the BPRH model, both coefficients, $τ$ and $ρ$, vary between 0 and 1 as $θ 3$ varies from 0 to .
Blomqvist’s Beta. The Blomqvist’s $β$ coefficient, also called the medial correlation coefficient, is defined as the probability of concordance minus the probability of discordance between $( X 1 , X 2 )$ and its median point, say $( m 1 , m 2 )$, taking the following form:
$β = P ( X 1 − m 1 ) ( X 2 − m 2 ) > 0 − P ( X 1 − m 1 ) ( X 2 − m 2 ) < 0 = 4 F ( m 1 , m 2 ) − 1 ,$
and from the copula of its joint cdf F, it can be expressed as
$β = 4 C ( 1 / 2 , 1 / 2 ) − 1 .$
In the case of Corollary 6, it is immediate that the medial correlation coefficient between $X 1$ and $X 2$ is $β = 2 − 1$ when it follows a GBD model with a common baseline distribution.
In the other case, from Corollary 7, the Blomqvist’s $β$ coefficient (16) is also readily obtainable between the marginals of a BPRH model:
$β = 2 θ 3 / ( θ 2 + θ 3 ) , if θ 1 ≤ θ 2 2 θ 3 / ( θ 1 + θ 3 ) , if θ 1 > θ 2 ,$
which takes values between 0 and 1 as $θ 3$ varies from 0 to .
Tail Dependence. The tail dependence measures the association of extreme events in both directions, the upper (lower) tail dependence $λ U$ ($λ L$) provides an asymptotical association measurement in the upper (lower) quadrant tail of a bivariate random vector, given by (if it exists)
$λ U ( λ L ) = lim u → 1 − ( 0 + ) P X 2 > ( ≤ ) F X 2 − 1 ( u ) | X 1 > ( ≤ ) F X 1 − 1 ( u ) .$
Similar to the above association coefficients, the tail dependence indexes can be calculated from the copula representation $C ( u 1 , u 2 )$ of the joint cdf of $( X 1 , X 2 )$, as follows:
$λ U = 2 − lim u → 1 − 1 − C ( u , u ) 1 − u and λ L = lim u → 0 + C ( u , u ) u .$
In particular, if $( X 1 , X 2 )$ follows a GBD model with a common baseline distribution, upon substituting from the copula of Corollary 6 in (17), it is easy to check that $λ L = 0$ and $λ U = 1 / 2$.
In the case of $U i ∼ P R H ( θ i )$ with the same base, from (17) and Corollary 7, it is clear that the tail dependence indexes of the BPRH model are $λ L = 0$ and
$λ U = θ 3 θ 2 + θ 3 , if θ 1 ≤ θ 2 θ 3 θ 1 + θ 3 , if θ 1 > θ 2 ,$
which takes values between 0 and 1 as $θ 3$ varies from 0 to .

## 6. Maximum Likelihood Estimation

In this section, we address the problem of computing the maximum likelihood estimations (MLEs) of the unknown parameters based on a random sample. The problem can be formulated as follows. Suppose ${ ( x 1 i , x 2 i ) ; i = 1 , … , n }$ is a random sample of size n from a GBD model, where it is assumed that, for $j = 1 , 2 , 3$, $U j$ has the pdf $f U j ( u ; θ j )$ and $θ j$ is of dimension $p j$. The objective is to estimate the unknown parameter vector $θ = ( θ 1 , θ 2 , θ 3 )$. We use the following partition of the sample:
$I 1 = { i : x 1 i < x 2 i } , I 2 = { i : x 1 i > x 2 i } , I 0 = { i : x 1 i = x 2 i = x i } .$
Based on the above observations, the log-likelihood function becomes
$ℓ ( θ ) = ∑ i ∈ I 0 ln f 0 ( x i ; θ ) + ∑ i ∈ I 1 ln f 1 ( x 1 i , x 2 i ; θ ) + ∑ i ∈ I 2 ln f 2 ( x 1 i , x 2 i ; θ ) ,$
where $f 0 ( x i ; θ )$, $f 1 ( x 1 i , x 2 i ; θ )$, $f 2 ( x 1 i , x 2 i ; θ )$ have been defined in Theorem 3.
Here, it is difficult to compute the MLEs of the unknown parameter vector $θ$ by solving a $p 1 + p 2 + p 3$ optimization problem. To avoid that, we suggest using the EM algorithm, and the basic idea is based on considering a random sample of size n from $( U 1 , U 2 , U 3 )$, instead of the random sample of size n from $( X 1 , X 2 )$. From the observed sample ${ ( x 1 i , x 2 i }$, the sample ${ ( u 1 i , u 2 i , u 3 i ) ; i = 1 , … , n }$ has missing values as shown in Table 1. It is immediate that the MLEs of $θ 1$, $θ 2$ and $θ 3$ can be obtained by solving the following three optimization problems of dimensions $p 1$, $p 2$ and $p 3$, respectively,
$ℓ j ( θ j ) = ∑ i = 1 n ln f U j ( u j i ; θ j ) ; j = 1 , 2 , 3 ,$
which are computationally more tractable.
From Table 1, if, $i ∈ I 0$, then $u 3 i$ is known, and $u 1 i$ and $u 2 i$ are unknown. Similarly, if $i ∈ I 1$ ($i ∈ I 2$), then $u 2 i$ ($u 1 i$) and $max { u 1 i , u 3 i }$ ($max { u 2 i , u 3 i }$) are known. Hence, in the E-step of the EM algorithm, the ‘pseudo’ log-likelihood function is formed by replacing the missing $u j i$ by its expected value, $u j i m ( θ )$, for $i = 1 , … , n$ and $j = 1 , 2 , 3$:
• If $i ∈ I 0$, then
$u j i m ( θ ) = E ( U j | U j < x i ) = 1 F U j ( x i ) ∫ − ∞ x i u f U j ( u ) d u , j = 1 , 2 .$
• If $i ∈ I 1$ and $j , k ∈ { 1 , 3 } , j ≠ k$, then
$u j i m ( θ ) = E ( U j | max { U 1 , U 3 } = x 1 i ) = x 1 i P ( U j > U k ) + P ( U j < U k ) 1 F U j ( x 1 i ) ∫ − ∞ x 1 i u f U j ( u ) d u = x 1 i ∫ − ∞ ∞ f U j ( u ) F U k ( u ) d u + 1 F U j ( x 1 i ) ∫ − ∞ ∞ f U k ( u ) F U j ( u ) d u ∫ − ∞ x 1 i u f U j ( u ) d u .$
• If $i ∈ I 2$ and $j , k ∈ { 2 , 3 } , j ≠ k$, then
$u j i m ( θ ) = E ( U j | max { U 2 , U 3 } = x 2 i ) = x 2 i P ( U j > U k ) + P ( U j < U k ) 1 F U j ( x 2 i ) ∫ − ∞ x 2 i u f U j ( u ) d u = x 2 i ∫ − ∞ ∞ f U j ( u ) F U k ( u ) d u + 1 F U j ( x 2 i ) ∫ − ∞ ∞ f U k ( u ) F U j ( u ) d u ∫ − ∞ x 2 i u f U j ( u ) d u .$
Therefore, we propose the following EM algorithm to compute the MLEs of $θ$. Suppose at the k-th iteration of the EM algorithm, the value of $θ$ is $θ ( k ) = ( θ 1 ( k ) , θ 2 ( k ) , θ 3 ( k )$), then the following steps can be used to compute $θ ( k + 1 )$:
E-step
• At the k-th step for $i ∈ I 0$, obtain the missing $u 1 i$ and $u 2 i$ as $u 1 i m ( θ ( k ) )$ and $u 2 i m ( θ ( k ) )$, respectively. For $i ∈ I 1$ obtain the missing $u 1 i$ and $u 3 i$ as $u 1 i m ( θ ( k ) )$ and $u 3 i m ( θ ( k ) )$, respectively. Similarly, for $i ∈ I 2$, obtain the missing $u 2 i$ and $u 3 i$ as $u 2 i m ( θ ( k ) )$ and $u 3 i m ( θ ( k ) )$, respectively.
• Form the ’pseudo’ log-likelihood function as $ℓ s ( k ) ( θ ) = ℓ 1 s ( k ) ( θ 1 ) + ℓ 2 s ( k ) ( θ 2 ) + ℓ 3 s ( k ) ( θ 3 )$, where
$ℓ 1 s ( k ) ( θ 1 ) = ∑ i ∈ I 0 ln f U 1 ( u 1 i m ( θ ( k ) ) ; θ 1 ) + ∑ i ∈ I 1 ln f U 1 ( u 1 i m ( θ ( k ) ) ; θ 1 ) + ∑ i ∈ I 2 ln f U 1 ( u 1 i ; θ 1 ) ℓ 2 s ( k ) ( θ 2 ) = ∑ i ∈ I 0 ln f U 2 ( u 2 i m ( θ ( k ) ) ; θ 2 ) + ∑ i ∈ I 1 ln f U 2 ( u 2 i ; θ 2 ) + ∑ i ∈ I 2 ln f U 2 ( u 2 i m ( θ ( k ) ) ; θ 2 ) ℓ 3 s ( k ) ( θ 3 ) = ∑ i ∈ I 0 ln f U 3 ( u 3 i ; θ 3 ) + ∑ i ∈ I 1 ln f U 3 ( u 3 i m ( θ ( k ) ) ; θ 3 ) + ∑ i ∈ I 2 ln f U 3 ( u 3 i m ( θ ( k ) ) ; θ 3 ) .$
M-step
• $θ ( k + 1 ) = ( θ 1 ( k + 1 ) , θ 2 ( k + 1 ) , θ 3 ( k + 1 ) )$ can be obtained by maximizing $ℓ 1 s ( k ) ( θ 1 )$, $ℓ 2 s ( k ) ( θ 2 )$ and $ℓ 3 s ( k ) ( θ 3 )$ with respect to $θ 1$, $θ 2$ and $θ 3$, respectively.
Mainly for illustrative purposes, two particular GBD models will be applied in the next section to show the usefulness of the above EM algorithm. Firstly, we shall consider a GBD model with baseline components having the same distribution type and different underlying parameters. Secondly, we shall use a GBD model with baseline components from different distribution families. The technical details of both of them can be found in Appendix B.

## 7. Data Analysis

In this section, we present the analysis of two-dimensional data sets in order to show how the proposed EM algorithm can be applied to fit particular GBD models. For that, we shall suppose the following two models described in Appendix B: Model I is the GBD model with the exponential baseline distributions and different underlying parameters, $U j ∼ E x p ( λ j )$ ($j = 1 , 2 , 3$). Model II is the GBD model with baseline components from Weibull and generalized exponential distributions, $U 1 ∼ W ( λ 1 , α 1 )$, $U 2 ∼ W ( λ 2 , α 2 )$ and $U 3 ∼ G E ( α 3 , λ 3 )$.

#### 7.1. Soccer Data

We have analyzed a UEFA Champion’s League data set [41], played during the seasons 2004–2005 and 2005–2006. This set represents the soccer data where at least one goal has been scored by a kick goal (penalty kick, foul kick or any other direct kick) by any team and one goal has been scored by the home team. Here, in the bivariate data, $( X 1 , X 2 )$, $X 1$ represents the time in minutes of the first kick goal and $X 2$ represents the time in minutes scored by the home team. Clearly, all possibilities exist in the data set, namely $X 1 < X 2$, $X 1 > X 2$ and $X 1 = X 2$.
Meintanis [41] analyzed this data set using the Marshall–Olkin bivariate exponential model. The marginals of the Marshall–Olkin bivariate exponential distribution are exponential, and then they have constant hazard functions. A preliminary data analysis indicated that the empirical hazard function of both the marginals are increasing and their reversed hazard functions are decreasing. Hence, it may not be proper to use the Marshall–Olkin bivariate exponential model to analyze this data.
Example 9.
In order to use Model I, we have started the initial guess as $λ 1 ( 0 ) = λ 2 ( 0 ) = λ 3 ( 0 ) = 1$. The algorithm stops after eight iterations, the final estimates and the associated 95% confidence intervals are $λ ^ 1 = 0.03126 ( ± 0.01121 )$, $λ ^ 2 = 0.04630 ( ± 0.01563 )$ and $λ ^ 3 = 0.04269 ( ± 0.01875 )$, with $− 257.8871$ being the pseudo log-likelihood value. To check whether it has converged to the maximum or not, the performance of the EM algorithm may be compared with the experimental results obtained by using a quasi-Newton method for solving constrained nonlinear optimization problem, which have been summarized in Appendix C as well as the corresponding ones to the subsequent examples.
One natural question is whether Model I fits the bivariate data or not. We have computed the Kolmogorov–Smirnov (KS) distances with the corresponding p-values between the empirical and fitted cdfs for the marginals and the maximum order statistic. The results are reported in Table 2, and, from them, we cannot reject the null hypothesis that this data are coming from the GBD model with exponential baseline distributions.
Example 10.
Let us consider now Model II. We have started the EM algorithm with the initial guesses as $α 1 ( 0 ) = α 2 ( 0 ) = α 3 ( 0 ) = 1$, $λ 1 ( 0 ) = 0.03$, $λ 2 ( 0 ) = 0.05$ and $λ 3 ( 0 ) = 0.04$. The algorithm converges in nineteen iterations, the final estimates and the associated 95% confidence intervals are $α ^ 1 = 1.2987 ( ± 0.3124 )$, $λ ^ 1 = 0.0097 ( ± 0.0005 )$, $α ^ 2 = 0.8047 ( ± 0.2823 )$, $λ ^ 2 = 0.0093 ( ± 0.0021 )$, $α ^ 3 = 1.0037 ( ± 0.2879 )$, $λ ^ 3 = 0.0369 ( ± 0.008 )$, with $− 201.1141$ being the pseudo log-likelihood value.
The KS distances with the corresponding p-values for the marginals and the maximum statistic are reported in Table 2. Thus, based on the p-values, we can say that the GBD model with two baseline Weibull distributions and the third GE one fits the data reasonably well.
Summarizing, it is clear that both of the GBD models provide a good fit to the given data set and the EM algorithm also works quite effectively in both the cases. Now, to compare Models I and II of Examples 9 and 10, which provide a better fit, we compute the Akaike’s information criterion (AIC) and Bayesian information criterion (BIC) values and they are also presented in Table 2. Therefore, based on the AIC and BIC values, it is clear that Model I provides a better fit than Model II to the UEFA Champion’s League data set.

#### 7.2. Diabetic Retinopathy Data

Let us consider now the diabetic retinopathy data set [42], available in the R package “SurvCor” [43]. Such data were investigated by the National Eye Institute to assess the effect of laser photocoagulation in delaying the onset of severe visual loss such as blindness in 197 patients with diabetic retinopathy. For each patient, one eye was randomly selected for laser photocoagulation and the other was given no treatment, being used as the control. The times to blindness in both eyes were recorded in months and the censoring was caused by death, dropout, or the end of the study.
For illustrative purposes, we have considered those patients for which complete data are available. Here, $X 1$ denotes the time to the blindness of the untreated or control eye and $X 2$ denotes the time to blindness of the treated eye. Out of 197 patients, we have complete information of $X 1$ and $X 2$ for 38 patients.
Example 11.
As in Example 9, we have used Model I to analyze the data set. In this case, we have also used the same initial guess as $λ 1 ( 0 ) = λ 2 ( 0 ) = λ 3 ( 0 ) = 1$. We have used the proposed EM algorithm, the iteration stops after 14 iterations, and the estimates of unknown parameters and the corresponding 95% confidence intervals are $λ ^ 1 = 0.0653 ( ± 0.0175 )$, $λ ^ 2 = 0.0737 ( ± 0.0210 )$ and $λ ^ 3 = 0.1345 ( ± 0.3879 )$, with $− 172.2314$ being the associated pseudo log-likelihood value.
The KS distances with the corresponding p-values between the empirical and fitted cdfs for the marginals and the maximum statistic are presented in Table 3.
Example 12.
As in Example 10, we have analyzed the data set by using Model II. We have started the EM algorithm with the initial guesses $α 1 ( 0 ) = α 2 ( 0 ) = α 3 ( 0 ) = 1$, $λ 1 ( 0 ) = 0.06$, $λ 2 ( 0 ) = 0.07$ and $λ 3 ( 0 ) = 0.13$. The algorithm stops after 27 iterations, the final estimates and the corresponding 95% confidence intervals are $α ^ 1 = 1.0937 ( ± 0.2563 )$, $λ ^ 1 = 0.0447 ( ± 0.0146 )$, $α ^ 2 = 0.5851 ( ± 0.1345 )$, $λ ^ 2 = 0.2369 ( ± 0.0763 )$, $α ^ 3 = 0.8995 ( ± 0.2787 )$, $λ ^ 3 = 0.1898 ( ± 0.0478 )$, with $− 125.4519$ being the associated pseudo log-likelihood value.
The KS distances with the corresponding p-values for the marginals and the maximum order statistic are presented in Table 3.
From Table 3, we can also say that the estimated GBD models fit the diabetic retinopathy data reasonably well in both the cases. Moreover, we also present the AIC and BIC values of the two models in Table 3. Therefore, based on the AIC and BIC values, it is clear that Model I provides a better fit than Model II for the diabetic retinopathy data.

## 8. Discussion and Conclusions

In this paper, we have presented the generalized bivariate distribution family by a generator system based on the maximization process from any three-dimensional baseline continuous distribution vector with independent components, providing bivariate models with dependence structure.
For the proposed GBD family, several distributional and stochastic properties have been established. The preservation of the PRH property for the marginals and the maximum order statistic has been obtained. The positive dependence has been shown between both marginals of the GBD models, some results about stochastic orders and on the preservation of the monotonicity of the reversed hazard function and of the mean inactivity time. Furthermore, the copula representation of the GBD model has been discussed, providing a general formula, and some related dependence measures have been also calculated for specific copulas of particular bivariate distributions of the GBD family. In addition, new bivariate distributions can be generated by combining independent baseline components from different distribution families, and several bivariate distributions given in the literature are derived as particular cases of the GBD family.
Note that, even in the simple case, the MLEs cannot be obtained in explicit forms, and it is required solving a multidimensional nonlinear optimization problem. We have proposed using an EM algorithm to compute the MLEs of the unknown parameters, and it is observed that the proposed EM algorithm perform quite satisfactorily in the two data analyses by using two different models of the GBD family. The experimental results summarized in Table A1 disclose such efficiency of the EM algorithm with respect to a conventional numerical iterative procedure of the Newton-type. In more detail, Table A1 presents the experimental results obtained by the Broyden–Fletcher–Goldfarb–Shanno algorithm for maximizing the log-likelihood function, available in the R package “maxLik” [44].
It is worth mentioning that the bivariate copula representation (13) allows us to discuss its multivariate extension. Let $U i$s for $i = 1 , ⋯ , q + 1$ be a set of $q + 1$ mutually independent random variables with any continuous distribution functions, denoting by $F U i$ the cdf of each $U i$. Similarly to (1), the joint cdf of the q-dimensional random vector $( X 1 , ⋯ , X q )$ with $X i = max ( X 1 , X 2 )$ is given by
$F ( x 1 , . . . x q ) = F U q + 1 ( min ( x 1 , ⋯ , x q ) ) ∏ i = 1 q F U i ( x i )$
which can be considered as a generator of q-dimensional distribution models, called generalized multivariate distribution (GMD) family with baseline distribution vector $( F U 1 , . . . , F U q + 1 )$. Hence, the q-dimensional copula representation of this GMD family can be expressed as
$C ( u 1 , . . . , u q ) = ∏ i = 1 q u i min i = 1 , . . . , q A i ( u i ) ∏ i = 1 q A i ( u i ) ,$
where
$A i ( u i ) = F U q + 1 ( F U i × F U q + 1 ) − 1 ( u i ) , for i = 1 , . . . , q .$
From these q-dimensional joint cdf and copula, many distributional and stochastic properties established for the GBD family are extensible to the GMD family. Furthermore, by using this generator of multivariate distributions, the special bivariate models given in Section 3 can be easily extended to the multivariate case, which contain multivariate versions of bivariate distributions given in the literature.

## Author Contributions

Conceptualization, M.F., J.-M.V., and D.K.; methodology, M.F. and J.-M.V.; software, M.F. and D.K.; validation, M.F. and D.K.; formal analysis, M.F., J.-M.V., and D.K.; investigation, M.F., J.-M.V., and D.K.; writing—original draft preparation, M.F. and J.-M.V.; writing—review and editing, M.F., J.-M.V., and D.K.; supervision, M.F.; project administration, M.F. and J.-M.V. All authors have read and agreed to the published version of the manuscript.

## Funding

This research was partially supported by the Spanish Ministry of Economy, Industry and Competitiveness, the European Regional Development Fund Program through grant TIN2017-85949-C2-1-R.

## Acknowledgments

The authors would like to thank the editors and the anonymous reviewers for their comments and suggestions.

## Conflicts of Interest

The authors declare no conflict of interest.

## Appendix A

Proof of Theorem 2.
First, taking into account the event $A = U 3 > max ( U 1 , U 2 )$, the joint cdf can be expressed as
$F ( x 1 , x 2 ) = P ( U 1 ≤ x 1 , U 2 ≤ x 2 , U 3 ≤ min ( x 1 , x 2 ) | A ) P ( A ) + P ( U 1 ≤ x 1 , U 2 ≤ x 2 , U 3 ≤ min ( x 1 , x 2 ) | A ′ ) P ( A ′ )$
where $A ′$ is the complementary event of A. For $z = m i n ( x 1 , x 2 )$, note that
$P ( U 1 ≤ x 1 , U 2 ≤ x 2 , U 3 ≤ z | A ) = P ( U 1 ≤ x 1 , U 2 ≤ x 2 , U 3 ≤ z | U 1 < U 3 , U 2 < U 3 ) = P ( U 1 ≤ U 3 , U 2 ≤ U 3 , U 3 ≤ z ) = ∫ − ∞ z F U 1 ( u ) F U 2 ( u ) d F U 3 ( u ) .$
Hence, it is immediate that $F s ( x 1 , x 2 )$ given by (3) is a singular cdf as its mixed second partial derivatives are zero when $x 1 ≠ x 2$.
Thus, $α = P ( A )$ may be established as follows:
$α = P ( U 3 > max ( U 1 , U 2 ) ) = ∫ − ∞ ∞ P ( U 1 < u , U 2 < u ) d F U 3 ( u ) = ∫ − ∞ ∞ F U 1 ( u ) F U 2 ( u ) d F U 3 ( u ) ,$
and, consequently, the bivariate cdf $F ( x 1 , x 2 )$ can be rewritten as (2), where the absolutely continuous part $F a c ( x 1 , x 2 )$ can be obtained by subtraction:
$F a c ( x 1 , x 2 ) = P ( U 1 ≤ x 1 , U 2 ≤ x 2 , U 3 ≤ min ( x 1 , x 2 ) | A ′ ) = 1 1 − α F ( x 1 , x 2 ) − α F s ( x 1 , x 2 ) = 1 1 − α F U 1 ( x 1 ) F U 2 ( x 2 ) F U 3 ( z ) − ∫ − ∞ z F U 1 ( u ) F U 2 ( u ) d F U 3 ( u ) ,$
which completes the proof of the theorem. □
Proof of Theorem 3.
Let $μ$, $μ s$ and $μ a c$ be the measures associated with F, $F s$ and $F a c$, respectively. Obviously, $μ a c$ is an absolutely continuous measure with respect to the two-dimensional Lebesgue measure since
$μ a c ( − ∞ , x 1 ] × ( − ∞ , x 2 ] = F a c ( x 1 , x 2 ) = ∫ − ∞ x 1 ∫ − ∞ x 2 f a c ( u , v ) d u d v$
where the pdf associated with $F a c$ in (4), $f a c ( u , v ) = ∂ 2 ∂ u ∂ v F a c ( u , v )$, can be written as
$f a c ( x 1 , x 2 ) = 1 1 − α f 1 ( x 1 , x 2 ) , if x 1 < x 2 1 1 − α f 2 ( x 1 , x 2 ) , if x 1 > x 2 0 , if x 1 = x 2 = x .$
On the other hand, $μ s$ is given by
$μ s ( − ∞ , x 1 ] × ( − ∞ , x 2 ] = F s ( x 1 , x 2 ) = F s ( z , z ) = 1 α ∫ − ∞ z F U 1 ( u ) F U 2 ( u ) d F U 3 ( u )$
where $z = min ( x 1 , x 2 )$, and so it can be expressed as an absolutely continuous measure $μ s *$ with respect to the one-dimensional Lebesgue measure on the projection onto the line $R$ of the intersection between $( − ∞ , x 1 ] × ( − ∞ , x 2 ]$ and the line $x 1 = x 2$:
$μ s ( − ∞ , x 1 ] × ( − ∞ , x 2 ] = μ s * ( − ∞ , z ] = ∫ − ∞ z f s * ( u ) d u ,$
where $f s * ( u ) = 1 α F U 1 ( u ) F U 2 ( u ) f U 3 ( u )$, which can be also written as $f s * ( u ) = 1 α f 0 ( u )$.
Furthermore, it is trivial that the line $x 1 = x 2$ is a null set under the two-dimensional Lebesgue measure, and hence with respect to $μ a c$. In addition, its complement ${ ( x 1 , x 2 ) ∈ R 2 | x 1 ≠ x 2 }$ is a null set with respect to $μ s$, since its projection onto the line $R$ is the empty set,
$μ s { ( x 1 , x 2 ) ∈ R 2 | x 1 ≠ x 2 } = μ s * Ø = 0 ,$
and, consequently, the measures $μ s$ and $μ a c$ are mutually singular. Therefore, the measure associated with F
$μ ( − ∞ , x 1 ] × ( − ∞ , x 2 ] = F ( x 1 , x 2 ) = α μ s ( − ∞ , x 1 ] × ( − ∞ , x 2 ] + ( 1 − α ) μ a c ( − ∞ , x 1 ] × ( − ∞ , x 2 ]$
allows us to have the pdf of a GBD model with respect to $μ$, given by
$f ( x 1 , x 2 ) = α f s * ( x 1 ) I ( x 1 = x 2 ) ( x 1 , x 2 ) + ( 1 − α ) f a c ( x 1 , x 2 )$
where $I ( x 1 = x 2 )$ is the indicator function of $x 1 = x 2$. Hence, it is easy to check that
$∫ − ∞ x 1 ∫ − ∞ x 2 f ( u , v ) d μ = F ( x 1 , x 2 )$
for all $( x 1 , x 2 ) ∈ R 2$. □
Proof of Theorem 4.
From (1) and (5), the proof of (1) of Theorem 4 is straightforward.
In order to prove (2) of Theorem 4, from the joint pdf of a GBD model given in Theorem 3 and its marginal pdf (6), the conditional pdf $f i | X j = x j$ can be expressed as
$f i | X j = x j ( x i ) = f X i ( x i ) f U j ( x j ) f X j ( x j ) , if x i < x j f U i ( x i ) , if x i > x j F U 1 ( x j ) F U 2 ( x j ) f U 3 ( x j ) f X j ( x j ) , if x i = x j ,$
by using the notation $α j = f i | X j = x j ( x j )$, this conditional pdf can be readily rewritten as in the statement of Theorem 4. □
Proof of Theorem 11.
The reversed hazard function (12) of the minimum statistic can be rewritten as
$r T 1 ( x ) = r U 1 ( x ) g 2 ( x ) + r U 2 ( x ) g 1 ( x ) + r U 3 ( x ) ,$
where each $g i$ is a positive function ($i = 1 , 2$) defined by
$g i ( x ) = 1 − F U i ( x ) F U 1 : 2 ( x ) .$
Here, observe that $U 1 : 2 ≤ r h U i$ implies the decreasing monotonicity of $g i ( x )$, and therefore $r T 1$ is a sum of three decreasing functions, which completes the proof. □
Proof of Corollary 4.
The proof readily follows along the same line as Theorem 11, taking into account that (12) can be simplified by using
$r U 1 : 2 ( x ) = 2 r U i ( x ) g i ( x )$
where $g i ( x ) = 1 − 1 2 − F U i ( x )$ decreases in x. □

## Appendix B

For practical implementation of the EM algorithm in the data analysis applications, we give the technical details of the EM algorithm for two particular GBD models, first with baseline component vector with the same distribution (Model I), and then with different baseline distributions (Model II).
Model I.
Suppose $U 1 ∼ E x p ( λ 1 )$, $U 2 ∼ E x p ( λ 2 )$ and $U 3 ∼ E x p ( λ 3 )$. To compute the MLEs of the unknown parameter vector $θ = ( λ 1 , λ 2 , λ 3 )$, one needs to solve a three-dimensional optimization problem.
For implementation of the EM algorithm, we need the following expected values:
• If $i ∈ I 0$, then
$u 1 i m ( θ ) = E ( U 1 | U 1 < x i ) = H ( x i ; λ 1 ) u 2 i m ( θ ) = E ( U 2 | U 2 < x i ) = H ( x i ; λ 2 ) ,$
where
$H ( x ; λ ) = 1 λ − x e − λ x 1 − e − λ x .$
• If $i ∈ I 1$, then
$u 1 i m ( θ ) = E ( U 1 | max { U 1 , U 3 } = x 1 i ) = λ 3 λ 1 + λ 3 x 1 i + λ 1 λ 1 + λ 3 H ( x 1 i ; λ 1 ) u 3 i m ( θ ) = E ( U 3 | max { U 1 , U 3 } = x 1 i ) = λ 1 λ 1 + λ 3 x 1 i + λ 3 λ 1 + λ 3 H ( x 1 i ; λ 3 ) .$
• If $i ∈ I 2$, then
$u 2 i m ( θ ) = E ( U 2 | max { U 2 , U 3 } = x 2 i ) = λ 3 λ 2 + λ 3 x 2 i + λ 2 λ 2 + λ 3 H ( x 2 i ; λ 2 ) u 3 i m ( θ ) = E ( U 3 | max { U 2 , U 3 } = x 2 i ) = λ 2 λ 2 + λ 3 x 2 i + λ 3 λ 2 + λ 3 H ( x 2 i ; λ 3 ) .$
Hence, the ’pseudo’ log-likelihood function in this case becomes
$ℓ s ( k ) ( λ 1 , λ 2 , λ 3 ) = ℓ 1 s ( k ) ( λ 1 ) + ℓ 2 s ( k ) ( λ 2 ) + ℓ 3 s ( k ) ( λ 3 ) ,$
where
$ℓ 1 s ( k ) ( λ 1 ) = n ln λ 1 − λ 1 ∑ i ∈ I 0 ∪ I 1 u 1 i m ( k ) + ∑ i ∈ I 2 x 1 i ℓ 2 s ( k ) ( λ 2 ) = n ln λ 2 − λ 2 ∑ i ∈ I 0 ∪ I 2 u 2 i m ( k ) + ∑ i ∈ I 1 x 2 i ℓ 3 s ( k ) ( λ 3 ) = n ln λ 3 − λ 3 ∑ i ∈ I 1 ∪ I 2 u 3 i m ( k ) + ∑ i ∈ I 0 x i ,$
and the $u j i m ( k )$s are obtained from $u j i m ( θ )$, $j = 1 , 2 , 3$, by replacing $θ = ( λ 1 , λ 2 , λ 3 )$ with $θ ( k ) = ( λ 1 ( k ) , λ 2 ( k ) , λ 3 ( k ) )$. Therefore,
$λ 1 ( k + 1 ) = n ∑ i ∈ I 0 ∪ I 1 u 1 i m ( k ) + ∑ i ∈ I 2 x 1 i λ 2 ( k + 1 ) = n ∑ i ∈ I 0 ∪ I 2 u 2 i m ( k ) + ∑ i ∈ I 1 x 2 i λ 3 ( k + 1 ) = n ∑ i ∈ I 1 ∪ I 2 u 3 i m ( k ) + ∑ i ∈ I 0 x i .$
Note that, in this case, the maximization can be performed analytically at each M-Step.
Model II.
Suppose $U 1 ∼ W ( λ 1 , α 1 )$, $U 2 ∼ W ( λ 2 , α 2 )$ and $U 3 ∼ G E ( α 3 , λ 3 )$. The pdf of a Weibull distribution $W ( λ , α )$ with scale parameter $λ > 0$ and the shape parameter $α > 0$ can be written as
$f W ( u ; λ , α ) = α λ u α − 1 e − λ u α , for u > 0 ,$
and zero otherwise. Similarly, the $G E ( α , λ )$ model defined in Section 3 has the pdf
$f G E ( u ; α , λ ) = α λ e − λ u ( 1 − e − λ u ) α − 1 ; for u > 0 ,$
and zero otherwise. Hence, one needs to solve a six-dimensional optimization problem to compute the MLEs of the unknown parameter vector $θ = ( θ 1 , θ 2 , θ 3 )$ where each $θ i$ represents the parameter vector of $U i$.
We need the following expected values for implementation of the EM algorithm:
• If $i ∈ I 0$, then
$u j i m ( θ ) = E ( U j | U j < x i ) = H W ( x i ; α j , λ j ) , j = 1 , 2 ,$
where
$H W ( x ; α , λ ) = 1 1 − e − λ x α ∫ 0 λ x α u λ 1 / α e − u d u .$
• If $i ∈ I 1$, then
$u 1 i m ( θ ) = E ( U 1 | max { U 1 , U 3 } = x 1 i ) = p 13 x 1 i + ( 1 − p 13 ) H W ( x 1 i ; α 1 , λ 1 ) u 3 i m ( θ ) = E ( U 3 | max { U 1 , U 3 } = x 1 i ) = ( 1 − p 13 ) x 1 i + p 13 H G ( x 1 i ; α 3 , λ 3 ) ,$
where $p 13 = P ( U 1 > U 3 ) = K ( α 1 , λ 1 )$ and
$K ( α , λ ) = ∫ 0 ∞ α λ x α − 1 e − λ x α ( 1 − e − λ 3 x ) α 3 d x , H G ( x ; α , λ ) = x − 1 λ ( 1 − e − λ x ) α ∫ 0 1 − e − λ x t α 1 − t d t$
• If $i ∈ I 2$, then
$u 2 i m ( θ ) = E ( U 2 | max { U 2 , U 3 } = x 2 i ) = p 23 x 2 i + ( 1 − p 23 ) H W ( x 2 i ; α 2 , λ 2 ) u 3 i m ( θ ) = E ( U 3 | max { U 2 , U 3 } = x 2 i ) = ( 1 − p 23 ) x 2 i + p 23 H G ( x 2 i ; α 3 , λ 3 ) ,$
where $p 23 = P ( U 2 > U 3 ) = K ( α 2 , λ 2 ) .$
In this case, the terms of the ‘pseudo’ log-likelihood function $ℓ s ( k ) ( θ )$ can be written as
$ℓ 1 s ( k ) ( α 1 , λ 1 ) = n ln α 1 + n ln λ 1 + ( α 1 − 1 ) ∑ i ∈ I 0 ∪ I 1 ln u 1 i m ( k ) + ∑ i ∈ I 2 ln x 1 i$
$− λ 1 ∑ i ∈ I 0 ∪ I 1 ( u 1 i m ( k ) ) α 1 + ∑ i ∈ I 2 x 1 i α 1 ℓ 2 s ( k ) ( α 2 , λ 2 ) = n ln α 2 + n ln λ 2 + ( α 2 − 1 ) ∑ i ∈ I 0 ∪ I 2 ln u 2 i m ( k ) + ∑ i ∈ I 1 ln x 2 i$
$− λ 2 ∑ i ∈ I 0 ∪ I 2 ( u 2 i m ( k ) ) α 2 + ∑ i ∈ I 1 x 2 i α 2 ℓ 3 s ( k ) ( α 3 , λ 3 ) = n ln α 3 + n ln λ 3 + ( α 3 − 1 ) ∑ i ∈ I 1 ∪ I 2 ln ( 1 − e − λ 3 u 3 i m ( k ) ) + ∑ i ∈ I 0 ln ( 1 − e − λ 3 x i )$
$− λ 3 ∑ i ∈ I 0 x i + ∑ i ∈ I 1 ∪ I 2 u 3 i m ( k ) .$
Therefore, $u 1 i m ( k )$, $u 2 i m ( k )$, $u 3 i m ( k )$ can be obtained from $u 1 i m ( θ )$, $u 2 i m ( θ )$ and $u 3 i m ( θ )$ by replacing $θ = ( α 1 , λ 1 , α 2 , λ 2 , α 3 , λ 3 )$ with $θ ( k ) = ( α 1 ( k ) , λ 1 ( k ) , α 2 ( k ) , λ 2 ( k ) , α 3 ( k ) , λ 3 ( k ) )$. Thus, $θ 1 ( k + 1 ) = ( α 1 ( k + 1 ) , λ 1 ( k + 1 ) )$, $θ 2 ( k + 1 ) = ( α 2 ( k + 1 ) , λ 2 ( k + 1 ) )$ and $θ 3 ( k + 1 ) = ( α 3 ( k + 1 ) , λ 3 ( k + 1 ) )$ can be obtained by maximizing (A1)–(A3), respectively. Hence, we obtain them as follows:
$λ 1 ( k + 1 ) = n ∑ i ∈ I 0 ∪ I 1 ( u 1 i m ( k ) ) α 1 ( k + 1 ) + ∑ i ∈ I 2 x 1 i α 1 ( k + 1 ) , λ 2 ( k + 1 ) = n ∑ i ∈ I 0 ∪ I 2 ( u 2 i m ( k ) ) α 2 ( k + 1 ) + ∑ i ∈ I 1 x 2 i α 2 ( k + 1 ) , α 3 ( k + 1 ) = − n ∑ i ∈ I 1 ∪ I 2 ln ( 1 − e − λ 3 ( k + 1 ) u 3 i m ( k ) ) + ∑ i ∈ I 0 ln ( 1 − e − λ 3 ( k + 1 ) x i ) , α 1 ( k + 1 ) = arg max p 1 ( α 1 ) , α 2 ( k + 1 ) = arg max p 2 ( α 2 ) , λ 3 ( k + 1 ) = arg max p 3 ( λ 3 ) ,$
where
$p 1 ( α 1 ) = n ln α 1 − n ln ∑ i ∈ I 0 ∪ I 1 ( u 1 i m ( k ) ) α 1 + ∑ i ∈ I 2 x 1 i α 1 + ( α 1 − 1 ) ∑ i ∈ I 0 ∪ I 1 ln u 1 i m ( k ) + ∑ i ∈ I 2 ln x 1 i , p 2 ( α 2 ) = n ln α 2 − n ln ∑ i ∈ I 0 ∪ I 2 ( u 2 i m ( k ) ) α 2 + ∑ i ∈ I 1 x 2 i α 2 + ( α 2 − 1 ) ∑ i ∈ I 0 ∪ I 2 ln u 2 i m ( k ) + ∑ i ∈ I 1 ln x 2 i , p 3 ( λ 3 ) = n ln λ 3 − n ln − ∑ i ∈ I 1 ∪ I 2 ln ( 1 − e − λ 3 u 3 i m ( k ) ) − ∑ i ∈ I 0 ln ( 1 − e − λ 3 x i ) − λ 3 ∑ i ∈ I 0 x i + ∑ i ∈ I 1 ∪ I 2 u 3 i m ( k ) − ∑ i ∈ I 1 ∪ I 2 ln ( 1 − e − λ 3 u 3 i m ( k ) ) + ∑ i ∈ I 0 ln ( 1 − e − λ 3 x i ) .$
Note that, in this case, one needs to solve three one-dimensional optimization problems numerically at each M-Step.

## Appendix C

Table A1. Summary of fitted GBD models for the two real data. EM rows are the parameters estimated with the EM algorithm for maximizing the pseudo log-likelihood function, along with the log-likelihood, AIC and BIC values, and BFGS rows correspond to the results obtained by applying the Broyden–Fletcher–Goldfarb–Shanno algorithm for maximizing the log-likelihood function.
Table A1. Summary of fitted GBD models for the two real data. EM rows are the parameters estimated with the EM algorithm for maximizing the pseudo log-likelihood function, along with the log-likelihood, AIC and BIC values, and BFGS rows correspond to the results obtained by applying the Broyden–Fletcher–Goldfarb–Shanno algorithm for maximizing the log-likelihood function.
GBD Model$θ$$ℓ ( θ )$AICBIC
$α 1$$λ 1$$α 2$$λ 2$$α 3$$λ 3$
Soccer data
Model I
EM 0.03126 0.04630 0.04269−299.4331604.8663609.6990
BFGS 0.03116 0.04636 0.04283−299.4328604.8656609.6984
Model II
EM1.29870.00970.80470.00931.00370.0369−348.2715708.5430718.2085
BFGS1.38080.006980.56520.254691.538130.05219−295.3057602.6114612.2770
Diabetic retinopathy data
Model I
EM 0.0653 0.0737 0.1345−289.9878585.9757590.8884
BFGS 0.06290 0.07181 0.14282−289.9144585.8288590.7415
Model II
EM1.09370.04470.58510.23690.89950.1898−290.0758592.1515601.9770
BFGS1.14770.039200.79170.139230.419130.08272−285.5795583.1590592.9846

## References

1. Gumbel, E.J. Bivariate exponential distributions. J. Am. Stat. Assoc. 1960, 55, 698–707. [Google Scholar] [CrossRef]
2. Freund, J.E. A bivariate extension of the exponential distribution. J. Am. Stat. Assoc. 1961, 56, 971–977. [Google Scholar] [CrossRef]
3. Marshall, A.W.; Olkin, I. A multivariate exponential distribution. J. Am. Stat. Assoc. 1967, 62, 30–44. [Google Scholar] [CrossRef]
4. Balakrishnan, N.; Lai, C.D. Continuous Bivariate Distributions, 2nd ed.; Springer: New York, NY, USA, 2009. [Google Scholar] [CrossRef]
5. Franco, M.; Vivo, J.M. A multivariate extension of Sarhan and Balakrishnan’s bivariate distribution and its ageing and dependence properties. J. Multivar. Anal. 2010, 101, 491–499. [Google Scholar] [CrossRef]
6. Kundu, D.; Gupta, R.D. Modified Sarhan–Balakrishnan singular bivariate distribution. J. Stat. Plan. Inference 2010, 40, 526–538. [Google Scholar] [CrossRef]
7. Franco, M.; Kundu, D.; Vivo, J.M. Multivariate extension of the modified Sarhan-Balakrishnan bivariate distribution. J. Stat. Plan. Inference 2011, 141, 3400–3412. [Google Scholar] [CrossRef]
8. Gupta, R.C.; Kirmani, S.N.U.A.; Balakrishnan, N. On a class of generalized Marshall–Olkin bivariate distributions and some reliability characteristics. Probab. Engrg. Inform. Sci. 2013, 27, 261–275. [Google Scholar] [CrossRef]
9. Kundu, D.; Franco, M.; Vivo, J.M. Multivariate distributions with proportional reversed hazard marginals. Comput. Stat. Data Anal. 2014, 77, 98–112. [Google Scholar] [CrossRef]
10. Muhammed, H.Z. On a bivariate generalized inverted Kumaraswamy distribution. Phys. A 2020, 553, 124281. [Google Scholar] [CrossRef]
11. Franco, M.; Vivo, J.M.; Kundu, D. A generalized Freund bivariate model for a two-component load sharing system. Reliab. Eng. Syst. Saf. 2020, 203, 107096. [Google Scholar] [CrossRef]
12. El-Morshedy, M.; Ali-Alhussain, Z.; Atta, D.; Almetwally, E.M.; Eliwa, M.S. Bivariate Burr X generator of distributions: Properties and estimation methods with applications to complete and type-II censored samples. Mathematics 2020, 8, 264. [Google Scholar] [CrossRef]
13. Kundu, D.; Gupta, R.D. Bivariate generalized exponential distribution. J. Multivar. Anal. 2009, 100, 581–593. [Google Scholar] [CrossRef]
14. Sarhan, A.M.; Hamilton, D.C.; Smith, B.; Kundu, D. The bivariate generalized linear failure rate distribution and its multivariate extension. Comput. Stat. Data Anal. 2011, 55, 644–654. [Google Scholar] [CrossRef]
15. Elsherpieny, E.A.; Ibrahim, S.A.; Bedar, R.E. A New Bivariate Distribution with Log-Exponentiated Kumaraswamy Marginals. Chil. J. Stat. 2014, 5, 55–69. Available online: http://www.soche.cl/chjs/volumes/05/02/Elsherpieny_etal(2014).pdf (accessed on 31 August 2020).
16. El-Gohary, A.; El-Bassiouny, A.H.; El-Morshedy, M. Bivariate exponentiated modified Weibull extension distribution. J. Stat. Appl. Probab. 2016, 5, 67–78. [Google Scholar] [CrossRef]
17. Muhammed, H.Z. Bivariate inverse Weibull distribution. J. Stat. Comput. Simul. 2016, 86, 2335–2345. [Google Scholar] [CrossRef]
18. Kundu, D.; Gupta, A.K. On bivariate inverse Weibull distribution. Braz. J. Probab. Stat. 2017, 31, 275–302. [Google Scholar] [CrossRef]
19. Muhammed, H.Z. Bivariate Dagum Distribution. Int. J. Reliab. Appl. 2017, 18, 65–82. Available online: https://www.koreascience.or.kr/article/JAKO201715565837044.pdf (accessed on 31 August 2020).
20. Sarhan, A.M. The bivariate generalized Rayleigh distribution. J. Math. Sci. Model. 2019, 2, 99–111. [Google Scholar] [CrossRef]
21. Eliwa, M.S.; El-Morshedy, M. Bivariate Gumbel-G family of distributions: Statistical properties, bayesian and non-bayesian estimation with application. Ann. Data Sci. 2019, 6, 39–60. [Google Scholar] [CrossRef]
22. Gupta, R.C.; Gupta, P.L.; Gupta, R.D. Modeling failure time data by Lehman alternatives. Commun. Stat. Theory Methods 1998, 24, 887–904. [Google Scholar] [CrossRef]
23. Di Crescenzo, A. Some results on the proportional reversed hazards model. Stat. Probab. Lett. 2000, 50, 313–321. [Google Scholar] [CrossRef]
24. Kundu, D.; Gupta, R.D. A class of bivariate models with proportional reversed hazard marginals. Sankhya B 2010, 72, 236–253. [Google Scholar] [CrossRef]
25. Gupta, R.D.; Kundu, D. Generalized exponential distribution. Aust. N. Z. J. Stat. 1999, 41, 173–188. [Google Scholar] [CrossRef]
26. Sarhan, A.; Kundu, D. Generalized linear failure rate distribution. Commun. Stat. Theory Methods 2009, 38, 642–660. [Google Scholar] [CrossRef]
27. Lemonte, A.J.; Cordeiro, G.M.; Barreto-Souza, W. The exponentiated Kumaraswamy distribution and its log-transform. Braz. J. Probab. Stat. 2013, 27, 31–53. [Google Scholar] [CrossRef]
28. Sarhan, A.M.; Apaloo, J. Exponentiated modifed Weibull extension distribution. Reliab. Eng. Syst. Saf. 2013, 112, 137–144. [Google Scholar] [CrossRef]
29. Keller, A.Z.; Giblin, M.T.; Farnworth, N.R. Reliability analysis of commercial vehicle engines. Reliab. Eng. 1985, 10, 15–25. [Google Scholar] [CrossRef]
30. Dagum, C. A new model of personal income distribution: Specification and estimation. Econ. Appl. 1977, 30, 413–437. [Google Scholar]
31. Burr, I.W. Cumulative frequency functions. Ann. Math. Stat. 1942, 13, 215–232. [Google Scholar] [CrossRef]
32. Alzaatreh, A.; Lee, C.; Famoye, F. A new method for generating families of continuous distributions. Metron 2013, 71, 63–79. [Google Scholar] [CrossRef]
33. Iqbal, Z.; Tahir, M.M.; Riaz, N.; Ali, S.A.; Ahmad, M. Generalized inverted Kumaraswamy distribution: Properties and application. Open J. Stat. 2017, 7, 645–662. [Google Scholar] [CrossRef]
34. Lai, C.D.; Xie, M. Stochastic Ageing and Dependence for Reliability; Springer: New York, NY, USA, 2006. [Google Scholar] [CrossRef]
35. Shaked, M.; Shanthikumar, J.G. Stochastic Orders; Springer: New York, NY, USA, 2007. [Google Scholar] [CrossRef]
36. Domma, F. Bivariate reversed hazard rate, notions, and measures of dependence and their relationships. Commun. Stat. Theory Methods 2011, 40, 989–999. [Google Scholar] [CrossRef]
37. Finkelstein, M.S. On the reversed hazard rate. Reliab. Eng. Syst. Saf. 2002, 78, 71–75. [Google Scholar] [CrossRef]
38. Gupta, R.C.; Balakrishnan, N. Log-concavity and monotonicity of hazard and reversed hazard functions of univariate and multivariate skew-normal distributions. Metrika 2012, 75, 181–191. [Google Scholar] [CrossRef]
39. Nelsen, R.B. An Introduction to Copulas, 2nd ed.; Springer: New York, NY, USA, 2006. [Google Scholar] [CrossRef]
40. Fang, R.; Li, X. A note on bivariate dual generalized Marshall–Olkin distributions with applications. Probab. Engrg. Inform. Sci. 2013, 27, 367–374. [Google Scholar] [CrossRef]
41. Meintanis, S.G. Test of fit for Marshall–Olkin distribution with applications. J. Stat. Plan. Inference 2007, 137, 3954–3963. [Google Scholar] [CrossRef]
42. Huster, W.J.; Brookmeyer, R.; Self, S.G. Modelling paired survival data with covariates. Biometrics 1989, 45, 145–156. [Google Scholar] [CrossRef]
43. Ploner, M.; Kaider, A.; Heinze, G. SurvCorr: Correlation of Bivariate Survival Times. R Package Version 1.0. 2015. Available online: https://CRAN.R-project.org/package=SurvCorr (accessed on 31 August 2020).
44. Henningsen, A.; Toomet, O. maxLik: A package for maximum likelihood estimation in R. Comput. Stat. 2011, 26, 443–458. [Google Scholar] [CrossRef]
Figure 1. Surface and contour plots of the joint pdf of GBD models $( X 1 , X 2 )$ with different components $( U 1 , U 2 , U 3 )$.
Figure 1. Surface and contour plots of the joint pdf of GBD models $( X 1 , X 2 )$ with different components $( U 1 , U 2 , U 3 )$.
Figure 2. Plots of the marginal pdfs of the GBD models $( X 1 , X 2 )$ with different components $( U 1 , U 2 , U 3 )$.
Figure 2. Plots of the marginal pdfs of the GBD models $( X 1 , X 2 )$ with different components $( U 1 , U 2 , U 3 )$.
Table 1. Relation between $( x 1 i , x 2 i )$ and $( u 1 i , u 2 i , u 3 i )$.
Table 1. Relation between $( x 1 i , x 2 i )$ and $( u 1 i , u 2 i , u 3 i )$.
$I k$Ordering of $U j$$X 1$$X 2$Missing
$I 0$$u 1 i < u 2 i < u 3 i$$u 3 i$$u 3 i$$u 1 i$, $u 2 i$
$I 0$$u 2 i < u 1 i < u 3 i$$u 3 i$$u 3 i$$u 1 i$, $u 2 i$
$I 1$$u 1 i < u 3 i < u 2 i$$u 3 i$$u 2 i$$u 1 i$
$I 1$$u 3 i < u 1 i < u 2 i$$u 1 i$$u 2 i$$u 3 i$
$I 2$$u 2 i < u 3 i < u 1 i$$u 1 i$$u 3 i$$u 2 i$
$I 2$$u 3 i < u 2 i < u 1 i$$u 2 i$$u 1 i$$u 3 i$
Table 2. Goodness-of-fit results for UEFA Champion’s League data.
Table 2. Goodness-of-fit results for UEFA Champion’s League data.
GBD ModelKS (p-Value)
$X 1$$X 2$$max { X 1 , X 2 }$AICBIC
Model I0.1491 (0.3830)0.1099 (0.7622)0.1530 (0.3517)604.8663609.6990
Model II0.0976 (0.8719)0.0839 (0.9565)0.1139 (0.7228)708.5430718.2085
Table 3. Goodness-of-fit results for diabetic retinopathy data.
Table 3. Goodness-of-fit results for diabetic retinopathy data.
GBD ModelKS (p-Value)
$X 1$$X 2$$max { X 1 , X 2 }$AICBIC
Model I0.1033 (0.8244)0.1848 (0.1598)0.1229 (0.6310)585.9757590.8884
Model II0.0920 (0.8960)0.0952 (0.8706)0.1152 (0.6778)592.1515601.9770
 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.