Next Article in Journal
Solution of Integral Equation with Neutrosophic Rectangular Triple Controlled Metric Spaces
Next Article in Special Issue
Synergies vs. Clustering Only of Depressive Symptoms in Diabetes and Co-Occurring Conditions: Symmetric Indicators with Asymmetric, Bidirectional Influences in MIMIC Models
Previous Article in Journal
Tutte Polynomials and Graph Symmetries
Previous Article in Special Issue
Suitability of the Single Transferable Vote as a Replacement for Largest Remainder Proportional Representation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Bivariate Proportional Hazard Models: Structure and Inference

by
Barry C. Arnold
1,
Guillermo Martínez-Flórez
2 and
Héctor W. Gómez
3,*
1
Statistics Department, University of California Riverside, Riverside, CA 92521, USA
2
Departamento de Matemáticas y Estadística, Facultad de Ciencias, Universidad de Córdoba, Córdoba 2300, Colombia
3
Departamento de Matemática, Facultad de Ciencias Básicas, Universidad de Antofagasta, Antofagasta 1240000, Chile
*
Author to whom correspondence should be addressed.
Symmetry 2022, 14(10), 2073; https://doi.org/10.3390/sym14102073
Submission received: 29 August 2022 / Revised: 23 September 2022 / Accepted: 29 September 2022 / Published: 6 October 2022
(This article belongs to the Special Issue Mathematical Models and Methods in Various Sciences)

Abstract

:
We focus on a variety of bivariate models with proportional hazard components. Models with proportional hazard marginals are described together with a selection of models with proportional hazard conditional distributions. The bivariate distributions with marginal proportional hazards distributions are shown to be closely related to certain known bivariate exponential models. Two distinct kinds of conditional specification are investigated. Discussion is provided of cases with hazard function components that are (1) completely unknown, (2) known to belong to given parametric families and (3) completely known. Since the models are designed for use with survival data, it is inevitable that the marginal and conditional distributions will be asymmetric. However, logarithmic transformations in some cases will result in symmetric component distributions.

1. Introduction

Survival models involving families of densities with proportional hazard functions have proved to be useful for analyzing many lifetime data sets. Not infrequently bivariate survival data (involving related lifetimes) need to be analyzed. In this paper, we review several methods for generating suitable bivariate models for such situations. The key observation in the development is that proportional hazard models can be viewed as ones obtained via monotone transformations applied to exponential models. In the latter sections of the paper, related statistical inference issues are discussed. Since the models are designed for use with survival data, it is inevitable that the marginal and conditional distributions will be asymmetric. However, logarithmic transformations in some cases will result in symmetric component distributions.

2. Bivariate Distributions with Proportional Hazard Marginals

Let F 1 and F 2 be two absolutely continuous distribution functions with support ( 0 , ) and with corresponding densities f 1 and f 2 and hazard functions h 1 and h 2 (where h i = f i / ( 1 F i ) , i = 1 , 2 ).
We will say that ( X 1 , X 2 ) has a bivariate marginal proportional hazard distribution associated with F 1 and F 2 and with parameters α 1 , α 2 > 0 if for i = 1 , 2
f X i ( x i ; F i , α i ) = α i [ 1 F i ( x i ) ] α i 1 f i ( x i ) I ( x i > 0 ) ,
and we will write ( X 1 , X 2 ) P H M ( F 1 , α 1 ; F 2 , α 2 ) and also X i P H ( F i , α i ) i = 1 , 2 (here P H M is an acronym for proportional hazard marginals). Note that the name is appropriate since for i = 1 , 2
h X i ( x i ; F i , α i ) = α i h i ( x i ) .
Observe that if ( X 1 , X 2 ) P H M ( F 1 , α 1 ; F 2 , α 2 ) then the X i ’s admit a representation of the form
X i = F i 1 ( 1 e Y i ) , i = 1 , 2 ,
where for each i, Y i exp ( α i ) (i.e., f Y i ( y i ) = α i e α i y i I ( y i > 0 ) ).
Note that the transformations in (3) are monotone increasing.
In fact, it is not necessary to build the model with reference to two distribution functions F 1 and F 2 . Instead we can begin with Y i exp ( α i ) , i = 1 , 2 and use two monotone increasing functions g i : ( 0 , ) ( 0 , ) to define X i = g i ( Y i ) , i = 1 , 2 . If we denote the corresponding inverse functions by h i ( x ) = g i 1 ( x ) , then it is readily verified that each X i has a proportional hazard distribution, i.e., that for i = 1 , 2 ,
h X i ( x i ; g i , α i ) = α i h i ( x i ) , x i > 0 .
It is however customary to use the representation (3) involving the two distribution functions F 1 and F 2 , and we will adhere to this convention.
Of course, in the representation (3), ( Y 1 , Y 2 ) can have any bivariate exponential distribution that we wish to utilize. Popular choices of bivariate exponential distributions involving few additional parameters include:
(i)
Gumbel Type I distribution, with
P ( Y 1 > y 1 , Y 2 > y 2 ) = exp [ α 1 y 1 α 2 y 2 δ α 1 α 2 y 1 y 2 ] , 0 δ 1 .
(ii)
Gumbel Type II distribution, with
P ( Y 1 y 1 , Y 2 y 2 ) = [ 1 e α 1 y 1 ] [ 1 e α 2 y 2 ] [ 1 + δ e α 1 y 1 α 2 y 2 ] , δ [ 1 , 1 ] .
(iii)
Marshall–Olkin distribution (see Marshall and Olkin [1]), with
P ( Y 1 > y 1 , Y 2 > y 2 ) = exp [ α 1 y 1 α 2 y 2 δ α 1 α 2 m a x ( y 1 , y 2 ) ] , δ [ 0 , ) .
Many other choices are possible, see for example Kotz et al. ([2], pp. 350–385).
For a specific example, if we choose ( Y 1 , Y 2 ) to have a Gumbel Type I density, i.e.,
f Y 1 , Y 2 ( y 1 , y 2 ) = α 1 α 2 ( 1 + δ y 1 ) ( 1 + δ y 2 ) δ exp { α 1 y 1 α 2 y 2 α 1 α 2 δ y 1 y 2 } ,
and use F 1 ( x 1 ) = x 1 γ 1 , 0 < x 1 < 1 and F 2 ( x 2 ) = x 2 γ 2 , 0 < x 2 < 1 , then the resulting PHM density will be of the form:
f X 1 , X 2 ( x 1 , x 2 ) = α 1 α 2 γ 1 γ 2 1 α 1 log 1 x 1 γ 1 1 α 2 log 1 x 2 γ 2 δ x 1 γ 1 x 2 γ 2 × exp α 1 α 2 δ log 1 x 1 γ 1 log 1 x 2 γ 2 .
In application of such models, it is frequently desirable to postulate that each F i is a member of some parametric family of distributions to add flexibility to the model. The dependence structure will however be completely determined by the copula of the particular bivariate exponential distribution used in the construction.
Alternatively, one could “let the data tell us which F i ’s to use in the model”. Thus, we would seek monotone marginal transformations that will make the transformed marginal sample distributions look as much like exponential distributions as possible. This semi-parametric approach will be returned to later in the paper.

3. Bivariate Distributions with Proportional Hazard Conditionals

We will consider two types of conditioning. The first kind is quite traditional in that we consider the distribution of one variable given that a second variable takes on a particular value. The second kind involves conditioning on the event that the second variable is larger than a particular value.

3.1. The First Kind

Consider two proportional hazard families of densities as in (1). Recall that we write X i P H ( F i , α i ) if the corresponding densities are given by (1).
For this conditional proportional hazards paradigm, we seek to identify joint distributions for ( X 1 , X 2 ) with all conditional densities of the forms in (1). Thus, for each x 2 > 0 we wish to have
X 1 | X 2 = x 2 P H ( F 1 , α 1 ( x 2 ) ) ,
and for each x 1 > 0 ,
X 2 | X 1 = x 1 P H ( F 2 , α 2 ( x 1 ) ) ,
for some functions α 1 ( x 2 ) and α 2 ( x 1 ) . It is not difficult to verify that this will be the case if and only if
X i = F i 1 ( 1 e Y i ) , i = 1 , 2 ,
where ( Y 1 , Y 2 ) has a joint distribution with exponential conditionals, i.e., such that for each y 2 > 0
Y 1 | Y 2 = y 2 exp ( α 1 ( y 2 ) ) ,
and for each y 1 > 0
Y 2 | Y 1 = y 1 exp ( α 2 ( y 1 ) ) .
The class of all densities with such exponential conditionals is identified in Arnold and Strauss [3] and is of the form:
f Y 1 , Y 2 ( y 1 , y 2 ) = k ( δ ) α 1 α 2 exp [ α 1 y 1 α 2 y 2 α 1 α 2 δ y 1 y 2 ] I ( y 1 > 0 , y 2 > 0 ) ,
where α 1 > 0 , α 2 > 0 and δ 0 . The normalizing constant, k ( δ ) , in (10) can be expressed in terms of the exponential integral function. Thus
k ( δ ) = δ exp { 1 / δ } 1 / δ exp { w } w d w .
Consequently, bivariate densities with conditionals of the proportional hazard form will be given by
f X 1 , X 2 ( x 1 , x 2 ) = α 1 α 2 k ( δ ) f 1 ( x 1 ) 1 F 1 ( x 1 ) α 1 1 f 2 ( x 2 ) 1 F 2 ( x 2 ) α 2 1 × exp { α 1 α 2 δ log 1 F 1 ( x 1 ) log 1 F 2 ( x 2 ) } .
This model is discussed in Arnold and Kim [4] and we will follow their nomenclature and call it a proportional hazard conditionals model of the first kind. If ( X 1 X 2 ) has a density of the form (13), we will write
( X 1 , X 2 ) P H C ( I ) ( F 1 , α 1 ; F 2 , α 2 ; δ ) .

3.2. The Second Kind

Again consider two proportional hazard families of densities as in (1). Recall that we write X i P H ( F i , α i ) if the corresponding densities are given by (1). The second kind of conditional model (also introduced in Arnold and Kim [4]) involves conditioning on events of the form { X 1 > x 1 } and { X 2 > x 2 } . Thus, we seek to identify joint survival functions for ( X 1 , X 2 ) such that for each x 2 > 0 ,
X 1 | { X 2 > x 2 } P H ( F 1 , α 1 ( x 2 ) ) ,
and for each x 1 > 0 ,
X 2 | { X 1 > x 1 } P H ( F 2 , α 2 ( x 1 ) ) ,
for some functions α 1 ( x 2 ) and α 2 ( x 1 ) .
To analyze this situation (since F 1 and F 2 are known) it is again convenient to write
X i = F i 1 ( 1 e Y i ) , i = 1 , 2 ,
where the Y i ’s are exponential random variables.
The conditions (14) and (15) are then equivalent to the statements
Y 1 | { Y 2 > y 2 } exp ( α 1 ( y 2 ) ) ,
and
Y 2 | { Y 1 > y 1 } exp ( α 2 ( y 1 ) ) .
Denote the survival functions of Y 1 and Y 2 by ψ 1 ( y 1 ) = P ( Y 1 > y 1 ) and ψ 2 ( y 2 ) = P ( Y 2 > y 2 ) . It then follows that
ψ 2 ( y 2 ) e α 1 ( y 2 ) y 1 = P ( Y 1 > y 1 , Y 2 > y 2 ) = ψ 1 ( y 1 ) e α 2 ( y 1 ) y 2 .
Taking logarithms we have:
log ψ 2 ( y 2 ) α 1 ( y 2 ) y 1 = log ψ 1 ( y 1 ) α 2 ( y 1 ) y 2 .
This is a Stephanos–Levi–Civita–Suto functional Equation (see Arnold et al. [5], p. 13) which is readily solved to yield the following expression for the joint survival function of ( Y 1 , Y 2 ) :
P ( Y 1 > y 1 , Y 2 > y 2 ) = exp [ α 1 y 1 α 2 y 2 α 1 α 2 δ y 1 y 2 ]
for y 1 > 0 , y 2 > 0 , where α 1 > 0 , α 2 > 0 and 0 δ 1 . This is recognizable as Gumbel’s Type I bivariate exponential distribution (with exponential marginals). From Equation (21) we obtain the joint survival function of ( X 1 , X 2 ) in the form:
P ( X 1 > x 1 , X 2 > x 2 ) = 1 F 1 ( x 1 ) α 1 1 F 2 ( x 2 ) α 2 × exp { α 1 α 2 δ log 1 F 1 ( x 1 ) log 1 F 2 ( x 2 ) } .
Then, the joint cumulative distribution function is
F ( x 1 , x 2 ) = 1 F 1 ( x 1 ) α 1 1 F 2 ( x 2 ) α 2 exp { α 1 α 2 δ log 1 F 1 ( x 1 ) log 1 F 2 ( x 2 ) } + 1 ( 1 F 1 ( x 1 ) ) α 1 + 1 ( 1 F 2 ( x 2 ) ) α 2 1 .
and the joint density function is
f X 1 , X 2 ( x 1 , x 2 ) = α 1 α 2 f 1 ( x 1 ) 1 F 1 ( x 1 ) α 1 1 f 2 ( x 2 ) 1 F 2 ( x 2 ) α 2 1 × exp { α 1 α 2 δ log 1 F 1 ( x 1 ) log 1 F 2 ( x 2 ) } × 1 α 1 δ log 1 F 1 ( x 1 ) 1 α 2 δ log 1 F 2 ( x 2 ) δ .
The vector ( Y 1 , Y 2 ) with density (21) has exponential marginals, i.e., Y i exp ( α i ) , for i = 1 , 2 , and thus X i P H ( F i , α i , ) for i = 1 , 2 . Consequently, for j , j = 1 , 2 the conditional densities are given by
f X j | X j ( x j ) = α 1 f j ( x j ) 1 F j ( x j ) α j 1 exp { α j α j δ log 1 F j ( x j ) log 1 F j ( x j ) } × 1 α j δ log 1 F j ( x j ) 1 α j δ log 1 F j ( x j ) δ .
If ( X 1 , X 2 ) has a survival function of the form (22), we write
( X 1 , X 2 ) P H C ( I I ) ( F 1 , α 1 ; F 2 , α 2 ; δ ) ,
note that X 1 and X 2 in (22) will be independent if and only if δ = 0 .

4. If F 1 and F 2 Are Known

Suppose that we have available a sample of size n , ( X 1 , j , X 2 , j ) , j = 1 , 2 , , n from one of the bivariate proportional hazard models discussed in this paper. Since F 1 and F 2 are known, it is appropriate to transform the data to obtain
Y i , j = log ( 1 F i ( X i , j ) ) , i = 1 , 2 , j = 1 , 2 , , n ,
and thus to have a sample ( Y 1 , j , Y 2 , j ) , j = 1 , 2 , , n from the corresponding well-known bivariate exponential distribution. See the following references for appropriate estimation strategies for these bivariate exponential data sets:
  • Gumbel [6],
  • Besag [7],
  • Arnold and Strauss ([3,8]),
  • Castillo and Hadi [9],
  • Arnold et al. [5].

5. If F 1 and F 2 Are Known to Belong to Some Given Parametric Families

We will illustrate this with a particular example. Other examples may treated in analogous fashion. Suppose that in the PHC(II) model, (23), we replace F 1 ( x 1 ) by F 1 ( x 1 ; θ ) and F 2 ( x 2 ) by F 2 ( x 2 ; τ ) , where the parameters θ and τ are unknown. In this case, the model becomes more complicated, but we can still envision success in estimating all the parameters in the model. As a specific example, consider the following distributions of the Weibull form:
F 1 ( x 1 ; θ ) = 1 e x 1 θ , x 1 > 0 ,
and
F 2 ( x 2 ; τ ) = 1 e x 2 τ , x 2 > 0 .
The corresponding log-likelihood function is of the form
( θ ; X 1 , X 2 ) = n log ( α 1 α 2 θ τ ) + ( θ 1 ) i = 1 n log ( x 1 i ) + ( τ 1 ) i = 1 n log ( x 2 i ) α 1 x 1 i θ α 1 α 2 δ i = 1 n x 1 i θ x 2 i τ α 2 x 2 i τ + i = 1 n log 1 + α 1 δ x 1 i θ 1 + α 2 δ x 2 i τ δ
The score function U ( θ ) = U ( θ ) , U ( τ ) , U ( δ ) , U ( α 1 ) , U ( α 2 ) , has elements which are derivative of the log-likelihood function with respect to the parameters and thus are given by
U ( α 1 ) = ( θ ) α 1 = n α 1 i = 1 n x 1 i θ α 2 δ i = 1 n x 1 i θ x 2 i τ + δ i = 1 n 1 + α 2 δ x 2 i τ x 1 i θ 1 + α 1 δ x 1 i θ 1 + α 2 δ x 2 i τ δ ,
U ( α 2 ) = ( θ ) α 2 = n α 2 i = 1 n x 2 i τ α 1 δ i = 1 n x 1 i θ x 2 i τ + δ i = 1 n 1 + α 1 δ x 1 i θ x 2 i τ 1 + α 1 δ x 1 i θ 1 + α 2 δ x 2 i τ δ ,
U ( δ ) = ( θ ) δ = α 1 α 2 i = 1 n x 1 i θ x 2 i τ + i = 1 n α 1 x 1 i θ + α 2 x 2 i τ 1 + 2 α 1 α 2 δ x 1 i θ x 2 i τ 1 + α 1 δ x 1 i θ 1 + α 2 δ x 2 i τ δ ,
U ( θ ) = ( θ ) θ = n θ + i = 1 n log ( x 1 i ) α 1 i = 1 n x 1 i θ log ( x 1 i ) α 1 α 2 δ i = 1 n x 1 i θ x 2 i τ log ( x 1 i ) + α 1 δ i = 1 n 1 + α 2 δ x 2 i τ x 1 i θ log ( x 1 i ) 1 + α 1 δ x 1 i θ 1 + α 2 δ x 2 i τ δ ,
U ( τ ) = ( θ ) τ = n τ + i = 1 n log ( x 2 i ) α 2 i = 1 n x 2 i τ log ( x 2 i ) α 1 α 2 δ i = 1 n x 1 i θ x 2 i τ log ( x 2 i ) + α 2 δ i = 1 n 1 + α 1 δ x 1 i θ x 2 i τ log ( x 2 i ) 1 + α 1 δ x 1 i θ 1 + α 2 δ x 2 i τ δ .
By equating the scores to zero we obtain the score equations, i.e., ( U ( θ ) = 0 ). These equations are typically solved by means of Newton–Raphson or quasi-Newton numerical methods to obtain the maximum likelihood estimators θ ^ = ( θ ^ , τ ^ , δ ^ , α ^ 1 , α ^ 2 ) of the parameter vector θ = ( θ , τ , δ , α 1 , α 2 ) . The observed information matrix of θ is given by K ( θ ) = d d θ U ( θ ) = ( K θ j θ j ) , i.e., with elements of the form of minus the second derivative of the log-likelihood function with respect to the parameters. The Fisher information matrix of vector θ , I ( θ ) is given by I ( θ ) = E K ( θ ) and should be calculated numerically.
When, we use the base distributions (26) and (27) in the P H C ( I ) model, a technique known as pseudo-likelihood estimation (see Arnold and Strauss [10]) will provide estimates of all five parameters in the model. Besag [7], defined the pseudo-likelihood estimator of θ as the value θ 0 of θ that maximizes the pseudo likelihood function, which in the present bivariate situation is based on the conditional PH densities and is given by
L P ( β ; X 1 , X 2 ) = i = 1 n f X 1 | X 2 ( x 1 i | x 2 ) f X 2 | X 1 ( x 2 i | x 1 ) .
Thus for the example with distributions:
F 1 ( x 1 ; γ 1 ) = x 1 γ 1 , 0 < x 1 < 1 ,
and
F 2 ( x 2 ; γ 2 ) = x 2 γ 2 , 0 < x 2 < 1 ,
we have the following log pseudo likelihood.
P ( β ; X 1 , X 2 ) = γ 1 i = 1 n log ( x 1 i ) + 2 i = 1 n log 1 α 1 δ log ( 1 x 1 i γ 1 ) 1 α 2 δ log ( 1 x 2 i γ 2 ) δ + n log ( α 1 α 2 γ 1 γ 2 ) + γ 2 i = 1 n log ( x 2 i ) 2 α 1 α 2 δ i = 1 n log ( 1 x 1 i γ 1 ) log ( 1 x 2 i γ 2 )
Parallel to the definition of the score function, the pseudo−score function is defined to be the vector whose coordinates are partial derivatives of the log-pseudo-likelihood function with respect to each of the parameters in the model. It is denoted by U p ( β ) = ( U p ( γ 1 ) , U p ( γ 2 ) , U p ( δ ) , U p ( α 1 ) , U p ( α 2 ) ) .
The estimating equations are constructed by setting the elements of the pseudo−score vector equal to zero. Solutions of these equations correspond to the pseudo-likelihood estimates of the parameters of the model. Typically, these solutions are obtained numerically using iterative methods such as Newton–Raphson or quasi-Newton.
The pseudo-likelihood estimator β ^ of β obtained in the above fashion can be verified to be consistent and asymptotically normally distributed with covariance matrix given by
Σ p = J 1 ( β ) K ( β ) J 1 ( β )
(see Arnold and Strauss [10]), where for l , m = 1 , 2
K l m ( β ) = E p ( β ) β l p ( β ) β m , J l m ( β ) = E 2 p ( β ) β l β m .
As a consistent estimate of the asymptotic variance-covariance matrix of the pseudo-likelihood estimator, we will use the sandwich estimator proposed by Cheng and Riu [11]. This estimator is developed as follows.
Let U p i ( β ) = p i β β , be the vector of pseudo-scores for the i-th observation. Then define
J ^ n ( β ) = 1 n i = 1 n U p i ( β ) β | β ˜ ,
which is the sum over all the observations of the matrices of second derivatives of p ( β ) evaluated at the pseudo-likelihood estimator β ˜ . In addition, define
K ^ n ( β ) = 1 n i = 1 n U p i ( β ) U p i ( β ) | β ˜ .
Using this, we construct a consistent sandwich estimator of the asymptotic variance-covariance matrix in the form
Σ ^ ( β ˜ ) = 1 n J ^ n 1 ( β ˜ ) K ^ n ( β ˜ ) J ^ n 1 ( β ˜ ) .
A detailed discussion and analysis of such a model, but with power Lindley base distributions, utilizing pseudo-likelihood estimation, may be found in Martínez-Flórez et al. [12].

6. If F 1 and F 2 Are Unknown

All but one of the bivariate proportional hazard models described in this paper have marginals of the proportional hazard form. The exception is the PHC(I) model which, for unknown F 1 and F 2 , we will discuss in Section 7. For the other models, we know that if F 1 and F 2 were known, we could transform the data to obtain a sample from a well-known bivariate exponential model. Consequently, if we consider an estimate F ˜ 1 , n ( x ) of F 1 based on X 1 , 1 , X 1 , 2 , X 1 , n and an estimate F ˜ 2 , n ( x ) of F 2 based on X 2 , 1 , X 2 , 2 , X 2 , n , we can transform the data using
Z 1 , j = log F ˜ 1 , n ( X 1 , j )
and
Z 2 , j = log F ˜ 2 , n ( X 2 , j ) ,
and then we will have approximately a sample from a bivariate distribution with standard exponential marginals and we can then estimate the parameters in this exponential model. Note that for identifiability in unknown F 1 and F 2 models we have to fix α 1 and α 2 to be equal to 1 .
For the PHC(II) model with Weibull component distributions, given in (26) and (27), a small simulation study of the performance of the maximum likelihood parameter estimates has been implemented for a variety of sample sizes and for several parametric configurations. With minimal loss of generality we set α 1 = α 2 = 1 throughout the simulation study. Three values of the dependence parameter δ were used, namely 0.15 , 0.30 and 0.45 , together with four sample sizes n = 30 , 50 , 70 , 90 . The table presents results for three representative choices of values for θ and τ . As for measures of performance, the relative bias (RB) and the square root of the mean squared error (MSE) are given.
The results in Table 1 confirm that both the relative bias and the root mean-squared error of the estimates decrease as sample size increases.

7. If F 1 and F 2 Are Unknown in the PHC(I) Model

Our model is of the P H C ( I ) ( 1 , F 1 ; 1 , F 2 ; δ ) form, i.e.,
f X 1 , X 2 ( x 1 , x 2 ) = k ( δ ) 1 log 1 F 1 ( x 1 ) 1 log 1 F 2 ( x 2 ) δ × f 1 ( x 1 ) f 2 ( x 2 ) exp { δ log 1 F 1 ( x 1 ) log 1 F 2 ( x 2 ) } .
where δ , F 1 , and F 2 are unknown. Although it would be easy to estimate δ , via pseudolikelihood, if F 1 and F 2 were known, it is not apparent how to estimate F 1 and F 2 assuming that δ is known. So it is not clear how to implement an iterative strategy for estimating F 1 , F 2 and δ simultaneously. Perhaps our only choice is to assume that the F 1 ’s belong to some parametric families of distributions, with once more utilizing pseudo likelihood to avoid dealing with k ( δ ) .

8. Application

The data analyzed in this example consist of the maximum water levels registered at two stations on the Fox river in Wisconsin during the period 1918–1950. Measurements were made at an upstream location (Berlin, X 1 ) and a downstream location (Wrightstown, X 2 ). This data set was previously analyzed by Gumbel and Mustafi [13] using a bivariate extreme model.
In our analysis of this data set we will fit four models namely:
  • The Arnold and Strauss [3] bivariate exponential conditionals distribution. denoted by BEC.
  • Gumbel’s [6] first bivariate exponential distribution, denoted by BG(I).
  • The proportional hazard conditionals Weibull extension of the BEC distribution, denoted by PHC(I)-W.
  • The proportional hazard conditionals Weibull extension of the BG(I) distribution, denoted by PHC(II)-W.
In both of the Weibull proportional hazard conditionals extensions mentioned above, i.e., PHC(I)-W and PHC(II)-W, as described in Section 3, we use the following choices for the component distributions F 1 and F 2 :
F 1 ( x 1 ; θ ) = 1 e x 1 θ , x 1 > 0 and F 2 ( x 2 ; τ ) = 1 e x 2 τ , x 2 > 0 .
Using the Arnold and Strauss [3] density given in Equation (11), the density of the PHC(I)-W is given by
f X 1 , X 2 ( x 1 , x 2 ) = α 1 α 2 θ τ k ( δ ) x 1 θ 1 x 2 τ 1 exp α 1 x 1 θ α 2 x 2 τ α 1 α 2 δ x 1 θ x 2 τ .
The corresponding log-pseudo-likelihood function for a sample of size n takes the form
( θ ; X ( 1 ) , X ( 2 ) ) = n log ( α 1 α 2 θ τ ) + ( θ 1 ) i = 1 n log ( x 1 i ) + i = 1 n log ( 1 + α 1 δ x 1 θ ) ( 1 + α 2 δ x 2 τ ) + ( τ 1 ) i = 1 n log ( x 2 i ) α 1 i = 1 n ( 1 + α 2 δ x 2 τ ) x 1 θ α 2 i = 1 n ( 1 + α 1 δ x 1 θ ) x 2 τ .
The log-pseudo-likelihood for the BEC model is obtained from the expression for the PHC(I)-W by setting θ = 1 and τ = 1 .
The log-likelihood function for a sample of size n from the PHC(II)-W is of the form given in Equation (28), with simple change of notation. In this case the corresponding log-likelihood function for the BG(I) model is again obtained by setting θ = 1 and τ = 1 .
Using the Fox river data, maximizing the log-likelihood for the models BG(I) and PHC(II)-W and the log-pseudo-likelihood for the models BEC and PHC(I)-W, we obtain the estimates of the parameters of the four models given in Table 2 (with standard errors in parentheses).
To compare model fitting, we use the AIC (Akaike [14]) criterion, namely AIC = 2 ^ ( · ) + 2 p . We also consider the BIC (Schwarz [15]) criterion, namely BIC = 2 ^ ( · ) + log ( n ) p , criterion where p is the number of parameters for the model being considered. The best model is the one with the smallest AIC or BIC.
According to the values of the AIC and BIC criteria for the Fox river data, the best model is the PHC(I)-W followed by the PHC(II)-W model.
Since the BEC and BG(I) models are special cases of the PHC(I)-W and PHC(II)-W models, respectively, obtained by setting θ = τ = 1 , we may test the hypotheses
H 0 : ( θ , τ ) = ( 1 , 1 ) v e r s u s H 1 : ( θ , τ ) ( 1 , 1 )
for comparing the PHC(II)-W and PHC(I)-W models with the BG(I) and BEC models, respectively.
Using the likelihood ratio statistic,
Λ = f ( β ^ ) f W ( β ^ )
we obtain
2 log ( Λ ) χ 2 2 .
The corresponding values of 2 log ( Λ ) in each case are provided in Table 3 (note in the BEC-PHC(I)-W comparison the log-pseudo-likelihoods have been utilized instead of log-likelihoods) which are greater than the value of the χ 2 , 99 % 2 = 9.210 indicating that the PHC(II)-W and PHC(I)-W models are significantly better at the 1% level. Thus, the PHC(II)-W and PHC(I)-W models appear to be good alternative for fitting the set data. The choice between the PHC(I)-W and PHC(II)-W is not so clear-cut, but perhaps the PHC(I)-W might be considered to be marginally better.
The graphs in Figure 1a,b and Figure 2a,b show the contours of the densities BG(I) and BEC and of the fitted models for PHC(II)-W and PHC(I)-W, respectively.
Under the assumption that the forms of the F i ’s are unknown, we use the transformations
Z 1 , j = log F ˜ 1 , n ( X 1 , j ) and Z 2 , j = log F ˜ 2 , n ( X 2 , j ) ,
to arrive at a BG(I) model with joint survival function
S ( z 1 , z 2 ) = exp ( z 1 z 2 δ z 1 z 2 ) .
Then, using the expression for the maximum likelihood estimate of δ provided by Kotz et al. ([2], p. 352), we obtain δ ^ = 0.2986 , a much smaller value than the estimated value of the parameter δ obtained assuming a known form for the F i ’s. Perhaps this indicates that the Weibull choices for the F i ’s are not optimal.

9. Discussion

The bivariate models discussed in this paper utilize quite different approaches to their construction and thus can be expected to exhibit significantly different distributional properties, especially with regard to dependence. Future research on such models should put some focus on the problem of selecting the appropriate one of these models for a particular data set. Of course, the old stand-by of fitting via maximum likelihood and comparing models via AIC and BIC is always available.

Author Contributions

Conceptualization, B.C.A., G.M.-F. and H.W.G.; Formal analysis, B.C.A., G.M.-F. and H.W.G.; Investigation, B.C.A., G.M.-F. and H.W.G.; Methodology, B.C.A. and G.M.-F.; Software, G.M.-F.; Supervision, B.C.A. and H.W.G.; Validation, B.C.A., G.M.-F. and H.W.G. Writing—original draft preparation, B.C.A., G.M.-F. and H.W.G.; Funding acquisition, H.W.G. All of the authors contributed significantly to this research article. All authors have read and agreed to the published version of the manuscript.

Funding

The research of H.W. Gómez was supported by SEMILLERO UA-2022 project, Chile. The research of G. Martínez-Flórez was supported by Universidad de Córdoba, Montería, Colombia.

Data Availability Statement

The data can be found in Gumbel and Mustafi (1967).

Acknowledgments

The authors thank the Editor and three anonymous referees for their constructive comments and suggestions, which have greatly helped them to improve the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Marshall, A.W.; Olkin, I. A Multivariate Exponential Distribution. J. Am. Statist. Assoc. 1967, 62, 30–44. [Google Scholar] [CrossRef]
  2. Kotz, S.; Balakrishnan, N.; Johnson, N.L. Continuous Multivariate Distributions; John Wiley and Sons: Hoboken, NJ, USA, 2000. [Google Scholar]
  3. Arnold, B.C.; Strauss, D.J. Bivariate distributions with exponential conditionals. J. Am. Statist. Assoc. 1988, 83, 522–527. [Google Scholar] [CrossRef]
  4. Arnold, B.C.; Kim, Y.H. Conditional proportional hazards models. In Lifetime Data: Models in Reliability and Survival Analysis; Jewell, N.P., Kimber, A.C., Lee, M.L.T., Whitmore, G.A., Eds.; Springer: Boston, MA, USA, 1996; pp. 21–28. [Google Scholar]
  5. Arnold, B.C.; Castillo, E.; Sarabia, J.M. Conditional Specification of Statistical Models; Springer Series in Statistics; Springer: Berlin/Heidelberg, Germany, 1999. [Google Scholar]
  6. Gumbel, E.J. Bivariate Exponential Distributions. J. Am. Statist. Assoc. 1960, 55, 698–707. [Google Scholar] [CrossRef]
  7. Besag, J. Statistical Analysis of Non-Lattice Data. J. R. Stat. Soc. Ser. D 1975, 24, 179–195. [Google Scholar] [CrossRef] [Green Version]
  8. Arnold, B.C.; Strauss, D.J. Bivariate distributions with conditionals in prescribed exponential families. J. R. Stat. Soc. Ser. B 1991, 53, 365–375. [Google Scholar] [CrossRef]
  9. Castillo, E.; Hadi, A.S. Modeling Lifetime Data with Application to Fatigue Models. J. Am. Statist. Assoc. 1995, 90, 1041–1054. [Google Scholar] [CrossRef]
  10. Arnold, B.C.; Strauss, D.J. Pseudolikelihood estimation: Some examples. Sankhya Ser. B 1991, 53, 233–243. [Google Scholar]
  11. Cheng, C.; Riu, J. On estimating linear relationships when both variables are subject to heteroscedastic measurement errors. Technometrics 2006, 48, 511–519. [Google Scholar] [CrossRef]
  12. Martínez-Flórez, G.; Arnold, B.C.; Gómez, H.W. A bivariate power Lindley survival distribution. 2022; Unpublished Work. [Google Scholar]
  13. Gumbel, E.; Mustafi, C.K. Some Analytical Properties of Bivariate Extremal Distributions. J. Am. Statist. Assoc. 1967, 62, 569–588. [Google Scholar] [CrossRef]
  14. Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, 19, 716–723. [Google Scholar] [CrossRef]
  15. Schwarz, G. Estimating the dimension of a model. Ann. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
Figure 1. Contours for (a) BG(I) model and (b) BEC model.
Figure 1. Contours for (a) BG(I) model and (b) BEC model.
Symmetry 14 02073 g001
Figure 2. Contours for (a) PHC(II)-W model and (b) PHC(I)-W model.
Figure 2. Contours for (a) PHC(II)-W model and (b) PHC(I)-W model.
Symmetry 14 02073 g002
Table 1. RB and M S E for the PHC(II)-Weibull model.
Table 1. RB and M S E for the PHC(II)-Weibull model.
θ ^ τ ^ δ ^ α ^ 1 α ^ 2
ParametersnRB M S E RB M S E RB M S E RB M S E RB M S E
300.19640.26300.49150.40291.67530.34240.20110.25910.53610.5931
500.19110.25830.48640.38531.42620.28280.19830.23420.53390.5705
(1.25, 0.75, 0.15, 1, 1)700.18420.25900.48190.37821.34800.25490.19740.22640.53300.5633
900.17170.25970.47510.36861.31170.23760.19330.21650.53070.5541
300.19480.26750.49660.40781.67980.33930.19570.25420.53580.596
500.19340.26200.48300.38231.45190.28410.19570.23100.53560.5731
(1.25, 0.75, 0.30, 1, 1)700.18870.26110.48270.37711.32600.25140.19350.22380.53320.5653
900.17560.25790.47620.36881.28610.23510.19160.21780.54240.5629
300.18940.26100.53160.43360.41770.30020.19180.24630.61270.6808
500.18270.25120.51490.40900.40220.26880.19040.22970.60180.6428
(1.25, 0.75, 0.45, 1, 1)700.17990.24830.49660.38840.40090.26040.19000.22100.59040.6193
900.16790.25120.49050.37990.39580.23760.17790.21260.58780.6136
300.42280.74571.25430.65031.67780.33780.42870.44731.41931.4930
500.41890.74271.23320.63051.54810.29410.42700.43531.419114.435
(1.75, 0.5, 0.15, 1, 1)700.41930.74221.23110.62621.31030.24550.42450.43471.40481.4378
900.40930.73261.22760.62151.25310.22940.42270.43461.39391.4363
300.42180.74551.28620.66520.81030.34390.42040.43521.49681.5718
500.42070.74281.25540.64080.75200.29840.41520.42881.49071.5178
(1.75, 0.5, 0.30, 1, 1)700.41710.73951.25440.63730.69810.26860.41540.42811.48041.5167
900.40980.73341.24210.63000.67300.25390.41430.42501.47331.5102
300.41790.73761.27800.66020.41900.29850.41490.42701.50691.5796
500.41690.73611.26060.64560.41150.27250.41480.42501.47621.5240
(1.75, 0.5, 0.45, 1, 1)700.41260.73261.23540.62680.40560.25970.41220.42391.46551.4988
900.40340.72251.23030.62380.39160.24550.40620.42271.45501.4831
300.39030.32700.25060.38991.31770.30910.35900.43950.27780.3212
500.36610.29800.24340.38871.13490.25080.34790.40060.27760.3044
(0.75, 1.5, 0.15, 1, 1)700.35160.28110.24930.38771.14610.23430.34710.38160.27610.2962
900.34520.27210.23050.38711.03920.21060.34490.37550.27480.2929
300.38770.32800.24390.38310.64610.31360.37940.46390.26770.3001
500.36990.30060.24240.38170.64030.27840.35390.40500.26600.2871
(0.75, 1.5, 0.30, 1, 1)700.35680.28610.23650.37850.63280.26110.35270.38960.26000.2884
900.35320.27840.22730.37670.60660.23730.35130.38190.24920.2823
300.39330.33240.25290.39140.39340.29060.37740.46110.26640.3023
500.36850.29960.24390.39000.38930.26620.36530.41860.26390.2872
(0.75, 1.5, 0.45, 1, 1)700.35570.28340.24950.38830.36490.25230.35950.39550.25940.2856
900.34820.27470.22860.38610.34860.23790.35510.38330.25260.2829
Table 2. Estimates (standard errors) for the fitted models.
Table 2. Estimates (standard errors) for the fitted models.
EstimatesBG(I)PHC(II)-WBECPHC(I)-W
α 1 ^ 0.21980.07750.09400.0277
(0.0185)(0.0389)(0.0081)(0.0077)
α 2 ^ 0.06560.00550.02830.0046
(0.0056)(0.0029)(0.0089)(0.0008)
δ ^ 0.83440.90533.86240.8753
(0.3041)(0.3341)(0.3254)(0.0006)
θ ^ 1.6396 2.0337
(0.2979) (0.0007)
τ ^ 1.8895 1.8384
(0.1995) (0.0006)
AIC400.6863364.4660415.2929353.4773
BIC405.1758371.9485419.7824360.9598
Table 3. Comparison of likelihood ratio statistics.
Table 3. Comparison of likelihood ratio statistics.
PHC(II)-W vs. BG(I)PHC(I)-W vs. BEC
2 log ( Λ ) 40.220365.8156
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Arnold, B.C.; Martínez-Flórez, G.; Gómez, H.W. Bivariate Proportional Hazard Models: Structure and Inference. Symmetry 2022, 14, 2073. https://doi.org/10.3390/sym14102073

AMA Style

Arnold BC, Martínez-Flórez G, Gómez HW. Bivariate Proportional Hazard Models: Structure and Inference. Symmetry. 2022; 14(10):2073. https://doi.org/10.3390/sym14102073

Chicago/Turabian Style

Arnold, Barry C., Guillermo Martínez-Flórez, and Héctor W. Gómez. 2022. "Bivariate Proportional Hazard Models: Structure and Inference" Symmetry 14, no. 10: 2073. https://doi.org/10.3390/sym14102073

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop