Next Article in Journal
The Physics and Metaphysics of Social Powers: Bridging Cognitive Processing and Social Dynamics, a New Perspective on Power Through Active Inference
Previous Article in Journal
Design of Covert Communication Waveform Based on Phase Randomization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Two-Step Estimation Procedure for Parametric Copula-Based Regression Models for Semi-Competing Risks Data

1
Yunnan Key Laboratory of Statistical Modeling and Data Analysis, Yunnan University, Kunming 650500, China
2
Centre for Mathematical Sciences, School of Engineering, Computing and Mathematics, University of Plymouth, Plymouth PL4 8AA, UK
*
Author to whom correspondence should be addressed.
Entropy 2025, 27(5), 521; https://doi.org/10.3390/e27050521
Submission received: 16 March 2025 / Revised: 2 May 2025 / Accepted: 9 May 2025 / Published: 13 May 2025

Abstract

:
Non-terminal and terminal events in semi-competing risks data are typically associated and may be influenced by covariates. We employed regression modeling for semi-competing risks data under a copula-based framework to evaluate the effects of covariates on the two events and the association between them. Due to the complexity of the copula structure, we propose a new method that integrates a novel two-step algorithm with the Bound Optimization by Quadratic Approximation (BOBYQA) method. This approach effectively mitigates the influence of initial values and demonstrates greater robustness. The simulations validate the performance of the proposed method. We further applied our proposed method to the Amsterdam Cohort Study (ACS) real data, where some improvements could be found.

1. Introduction

In follow-up biomedical studies, objects may experience two types of events: a non-terminal event (e.g., disease relapse) or a terminal event (death). The occurrence of a non-terminal event can preclude the occurrence of a terminal event, but not vice versa. This is referred to as semi-competing risks (Fine et al. [1]).
There are two conventional statistical methods for analyzing semi-competing risks data: the random effects model and the copula-based model. As in the study by Xu et al. [2], the random effects model usually adopts a shared frailty non-negative random variable to generate the joint distribution. In contrast, Shih et al. [3] showed that the copula-based model constructs the joint distribution by connecting two marginal distributions, and the copula parameters can explicitly estimate the dependence strength. For the copula-based model, the marginal distribution of times to events conditioned on covariates can be specified independently, and the covariate effects on marginal events can be accessed directly. Moreover, Ramadan et al. [4] highlighted that the copula-based model can capture asymmetric and nonlinear connections between variables and simplify the computation of certain probability measures. Using the copula function, Fayomi et al. [5] proposed the BFGMLG family to model bivariate continuous data with heavy-tailed and skewed distributions; they applied this to environmental, medical, and computer science data. Haj Ahmad et al. [6] proposed a bivariate modified extended exponential (BMExE) distribution and adopted this model to study the relationship between processor and memory reliability in data science. Ramadan et al. [4] introduced the BAPB-XII distribution and applied it to engineering data. In this article, we consider the copula model for semi-competing risks data.
Fine et al. [1] proposed estimators of the association parameter from the concordance-estimating function and used a novel plug-in estimator for the unspecified marginal distribution of a non-terminal event. Jiang et al. [7] proposed another copula-based estimator with better properties based on the principle of pseudo-self-consistency. Peng and Fine [8] considered a class of time-varying coefficient semiparametric regression models for marginals and copulas, and they proposed an estimator based on the moment method. Hsieh and Huang [9] analyzed semi-competing risks data based on the conditional likelihood approach. They adopted semiparametric transformation models as the marginal regressions, which include the popular proportional hazards and proportional odds models as special cases.
For the copula-based model, Shih and Louis [3] proposed a two-stage procedure for the estimation of parameters in the marginal distribution and copula function. In the first stage, the parameters for the marginal distributions are estimated independently for each event, disregarding the dependencies between events; in the second stage, the parameter associated with the copula is estimated via maximizing the log-likelihood using the plug-in estimation of the marginals. However, this approach may not be suitable for the copula-based models for semi-competing risks data, as a terminal event may dependently censor the non-terminal event. Chen [10] considered a nonparametric maximum likelihood estimation (MLE) approach, which involves simultaneously estimating the parameters that are associated with the marginals and the copula via maximization of the log-likelihood function. Arachchige et al. [11] provided a two-stage pseudo-maximum likelihood estimation approach (PMLE). More specifically, in the first stage, they obtained a consistent estimate of the marginal of the terminal event; in the second stage, they estimated the marginal of the non-terminal event and the copula parameters simultaneously by maximizing the pseudo-likelihood function with the plug-in estimation of the terminal event. Sun et al. [12] proposed a two-step method: in the first step, they obtained the parameter estimation separately as the initial value while, in the second step, they obtained all parameters of the marginal distribution and copula function simultaneously through a joint distribution. They called this procedure a two-step procedure. This method can mitigate the influence of initial values, and is adopted in our article.
In addition, due to the complex structure of a copula-based objective function, the derivative information is unavailable, unreliable, or computationally expensive to obtain. In such cases, derivative-free optimization (DFO) methods serve as a viable alternative. This article adopts the Bound Optimization by Quadratic Approximation (BOBYQA) algorithm proposed by Powell [13]. As summarized by Ragonneau [14], Powell pioneered his first DFO method using conjugate directions in 1964. From the 1990s onward, he developed five model-based DFO algorithms, each tailored to distinct problem classes: COBYLA for nonlinearly constrained problems, UOBYQA and NEWUOA for unconstrained problems, BOBYQA for bound-constrained problems, and LINCOA for linearly constrained problems. However, a key limitation of BOBYQA is its limited scalability to high-dimensional problems.
In this article, we consider the parametric copula-based regression model for semi-competing risks data. Fully parametric approaches are appealing because they are generally identifiable, can produce smooth survival functions, and give a direct interpretation [3,15]. Within the realm of parametric regression copula-based models, traditional survival distributions—particularly the exponential, Weibull, and Gompertz distributions—remain significant, both theoretically and practically [16]. The primary challenge in parametric copula-based modeling lies in the optimization process, which is complicated by the intricate likelihood structure of semi-competing risks data. This complexity makes the estimation particularly sensitive to the choice of initial values. We implement a two-step estimation approach to address this issue and reduce the dependence on initial value selection. First, we employed the method developed by Shih and Louis [3] to obtain preliminary parameter estimates. These estimates are then used as initial values for the second step maximum likelihood estimation based on the joint likelihood of the two marginal events, thereby improving the stability and reliability of the final parameter estimates. We employ the BOBYQA optimization algorithm and demonstrate the competitive performance of our proposed method in comparison to other existing methods through simulations and real data on HIV/AIDS from the Amsterdam Cohort Study (ACS).
The remainder of this article is organized as follows. In Section 2, we provide a brief description of the motivating data. In Section 3, we introduce the fully parametric copula-based regression models that utilize various copula functions and survival functions. We also explain the main procedure of the state-of-the-art BOBYQA optimization method. Subsequently, we applied our method to the real data on HIV/AIDS from the Amsterdam Cohort Study (ACS) (Section 4) and compared our method with other optimization methods in Section 5. The discussion is given by concluding the article in Section 6.

2. Data

The Amsterdam Cohort Study (ACS) on HIV/AIDS among homosexual men is a progressive cohort study that commenced in 1984 to investigate HIV/AIDS. This dataset consists of the time (in years) from HIV infection to AIDS, virus phenotype switching from non-syncytium-inducing to syncytium-inducing (SI switch), and mortality data for 329 men who have sex with men (MSM). Data from the period until combination anti-retroviral therapy are available (1996). The censoring rates for the SI switch event and mortality are approximately 65.7% and 45.9%, respectively. After removing the five partially observed cases, the final dataset comprises 324 samples, and a summary of the variables is listed in Table 1.
This dataset is included in the mstate library and has been analyzed by Geskus et al. [17,18] and Sorrell et al. [19]. However, they only consider the association between events. In this article, we are also interested in the effect of CCR5 and age on the time to SI switch and death events. Due to the sample size limitation, the parametric model may be more suitable, and we specify the marginal distribution for SI switch event and death event.

3. Methods

3.1. The Copula-Based Model Specifications

In this section, we introduce the model for semi-competing risks events based on the conditional copula function proposed by Patton [20]. For the semi-competing event, we denote T 1 as the time to a non-terminal event and T 2 as the time to a terminal event. Moreover, the times ( T 1 , T 2 ) are subject to right censoring C t . Then, the time to first event is X = min ( T 1 , T 2 , C t ) , with the censoring indicator d 1 = 1 if T 1 < min ( T 2 , C t ) ; otherwise, it is d 1 = 0 . The time to the second event is Y = min ( T 2 , C t ) , with the censoring indicator d 2 = 1 if T 2 < C t ; otherwise, it is d 2 = 0 .
We assume the parameters of the copula and marginal survival distribution depend on the covariate vector Z = ( Z 1 , , Z p ) , where p is the dimension of covariates. The marginal survival function is denoted as S T 1 Z ( t 1 | z ) = P ( T 1 > t 1 | z ) for the non-terminal event and S T 2 Z ( t 2 | z ) = P ( T 2 > t 2 | z ) for the terminal event. For the joint survival distribution, we employ a bivariate copula function C θ with a univariate parameter θ :
S t 1 , t 2 z = P T 1 > t 1 , T 2 > t 2 z = C θ S T 1 z t 1 z , S T 2 z t 2 z z .
Subsequently, the density function for the copula is as follows:
c θ S T 1 z t 1 z , S T 2 z t 2 z z = 2 C θ S T 1 z t 1 z , S T 2 z t 2 z z S T 1 z t 1 z S T 2 z t 2 z
and the density corresponding to the survival function can be written as
f t 1 , t 2 z = c θ S T 1 z t 1 z , S T 2 z t 2 z z f T 1 z t 1 z f T 2 z t 2 z ,
where Z , f T 1 z t 1 z and f T 2 z t 2 z are the marginal densities of T 1 and T 2 , respectively.
In this article, we mainly focus on four commonly used univariate-parameter copulas: Clayton, Frank, Gumbel, and normal. To incorporate covariate Z = ( Z 1 , , Z p ) effects while maintaining parameter constraints, we employ distinct link functions corresponding to each copula’s parameter space. The four copula families and their respective parameter link functions are formally specified as follows:
(i)
Clayton copula:
C θ ( u , v ) = u θ + v θ 1 1 θ , θ ( 0 , + ) , θ = exp ( b 0 + b 1 Z 1 + + b p Z p ) .
(ii)
Frank copula:
C θ ( u , v ) = 1 θ log 1 ( 1 e θ u ) ( 1 e θ v ) 1 e θ , θ R { 0 } , θ = b 0 + b 1 Z 1 + . . . + b p Z p .
(iii)
Gumbel copula:
C θ ( u , v ) = exp { log ( u ) } θ + { log ( v ) } θ 1 / θ , θ [ 1 , + ) , θ = exp ( b 0 + b 1 Z 1 + . . . + b p Z p ) + 1 .
(iv)
Normal copula:
C θ ( u , v ) = 1 2 π 1 θ 2 Φ 1 ( u ) Φ 1 ( v ) e x p ( ( s 2 2 θ s t + t 2 ) 2 ( 1 θ 2 ) ) d s d t , θ [ 1 , 1 ] , θ = exp ( 2 ( b 0 + b 1 Z 1 + . . . + b p Z p ) ) 1 exp ( 2 ( b 0 + b 1 Z 1 + . . . + b p Z p ) ) + 1 .
where Φ 1 ( · ) is the inverse of the cumulative distribution function of the standard univariate Gaussian distribution.
The above three copula functions (i–iii) belong to the Archimedean copula family, and the normal copula (iv) belongs to the elliptical copula family. For more details, we refer the reader to Nelsen [21].

3.2. The Parametric Model for Event to Time

For the marginal distribution of non-terminal and terminal events, we consider three popular parametric distributions: exponential, Weibull, and Gompertz. The covariates’ effects on marginal distribution are specified as an exponential linear regression model.
For the exponential marginal distribution, the survival function is given as S T i Z ( t i ) = exp ( λ i t i ) , i = 1 , 2 , for non-terminal and terminal events, respectively. The relationship between parameter λ i and covariates Z = ( Z 1 , , Z p ) is characterized by the following function:
λ i = exp ( β i , 0 + β i , 1 Z 1 + + β i , p Z p ) ,
where β i , 0 , , β i , p are regression coefficients for non-terminal and terminal events, respectively, for i = 1 , 2 . We apply the exponential link function to guarantee that λ i is positive.
Concerning the Weibull distribution, the survival function can be written as S T i Z = exp ( λ i t i α i ) , where λ i and α i are scale and shape parameters, respectively, for i = 1 , 2 . We assume the shape parameters α 1 and α 2 are constant, and the scale parameters λ 1 and λ 2 are determined by Z = ( Z 1 , , Z p ) as
λ i = exp ( β i , 0 + β i , 1 Z 1 + + β i , p Z p ) ,
for i = 1 , 2 , where β i , 0 , , β i , p are regression coefficients.
As for the Gompertz marginal distribution, the survival function can be written as S T i Z t i = exp λ i α i exp α i t i 1 for i = 1 , 2 , where λ i is the rate parameter, and α i is the shape parameter. As is the case for the Weibull distribution, let the shape parameter α i be constant and the rate parameter λ i depend on Z = ( Z 1 , , Z p ) in the following way:
λ i = exp ( β i , 0 + β i , 1 Z 1 + + β i , p Z p ) ,
for i = 1 , 2 , where β i , 0 , , β i , p are regression coefficients.

3.3. Joint Likelihood Function

For the semi-competing bivariate right-censored data, the likelihood based on the copula function C θ ( S T 1 Z , S T 2 Z ) is expressed as
L ( Φ ) = i = 1 n c θ S T 1 z t i , 1 z i , S T 2 z t i , 2 z i z i f T 1 z t i , 1 z i f T 2 z t i , 2 z i d i , 1 d i , 2 × C θ S T 1 z t i , 1 z i , S T 2 z t i , 2 z i z i S T 1 z t i , 1 z i f T 1 z t i , 1 z i d i , 1 1 d i , 2 × C θ S T 1 z t i , 1 z i , S T 2 z t i , 2 z i z i S T 2 z t i , 2 z i f T 2 z t i , 2 z i 1 d i , 1 d i , 2 × C θ S T 1 z t i , 1 z i , S T 2 z t i , 2 z i z i 1 d i , 1 1 d i , 2 = i = 1 n c θ S T 1 z t i , 1 z i , S T 2 z t i , 2 z i z i f T 1 z t i , 1 z i f T 2 z t i , 2 z i d i , 1 d i , 2 × h T 1 | T 2 ( z i ) × f T 1 z t i , 1 z i d i , 1 1 d i , 2 × h T 2 | T 1 ( z i ) × f T 2 z t i , 2 z i 1 d i , 1 d i , 2 × C θ S T 1 z t i , 1 z i , S T 2 z t i , 2 z i z i 1 d i , 1 1 d i , 2 ,
where n is the sample size, and c θ S T 1 z t i , 1 z i , S T 2 z t i , 2 z i z i is the density function for the copula defined in Equation (2). We assume that the censoring times are independent of the survival times and are not informative.
Here,
h T 1 | T 2 ( z i ) = C θ S T 1 z t i , 1 z i , S T 2 z t i , 2 z i z i S T 1 z t i , 1 z i
and
h T 2 | T 1 ( z i ) = C θ S T 1 z t i , 1 z i , S T 2 z t i , 2 z i z i S T 2 z t i , 2 z i
are called h-functions in the literature [22], and the censoring indicator is d i , 1 = 1 if t i , 1 < min ( t i , 2 , C t ) ; otherwise, it is d i , 1 = 0 . The censoring indicator is d i , 2 = 1 if t i , 2 < C t ; otherwise, it is d i , 2 = 0 . The exact form of the likelihood for the four types of copula functions previously mentioned is given by Wei et al. [23].

3.4. Estimation

From Equation (11), we can see that the structure of the log-likelihood function is relatively complex; therefore, it is often a computationally challenging task to estimate the unknown parameters. We define Φ = ( α = ( α 1 , α 2 ) , β = ( β 1 , β 2 ) , b ) as a vector of the parameters, where α j ( j = 1 , 2 ) is the constant parameter of marginal distribution, β j ( j = 1 , 2 ) are regression coefficients, and b denotes the regression coefficients in the linear predictor of the link function associated with the copula association parameter θ mentioned in Section 3.3. It is known that the terminal event can censor the non-terminal event, so the parameter estimation of the non-terminal event may be biased when using the two-stage method because it treats the censored time as non-informational. Thus, we leverage the popular two-step estimation procedures by Sun et al. [12,24]. The method is shown in Algorithm 1.
Algorithm 1 Two-step estimation procedure.
Step 1: Estimate the parameters of marginal distributions and copula function separately.
(1)
First obtain the initial values of ( α j , β j ) through grid search method. Then estimate these parameters based on
α ^ j , n ( 1 ) , β ^ j , n ( 1 ) = arg max α j , n , β j , n L j , n α j , n , β j , n ,
where L j , n denotes the likelihood for the marginal survival distribution, j = 1 , 2 .
(2)
Obtain the initial values of b ^ through grid search method. Then estimate it by
b ^ n ( 1 ) = arg max b n L n α ^ n ( 1 ) = α ^ 1 , n ( 1 ) , α ^ 2 , n ( 1 ) , β ^ n ( 1 ) = β ^ 1 , n ( 1 ) , β ^ 2 , n ( 1 ) , b n ,
where β ^ j , n ( 1 ) and α ^ j , n ( 1 ) are the estimated values by above procedure (1), j = 1 , 2 , and L n is the joint likelihood based on the copula function.
Step 2: Simultaneously maximize the joint likelihood
Φ ^ n = arg max Φ L n α n = α 1 , n , α 2 , n , β n = β 1 , n , β 2 , n , b n ,
with the initial values of β ^ j , n ( 1 ) , α ^ j , n ( 1 ) , j = 1 , 2 and b ^ n ( 1 ) attained from Step 1.
In Algorithm 1, the grid search method establishes initial values for parameters α i and β j through the following systematic procedure. First, define search ranges for each parameter: [ α j , l o w e r , α j , u p p e r ] and [ β j , l o w e r , β j , u p p e r ] . Second, discretize each interval into n equally spaced points: α j , g r i d = { α j , 1 , α j , 2 , , α j , n } and β j , g r i d = { β j , 1 , β j , 2 , , β j , n } . Third, generate all possible parameter combinations from the Cartesian product of α j , g r i d and β j , g r i d , resulting in n p + 1 distinct pairs ( α j , k , β j , k ) , where p denotes the dimension of β j . Finally, evaluate the marginal survival distribution likelihood L j , k α j , k , β j , k , j = 1 , 2 and k = 1 , 2 , , n p + 1 and identify the optimal parameters ( α j , i n i , β j , i n i ) = arg max ( α j , k , β j , k ) { L j , k } . Apply an analogous procedure to determine the initial values for b. The parameter estimates derived from Step 1, formally expressed as Φ ^ n ( 1 ) = { α ^ 1 , n ( 1 ) , α ^ 2 , n ( 1 ) , ( β ^ 1 , n ( 1 ) ) , ( β ^ 2 , n ( 1 ) ) , b ^ n ( 1 ) } serve as the initialized parameter vector for the joint optimization process in Step 2.
The accuracy and stability of parameter estimation are critically dependent on appropriate initialization. Direct likelihood maximization in single-step procedures frequently encounters convergence failures when using arbitrary initial values, particularly in complex models. In contrast, the two-stage procedure can be an effective initial method. Given the complexity of the copula itself, the estimation of parameters for the copula and marginal distributions, as outlined in the maximization Equations (12)–(14), is a challenging task. Therefore, we adopt the BOBYQA optimization algorithm instead of the novel limited memory Broyden–Fletcher–Goldfarb–Shanno bounded (L-BFGS-B) method.
The BOBYQA algorithm is a model-based derivative-free trust-region method proposed by Michael J.D. Powell [13]. It is an iterative algorithm used to obtain a minimum of a function L ( Φ ) , Φ R p , where p is the dimension of variable Φ , subject to some box bounds on the variables. It mainly consists of building or updating the interpolation set, constructing or updating the interpolation-based quadratic approximation function, and solving the trust-region subproblem.
At iteration k, it employs the interpolation method to construct or update the quadratic model Q k ( Φ ) ( Φ R p ) approximation to the objective function L ( Φ ) ( Φ R p ) without using its derivatives. For a selected collection of points ϕ k = { ϕ k , 1 , ϕ k , 2 , , ϕ k , m } , ϕ k , i R p , i = 1 , 2 , , m , where m is a fixed integer from the interval p + 2 m ( p + 1 ) ( p + 2 ) / 2 , and a local approximation Q k ( ϕ ) of objective L ( ϕ ) , is constructed as follows:
L ( ϕ k , i + s ) Q k ( ϕ k , i + s ) = c k + g k s + 1 2 s H k s , s R p .
and Q k ( ϕ ) satisfies the interpolation conditions
Q k ( ϕ k , i ) = L ( ϕ k , i ) , f o r ϕ k , i ϕ k , i = 1 , 2 , , m .
As the number m of interpolation points is less than ( p + 1 ) ( p + 2 ) / 2 (the necessary number for a pure interpolation method) and m = 2 p + 1 is recommended by Powell [13,25], the method uses the remaining degree of freedom by solving
Q k = arg min Q 2 Q 2 Q k 1 F 2 ,
where · F 2 denotes the square of the Frobenius norm, i.e., G F 2 = Σ i = 1 m Σ j = 1 n | G i j | 2 [26]. Further, based on the established quadratic approximation function Q k ( x ) , it solves its trust-region subproblem. Equally important, the set of m interpolation points is also updated during the iterations. The BOBYQA algorithm can be applied to solve functions with large p parameters, as mentioned by Ragonneau [27]. When compared with the L-BFGS-B method, the strength of the BOBYQA algorithm is that it can approximate functions whose Hessian matrix is a singular matrix, as also manifested in Section 4. For more information, see [13]. The algorithm is included in the R “optimx” library [28], a replacement and extension of the “optim” function in the “stats” library [29].
As mentioned above, the two-stage procedure refers to an algorithm that optimizes parameters separately, and the two-step procedure denotes an algorithm that optimizes parameters jointly. To evaluate the performance of our proposed TStep-BOBYQA, we compared it with three other methods: TStage-BFGS, TStep-BFGS, and TStage-BOBYQA. TStage-BFGS refers to the optimization of parameters through a two-stage procedure that combines the L-BFGS-B optimization algorithm with initial values obtained via grid search. TStep-BFGS refers to the optimization of parameters using a two-step procedure that integrates the L-BFGS-B optimization algorithm with initial values derived from TStage-BFGS. TStage-BOBYQA refers to the optimization of parameters through a two-stage procedure that combines the BOBYQA optimization algorithm with initial values obtained through grid search. In contrast, TStep-BOBYQA utilizes a two-step procedure coupled with the BOBYQA optimization algorithm, employing initial values sourced from TStage-BOBYQA. These methods are summarized in Table 2.

4. Application

In this section, we apply the methods described in Section 3 to the ACS data presented in Section 2. This dataset includes the time to SI switch, time to death from AIDS, the continuous covariate age, and the categorical covariate CCR5. As mentioned in Section 3.2, we utilized three various parametric distributions (i.e., exponential, Weibull, and Gompertz distributions) to model the marginal distribution of survival time. We employed the four different copula functions: Clayton, Frank, Gumbel, and normal, to assess the association between the two survival endpoints. Finally, we selected the best-fitting model based on the Akaike information criterion (AIC). We also assume that the hazard rate and the association between the two events all depend on covariates. We estimate the hazard ratios of the non-terminal and terminal events as well as the regression coefficients of the covariates for the copula function parameters. To eliminate the influence of units, we standardized the times to events and age using a min-max normalization method.
The estimated hazard ratios for each covariate and the regression coefficients of the covariates for the association parameter are reported in Table 3, Table 4, Table 5 and Table 6. The hazard ratios and regression coefficients for the association parameter are similar between the L-BFGS-B methods (TStage-BFGS; TStep-BFGS) and the BOBYQA methods (TStage-BOBYQA; TStep-BOBYQA), with some exceptions. For the Clayton–Gompertz model, parameter estimation can not be obtained using TStep-BFGS (see Table 3). For the Frank–Gompertz model, the variance estimation of regression coefficients of association can not be obtained using TStage-BOBYQA (see Table 4). Additionally, for the Weibull distribution, although the estimated parameters are comparable, the confidence intervals for parameters estimated by the L-BFGS-B methods (TStage-BFGS; TStep-BFGS) are wider than those obtained through the BOBYQA methods (TStage-BOBYQA; TStep-BOBYQA), as illustrated in Table 3, Table 4, Table 5 and Table 6. The methods mentioned above are briefly outlined in Table 2.
From Table 3, Table 4, Table 5 and Table 6, we can see that the two-step methods (TStep-BFGS and TStep-BOBYQA) provide lower AIC values than the two-stage methods (TStage-BFGS and TStage-BOBYQA), with the exception of the copulas combined with the Weibull distribution models. The parameters estimated by the two-stage procedure are identical to those estimated by the two-step procedure, resulting in equivalent AIC values.

4.1. Hazard Ratios and Associations

SI following HIV infection. Based on the estimation of four different copula functions combined with three different marginal distribution models, no association was found between age and SI switch. Similarly, no association was observed between CCR5 and SI switch, except in two models: the Frank–exponential model (HR:0.705, 95% CI: 0.425 to 0.986, obtained through TStep-BFGS and TStep-BOBYQA; see Table 4) and the normal–Gompertz model (HR: 0.698, 95% CI: 0.402 to 0.995, using TStep-BFGS and TStep-BOBYQA; see Table 6). Thus, the “WM” groups are shown to have a lower risk of SI switch.
Death following HIV infection. According to the results of all models, age is not found to be associated with death, as the value of 1 is within their HR confidence intervals. We can see that the CCR5 HR is less than the value of 1, and their HR confidence intervals do not include a value of 1; this means that it is associated with death, and the “WM” groups have a lower risk of death compared to the “WW” groups. These conclusions are consistent for the four different copula models.
Association between SI and death. From the results of all models, the association between SI switch and death is stronger for the patients with the CCR5 “WM” genotype than for those with the CCR5 “WW” genotype, except for cases including the Frank–Weibull model, Frank–exponential model, and normal–exponential model. Age does not appear to influence the association for both SI switch and death events, as the value of 0 is within the CI of the age parameter coefficient estimation.

4.2. Results for the Preferred Model

For the data from the Amsterdam Cohort Study on HIV infection and AIDS, we employed the AIC to select the optimal model. Our comparative analysis reveals that the two-step estimation method consistently demonstrates superior performance, yielding lower AIC values compared to the two-stage optimization approach in most situations. Among the copula models, the normal copula exhibits the most favorable AIC outcomes, as per Table 3, Table 4, Table 5 and Table 6. Moreover, the Gompertz distribution presents the lowest AIC values within each copula model. These systematic comparisons conclusively identify the normal–Gompertz copula model as the optimal choice for the ACS data.
As demonstrated by the normal–Gompertz survival model in Table 6, individuals with the CCR5 “WM” genotype exhibit significantly reduced risks of both seroconversion illness (SI switch) and mortality compared to the “WW” genotype group. Specifically, the hazard ratios are 0.698 (95% CI: 0.402–0.995) for SI switch and 0.406 (95% CI: 0.235–0.578) for death. This result aligns with the genetic mechanism proposed by Eugen-Olsen et al. [30].
Age demonstrated no statistically significant association with clinical outcomes in the normal–Gompertz survival model. The hazard ratio (HR) for seroconversion illness (SI switch) was 2.065 (95% CI: 0.164–3.967), and the mortality HR reached 3.434 (95% CI: 0.654–6.215), indicating insufficient evidence to guarantee their relationship.

5. Simulation Study

In this section, we investigate the performance of our proposed methods through simulations based on different copula functions. On the one hand, we focus on evaluating the estimation accuracy and efficiency of the proposed two-step process and comparing its performances with the two-stage method. On the other hand, we compare the parameter estimation precision achieved by the optimizer BOBYQA (TStage-BOBYQA; TStep-BOBYQA) with that of the popular optimizer L-BFGS-B (TStage-BFGS; TStep-BFGS). The comprehensive simulation results are presented in Table 7, Table 8, Table 9 and Table 10.

5.1. Design

The data were generated from various copula models with survival distributions. The main procedure follows the methodology outlined by Wei et al. [23]. We generated 3000 individuals, mimicking the standardized real data analysis from the ACS dataset. The covariate vector is Z = { Z 1 , Z 2 } , with Z 1 from Uniform(0, 1) and Z 2 from Bernoulli(0.3). The parameters of two survival distributions were generated using the exponential regression model, as illustrated in Section 3.2. The times to non-terminal and terminal events were simulated from the exponential, Weibull, and Gompertz survival distributions. The censoring time C t was generated independently from a uniform distribution. The true regression coefficients are specified as in Table 7, Table 8, Table 9 and Table 10. We generated 500 datasets with 3000 individuals.

5.2. Performance Measures

To estimate the regression coefficients for the marginal and association parameters, we maximized the likelihood described in Section 3.2 by using the L-BFGS-B optimization algorithm (implemented in the optim function in R) and the BOBYQA algorithm (implemented in the optimx function in R). The performance of our proposed methods was measured using mean squared error (MSE) and coverage probability (CP).

5.3. Results

The results are presented in Table 7, Table 8, Table 9 and Table 10. As expected, in the two-stage optimization procedure, the estimation of parameters for the non-terminal regression model is inferior to that of the terminal regression model in terms of MSE and CP, as the non-terminal time to event is censored by the terminal time to event. Furthermore, we can see that the estimation of the parameters for the copula function is suboptimal due to inaccuracies in estimating the parameters for the non-terminal time-to-event regression model. This highlights the limitations of the two-stage optimization method when applied to semi-competing risks data.
When comparing the two-stage and two-step optimization procedures, the coverage probability achieved by the two-step method (TStep-BFGS and TStep-BOBYQA) is closer to the nominal level than that of the two-stage method (TStage-BFGS and TStage-BOBYQA) in most scenarios. Regarding the MSE criteria, the two-step method (TStep-BFGS and TStep-BOBYQA) demonstrates a slight reduction for the exponential distribution with all copula functions when compared to the two-stage method (TStage-BFGS and TStage-BOBYQA) (Table 7). Therefore, the two-step method outperformed the two-stage method.
When comparing the results of the optimization methods, TStage-BFGS yields results similar to those of TStage-BOBYQA with some exceptions. Specifically, for the Clayton–Gompertz and Frank–Gompertz models, TStage-BOBYQA demonstrates a reduction in MSE compared to TStage-BFGS (see Table 7 and Table 8). TStage-BFGS achieves 0.0% coverage probabilities for all coefficients, except for the parameter α N T in the Frank–Weibull model (see Table 8). Furthermore, TStep-BFGS produces results comparable to those of TStep-BOBYQA. It is critical to note that the bounds of parameter values must be specified properly for TStep-BFGS; otherwise, parameter estimation cannot be obtained, rendering it impractical. The BOBYQA optimization method demonstrates greater reliability than the L-BFGS-B optimization method. Consequently, our proposed TStep-BOBYQA optimization method outperforms the other optimization methods in these simulations.

6. Discussion

In this article, we present copula-based regression models for semi-competing events with right censoring. We mainly consider four well-known univariate copula functions (i.e., Clayton, Frank, Gumbel, and normal) with three parametric distributions (i.e., exponential, Weibull, and Gompertz) as the marginal distributions. To evaluate the effects of covariates on the time to events, we incorporated these covariates into an interpretable regression model. Due to the complexity of copula-based likelihood functions and the censoring data, estimating the parameters of copula functions and marginal distributions is a great challenge. The initial values of parameters always have a heavy impact on the accuracy of parameter estimation. We employed a two-step procedure method to alleviate the influence and adopted the BOBYQA optimization solver.
In our analysis of real data on HIV/AIDS from the ACS, we employed copula-based models to construct the association between two events. We obtained consistent results, with the exception of the variable CCR5 and its effect on the SI switch event. Specifically, CCR5 shows no relation to the SI switch event in most models, except for two cases: the Frank–exponential model (HR:0.705, 95% CI: 0.425 to 0.986, using TStep-BFGS and TStep-BOBYQA; see Table 4) and the normal–Gompertz model (HR: 0.698, 95% CI: 0.402 to 0.995, using TStep-BFGS and TStep-BOBYQA; see Table 6), which show that CCR5 “WM” is associated with a lower risk of SI switch. Additionally, we conducted several simulations to evaluate the performance of different models based on MSE and CP criteria. In general, in view of MSE criteria, the two-step method performs slightly better than the two-stage method when utilizing both the L-BFGS-B and BOBYQA optimizers. From the aspect of CP, the two-step method consistently outperforms the two-stage method. For the BOBYQA optimizer, it yields nearly the same accurate results as the L-BFGS-B optimizer in certain simulations, including the Gumbel–exponential and Gumbel–Gompertz models in Table 9, as well as the normal–exponential model in Table 10. However, in terms of CP, the BOBYQA optimizer outperforms the L-BFGS-B optimizer in some other simulations. Specifically, the TStage-BFGS method yields a CP further from 95% for the variable coefficients of the terminal regression models in the Clayton–Weibull, Clayton–Gompertz (Table 7), and Frank–Weibull (Table 8) models. Furthermore, in some simulations, careful specification of the bounds is essential for the L-BFGS-B optimizer; otherwise, parameter estimation may not be achievable.
As it is known, one appealing feature of the copula model is that it allows for separate modeling and estimation of the association parameter and marginal distributions, such that the marginal distributions do not depend on the dependency structure. Consequently, the dependency and margins can be estimated separately. However, different distributional choices within the family may induce significantly different dependence structures, as studied by Shih and Louis [3] and Wang and Wells [31]. The methods for selecting appropriate copula models are critical to avoid incorrect association modeling, which can lead to misleading or incorrect conclusions. In addition to the AIC and BIC criteria, Zhu et al. [32] applied the diagnostic method proposed by Chen and Bandeen-Roche [33] to choose the appropriate copula model based on the observed bivariate failure times ( T 1 , T 2 ) . Further work could see the use of asymmetric and heavy-tailed distributions for parametric copulas [34]. Therefore, research is warranted to develop a method for copula model selection.
In this article, we mainly consider the BOBYQA derivative-free optimization algorithm. More relevant algorithms surveyed by Rios and Sahinidis [35] need to be investigated, and they may yield better results. In practical applications, the selection of a copula model and the choice of optimizer must be considered for semi-competing risks data. At present, there is no established criterion for selecting an appropriate optimization algorithm for copula models across different semi-competing risks datasets. Optimization methodologies for semi-competing risks models remain a significant and understudied research area, warranting further systematic investigation.
Moreover, there may be issues related to the misspecification of the parametric marginal survival function. To enhance flexibility, semi-parametric or nonparametric models can be considered. Sun and Ding [36] proposed a class of copula-based semiparametric transformation models for bivariate data subject to general interval censoring. For the marginal model, they used the semi-parametric transformation model and approximate the infinite-dimensional nuisance parameters using sieves with Bernstein polynomials. The major challenges in analyzing semi-competing risks data are parameter identifiability, the stability of parameter estimation, and efficiency. Fine [37] used an iteration strategy between estimating parameters of a terminal event and estimating the parameters of copula and non-terminal events. Finally, in the context of real data, there are only two factors: age and CCR5 for AIDS data. Some other unmeasured covariates may influence the time-to-event, and the frailty term can be added for future work, as mentioned by Wei et al. [23].

Author Contributions

Conceptualization, Q.Z. and Y.W.; methodology, Q.Z. and Y.W.; software, Q.Z. and Y.W.; validation, Y.W., M.W., and B.D.; formal analysis, Y.W., and M.W.; writing—original draft preparation, Q.Z.; writing—review and editing, Y.W., M.W., and B.D.; project administration, Q.Z. and Y.W.; supervision, Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research is part of the Yunnan–Plymouth joint Doctoral Training Programme in Statistical Science, funded by the China Scholarship Council (CSC). Q.Z and B.D. were funded by this program (CSC-503854, CSC-557660) to visit the University of Plymouth.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Fine, J.; Jiang, H.; Chappell, R. On semi-competing risks data. Biometrika 2001, 88, 907–919. [Google Scholar] [CrossRef]
  2. Xu, J.; Kalbfleisch, J.D.; Tai, B. Statistical analysis of illness–death processes and semicompeting risks data. Biometrics 2010, 66, 716–725. [Google Scholar] [CrossRef] [PubMed]
  3. Shih, J.; Louis, T. Inferences on the association parameter in copula models for bivariate survival data. Biometrics 1995, 51, 1384–1399. [Google Scholar] [CrossRef]
  4. Ramadan, D.A.; Hasaballah, M.M.; Abd-Elwaha, N.K.; Alshangiti, A.M.; Kamel, M.I.; Balogun, O.S.; El-Awady, M.M. Bayesian and Non-Bayesian Inference to Bivariate Alpha Power Burr-XII Distribution with Engineering Application. Axioms 2024, 13, 796. [Google Scholar] [CrossRef]
  5. Fayomi, A.; Almetwally, E.M.; Qura, M.E. A novel bivariate Lomax-G family of distributions: Properties, inference, and applications to environmental, medical, and computer science data. AIMS Math. 2023, 8, 17539–17584. [Google Scholar] [CrossRef]
  6. Haj Ahmad, H.; Almetwally, E.M.; Ramadan, D.A. Investigating the relationship between processor and memory reliability in data science: A bivariate model approach. Mathematics 2023, 11, 2142. [Google Scholar] [CrossRef]
  7. Jiang, H.; Fine, J.; Kosork, M.; Chappell, R. Pseudo self-consistent estimation of a copula model with informative censoring. Scand. J. Stat. 2005, 32, 1–20. [Google Scholar] [CrossRef]
  8. Peng, L.; Fine, J. Regression modeling of semicompeting risks data. Biometrics 2007, 63, 96–108. [Google Scholar] [CrossRef]
  9. Hsieh, J.; Huang, Y. Regression analysis based on conditional likelihood approach under semi-competing risks data. Lifetime Data Anal. 2012, 18, 302–320. [Google Scholar] [CrossRef]
  10. Chen, Y. Maximum likelihood analysis of semicompeting risks data with semiparametric regression models. Lifetime Data Anal. 2012, 18, 36–57. [Google Scholar] [CrossRef]
  11. Arachchige, S.; Chen, X.; Zhou, Q. Two-stage pseudo maximum likelihood estimation of semiparametric copula-based regression models for semi-competing risks data. Lifetime Data Anal. 2025, 31, 52–75. [Google Scholar] [CrossRef]
  12. Sun, T.; Li, Y.; Xiao, Z.; Ding, Y.; Wang, X. Semiparametric copula method for semi-competing risks data subject to interval censoring and left truncation: Application to disability in elderly. Stat. Methods Med. Res. 2023, 32, 656–670. [Google Scholar] [CrossRef] [PubMed]
  13. Powell, M. The BOBYQA Algorithm for Bound Constrained Optimization Without Derivatives; Cambridge NA Report NA2009/06; University of Cambridge: Cambridge, UK, 2009; Volume 26, pp. 26–46. [Google Scholar]
  14. Ragonneau, T. Model-Based Derivative-Free Optimization Methods and Software. Ph.D. Thesis, Hong Kong Polytechnic University, Hong Kong, China, 2023. [Google Scholar]
  15. Deresa, N.; Ingrid, V.; Katrien, A. Copula-based inference for bivariate survival data with left truncation and dependent censoring. Insur. Math. Econ. 2022, 107, 1–21. [Google Scholar] [CrossRef]
  16. Czado, C.; Keilegom, I. Dependent censoring based on parametric copulas. Biometrika 2023, 110, 721–738. [Google Scholar] [CrossRef]
  17. Geskus, R. On the inclusion of prevalent cases in HIV/AIDS natural history studies through a marker-based estimate of time since seroconversion. Stat. Med. 2000, 19, 1753–1769. [Google Scholar] [CrossRef] [PubMed]
  18. Geskus, R.; Miedema, F.; Goudsmit, J.; Reiss, P.; Schuitemaker, H.; Coutinho, R. Prediction of residual time to AIDS and death based on markers and cofactors. JAIDS 2003, 32, 514–521. [Google Scholar] [CrossRef]
  19. Sorrell, L.; Wei, Y.; Wojtyś, M.; Rowe, P. Estimating the correlation between semi-competing risk survival endpoints. Biom. J. 2022, 64, 131–145. [Google Scholar] [CrossRef]
  20. Patton, A. Modelling asymmetric exchange rate dependence. Int. Econ. Rev. 2006, 47, 527–556. [Google Scholar] [CrossRef]
  21. Nelsen, R. An Introduction to Copulas, 2nd ed.; Springer Science & Business Medias: New York, NY, USA, 2006. [Google Scholar]
  22. Schepsmeier, U.; Jakob, S. Derivatives and Fisher information of bivariate copulas. Stat. Pap. 2014, 55, 525–542. [Google Scholar] [CrossRef]
  23. Wei, Y.; Wojtyś, M.; Sorrell, L.; Rowe, P. Bivariate copula regression models for semi-competing risks. Stat. Methods Med. Res. 2023, 32, 1902–1918. [Google Scholar] [CrossRef]
  24. Sun, T.; Ding, Y. CopulaCenR: Copula based Regression Models for Bivariate Censored Data in R. R J. 2020, 12, 266–282. [Google Scholar] [CrossRef]
  25. Powell, M. Large-Scale Nonlinear Optimization, 1st ed.; Springer Sciencet & Business Media: New York, NY, USA, 2006; pp. 255–297. [Google Scholar]
  26. Xie, P.; Yuan, Y. Derivative-Free Optimization with Transformed Objective Functions and the Algorithm Based on the Least Frobenius Norm Updating Quadratic Model. J. Oper. Res. Soc. China 2024, 1–37. [Google Scholar] [CrossRef]
  27. Ragonneau, T.; Zhang, Z. PDFO: A cross-platform package for Powell’s derivative-free optimization solvers. Math. Prog. Comp. 2024, 16, 535–559. [Google Scholar] [CrossRef]
  28. Nash, J.C.; Varadhan, R. Unifying optimization algorithms to aid software system users: Optimx for R. J. Stat. Softw. 2011, 43, 1–14. [Google Scholar] [CrossRef]
  29. R Core Team. R: A Language and Environment for Statistical Computing; Foundation for Statistical Computing: Vienna, Austria, 2013; pp. 1613–1619. [Google Scholar]
  30. Eugen-Olsen, J.; Iversen, A.K.; Garred, P.; Koppelhus, U.; Benfield, T.L.; Sorensen, A.M.; Katzenstein, T.; Dickmeiss, E.; Gerstoft, J.; Skinhøj, P.; et al. Heterozygosity for a deletion in the CKR-5 gene leads to prolonged AIDS-free survival and slower CD4 T-cell decline in a cohort of HIV-seropositive individuals. Aids 1997, 11, 305–310. [Google Scholar] [CrossRef] [PubMed]
  31. Wang, W.; Wells, M. Model selection and semiparametric inference for bivariate failure-time data. J. Am. Stat. Assoc. 2000, 95, 62–72. [Google Scholar] [CrossRef]
  32. Zhu, H.; Lan, Y.; Ning, J.; Shen, Y. Semiparametric copula-based regression modeling of semi-competing risks data. Commun. Stat.-Theory Methods 2021, 51, 7830–7845. [Google Scholar] [CrossRef]
  33. Chen, M.; Karen, B. A diagnostic for association in bivariate survival models. Lifetime Data Anal. 2005, 11, 245–264. [Google Scholar] [CrossRef]
  34. Quintero, F.O.L.; Contreras-Reyes, J.E.; Wiff, R. Incorporating uncertainty into a length-based estimator of natural mortality in fish populations. Fish. Bull. 2017, 115, 355–364. [Google Scholar] [CrossRef]
  35. Rios, L.; Sahinidis, N. Derivative-free optimization: A review of algorithms and comparison of software implementations. J. Glob. Optim. 2013, 56, 1247–1293. [Google Scholar] [CrossRef]
  36. Sun, T.; Ding, Y. Copula-based semiparametric regression method for bivariate data under general interval censoring. Biostatistics 2021, 22, 315–330. [Google Scholar] [CrossRef] [PubMed]
  37. Fine, J.; Yan, J.; Kosorok, M. Temporal process regression. Biometrika 2004, 91, 683–703. [Google Scholar] [CrossRef]
Table 1. Summary of variables in the Amsterdam Cohort Study dataset (N = 324).
Table 1. Summary of variables in the Amsterdam Cohort Study dataset (N = 324).
VariableValues
SI Status
     Switch
     No switch
113 (34.9%)
211 (65.1%)
Mortality
     Death
     No Death
178 (54.9%)
146 (45.1%)
CCR5 Genotype
     WW (Wild Wild)
     WM (Wild Mutant)
259 (79.9%)
65 (20.1%)
Age (years)
     Median (range)
      Mean (variance)
34.2 (19–58)
34.7 (53.4)
WW: Wild type allele on both chromosomes; WM: Mutant allele on one chromosome. Data are presented as n (%) unless otherwise specified.
Table 2. Summary of proposed and baseline optimization methods.
Table 2. Summary of proposed and baseline optimization methods.
MethodInitializationAlgorithmProcedure
TStage-BFGSGrid searchL-BFGS-BTwo-Stage
TStep-BFGSTStage-BFGS resultL-BFGS-BTwo-Step
TStage-BOBYQAGrid searchBOBYQATwo-Stage
TStep-BOBYQATStage-BOBYQA resultBOBYQATwo-Step
TStage: Two-Stage. TStep: Two-Step. L-BFGS-B: Limited-memory Broyden–Fletcher–Goldfarb–Shanno bounded. BOBYQA: Bound Optimization by Quadratic Approximation.
Table 3. The parameter estimation results of the Clayton copula model with exponential, Weibull, and Gompertz marginal distributions.
Table 3. The parameter estimation results of the Clayton copula model with exponential, Weibull, and Gompertz marginal distributions.
Marginal
Distribution
Method/CovariateHazard Ratio (95% CI)Regression Coefficients on
Copula Parameter (95% CI)
AIC
SI SwitchDeath
Exponential
distribution
TStage-BFGS 326.599
Age1.529 (−0.024, 3.082)2.040 (0.418, 3.662)−0.202 (−2.083, 1.680)
CCR5 type: WM0.826 (0.446, 1.205)0.496 (0.279, 0.714)1.283 (0.452, 2.114)
TStep-BFGS 319.340
Age1.635 (0.194, 3.075)1.775 (0.415, 3.135)0.082 (−1.639, 1.803)
CCR5 type: WM0.738 (0.440, 1.035)0.521 (0.310, 0.732)0.804 (0.039, 1.569)
TStage-BOBYQA 326.599
Age1.529 (−0.024, 3.082)2.040 (0.418, 3.662)−0.202 (−1.200, 0.796)
CCR5 type: WM0.826 (0.446, 1.205)0.496 (0.279, 0.714)1.283 (0.978, 1.588)
TStep-BOBYQA 319.340
Age1.635 (0.194, 3.075)1.775 (0.415, 3.135)0.081 (−1.640, 1.802)
CCR5 type: WM0.738 (0.440, 1.035)0.521 (0.310, 0.732)0.804 (0.039, 1.569)
Weibull
distribution
TStage-BFGS 256.229
Age2.718 (−0.270, 5.706)3.719 (0.659, 6.778)0.364 (−1.656, 2.384)
CCR5 type: WM1.000 (0.514, 1.486)0.404 (0.226, 0.582)1.397 (0.447, 2.347)
TStep-BFGS 256.229
Age2.718 (−0.374, 5.810)3.719 (0.669, 6.768)0.364 (−1.935, 2.663)
CCR5 type: WM1.000 (0.462, 1.538)0.404 (0.231, 0.577)1.397 (0.265, 2.528)
TStage-BOBYQA 250.411
Age2.718 (0.024, 5.413)3.719 (0.659, 6.778)0.235 (−0.558, 1.028)
CCR5 type: WM1.000 (0.561, 1.439)0.404 (0.226, 0.582)1.329 (0.975, 1.683)
TStep-BOBYQA 250.411
Age2.718 (0.041, 5.395)3.719 (0.723, 6.715)0.235 (−1.791, 2.261)
CCR5 type: WM1.000 (0.545, 1.455)0.404 (0.231, 0.577)1.329 (0.353, 2.305)
Gompertz
distribution
TStage-BFGS 257.348
Age1.708 (−0.051, 3.468)3.708 (0.645, 6.771)0.484 (−1.384, 2.351)
CCR5 type: WM0.781 (0.420, 1.143)0.400 (0.224, 0.576)1.229 (0.363, 2.095)
TStep-BFGS
Age39,610.734
(39,610.734, 39,610.734)
3,269,017.372
(3,269,017.372, 3,269,017.372)
15.000
(15.000, 15.000)
CCR5 type: WM492.465
(492.465, 492.465)
3,269,017.372
(3,269,017.372, 3,269,017.372)
15.000
(15.000, 15.000)
TStage-BOBYQA 257.348
Age1.708 (−0.051, 3.468)3.708 (0.645, 6.771)0.484 (−0.406, 1.373)
CCR5 type: WM0.781 (0.420, 1.143)0.400 (0.224, 0.576)1.229 (0.853, 1.606)
TStep-BOBYQA 252.517
Age2.172 (0.092, 4.251)3.393 (0.632, 6.155)0.662 (−1.178, 2.502)
CCR5 type: WM0.707 (0.402, 1.013)0.400 (0.228, 0.573)1.025 (0.161, 1.888)
Table 4. The parameter estimation results of the Frank copula model with exponential, Weibull, and Gompertz marginal distributions.
Table 4. The parameter estimation results of the Frank copula model with exponential, Weibull, and Gompertz marginal distributions.
Marginal
Distribution
Method/CovariateHazard Ratio (95% CI)Regression Coefficients on
Copula Parameter (95% CI)
AIC
SI SwitchDeath
Exponential
distribution
TStage-BFGS 325.685
Age1.529 (−0.024, 3.082)2.040 (0.418, 3.662)−2.017 (−9.359, 5.324)
CCR5 type: WM0.826 (0.446, 1.205)0.496 (0.279, 0.714)5.742 (0.220, 11.264)
TStep-BFGS 314.003
Age1.422 (0.174, 2.671)1.611 (0.357, 2.864)−1.074 (−10.917, 8.769)
CCR5 type: WM0.705 (0.425, 0.986)0.523 (0.316, 0.729)4.079 (−1.632, 9.787)
TStage-BOBYQA 325.685
Age1.529 (−0.024, 3.082)2.040 (0.418, 3.662)−2.017 (−2.060, −1.974)
CCR5 type: WM0.826 (0.446, 1.205)0.496 (0.279, 0.714)5.742 (5.742, 5.742)
TStep-BOBYQA 314.003
Age1.423 (0.174, 2.672)1.612 (0.357, 2.866)−1.085 (−10.927, 8.757)
CCR5 type: WM0.706 (0.425, 0.986)0.523 (0.316, 0.729)4.086 (−1.630, 9.802)
Weibull
distribution
TStage-BFGS 253.969
Age2.718 (−0.270, 5.706)3.719 (0.659, 6.778)0.795 (−4.983, 6.573)
CCR5 type: WM1.000 (0.514, 1.486)0.404 (0.226, 0.582)3.538 (−0.294, 7.371)
TStep-BFGS 253.969
Age2.718 (−0.433, 5.870)3.719 (0.529, 6.908)0.795 (−5.999, 7.589)
CCR5 type: WM1.000 (0.488, 1.512)0.404 (0.230, 0.578)3.538 (−0.746, 7.822)
TStage-BOBYQA 247.748
Age2.718 (0.024, 5.413)3.719 (0.659, 6.778)0.873 (0.806, 0.939)
CCR5 type: WM1.000 (0.561, 1.439)0.404 (0.226, 0.582)3.433 (3.433, 3.433)
TStep-BOBYQA 247.748
Age2.718 (0.052, 5.385)3.719 (0.581, 6.856)0.873 (−5.731, 7.476)
CCR5 type: WM1.000 (0.564, 1.436)0.404 (0.230, 0.578)3.433 (−0.564, 7.431)
Gompertz
distribution
TStage-BFGS 254.023
Age1.709 (−0.051, 3.468)3.708(0.645, 6.771)0.979(−5.248, 7.205)
CCR5 type: WW0.781 (0.420, 1.143)0.400 (0.224, 0.576)3.011 (−0.783, 6.805)
TStep-BFGS 249.938
Age1.943 (0.121, 3.76)2.924 (0.474, 5.374)2.629 (−4.556, 9.814)
CCR5 type: WW0.722 (0.422, 1.023)0.405 (0.234, 0.576)2.679 (−1.160, 6.519)
TStage-BOBYQA 254.024
Age1.708 (−0.051, 3.468)3.708 (0.645, 6.771)0.978 (0.851, 1.105)
CCR5 type: WW0.781 (0.420, 1.143)0.400 (0.224, 0.576)3.011 (3.011, 3.011)
TStep-BOBYQA 249.938
Age1.943 (0.122, 3.765)2.922 (0.474, 5.370)2.653 (−4.532, 9.839)
CCR5 type: WW0.722 (0.421, 1.023)0.405 (0.234, 0.576)2.688 (−1.155, 6.531)
Table 5. The parameter estimation results of the Gumbel copula model with exponential, Weibull, and Gompertz marginal distributions.
Table 5. The parameter estimation results of the Gumbel copula model with exponential, Weibull, and Gompertz marginal distributions.
Marginal
Distribution
Method/CovariateHazard Ratio (95% CI)Regression Coefficients on
Copula Parameter (95% CI)
AIC
SI SwitchDeath
Exponential
distribution
TStage-BFGS 330.925
Age1.529 (−0.024, 3.082)2.040 (0.418, 3.662)0.194 (−2.026, 2.414)
CCR5 type: WM0.826 (0.446, 1.205)0.496 (0.279, 0.714)1.180 (0.389, 1.971)
TStep-BFGS 323.343
Age1.780 (0.263, 3.297)1.882 (0.404, 3.359)0.600 (−1.301, 2.501)
CCR5 type: WM0.757 (0.452, 1.063)0.504 (0.296, 0.713)0.847 (0.134, 1.560)
TStage-BOBYQA 330.925
Age1.529 (−0.024, 3.082)2.040 (0.418, 3.662)0.194 (−0.405, 0.794)
CCR5 type: WM0.826 (0.446, 1.205)0.496 (0.279, 0.714)1.180 (0.313, 2.047)
TStep-BOBYQA 323.343
Age1.780 (0.263, 3.297)1.882 (0.404, 3.359)0.600 (−1.301, 2.502)
CCR5 type: WM0.757 (0.452, 1.063)0.504 (0.296, 0.713)0.847 (0.134, 1.560)
Weibull
distribution
TStage-BFGS 251.986
Age2.718 (−0.270, 5.706)3.719 (0.659, 6.778)−0.112 (−2.204, 1.980)
CCR5 type: WM1.000 (0.514, 1.486)0.404 (0.226, 0.582)1.026 (0.195, 1.856)
TStep-BFGS 251.986
Age2.718 (−0.102, 5.538)3.719 (0.701, 6.737)−0.112 (−2.275, 2.051)
CCR5 type: WM1.000 (0.525, 1.475)0.404 (0.231, 0.576)1.026 (0.147, 1.905)
TStage-BOBYQA 247.238
Age2.718 (0.024, 5.413)3.719 (0.659, 6.778)−0.150 (−0.511, 0.212)
CCR5 type: WM1.000 (0.561, 1.439)0.404 (0.226, 0.582)0.846 (0.479, 1.213)
TStep-BOBYQA 247.238
Age2.718 (0.177, 5.259)3.719 (0.718, 6.720)−0.150 (−2.169, 1.870)
CCR5 type: WM1.000 (0.571, 1.429)0.404 (0.230, 0.578)0.846 (0.015, 1.676)
Gompertz
distribution
TStage-BFGS 254.501
Age1.709 (−0.051, 3.468)3.708 (0.645, 6.771)0.258 (−1.569, 2.086)
CCR5 type: WW0.781(0.420, 1.143)0.400(0.224, 0.576)0.804 (0.065, 1.542)
TStep-BFGS 252.165
Age2.075 (0.187, 3.964)3.461 (0.627, 6.294)0.490 (−1.294, 2.274)
CCR5 type: WW0.715 (0.411, 1.019)0.408 (0.235, 0.581)0.670 (−0.059, 1.399)
TStage-BOBYQA 254.502
Age31.708 (−0.051, 3.468)3.708(0.645, 6.771)0.258 (−0.167, 0.684)
CCR5 type: WW0.781 (0.420, 1.143)0.400 (0.224, 0.576)0.804 (0.438, 1.169)
TStep-BOBYQA 252.165
Age2.074 (0.186, 3.962)3.458 (0.626, 6.290)0.487 (−1.297, 2.272)
CCR5 type: WW0.715 (0.411, 1.019)0.408 (0.235, 0.581)0.670 (−0.059, 1.399)
Table 6. The parameter estimation results of the normal copula model with exponential, Weibull, and Gompertz marginal distributions.
Table 6. The parameter estimation results of the normal copula model with exponential, Weibull, and Gompertz marginal distributions.
Marginal
Distribution
Method/CovariateHazard Ratio (95% CI)Regression Coefficients on
Copula Parameter (95% CI)
AIC
SI SwitchDeath
Exponential
distribution
TStage-BFGS 323.288
Age1.529 (−0.024, 3.082)2.040 (0.418, 3.662)0.143 (−0.859, 1.145)
CCR5 type: WM0.826 (0.446, 1.205)0.496 (0.279, 0.714)0.644 (0.222, 1.066)
TStep-BFGS 313.698
Age1.668 (0.222, 3.114)1.822 (0.409, 3.235)0.165 (−0.882, 1.213)
CCR5 type: WM0.735 (0.439, 1.030)0.506 (0.299, 0.714)0.426 (0.000, 0.852)
TStage-BOBYQA 323.288
Age1.529 (−0.024, 3.082)2.040 (0.418, 3.662)0.143 (−0.859, 1.145)
CCR5 type: WM0.826 (0.446, 1.205)0.496 (0.279, 0.714)0.644 (0.222, 1.066)
TStep-BOBYQA 313.698
Age1.668 (0.222, 3.114)1.822 (0.409, 3.235)0.165 (−0.882, 1.213)
CCR5 type: WM0.735 (0.439, 1.030)0.506 (0.299, 0.714)0.426 (0.000, 0.852)
Weibull
distribution
TStage-BFGS 249.868
Age2.718 (−0.270, 5.706)3.719 (0.659, 6.778)0.042 (−0.852, 0.936)
CCR5 type: WM1.000 (0.514, 1.486)0.404 (0.226, 0.582)0.517 (0.086, 0.947)
TStep-BFGS 249.868
Age2.718 (−0.234, 5.671)3.719 (0.666, 6.772)0.042 (−0.930, 1.014)
CCR5 type: WM1.000 (0.507, 1.493)0.404 (0.231, 0.577)0.517 (0.056, 0.977)
TStage-BOBYQA 244.226
Age2.718 (0.024, 5.413)3.719 (0.659, 6.778)0.003 (−0.895, 0.901)
CCR5 type: WM1.000 (0.561, 1.439)0.404 (0.226, 0.582)0.447 (0.024, 0.870)
TStep-BOBYQA 244.226
Age2.718 (0.112, 5.324)3.719 (0.700, 6.737)0.003 (−0.919, 0.926)
CCR5 type: WM1.000 (0.562, 1.438)0.404 (0.228, 0.580)0.447 (0.003, 0.891)
Gompertz
distribution
TStage-BFGS 249.546
Age1.709 (−0.051, 3.468)3.708 (0.645, 6.771)0.245 (−0.627, 1.117)
CCR5 type: WM0.781 (0.420, 1.143)0.400 (0.224, 0.576)0.481 (0.067, 0.896)
TStep-BFGS 245.306
Age2.065(0.164, 3.967)3.434 (0.654, 6.215)0.283 (−0.612, 1.177)
CCR5 type: WM0.698 (0.402, 0.995)0.406 (0.235, 0.578)0.353 (−0.061, 0.767)
TStage-BOBYQA 249.546
Age1.708 (−0.051, 3.468)3.708 (0.645, 6.771)0.245 (−0.627, 1.117)
CCR5 type: WM0.781 (0.420, 1.143)0.400 (0.224, 0.576)0.481 (0.067, 0.896)
TStep-BOBYQA 245.306
Age2.065 (0.164, 3.967)3.434 (0.654, 6.215)0.282(−0.612, 1.177)
CCR5 type: WM0.698 (0.402, 0.995)0.406 (0.235, 0.578)0.353 (−0.061, 0.766)
The results are for real data on HIV/AIDS from the Amsterdam Cohort Study. The binary covariate CCR5 has two categories: “WM” and “WW” (reference). The SI switch refers to the virus phenotype switching from non-syncytium-inducing to syncytium-inducing.
Table 7. Simulation results for the Clayton copula.
Table 7. Simulation results for the Clayton copula.
ParametersTrue_ValueTStage-BFGSTStep-BFGSTStage-BOBYQATStep-BOBYQA
MSECPMSECPMSECPMSECP
Exponential distribution
β ^ N T , 0 1.3600.0460.00.00294.40.0460.00.00294.4
β ^ N T , a g e 0.5770.01181.60.00493.60.01181.60.00493.6
β ^ N T , c c r 5 0.1000.00293.40.00294.00.00293.40.00294.0
β ^ T , 0 0.9800.00294.20.00294.80.00294.20.00294.8
β ^ T , a g e 0.3000.00595.00.00494.20.00595.00.00494.2
β ^ T , c c r 5 0.1500.00195.80.00195.80.00195.80.00195.8
β ^ C , 0 1.2600.07010.40.00693.40.0710.40.00693.4
β ^ C , a g e −0.0370.05457.00.01594.40.05433.60.01694.4
β ^ C , c c r 5 −1.2000.01189.80.00895.20.01152.20.00895.2
Time (minutes) 171.6 242.1 208.0 315.7
Weibull distribution
α ^ N T 2.6000.00292.80.00195.60.00292.80.00195.8
β ^ N T , 0 −1.6600.00487.00.00294.80.00487.00.00294.6
β ^ N T , a g e −0.0770.00496.00.00396.20.00496.00.00396.0
β ^ N T , c c r 5 0.2100.00197.20.00197.00.00197.20.00197.0
α ^ T 2.9900.0205.80.00294.00.00295.40.00293.4
β ^ T , 0 −3.2600.0680.00.00594.00.00595.40.00593.0
β ^ T , a g e −0.3700.01860.80.00494.40.00594.20.00494.4
β ^ T , c c r 5 0.0500.00389.40.00197.40.00197.40.00197.4
β ^ C , 0 1.2600.01368.60.00695.20.00655.60.00695.2
β ^ C , a g e −0.0370.05463.80.01995.60.02154.20.02095.2
β ^ C , c c r 5 −1.2000.01587.40.01194.00.01146.20.01194.0
Time (minutes) 850.2 976.4 876.3 1055.1
Gompertz distribution
α ^ N T 2.6000.00491.60.00295.80.00491.60.00295.8
β ^ N T , 0 −1.6600.00590.80.00395.40.00590.80.00395.2
β ^ N T , a g e −0.0770.00496.00.00496.00.00496.00.00495.8
β ^ N T , c c r 5 0.2100.00295.60.00195.80.00295.60.00196.0
α ^ T 2.9900.0230.70.00294.80.00394.80.00294.0
β ^ T , 0 −3.2600.0680.00.00695.20.00694.60.00594.8
β ^ T , a g e −0.3700.01375.60.00496.40.00594.40.00494.4
β ^ T , c c r 5 0.0500.00388.20.00294.40.00294.20.00294.6
β ^ C , 0 1.2600.00983.60.00696.60.00657.20.00696.6
β ^ C , a g e −0.0370.04474.80.02294.40.02351.00.02294.4
β ^ C , c c r 5 −1.2000.01490.40.01294.60.01245.40.01294.8
Time (minutes) 664.2 785.2 702.4 859.4
Data were generated using the Clayton copula exponential, Weiull, and Gompertz models, respectively. We generated 500 datasets, with 3000 individuals in each. MSE: mean squared error. CP: empirical coverage probability for 95% confidence interval. TStage-BFGS: two-stage procedure with L-BFGS-B optimization algorithm. TStep-BFGS: two-step procedure with L-BFGS-B optimization algorithm. TStage-BOBYQA: two-stage procedure with BOBYQA optimization algorithm. TStep-BOBYQA: two-step procedure with BOBYQA optimization algorithm.
Table 8. Simulation results for the Frank copula.
Table 8. Simulation results for the Frank copula.
ParametersTrue_ValueTStage-BFGSTStep-BFGSTStage-BOBYQATStep-BOBYQA
MSECPMSECPMSECPMSECP
Exponential distribution
β ^ N T , 0 1.3600.00371.00.00897.00.00371.00.00895.0
β ^ N T , a g e 0.5770.00796.00.00797.00.00796.00.00795.0
β ^ N T , c c r 5 0.1000.00371.60.00794.60.00371.60.00794.4
β ^ T , 0 0.9800.00296.80.00296.80.00296.80.00296.8
β ^ T , a g e 0.3000.00494.60.00494.40.00494.60.00494.4
β ^ T , c c r 5 0.1500.00197.00.00197.00.00197.00.00197.0
β ^ C , 0 1.2600.10591.80.09893.40.10510.60.09793.6
β ^ C , a g e −0.0370.25795.00.21093.80.25614.40.21093.8
β ^ C , c c r 5 −1.2000.22994.40.07794.60.08321.80.07794.6
Time (minutes) 425.1 570.5 398.7 500.7
Weibull distribution
α ^ N T 2.6000.00295.40.00295.40.00295.00.00295.8
β ^ N T , 0 −1.6600.0040.00.00395.20.00392.00.00395.2
β ^ N T , a g e −0.0770.0050.00.00595.00.00693.20.00594.4
β ^ N T , c c r 5 0.2100.0020.00.00296.40.00389.40.00294.6
α ^ T 2.9900.0030.00.00392.80.00295.60.00295.6
β ^ T , 0 −3.2600.0050.00.00593.60.00594.60.00595.0
β ^ T , a g e −0.3700.0050.00.00593.80.00595.40.00595.0
β ^ T , c c r 5 0.0500.0020.00.00296.20.00296.00.00296.4
β ^ C , 0 1.2600.0660.00.06795.40.08311.60.06791.6
β ^ C , a g e −0.0370.1510.00.15596.00.19618.20.15693.0
β ^ C , c c r 5 −1.2000.0610.00.06396.00.06617.40.06393.2
Time (minutes) 1106.7 1292.4
Gompertz distribution
α ^ N T 2.6000.00093.80.00494.20.00393.80.00494.2
β ^ N T , 0 −1.6600.00892.80.00495.60.00592.80.00495.2
β ^ N T , a g e −0.0770.03993.80.00597.00.00693.80.00596.8
β ^ N T , c c r 5 0.2100.00491.20.00296.20.00291.20.00296.0
α ^ T 2.9900.00095.40.00395.40.00395.40.00395.4
β ^ T , 0 −3.2600.00095.00.00795.40.00795.00.00795.4
β ^ T , a g e −0.3700.01394.60.00595.00.00594.60.00595.0
β ^ T , c c r 5 0.0500.00494.60.00294.20.00294.60.00294.2
β ^ C , 0 1.2600.23595.00.07295.20.07112.60.07295.6
β ^ C , a g e −0.0370.28094.80.19894.80.19113.40.19694.8
β ^ C , c c r 5 −1.2000.23993.40.07493.00.07217.60.07492.8
Time (minutes) 919.6 1090.3 919.6 1090.3
Data were generated using the Frank copula exponential, Weiull, and Gompertz models, respectively. We generated 500 datasets, with 3000 individuals in each. MSE: mean squared error. CP: empirical coverage probability for 95% confidence interval. TStage-BFGS: two-stage procedure with L-BFGS-B optimization algorithm. TStep-BFGS: two-step procedure with L-BFGS-B optimization algorithm. TStage-BOBYQA: two-stage procedure with BOBYQA optimization algorithm. TStep-BOBYQA: two-step procedure with BOBYQA optimization algorithm.
Table 9. Simulation results for the Gumbel copula.
Table 9. Simulation results for the Gumbel copula.
ParametersTrue_ValueTStage-BFGSTStep-BFGSTStage-BOBYQATStep-BOBYQA
MSECPMSECPMSECPMSECP
Exponential distribution
β ^ N T , 0 1.3600.01624.80.00295.60.01624.80.00295.8
β ^ N T , a g e 0.5770.01276.20.00495.00.01276.20.00495.2
β ^ N T , c c r 5 0.1000.01046.60.00294.40.01046.60.00194.6
β ^ T , 0 0.9800.00295.20.00294.40.00295.40.00294.6
β ^ T , a g e 0.3000.00594.80.00494.00.00595.00.00494.2
β ^ T , c c r 5 0.1500.00195.20.00195.40.00195.40.00195.6
β ^ C , 0 1.2600.00968.00.00393.00.00968.40.00393.2
β ^ C , a g e −0.0370.01678.60.00795.00.01672.60.00795.2
β ^ C , c c r 5 −1.2000.01081.00.00495.00.01057.60.00495.2
Time (minutes) 560.4 650.3 590.1 748.0
Weibull distribution
α ^ N T 2.6000.00196.20.00195.40.00195.00.00195.0
β ^ N T , 0 −1.6600.00394.00.00393.80.00395.40.00295.0
β ^ N T , a g e −0.0770.00494.20.00495.20.00494.80.00495.8
β ^ N T , c c r 5 0.2100.00295.40.00296.40.00294.00.00194.4
α ^ T 2.9900.00294.20.00295.80.00294.20.00294.8
β ^ T , 0 −3.2600.00593.80.00594.40.00596.00.00595.8
β ^ T , a g e −0.3700.00496.00.00493.80.00495.20.00494.2
β ^ T , c c r 5 0.0500.00296.80.00296.80.00295.60.00196.2
β ^ C , 0 1.2600.00392.40.00394.20.00389.00.00394.8
β ^ C , a g e −0.0370.01094.00.01093.80.01074.60.01094.0
β ^ C , c c r 5 −1.2000.00594.20.00594.20.00576.40.00595.4
Time (minutes) 1228.1 1332.6 1248.5 1462.7
Gompertz distribution
α ^ N T 2.6000.00394.40.00395.60.00394.80.00395.0
β ^ N T , 0 −1.6600.00496.20.00395.60.00494.80.00395.0
β ^ N T , a g e −0.0770.00496.00.00495.60.00595.20.00495.4
β ^ N T , c c r 5 0.2100.00292.80.00195.40.00293.60.00193.8
α ^ T 2.9900.00394.60.00294.20.00394.60.00294.8
β ^ T , 0 −3.2600.00794.00.00594.40.00694.60.00594.8
β ^ T , a g e −0.3700.00594.40.00596.20.00694.00.00594.2
β ^ T , c c r 5 0.0500.00295.40.00195.20.00295.20.00195.4
β ^ C , 0 1.2600.00394.00.00394.80.00394.00.00394.2
β ^ C , a g e −0.0370.00994.80.01095.00.01095.00.01095.2
β ^ C , c c r 5 −1.2000.00595.00.00595.00.00595.20.00595.4
Time (minutes) 1044.5 1147.1 1072.8 1276.7
Data were generated using the Gumbel copula exponential, Weiull, and Gompertz models, respectively. We generated 500 datasets, with 3000 individuals in each. MSE: mean squared error. CP: empirical coverage probability for 95% confidence interval. TStage-BFGS: two-stage procedure with L-BFGS-B optimization algorithm. TStep-BFGS: two-step procedure with L-BFGS-B optimization algorithm. TStage-BOBYQA: two-stage procedure with BOBYQA optimization algorithm. TStep-BOBYQA: two-step procedure with BOBYQA optimization algorithm.
Table 10. Simulation results for the Normal copula.
Table 10. Simulation results for the Normal copula.
ParametersTrue_ValueTStage-BFGSTStep-BFGSTStage-BOBYQATStep-BOBYQA
MSECPMSECPMSECPMSECP
Exponential distribution
β ^ N T , 0 1.3600.0480.00.00294.20.0500.060.00294.2
β ^ N T , a g e 0.5770.01084.60.00494.60.01184.60.00494.6
β ^ N T , c c r 5 0.1000.0300.60.00294.60.0300.60.00294.6
β ^ T , 0 0.9800.00297.00.00195.40.00295.00.00195.4
β ^ T , a g e 0.3000.00594.60.00494.60.00594.60.00494.6
β ^ T , c c r 5 0.1500.00196.00.00196.00.00196.00.00196.0
β ^ C , 0 1.2600.01329.80.00292.80.01329.80.00292.8
β ^ C , a g e −1.0370.01078.80.00592.80.01078.80.00592.8
β ^ C , c c r 5 −1.2000.00671.60.00294.40.00671.60.00294.4
Time (minutes) 1637.3 1922.3 1704.4 2109.0
Weibull distribution
α ^ N T 2.6000.00385.80.00396.60.00288.20.00294.4
β ^ N T , 0 1.3600.01258.40.00295.80.01257.60.00294.6
β ^ N T , a g e 0.5770.00790.20.00397.20.00886.20.00494.6
β ^ N T , c c r 5 0.1000.01813.20.00196.20.01912.40.00295.2
α ^ T 2.9900.00293.00.00293.80.00294.60.00293.6
β ^ T , 0 0.9800.00595.40.00494.80.00594.00.00494.2
β ^ T , a g e 0.3000.00594.80.00494.40.00595.40.00494.4
β ^ T , c c r 5 0.1500.00296.60.00196.00.00197.00.00196.6
β ^ C , 0 1.2600.00193.40.00194.00.00290.40.00292.4
β ^ C , a g e −1.0370.00495.80.00495.80.00592.60.00593.0
β ^ C , c c r 5 −1.2000.00285.80.00196.00.00286.20.00294.2
Time (minutes) 1656.0 1898.2 1705.1 2134.2
Gompertz distribution
α ^ N T 2.6000.00794.60.00394.20.00784.60.00394.2
β ^ N T , 0 1.3600.01955.60.00495.80.02055.60.00495.8
β ^ N T , a g e 0.5770.00984.20.00495.00.00984.20.00495.0
β ^ N T , c c r 5 0.1000.0210.90.00294.60.0210.90.00294.6
α ^ T 2.9900.00395.40.00395.20.00394.40.00395.4
β ^ T , 0 0.9800.00793.80.00694.60.00793.80.00694.6
β ^ T , a g e 0.3000.00595.60.00494.60.00595.60.00494.6
β ^ T , c c r 5 0.1500.00297.00.00297.20.00297.00.00297.2
β ^ C , 0 1.2600.00291.40.00293.60.00291.40.00293.6
β ^ C , a g e −1.0370.00593.20.00593.20.00593.20.00593.2
β ^ C , c c r 5 −1.2000.00395.40.00293.20.00295.40.00293.2
Time (minutes) 1920.6 2520.9 1920.6 2520.9
Data were generated using the normal copula exponential, Weiull, and Gompertz models, respectively. We generated 500 datasets, with 3000 individuals in each. MSE: mean squared error. CP: empirical coverage probability for 95% confidence interval. TStage-BFGS: two-stage procedure with L-BFGS-B optimization algorithm. TStep-BFGS: two-step procedure with L-BFGS-B optimization algorithm. TStage-BOBYQA: two-stage procedure with BOBYQA optimization algorithm. TStep-BOBYQA: two-step procedure with BOBYQA optimization algorithm.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, Q.; Duan, B.; Wojtyś, M.; Wei, Y. Two-Step Estimation Procedure for Parametric Copula-Based Regression Models for Semi-Competing Risks Data. Entropy 2025, 27, 521. https://doi.org/10.3390/e27050521

AMA Style

Zhang Q, Duan B, Wojtyś M, Wei Y. Two-Step Estimation Procedure for Parametric Copula-Based Regression Models for Semi-Competing Risks Data. Entropy. 2025; 27(5):521. https://doi.org/10.3390/e27050521

Chicago/Turabian Style

Zhang, Qingmin, Bowen Duan, Małgorzata Wojtyś, and Yinghui Wei. 2025. "Two-Step Estimation Procedure for Parametric Copula-Based Regression Models for Semi-Competing Risks Data" Entropy 27, no. 5: 521. https://doi.org/10.3390/e27050521

APA Style

Zhang, Q., Duan, B., Wojtyś, M., & Wei, Y. (2025). Two-Step Estimation Procedure for Parametric Copula-Based Regression Models for Semi-Competing Risks Data. Entropy, 27(5), 521. https://doi.org/10.3390/e27050521

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop