Likelihood Inference for Copula Models Based on Left-Truncated and Competing Risks Data from Field Studies

: Survival and reliability analyses deal with incomplete failure time data, such as censored and truncated data. Recently, the classical left-truncation scheme was generalized to analyze “ﬁeld data”, deﬁned as samples collected within a ﬁxed period. However, existing competing risks models dealing with left-truncated ﬁeld data are not ﬂexible enough. We propose copula-based competing risks models for latent failure times, permitting a ﬂexible parametric form. We formulate maximum likelihood estimation methods under the Weibull, lognormal, and gamma distributions for the latent failure times. We conduct simulations to check the performance of the proposed methods. We ﬁnally give a real data example. We provide the R code to reproduce the simulations and data analysis results.


Introduction
Survival analysis deals with incomplete failure time data, such as censored and truncated data [1][2][3][4][5]. Left-truncation arises if early failures occur before the data collection period and their failures are simply ignored. Right-censoring arises if late failures occur after the data collection period, where their exact times of failures are unascertainable. Competing risks arise if the failure event of interest becomes unavailable by the occurrence of other failure events.
Naïve statistical analyses of truncated, censored, and competing risks data lead to biased inference for the population of interest. Left-truncation, right-censoring, and competing risks are a classical topic in survival analysis that is widely recognized in both biostatistics [3] and reliability engineering [1][2][3][4][5].
Hong et al. [1] generalized the classical truncation scheme to a more complex scheme found in the "field data", defined as samples collected within a fixed period (see Section 2). The well-known example is the failure time dataset for electric power transformers. Hong et al. [1] separated the samples into two parts: a truncated part (e.g., transformers installed before 1980) and an untruncated part (e.g., transformers installed after 1980), and then combined them into a single likelihood function. This type of truncated field data arises in many other practical applications, where there is a birth process (the installing process) of individuals and a certain data collection period of failure [6][7][8][9].
For left-truncated field reliability data, a variety of distributions have been considered for statistical inference. The inference methods for the lognormal and Weibull distributions were developed by [1,[10][11][12]. The gamma and the generalized gamma distribution were considered in [13,14]. Comparison of the numerical methods for maximum likelihood estimator were considered by [12,15]. Bayesian approaches for the lognormal, Weibull, and gamma distributions were considered by [2,16]. The Lehmann family of distributions was considered by [17] and the log-location-scale family of distributions was considered by [18]. The spline model was proposed by [7]. In this research topic, the most up-to-date review paper is [2].
Competing risks could occur together with left-truncation. Competing risks means that there are latent failure times to events, and only one of the failure times is observable. As competing risks are dependent, copula-based dependent risks models have been extensively utilized [19][20][21][22][23][24][25], which formulate the joint distribution of latent failure times. However, in analysis of left-truncated field data, copula-based dependent risks models have not been proposed yet.
Existing competing risks analyses for left-truncated field data are limited to the following models: the independent Weibull model with the common shape parameter [26], the Marshall-Olkin bivariate Rayleigh model [27], and the Marshall-Olkin bivariate Weibull model [4]. While these three models provide an important starting point, their models are somewhat specific for modeling latent failure times of competing risks. Therefore, there is much room for proposing more flexible models.
In this paper, we will develop copula-based models for dependent competing risks under left-truncation, permitting more flexible models for dependence structure and failure time distributions than the existing three models of [4,26,27]. Due to our main interest in applications to reliability analyses, we adopt the standard parametric distributions, including the Weibull, lognormal, and gamma distributions. These failure time distributions are commonly used in reliability studies, and also found in recent engineering applications [28][29][30]. While the proposed copula models permit a variety of copulas, we nonetheless focus on the Clayton copula for modeling dependence among failure times to facilitate computation.
The remainder of the paper is structured as follows. Section 2 provides the backgrounds and key concepts, such as field failure data and competing risks. Section 3 introduces copula models and develops likelihood functions for the Weibull, lognormal, gamma marginal models. Section 4 reports simulation studies to check the performance of the proposed methods. In Section 5, we analyze a dataset for illustration. Section 6 concludes the paper with future directions.

Left-Truncation and Competing Risks
We review the structure of left-truncation and competing risks for data collected from field studies.
Suppose that there are two mutually exclusive events for failure, named Event 1 and Event 2. Let (T 1 , T 2 ) be a pair of failure times for Event 1 and Event 2, respectively. Under competing risks, the failure time is defined by T 12 ≡ min(T 1 , T 2 ), where only one of (T 1 , T 2 ) is observable. For instance, if T 1 < T 2 , the observable failure time is T 12 = T 1 , and the exact value of T 2 is unknown. The pair (T 1 , T 2 ) is called "latent failure times" since only one of T 1 and T 2 is observable. Figure 1 explains left-truncation in the data collection scheme, where the individuals were born randomly over time. However, we cannot obtain samples of all the individuals since the data collection starts only after time s (the starting time). If individuals are born and then fail before time s, they are discarded and not observable; see "Not observed" in Figure 1. This leads to biased sampling.
To define left-truncation mathematically, we let τ be the time from birth to time s (Figure 1). For individuals to be included in our samples, their failure time T 12 must be greater than τ (Figure 1). Those who were born before s are defined as "truncated samples", which are all conditional on T 12 ≥ τ. The individuals born after s are always included in our samples. Therefore, our samples are divided into truncated ones (born before s) and untruncated ones (born after s). For untruncated samples, the sample is observed without any condition, and τ is undefined. Left-truncated and competing risks data from a field study. Event 1 and Event 2 are subject to competing risks. Solid curves show observable times, and dashed curves show unobservable times. One can observe Event 1, Event 2, or Censoring, whichever occurs first, between the starting time (s) and the ending time (e). One cannot observe anything if Event 1 or Event 2 occurs before time s.
To signify the difference between truncated and untruncated samples, we define the truncation indicator, There could be a random censoring time C that may censor the value of T 1 and T 2 , e.g., the duration of the data collection ( Figure 1). If both Event 1 and Event 2 do not occur within the data collection period, one cannot exactly know T 1 and T 2 . One only knows that T 1 and T 2 are greater than the observed censoring time C that is defined as time from birth to e (the ending time).
In this sense, what we actually observe is T = min(T 1 , T 2 , C). The sample is censored if both T 1 > C and T 2 > C hold; otherwise, the sample is uncensored. We assume that two failures do not occur simultaneously, i.e., T 1 = T 2 never occurs.
Under competing risks, there are three possible cases: The first case is where Event 1 is observable by T = T 1 and {T 1 < T 2 , T 1 < C}. The second case is where Event 2 is observable by T = T 2 and {T 2 < T 1 , T 2 < C}. The last case is where neither Event 1 nor Event 2 is observable by T = C and {C < T 1 , C < T 2 }. The first and last cases are shown in Figure 1.
In a dataset, there are n samples denoted as {(t i , τ i , ν i ); i = 1, 2, . . . , n}. In addition, as we know the event status of each individual, the samples are divided into three sets: Since each sample belongs to one of the three sets, it holds that l 1 ∪ l 2 ∪ l 0 = {1, 2, . . . , n}.

Proposed Methods
In this section, we propose a copula-based model and a likelihood-based inference method.

Copula Model for Competing Risks
We use a copula model for bivariate competing risks as originally proposed by [19]. Let (T 1 , T 2 ) be a pair of failure times whose marginal survival functions are S i (t) = P(T i > t), i ∈ {1, 2}. We assume that the failure times are continuous and hence have the density functions f i (t) = −dS i (t)/dt, i ∈ {1, 2}. To model the dependence of T 1 and T 2 , we assume a survival copula model [21] where C θ is a parametric copula [31] with parameter θ.
For computational simplicity, we particularly adopt the Clayton copula [31] C where θ specifies dependence between T 1 and T 2 , and is transformed to Kendall's tau: θ/(θ + 2). Hence, T 1 and T 2 are positively dependent whose strength increases as θ gets large. If θ → 0 , the Clayton copula reduces to C(u, v) = uv, the independent risks model of [26]. The Clayton copula is one of the most frequently used copulas in competing risks models [22,24] and other models [33,[40][41][42] due to its remarkable mathematical tractability. Other copulas might be considered. However, we do not suggest fitting many copulas since the true one may not be identifiable from the data. We use the Clayton copula, and then focus on the selection of θ.

Likelihood Function
We extend the likelihood function for the independent risks models [26] to the copulabased models. Let {(t i , τ i , ν i ); i = 1, 2, . . . , n} be observed data as previously defined. Let (T 1i , T 2i ) be a pair of latent failure times for Event 1 and Event 2, respectively. In addition, let C i be independent and uninformative censoring times. Let T i = min(T 1i , T 2i , C i ), and t i be its observed value. For a sample without truncation (ν i = 1), the likelihood is divided into three cases: For a sample with truncation (ν i = 0), the likelihood is divided into other three cases: In the last three cases, the "C i ≥ τ i " can be dropped. Combining all the six cases and ignoring the constant factors for censoring, the likelihood function is After some simplification, the likelihood function becomes To estimate parameters in the model, the maximum likelihood estimator (MLE) is adopted under parametric models for S 1 (.) and S 2 (.). We will consider commonly used parametric models for failure time distributions in reliability, including the Weibull, lognormal, and gamma distributions. Recent engineering applications of these distributions are found in [28][29][30]. The Weibull distribution will be detailed in Section 3.3 and the lognormal and gamma distributions in Appendices A and B.

Weibull Model
We set S 1 (t) = exp(−λ 1 t α 1 ) and S 2 (t) = exp(−λ 2 t α 2 ). Under the independent risks model C(v, w) = vw, the likelihood function is If we set the common shape parameter α = α 1 = α 2 , the above likelihood is equivalent to the one derived in Kundu et al. [26]. We do not impose this assumption, and let α 1 = α 2 for generality.
Under the Clayton copula with parameter θ, the likelihood function is . While the likelihood is undefined for θ = 0, by letting θ → +0 , the likelihood function reduces to the one under the independent risks model.
There are a few different ways to estimate (α 1 , α 2 , λ 1 , λ 2 , θ) based on L Weib (α 1 , α 2 , λ 1 , λ 2 , θ). The first one is the MLE under an "assumed" value of θ, which is defined as The MLE α 1 ,α 2 ,λ 1 ,λ 2 under the independence copula is regarded as α 1 ,α 2 ,λ 1 ,λ 2 0 . The next one is the full MLE with the unknown θ, which is defined as In either case, one can maximize log L by applying an iteration algorithm. In R (https://www.r-project.org/), there are a variety of optimization functions. We suggest the "optim(.)" function (https://stat.ethz.ch/R-manual/R-patched/library/stats/html/optim. html, accessed on 19 June 2022). This function has the option "hessian=TRUE" to yield the Hessian matrix of − log L. The standard error (SE) of the estimates can be computed by the diagonal components of the inverse of the Hessian matrix. Details can be seen from the R code given in Supplementary Materials.
We generally recommend using the full MLE if one wishes to estimate θ. However, the full MLE α 1 ,α 2 ,λ 1 ,λ 2 ,θ gives large variance since data have little information for θ. In addition, the Hessian matrix could not be invertible, making the SE unavailable. We conjecture that the regularity conditions for the MLE hold with the unknown θ (as shown in [25]), but fail to hold for the full MLE.
Our numerical studies (Sections 4 and 5) will consider both known and unknown settings of θ.

Simulation Studies
We conducted simulation studies to check the performance of the proposed methods of Section 3. To this end, we compared the performance of fitting (i) the independence copula, (ii) the Clayton copula with the known parameter, (iii) the Clayton copula with the parameter estimated by the MLE, and (iv) the Clayton copula with the parameter estimated by the PMLE.

Simulation Settings
We generated latent failure times from the Weibull model: Dependence structure for the two failure times is modeled by either the independence copula or the Clayton copula with θ = 2 (Kendall's tau = 0.5).
Mimicking the sampling scheme of Figure 1, we generated s is the starting time, and e is the ending time. For truncated samples, we kept samples satisfying τ i ≤ t i = min(T 1i , T 2i ). This led to samples {(t i , τ i , ν i ); i = 1, 2, . . . , n} with ν i = 0 for n/2 truncated samples and with ν i = 1 for n/2 untruncated samples.
Based on the samples, the MLE α 1 ,α 2 ,λ 1 ,λ 2 was computed by fitting the independent model or the Clayton copula with θ being known. In addition, the MLE and PMLE α 1 ,α 2 ,λ 1 ,λ 2 ,θ were computed by fitting the Clayton copula with θ being unknown (the PMLE wasso computed under an assumed value of θ * = 2/9 corresponding to Kendall's tau = 0.1). Based on 1000 repetitions, we calculated the mean, standard deviation (SD), and the mean squared error (MSE) of parameter estimates, and the average of the SEs. For instance, the MSE for α 1 is E[(α 1 − α 1 ) 2 ], where the expectation is taken by the average of 1000 repetitions.
The following parameter settings were considered to model different shapes of the hazard function (  Table 1 shows the means of parameter estimates. We see that the means of parameter estimates are nearly close to their true parameters when the copula is correctly specified. For instance, when the true copula is the Clayton copula with θ = 2, accurate estimates are obtained by fitting the same copula. If the true copula is wrongly specified, the estimates are biased (e.g., by fitting the Clayton copula with θ = 2 under the independence copula). This implies the importance of selecting the correct value of θ to estimate the parameter. If θ is treated as unknown and estimated by data, the biases of estimates can often be large. These biases exist for both MLE and PMLE methods for dealing with the unknown θ. This unavoidable bias comes from the well-known difficulty of estimating θ under competing risks.    Table 2 shows the MSEs for the estimates. The smallest MSEs are attained when the true copula equals the fitted one. This agrees with the pattern of biases (Table 1), again implying the importance of selecting the correct value of θ. The MSEs remain large when estimating θ by the MLE or PMLE. Again, this problem comes from the difficulty of estimating θ under competing risks.   Table 3 shows how the MSEs reduce as the sample size increases (only the results for estimating α 1 are shown as the results for other parameters exhibiting similar patterns). When the true copula's parameter is correctly specified, the MSEs vanish toward zero with increased samples. This implies the consistency of the parameter estimates. When the copula parameter is wrongly chosen, the MSEs reduce very slowly or do not vanish toward zero. When the copula parameter is estimated, the MSEs do not always reduce with increased samples. This indicates the inconsistency of the parameter estimates when the copula parameter is wrongly specified or estimated. Table 4 compares the SDs of the parameter estimates with the average of the SEs. The SDs and SEs are very close to each other, indicating that the SEs capture the true variations of the estimates. Hence, the SEs calculated by the proposed method (by the Hessian matrix from the "optim(.)" R function) are highly reliable. However, if θ is treated as unknown and estimated by data, there are some simulation runs where the SEs cannot be computed due to the calculation problem of the Hessian caused by some errors in the "optim(.)" function. Hence, we could not present the results in this case. We suspect that the regularity conditions for the Hessian matrix (to be negative definite) may not hold in this case.

Data Analysis
This section illustrates the proposed method by a data example. Table 5 shows the artificial dataset created by Kundu et al. [26], who mimicked the setting of electric power transformers that were randomly installed from 1960 to 2008 [1]. The dataset consists of n = 100 samples (electric power transformers) that are observed from the starting year (s = 1980) to the ending year (e = 2008). However, some transformers installed before 1980 are subject to left-truncation. The 30 transformers are truncated (ν i = 0; installed before 1980) and the other 70 transformers are untruncated (ν i = 1; installed after 1980). Observed events are three types of "Event" (=1 for Event 1; =2 for Event 2; =0 for censoring) in the 5th column of Table 5. The number of events is 14 for Event 1, 33 for Event 2, and 53 for censoring. The data seem to be a simulated dataset from the independent Weibull with the common shape parameter as considered in [26], though the details were not explained. We therefore will fit the Weibull failure time models. Table 5. Artificial data from Kundu et al. [26], which consist of n = 100 power transformers observed from the starting year (s = 1980) to the ending year (e = 2008). B i is the installation year. E i is either failure year or censoring year (2008). The 30 transformers are truncated (ν i = 0) and other 70 transformers are untruncated (ν i = 1). Observed events are shown in Event (=0 for censoring; =1 for Event 1; =2 for Event 2). Time)   1  1961  1996  0  2  19  35  2  1964  1985  0  1  16  21  3  1962  2007  0  2  18  45  4  1962  1986  0  2  18  24  5  1961  1992  0  2  19  31  :  :  :  :  :  :  :  30  1963  1994  0  1  17  31  31  1987  2008  1  0  Undefined  21  32  1980  2008  1  0  Undefined  28  33  1988  2008  1  0  Undefined  20  34  1985  2008  1  0  Undefined  23  :  :  :  :  :  :  :  100  1989  2008  1  0  Undefined  19 For the i-th transformer, let (T 1i , T 2i ) be a pair of latent failure times for Event 1 and Event 2, respectively, for i = 1, 2, . . . , 100. Following [26], we scaled the failure times (in days) by dividing them by 100 to avoid many decimal places in parameter estimates. We postulate the model We estimated the parameters α 1 , λ 1 , α 2 , and λ 2 under the assumed value of θ = 0, 0.5, 2, or 8. We also estimated the parameters without giving the value of θ (all parameters were estimated together) by the MLE or the PMLE. Table 6 compares the estimates for (α 1 , λ 1 , α 2 , λ 2 ) based on different values for θ. Under the assumed value of θ = 0, the estimates are very close to the estimates obtained by Kundu et al. [26]. This is reasonable since the proposed model with θ = 0 and that of Kundu et al. [26] are both the independent Weibull models. Note that the model of Kundu et al. [26] imposed the common shape parameter α 1 = α 2 while our proposed model did not. Estimates α 1 = 2.82 and α 2 = 2.79 show that the common shape parameter seems to be valid for these data. For the assumed values of θ ∈ {0.5, 2, 8}, the estimates deviate from the results of Kundu et al. [26]. The results under θ ∈ {0.5, 2, 8} should not be taken since their log-likelihood values are lower than the one under θ = 0. The PMLE under the assumed values of θ * ∈ {0.5, 2, 8} improves the log-likelihood value, yet none of them reaches the log-likelihood value under θ = 0. Indeed, the log-likelihood value under θ = 0 almost reaches the largest log-likelihood value underθ = 0.004 that comes from the MLE. In conclusion, our sensitivity analysis by the fixed values and estimated value of θ extremely confirms adequacy of the independent risks model for this dataset. Table 6. The estimates (SEs in parenthesis) for the Clayton copula model with the Weibull failure times based on the dataset from Kundu et al. [26]. The Clayton copula with fixed θ or estimated θ are fitted.

Conclusions and Future Work
In this paper, we propose a copula-based model for dependent competing risks in the presence of left-truncation. The copula-based competing risks model permits more flexible failure time distributions and dependence structure than the existing competing risks model [4,26,27]. Parametric likelihood-based inference methods are then formulated, motivated by the practically important applications in field reliability analyses.
Below, we list several problems to be considered or resolved in the future. First, different failure time models can be examined besides the Weibull, lognormal, and gamma models considered in this paper. The candidate distributions could be the extreme-value distributions [43], the generalized Pareto distribution [44], and the generalized Bilal distribution [45]. The Fréchet (type II) and Gumbel (type I) distributions are of particular interest in the extreme-value distributions [43]. In addition, the model may include a covariate (e.g., an acceleration factor) associated with failure time as in the Weibull regression [1]. In addition to this usual failure time regression model, a conditional copula regression model is of another interest [46]. Cure fractions and latent variables may also be considered to be incorporated into the failure time model [47].
Second, the prediction analysis and the test design may be explored in addition to the estimation method considered in this paper. The prediction of remaining lifetime was considered under left-truncated and right-censored field data without competing risks [1,17]. This risk prediction at a censored time point may be informative for engineers to make decisions on future maintenance plans. However, it is not trivial to extend this prediction method under competing risks. This risk prediction scheme is equivalent to dynamic prediction [48][49][50][51][52][53]. Moreover, the accelerated life test plans may be considered including the case of one-shot devices with competing risks [54,55].
Third, a goodness-of-fit test and/or graphical model-diagnostic method for the proposed model may be developed. A possible strategy is to apply a distance measure between the empirical sub-distribution function and the model-based one. This strategy was adopted for right-censored competing risks data analyses [23] without left-truncation. In order to handle left-truncation, further methodological and numerical works are necessary.
Fourth, Bayesian methods can be considered as a competitor to the MLE methods of this paper. The choice of the marginal prior distributions could follow [2,16], while the bounded uniform prior could be recommended for the copula prior [56]. It is of interest to see if the difficulty of estimating the copula parameter is resolved by Bayesian methods.
Finally, the limited availability of open datasets could be resolved by encouraging engineers to collect field failure data. Due to the difficulty of obtaining real datasets in the literature, we applied our methods to the artificial dataset created by Kundu et al. [26], who considered the setting of electric power transformers. Dataset can also be searched from other applications of survival analysis, such as those for medical research.

Appendix B. Likelihood for the Lognormal Model
We set the marginal survival functions where Φ is the cdf of the standard normal distribution. Under the independent risks model C(v, w) = vw, the likelihood function is Under the Clayton copula, the likelihood function is L lognorm (µ 1 , µ 2 , σ 1 , σ 2 , θ)