Abstract
In this paper, we discuss regression analysis of bivariate interval-censored failure time data that often occur in biomedical and epidemiological studies. To solve this problem, we propose a kind of general and flexible copula-based semiparametric partly linear additive hazards models that can allow for both time-dependent covariates and possible nonlinear effects. For inference, a sieve maximum likelihood estimation approach based on Bernstein polynomials is proposed to estimate the baseline hazard functions and nonlinear covariate effects. The resulting estimators of regression parameters are shown to be consistent, asymptotically efficient and normal. A simulation study is conducted to assess the finite-sample performance of this method and the results show that it is effective in practice. Moreover, an illustration is provided.
Keywords:
Archimedean copula model; Bernstein polynomials; bivariate interval-censored data; partly linear model MSC:
62N02; 62H05; 62G05; 62G20
1. Introduction
In this article, we discuss a regression analysis using the marginal additive hazards model on bivariate interval-censored data. Interval-censored data refer to failure times that are observed to only belong to an interval rather than being known with absolute certainty. These types of data frequently occur in various areas, including biomedical and epidemiological investigations [1]. It is easy to see that the most commonly studied right-censored data can be seen as a special case of interval-censored data, and it is important to note that the analysis of interval-censored data is typically far more difficult than right-censored data research. Bivariate interval-censored data occur when there exist two correlated failure times of interest and the observed times of both failure events suffer interval censoring. Clinical trials or medical studies on several events from the same individual such as eye disease studies often yield bivariate interval-censored data. The purpose of this article is to propose a flexible regression model which can process bivariate interval-censored data when the main interest is on the risk differential or excess risk.
There are several methods available for regression analysis of univariate interval-censored data arising from the additive hazards model. Refs. [2,3], for example, investigated the problem for case I interval-censored or current state data, a special form of interval-censored data in which the observed time contains zero or infinite. The former discussed an estimating equation approach and the latter considered an efficient estimation approach. More recently, Ref. [4] studied the same issue as Ref. [2] but with some inequality constraints, and Ref. [5] proposed a sieve maximum likelihood approach. Moreover, Ref. [6] developed several inverse probability weight-based and reweighting-based estimation procedures for the situation with missing covariates, while Ref. [7] presented an efficient approach for general situations.
A large number of approaches have been established for modeling bivariate interval-censored survival data and three types of methods are generally used. One is a marginal method that relies on the working independence assumption [8,9]. Another commonly used approach is the frailty-model-based method, which employs the frailty or latent variable to build the correlation among the associated failure times [10,11]. Ref. [12] proposed a frailty model method for multivariate interval-censored data with informative censoring. The third type of method is the copula-based approach, which gives a different, specific way to model two dependent failure times. One advantage of the approach is that it directly connects the two marginal distributions through a copula function to construct the joint distribution and uses the copula parameter to determine the correlation. This distinguishing characteristic makes it possible to represent the margins independent of the copula function. This advantage is appealing for the points of both modeling and interpretation views. Among others, Ref. [13] discussed this approach based on the marginal transformation model with the two-parameter Archimedean copula. Ref. [14] proposed a copula link-based additive model for right-censored event time data. Moreover, Ref. [15] proposed a copula-based model to deal with bivariate survival data with various censoring mechanisms. In the following, we discuss the regression analysis of bivariate cases by employing the method in [13]. More specifically, we propose a kind of semiparametric partly linear additive hazards model.
Partly linear models are becoming more and more common since they combine the flexibility of nonparametric modeling with the simplicity and ease of interpretation of parametric modeling. It is presumptive that the marginally conditional hazard function has nonlinear relationships with some covariates but linear relationships with others [16,17,18]. In practice, nonlinear covariate effects are typical. For instance, in some medication research, the influence of the dosage of a particular medication may reach a peak at a certain dosage level and be maintained at the peak level, or it may diminish after the dosage level. Although there is a fair amount of literature in this field, to the best of our knowledge, there does not seem to exist a study considering this for bivariate interval-censored survival data.
The presented model involves two nonparametric functions, one identifies the baseline cumulative hazard function and another describes the nonlinear effects of a continuous covariate. In the following, the two-parameter Archimedean copula model is employed for the dependence and more comments on this is given below. Moreover, a sieve maximum likelihood estimation approach is developed to approximate two involved nonparametric nuisance functions by Bernstein polynomials. The proposed method has several desirable features: (a) it allows both time-dependent covariates and the covariates that may have nonlinear effects; (b) the two-parameter Archimedean copula model can flexibly handle dependence structures on both upper and lower tails and the strength of the dependence can be quantified via Kendall’s ; (c) the sieve maximum likelihood estimation approach by Bernstein polynomials can be easily implemented with the use of some existing software; (d) as is seen in the simulation study, the computation is both stable and efficient. Note that the method given by [13] only allows for linear covariate effects.
More specifically, in Section 2, after recommending some notation, assumptions and models that would be used throughout the paper, the resulting likelihood function is presented. In Section 3, we first describe the proposed sieve maximum likelihood estimation approach and then present the asymptotic properties of the proposed estimators. Section 4 presents a simulation study for the assessment of the finite sample performance of the proposed estimation approach, and the results indicate that it works well as expected. In Section 5, an illustration is provided by using a set of data arising from the Age-Related Eye Disease Study (AREDS), and Section 6 gives the conclusion with some discussion and concluding remarks.
2. Assumptions and Likelihood Function
Consider a study consisting of n independent subjects. Define as the failure time of interest associated with the ith subject of the jth failure event. Suppose that for , the two observation times are given by such that . In addition, suppose that for the ith subject, the p-dimensional covariate vectors are possibly time-dependent and denoted by and and the single continuous covariates and are related to and , respectively. More details on them are given below. Here, we assume that and or and could be the same, entirely different, or they also could have some common components. Then, the observed data are as follows:
Note that when is right-censored and when is left-censored. Moreover, it is assumed that given the covariates, the interval censoring is independent of the failure times [1].
Given some covariates, define as the marginal survival function of , and
is the joint survival function of and . For the covariate effects, suppose that given and , the marginal hazard function of is defined as follows:
where denotes an unknown baseline hazard function, is an unknown regression coefficient vector, and is an unknown, smooth nonlinear regression function. That is, follows a partially linear additive hazards model, in which represents the covariate that may have nonlinear effects on . Correspondingly, has the cumulative hazard
where , , and we have that
It is worth noting that for the simplicity of the expressions and calculations, we assume that both linear and nonlinear covariates effects are the same for the two associated failure times of interest [19,20]. It is simple to extend the following method to the situation where the covariate effects are different.
It follows from Sklar’s theorem [21] that if the marginal survival functions are continuous, there exists a unique copula function on such that , and , and it gives
Here, the parameter generally denotes the correlation or dependence between and . As mentioned above, a significant advantage of the copula representation above is that it separates the correlation from the two marginal distributions [22]. There exist many copula functions and among others, one type of the most commonly used copula functions for bivariate data is perhaps the Archimedean copula family. By following [13], we focus on the flexible two-parameter Archimedean copula model given by
where and is the generalized inverse of , which is defined as . The detailed derivation and more comments can be found in Chapter 5 of [23].
As mentioned before, the two parameters and in the copula function above are the association parameters, representing the correlation in both the upper and lower tails. In particular, when the copula model above is equal to the Clayton copula [24], while if , the copula model becomes the Gumbel copula [25]. In other words, the two-parameter copula model is more flexible and has the Clayton or Gumbel copula as special cases. It is well-known that another commonly used measure for the correlation is Kendall’s , and it has an explicit connection with , as
Under the assumptions above, the observed likelihood function has the form
where . Let , all unknown parameters. The next section proposes a sieve approach for the estimation of .
3. Sieve Maximum Likelihood Estimation
It is well-known that directly maximizing the likelihood function can provide an estimate of . On the other hand, it is inappropriate for this situation, as it involves infinite-dimensional functions and . For this, according to [26] and others, we suggest using the sieve method to approximate these functions based on Bernstein’s polynomial first, and then maximize the likelihood function.
Specifically, define , the parameter space, and
in which ⊗ denotes the Kronecker product, and K is a nonnegative constant. Additionally, denotes the subset of all bounded and continuous, nondecreasing, nonnegative functions within , . Similarly, denotes the collection of all bounded and continuous functions within , the support of . In practice, is generally valued as the large and minimum values of all observation times. In the following, define the sieve parameter space
In the above,
and
with
the Bernstein basis polynomial with the degrees and for some fixed , and and being some positive constants. To estimate , define as the value of by maximizing the log-likelihood function
Note that one of the main advantages of Bernstein polynomials is that they can easily implement the nonnegativity and monotonicity properties of by the reparameterization , [26]. In addition, of all approximation polynomials, they have the optimal shape-preserving properties [27]. By the way, they are easy to use because they do not need interior knots. In the above, for simplicity, the same basis polynomial is used for and . Moreover, note that this approach can be relatively easily implemented as discussed below, although it may seem to be complicated. In practice, one need to choose . We suggest employing the Akaike information criterion (AIC) defined as
In the above, the form in the second bracket is the number of unknown parameters in the model [28], where p denotes the dimension of ; and represent the degree of the Bernstein polynomials (i.e., ); and the last number denotes the dimension of correlation parameters and .
For the maximization of the log-likelihood function over the sieve space or the determination of , we suggest first determining the initial estimate of and then applying the Newton–Raphson approach to maximize . In the numerical study section, one can use the R function for the maximization. For the determination of the initial estimates, the following procedure can be applied.
- Step 1: obtain by maximizing the log marginal likelihood function under the observation data on the ’s;
- Step 2: obtain by maximizing the log marginal likelihood function under the observation data on the ’s;
- Step 3: obtain the initial estimates of and by maximizing the joint sieve log likelihood function .
Now, we establish the asymptotic properties of . Define and , and their distance has the form
In the above, with denoting the joint distribution function of the and . Let the true value of be denoted .
Theorem 1 (Consistency).
Suppose that Conditions 1–4 given in Appendix A hold. Then, we have 0 almost surely as .
Theorem 2 (Convergence rate).
Suppose that Conditions 1–5 given in the Appendix A hold. Then,
where and such that , , and q and r are defined in Condition 4 of Appendix A.
Theorem 3 (Asymptotic normality).
Suppose that Conditions 1–5 given in Appendix A hold. Then, we have
where , ,
for an arbitrary vector w, and are the score statistics defined in Appendix A.
We sketch the proofs of the theorems above in Appendix A. Note that based on Theorem 2, one can get the optimal rate of convergence with and . Specifically, it becomes with or and increases while q and r increases. Theorem 3 demonstrates that the suggested estimator of the regression parameter is nonetheless asymptotically normal and effective even when the total convergence rate is less than . It is clear that to give the inference of the suggested estimators, we need to estimate the variance or covariance matrix of , and . However, it can be seen from Appendix A that obtaining their consistent estimators would be difficult. As a result, we propose using the simple nonparametric bootstrap approach [29], which is well-known for offering a direct and simple tool for estimating covariances when no explicit formula is given. It looks to work well, according to the numerical analysis below.
4. A Simulation Study
In this section, we present some simulation studies conducted to evaluate the finite sample performance of the sieve maximum likelihood estimation approach suggested in the preceding sections. Three scenarios for covariates were taken into account in the study. The first one was to generate the single covariates ’s to follow the Bernoulli distribution with the success probability 0.5 and ’s to follow the uniform distribution over . In scenario two, we considered the same ’s as above but generated a two-dimensional vector of the covariate with Bernoulli and . For scenario three, we first generated covariates ’s and ’s as in scenario one and then replaced by . That is, we had time-dependent covariates.
To generate the true bivariate failure times under model (1), one needed to generate and from the uniform distribution over independently for the first step and solve from the equation for a given copula function . Then, the two dependent survival times and were obtained based on and , respectively, with or , . In order to generate the censoring intervals or the observed data, we assumed that each subject was assessed at discrete time points, and the length of two contiguous observation times followed the standard exponential distribution. Then, for every subject, was valued as the last assessment time before , and was equal to the first assessment time behind . The length of the study was determined to yield about a right-censoring rate. On the Bernstein polynomial approximation, by following [13], we set and and also took a Kendall’s equal to for the weak dependency and for the strong dependency. The results given below are based on 1000 replications and 100 bootstrap samples for the variance estimation.
Table 1 and Table 2 present the results based on given by the proposed estimation procedure on the estimation of the regression parameter and the dependence parameter with the true or and or , respectively. The table consists of the bias of estimates (Bias) determined by the difference value between the mean of the estimates and the real value, the sample standard error (SSE) of the estimates based on 1000 replications, the mean of the estimated standard errors (ESE) (for one replication, we obtained the estimated standard errors based on 100 bootstrap samples and computed the average of the 1000 estimated standard errors), as well as the empirical coverage probabilities (CP). The tables show that the proposed estimator was seemingly unbiased, and the variance estimation procedure seemed to perform well. Moreover, all empirical coverage probabilities were close to the nominal level when the sample sizes were increasing, indicating that the normal approximation to the distribution of the proposed estimator seemed reasonable. Moreover, as expected, the results got better in general with the increasing sample size.
Table 1.
Estimation results on and with .
Table 2.
Estimation results on and with .
The estimation results obtained under scenario two for the covariates are given in Table 3; the sample size was or 400 with the true or and . Table 4 contains the estimation results obtained with the time-dependent covariates based on or as well as or 400, and the other values being the same as in Table 1. They again indicated that the proposed method seemed to perform well for the estimation of the regression and association parameters. Furthermore, the results generally became better when the sample size increased.
Table 3.
Estimation results on and with and two covariates.
Table 4.
Estimation results on and with the time-dependent covariates.
To assess the performance of the proposed method on the estimation of the nonlinear function g, we repeated the study given in Table 1 with four different g functions, , , , and , and . Figure 1 and Figure 2 show the average of the estimated g for each of the four cases with and or , respectively. The solid red lines represent the genuine functions, while the dashed blue lines represent the estimations. They show that the proposed method based on Bernstein polynomials seemed to perform reasonably well for the different Kendall’s considered, including a weak or strong dependency. We also took into account various configurations and got comparable outcomes.
Figure 1.
Estimated g with and ; the solid red lines represent the real functions and the dashed blue lines show the estimated functions. (a) ; (b) ; (c) ; (d) .
Figure 2.
Estimated g with and ; the solid red lines represent the real functions and the dashed blue lines show the estimated functions. (a) ; (b) ; (c) ; (d) .
5. An Illustration
In this section, we illustrate the proposed procedure by using the data from the Age-related Eye Disease Study (AREDS) [30], a clinical experiment tracking the development of a bilateral eye disease, age-related macular degeneration (AMD), provided the CopulaCenR package in version 4.2.0 of R software [31]. Each participant in the research was monitored every six months (during the first six years) or once a year (after year 6) for about 12 years. At each appointment, each participant’s eyes were given a severity score on a range of 1 to 12 (a higher number signifying a more serious condition).Moreover, the time-to-late AMD, which is the interval between the baseline visit and the first appointment at which the severity score reached nine or above, was computed for each eye of these subjects. Either interval censoring or right censoring was applied to the observations at both periods.
The data set consisted of subjects and in the analysis below, we focused on the effects of two covariates, SevScaleBL for the baseline AMD severity score (a value between one and eight with a higher value indicating more severe AMD) and rs2284665 for a genetic variant (zero, one and two for GG, GT and TT) of the AMD progression. In order to use the proposed method, we set rs2284665 to be X and SevScaleBL to be Z since the latter could be viewed as continuous. For the identifiability of the model, both X and Z were standardized and thus had the support . Moreover, various degrees of Bernstein polynomials were considered, such as .
Table 5 gives the analysis results obtained by the suggested estimation approach. They include the estimated effect of the covariate rs2284665 and Kendall’s along with the estimated standard errors and the p-value for testing the effect to be zero. One can see that the results were consistent with respect to the degree of the Bernstein polynomials and suggested that the minor allele (TT) had a significantly “harmful” effect on AMD progression. Moreover, the estimated Kendall’s was , suggesting a moderate dependence of the AMD progression between the two eyes. Figure 3 gives the estimated effect of the SevScaleBL, which indeed seemed to be nonlinear. To be more specific, the increased risk of AMD patients was associated with higher severity scores. The findings reached here were in line with those of other researchers who examined this subject [15]. It is worth pointing out that the conclusion of [15] was obtained under the proportional odds model and they could not visualize nonlinear covariate effects as in Figure 3.
Table 5.
Analysis results for the AREDS data.
Figure 3.
Estimated nonlinear effect of SevScaleBL for the illustration.
6. Concluding Remarks
In the preceding sections, the regression analysis of bivariate interval-censored survival data was considered under a family of copula-based, semiparametric, partly linear additive hazards models. As discussed above, one significant advantage of the models was that they only needed to handle the two marginal distributions via a copula function with the copula parameter determining the dependence. For inference, a sieve maximum likelihood estimation procedure with the use of Bernstein polynomials were provided, and it was shown that the resulting estimators of the regression parameters were consistent and asymptotically efficient. Furthermore, the simulation studies suggested that the recommended approach worked effectively in practical situations.
It is worth emphasizing that the main reason for using Bernstein polynomials to approximate the infinite-dimensional cumulative hazard function and nonlinear covariate effects was its simplicity. Other smoothing functions, such as spline functions [20], could also be used, and estimation procedures similar to the one described above could be developed. In order to implement the approach, we needed to choose the degrees and . As discussed above, a common method is to consider different values and compare the resulting estimators. Certainly, developing a data-driven approach for their selections would also be helpful.
In this paper, we focused on the additive hazards model. Other models, for example, the proportional hazards model, are sometimes more popular. It is well-known that the latter is more suitable for the situation where one is interested in the hazard ratio, whereas the former fits better if the excess risk or the risk differential is what is most important. However, there seems to be little literature on how to choose the better model or develop some model-checking methods, which is attractive research for the future.
Author Contributions
Conceptualization, T.H. and S.Z; methodology, X.Z. and T.H.; software, X.Z.; validation, X.Z. and T.H.; resources, T.H. and S.Z.; writing—original draft preparation, X.Z; writing—review and editing, T.H. and J.S. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the Beijing Natural Science Foundation Z210003, Technology Developing Plan of Jilin Province (No. 20200201258JC) and National Nature Science Foundation of China (grant nos. 12171328, 12071176 and 11971064).
Data Availability Statement
Research data are available in the R package CopulaCenR.
Conflicts of Interest
The authors declare no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| AIC | Akaike information criterion |
| AREDS | Age-related Eye Disease Study |
| AMD | Age-related macular degeneration |
| a.s. | almost surely |
Appendix A. Proof of Asymptotic Properties
This appendix begins by describing the necessary regular conditions, which are similar to those generally used in the interval-censored data literature [32,33], and then sketch the proof of the asymptotic results given in Theorems 1–3.
Condition 1. There exists a positive number such that .
Condition 2. (i) There exist such that . (ii) The covariate ’s are bounded; in other words, there exists that makes , where . The distribution of the ’s is not focused on any proper affine subspace of . (iii) For some positive constant K, given the set of all observation times , suppose the conditional density of is twice continuously differentiable and has a bound within on a.s.
Condition 3. After substituting with , the likelihood function can be rewritten as . Define
with . There exist where there exist various sets of so that when in which for every sets of values, one can conclude .
Condition 4. There exist satisfying , and is strictly increasing and continuously differentiable until a q-order within , . In addition, is continuously differentiable until an r-order within . Additionally, denotes an interior point in .
Condition 5. There exists , for every , , in which is the log-likelihood function acquired before, and ⪯ denotes that “the left-hand side is smaller than the right, until a constant time”.
It is worth noting that Conditions 1–5, except Condition 3, are usually employed in interval-censored failure time research [26,34,35]. Condition 3 helps to ensure the parameters’ identifiability and ensure the effective Fisher information matrix is always positive [26,35].
Proof of Theorem 1.
Define as the log-likelihood for only one observation , and . Let denote a class of functions, and let and P denote the empirical and true probability measures, respectively. Then, following a similar calculation as in Lemma 1 in [19], the bracketing number of is (until a constant time) in a bound of . Therefore, following Theorem 2.4.1 in [36], we can get is a Glivenko–Cantelli class. Thus,
Let , for any , define , , and . Hence, we can conclude that
If , we can get
Then, we can prove based on the proof by contradiction under Condition 3. Combining (A1) and (A2), we have
in which , so that , where . This gives . Based on Condition 1 and together with the strong law of large numbers, we have a.s. Hence, , which proves that a.s. The proof of Theorem 1 is complete. □
Proof of Theorem 2.
We verify Conditions C1–C3 in [37] to derive the rate of convergence. Define as the Euclidean norm of a vector u, is the supremum norm of a function h, and . Moreover, let P denote a probability measure. After that, for the convenience of understanding the proof as follows, we define , where is the sets of , and is the sets of g. q and r are as defined in Condition 4. It is noteworthy that is completely the same as except for the notation. Similarly, is the corresponding sieve space containing and . First, as a result of Condition 5, we can verify that Condition C1 directly stands. That is, for any . Second, we verify Condition C2 in [37]. Based on Conditions 1–4, one can easily find that for every ,
in which , and this implies that for any , . Thus, Condition C2 from [37] holds when the sign in their paper is equal to one. In the end, one needs to verify Condition C3 of [37]. Define the class of functions and let denote the -bracketing number related to norm of . Then, we have by following similar arguments as in Lemma A3 of [13], where , and is the dimensionality of b. By following the fact that the covering number is always smaller than the bracketing number, we have . Therefore, Condition C3 in [37] is satisfied under , and in their sign. Hence, of Theorem 1 in [37] on page 584 may be equal to . Because the part behind the minus is close to zero with , one can set a a little bigger than so as to get with a large n. Let replace but keep the same notation with , then the new constant .
Notice that from Theorem 1.6.2 in [38], there are Bernstein polynomials that make , . Similarly, there also exists a function satisfying . Then, the sieve approximate error in [37] is . Therefore, applying the Taylor expansion to surrounding , then plugging in , the Kullback–Leilber pseudodistance of and follows
The first equality holds due to the first derivative of at being equal to zero. As for the penultimate inequality, it holds because all the derivatives and second-order derivatives of the log-likelihood are bounded. Furthermore, since , and , we can get the last inequality, so that . Hence, by Theorem 1 in [37], the convergence rate of is
The proof of Theorem 2 is complete. □
Proof of Theorem 3.
Let us sketch the proof of Theorem 3 in five steps as follows.
Step 1. We first calculate the derivatives regarding , such that , and so on; now, we omit in the following formula for convenience in Step 1.
To obtain the score functions of . Let denote an arbitrary parametric submodel of , in which satisfies the Fréchet derivative . Similarly, we can also define a submodel of g noted by . Moreover, note
and
where and . The score function along is
with , and . Analogously, we have the derivatives with respect to g as
with , and .
The second-order derivatives of have the form
Similarly, we can derive , and as, respectively, the derivatives of , and with respect to b.
and are, respectively, the derivatives of and with respect to , .
, and are, respectively, the derivatives of and with respect to g.
Step 2. Consider the classes of functions and . We need to show these three function classes are Donsker for any . We determine the bracketing number of in order to demonstrate that it is Donsker. In accordance with [37], we have for . This results in a finite-valued bracketing integral according to Theorem 2.8.4 of [36]. Hence, the class is Donsker. Similar justifications support that and are also Donsker.
Step 3. Following similar arguments as in Lemma 2 of [19] and the properties of the score statistic, there exist and satisfying
Let denote the estimators of the sieve log-likelihood and is the projection of onto , . We get
Following the discussion about the proof for Theorem of [34], we can derive that part (I) is equal to . In addition, (II) is also equal to based on (A3). We can acquire (III) as due to being Donsker. As for the fourth term (IV), on account of Theorem 2 and employing the first-order linear expansion of around , one can get (IV) is as well. Summating the four terms, we have . Likewise, we have the property of . Hence, we have
Furthermore, based on some arguments in the proof of Theorem 3.2 in [13], there exists a neighborhood of as , where . Then, applying the Taylor expansion for yields
where . Likewise, it is also easy to get the property of and . Note that the derivatives of the score statistics are bounded. After applying Taylor series expansions about to (A6), and combining the Equations (A7), we have
Taking the first equality in (A8) and subtracting the second and third equalities, we have
Step 5. Define and ; then, we have
where . Next, we need to verify Q is nonsingular. If Q is a nonsingular matrix, then we can conclude from . Moreover, one is enough to show if , then . Thus, we have
where and is the likelihood function. Under our Condition 3, (A10) is equal to zero only if . As a consequence, we have verified Q is nonsingular.
This implies that
Since , we obtain . Thus, , with and being the efficient score function of . Now, we complete the proof of Theorem 3. □
References
- Sun, J. The Statistical Analysis of Interval-Censored Failure Time Data; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
- Lin, D.Y.; Oakes, D.; Ying, Z. Additive hazards regression with current status data. Biometrika 1998, 85, 289–298. [Google Scholar] [CrossRef]
- Martinussen, T.; Scheike, T.H. Efficient estimation in additive hazards regression with current status data. Biometrika 2002, 89, 649–658. [Google Scholar] [CrossRef]
- Feng, Y.; Sun, J.; Sun, L. Estimation of the additive hazards model with linear inequality restrictions based on current status data. Commun. Stat.-Theory Methods 2022, 51, 68–81. [Google Scholar] [CrossRef]
- Wang, P.; Zhou, Y.; Sun, J. A new method for regression analysis of interval-censored data with the additive hazards model. J. Korean Stat. Soc. 2020, 49, 1131–1147. [Google Scholar] [CrossRef]
- Li, H.; Zhang, H.; Zhu, L.; Li, N.; Sun, J. Estimation of the additive hazards model with interval-censored data and missing covariates. Can. J. Stat. 2020, 48, 499–517. [Google Scholar] [CrossRef]
- Wang, T.B.; Yopadhyay, D.; Sinha, S. Efficient estimation of the additive risks model for interval-censored data. arXiv 2022, arXiv:2203.09726. [Google Scholar]
- Tong, X.; Chen, M.H.; Sun, J. Regression analysis of multivariate interval-censored failure time data with application to tumorigenicity experiments. Biom. J. 2008, 50, 364–374. [Google Scholar] [CrossRef] [PubMed]
- Yin, G.; Cai, J. Additive hazards model with multivariate failure time data. Biometrika 2004, 91, 801–818. [Google Scholar] [CrossRef]
- Liu, P.; Song, S.; Zhou, Y. Semiparametric additive frailty hazard model for clustered failure time data. Can. J. Stat. 2022, 50, 549–571. [Google Scholar] [CrossRef]
- Zeng, D.; Cai, J. Additive transformation models for clustered failure time data. Lifetime Data Anal. 2010, 16, 333–352. [Google Scholar] [CrossRef]
- Yu, M.; Du, M. Regression analysis of multivariate interval-censored failure time data under transformation model with informative censoring. Mathematics 2022, 10, 3257. [Google Scholar] [CrossRef]
- Sun, T.; Ding, Y. Copula-based semiparametric regression method for bivariate data under general interval censoring. Biostatistics 2021, 22, 315–330. [Google Scholar] [CrossRef]
- Marra, G.; Radice, R. Copula link-based additive models for right-censored event time data. J. Am. Stat. Assoc. 2020, 115, 886–895. [Google Scholar] [CrossRef]
- Petti, D.; Eletti, A.; Marra, G.; Radice, R. Copula link-based additive models for bivariate time-to-event outcomes with general censoring scheme. Comput. Stat. Data Anal. 2022, 175, 107550. [Google Scholar] [CrossRef]
- Cheng, G.; Zhou, L.; Huang, J.Z. Efficient semiparametric estimation in generalized partially linear additive models for longitudinal/clustered data. Bernoulli 2014, 20, 141–163. [Google Scholar] [CrossRef]
- Lu, X.; Song, P.X.K. Efficient estimation of the partly linear additive hazards model with current status data. Scand. J. Stat. 2015, 42, 306–328. [Google Scholar] [CrossRef]
- Wang, X.; Song, Y.; Zhang, S. An efficient estimation for the parameter in additive partially linear models with missing covariates. J. Korean Stat. Soc. 2020, 49, 779–801. [Google Scholar] [CrossRef]
- Lee, C.Y.; Wong, K.Y.; Lam, K.F.; Xu, J. Analysis of clustered interval-censored data using a class of semiparametric partly linear frailty transformation models. Biometrics 2022, 78, 165–178. [Google Scholar] [CrossRef] [PubMed]
- Chen, W.; Ren, F. Partially linear additive hazards regression for clustered and right censored data. Bulletin of Informatics and Cybernetics 2022, 54, 1–14. [Google Scholar] [CrossRef]
- Sklar, M. Fonctions de repartition an dimensions et leurs marges. Publ. Inst. Stat. Univ. Paris 1959, 8, 229–231. [Google Scholar]
- Nelson, R.B. An Introduction to Copulas; Springer Science and Business Media: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
- Joe, H. Multivariate Models and Dependence Concepts; CRC Press: Boca Raton, FL, USA, 1997. [Google Scholar]
- Clayton, D.G. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 1978, 65, 141–151. [Google Scholar] [CrossRef]
- Gumble, E.J. Bivariate exponential distributions. J. Am. Statitical Assoc. 1960, 55, 698–707. [Google Scholar] [CrossRef]
- Zhou, Q.; Hu, T.; Sun, J. A sieve semiparametric maximum likelihood approach for regression analysis of bivariate interval-censored failure time data. J. Am. Stat. Assoc. 2017, 112, 664–672. [Google Scholar] [CrossRef]
- Carnicer, J.M.; Pen˜a, J.M. Shape preserving representations and optimality of the bernstein basis. Adv. Comput. Math. 1993, 1, 173–196. [Google Scholar] [CrossRef]
- Burnham, K.P.; Anderson, D.R.; Burnham, K.P.; Anderson, D.R. Practical Use of the Information-Theoretic Approach; Springer: Berlin/Heidelberg, Germany, 1993; pp. 75–117. [Google Scholar]
- Efron, B. Bootstrap methods: Another look at the jackknife. Ann. Stat. 1979, 7, 1–26. [Google Scholar] [CrossRef]
- Age-Related Eye Disease Study Research Group. The age-related eye disease study (AREDS): Design implications AREDS report no. 1. Control. Clin. Trials 1999, 20, 573. [Google Scholar] [CrossRef]
- Sun, T.; Ding, Y. CopulaCenR: Copula based regression models for bivariate censored data in R. R J. 2020, 12, 266. [Google Scholar] [CrossRef]
- Huang, J. Efficient estimation for the proportional hazards model with interval censoring. Ann. Stat. 1996, 24, 540–568. [Google Scholar] [CrossRef]
- Zhang, Y.; Hua, L.; Huang, J. A spline-based semiparametric maximum likelihood estimation method for the cox model with interval-censored data. Scand. J. Stat. 2010, 37, 338–354. [Google Scholar] [CrossRef]
- Huang, J.; Rossini, A. Sieve estimation for the proportional-odds failure-time regression model with interval censoring. J. Am. Stat. Assoc. 1997, 92, 960–967. [Google Scholar] [CrossRef]
- Wen, C.C.; Chen, Y.H. A frailty model approach for regression analysis of bivariate interval-censored survival data. Stat. Sin. 2013, 23, 383–408. [Google Scholar] [CrossRef]
- Van der Vaart, A.; Wellner, J. Weak Convergence and Empirical Processes; Springer: Berlin/Heidelberg, Germany, 1996. [Google Scholar]
- Shen, X.; Wong, W.H. Convergence rate of sieve estimates. Ann. Stat. 1994, 22, 580–615. [Google Scholar] [CrossRef]
- Lorentz, G.G. Bernstein Polynomials; American Mathematical Society: Providence, RI, USA, 2013. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).