Generalized Accelerated Failure Time Models for Recurrent Events

: For analyzing recurrent event data, we consider a generalization of the classical accelerated failure time model. In the proposed approach, the general function is no longer assumed to be a singleton but allowed to be time-varying. This is in the same spirit as in quantile regression and the counting process techniques can be utilized. Theoretical properties such as consistency and asymptotic normality are obtained. The illustration of the methodology using simulation studies and then the application to the bladder cancer data is also given.


Introduction
Recurrence event data refer to the situation in which events of interest occur repeatedly over time, and these are wildly used in science and technology, especially in medical research, e.g., epileptic seizures, tumor recurrences, or asthma attack.When the investigators studied the recurrent events, they are often interested in the estimation of covariates on recurrent event times, which can help them perform further predictions.The establishment of models can be approached in many ways; meanwhile, the concept of intensity functions and the counting process are also useful.The nonparametric method used to generalize the intensity estimator for the censored failure time data, independently started by [1,2], also denoted the Nelson-Aalen estimator, is one of the methods widely used by investigators.Refs.[3,4], among others, also studied multiplicative models for the rate and mean functions, an approach which can be used with more general models, including regression.Ref. [5] proposed a generalization of the accelerated failure time model, the so-called accelerated recurrence time model, which resembles quantile regression as well as allows for the evolution of covariate effects.
In this paper, we also extend the quantile regression approach and utilize the counting process techniques.For a time-to-event response T, Ref. [6] proposed that a form may assume Q T (τ|Z) = exp X T β 0 for τ ∈ (0, 1), where Q T (τ|Z) = inf{t : Pr(T ≤ t|Z) ≥ τ}, so Q T (τ|Z) can also be seen as the τ-th quantile of T given that the covariates Z and X = (1, Z T ) T .Here, T, τ and t denote the time to event, the quantile level, and a nonnegative real number, respectively.Since its introduction, quantile regression has been widely used and researched, mainly in survival analysis.Ref. [7] extended the LAD estimation method to more general quantiles, and can also improve efficiency when the error terms are identically distributed.Ref. [5] developed a new quantile regression approach to the counting process, and Ref. [8] adopted a constant general function in quantile regression which it applied to recurrence event data.In the application and algorithm estimation of quantile regression, efforts have also been made by many people.
Ref. [9] proposed a locally weighted censored quantile regression approach that can solve covariate-dependent censoring.Ref. [10] proposed a semi-parametric approach using empirical likelihood to a random effects quantile regression model.
Overall, this work considers a situation wherein the general function of the accelerated failure time model is affected by time, sharing the same spirit of the parametric estimation method, while we will use the nonparametric method to estimate the effect.Then, we developed a new counting process model extended from the quantile regression models of this general function.This method provides some ideas for the development of more diverse estimation methods for the counting process.

Accelerated Recurrent Events Time Model
shows that the observation is limited to followup time L and ∧ is the minimization operator, meanwhile, the at-risk process Y(•) and Y(t) = I(L ≥ t).We also have µ Z = E{R(t)|Z}, and Z is a p-vector as the covariates of interest.The original accelerated recurrence time models considered covariates' effects as time scale changes, which also share the same spirit as quantile regression, and the inverse function is in a general setting, and G(u) = u denotes the time to expected frequency u.However, using a constant to estimate the expected frequency has some limitations.For example, the effect of the intervention of interest factors affecting the occurrence of an event may change over time.In this case, the model which combines a constant and variable coefficient may be more effective.As such, we proposed a new accelerated recurrence time model, and the right side of the inequality in the inverse function is G(u) = u • m(t) instead of G(u) = u, where m(t) is the function of time t.Regardless of the form of inverse function, it can be written in the following form: As can be seen from the above formula, logτ Z (u) should be linear.Compared to the original accelerated recurrence model and the improved model, when the recurrent event T follows a homogeneous Poisson distribution, as shown in Figure 1, the estimated logτ Z (u) by both models is approximately linear when the recurrent event is a non-homogeneous Poisson process, which has an intensity function λ 0 (t) related to time t.In Figure 2, we compare three different situations, namely when λ 0 (t) = b × t and b = 1, 5, 10.As b increases, the estimation of the original accelerated recurrence model tends to be more and more nonlinear, while the proposed improved model still shows a strong linear relationship.In that way, the proposed recurrent event time model can be used in a more general situation.
In this paper, we consider the situation that G(u) = ut, then model 2 can be rewritten as if and only if β = β 0 (u), and the left-hand side should be monotone, then the above equation can be transformed into where a + = a ∨ 0 and ∨ denotes the maximization operator.Then, we can come up with the following theorem: | Z}, and assume (C1) E X 2 I{L > τ Z (u)} is non-singular; (C2) μZ (e X T β 0 (u) )F Z (e X T β 0 (u) ) > U for all u ∈ (0, U]. Then, where  The proof of this result, as well as the proofs for Theorems 2 and 3 below, are given in Appendix A. Powell extended these results for censored quantile regression when the censoring time is observed [7].We will then combine the counting process model and apply this method to the recurrence events data.

The Recurrent Events Model
For the counting process model, we begin with a review of Gill and Andersen's study [11], which extended the Cox proportional hazards model [12] to a multivariate counting process.The intimate connection of the study on the multivariate counting process and the use of martingale methods can allow us to derive more properties of the statistical estimation and testing procedures.As each component N i (•) of a multivariate counting process has a cumulative hazard function Λ T (•), there exist local martingales M i defined by M i (t) = N i (t) − Λ T (t).Since the cumulative hazard function Λ T (•) is the integral of the hazard function, then according to [11], the counting process formation of the Cox model can be given by where λ 0 (s) denotes the baseline hazard function, and this model can accommodate recurrent events data well.Peng and Huang conducted a re-examination work [5], which combined the quantile regression method for survival data with the above counting process model.
which has a nice monotonicity property.In model 6, the estimated cumulative hazard function ΛT (x) = −log(1 − x).Huang also showed that a singleton model, where an estimated cumulative hazard function is a constant u, can also be applied in the counting process model [13].When the singleton model is combined with quantile regression, this extended the following model to a more general situation of recurrent events [8], and the estimation formula is given by where g(s) ≡ 1.In fact, in model 7, the estimated cumulative hazard function G(u) = u, which is a constant.According to Theorem 1 and the above models, we can propose a new counting process model that takes the form where in our model, we can also show that Therefore, we also have an alternative representation of the model that The proof of this recurrence events setting is also shown in the Appendix A. In the simulation part, we will show that our estimation method performs better when the recurrent event time follows the non-homogeneous Poisson distribution.

The Proposed Estimation Procedure
From Theorem 1, we can define the objective function Theorem 1 leads to the estimation of It can be seen that the objective function is not convex.An algorithm was developed to find the local minimizer that is asymptotically equivalent to the global minimizer [13].Since we proposed the counting process-based model (model 7), then we can propose an equation to estimate β 0 (u): where As the results proposed in [11], although M h is a local square integrable martingale, it has the same asymptotic property as a global martingale, and if we let S(β, u) = E{S n (β, u)}, then follows the martingale property of M(•) given that S(β 0 , u) = 0.
Equation ( 8) boils down to the estimation of censored quantile regression in [5].A common method used to predict β 0 (u) is the grid-based estimation procedure by denoting β 0 (u) as a right-continuous piecewise-constant function that jumps on a grid.More specifically, and for our recurrence setting, the U in the grid can be greater than 1.The size of S L(n) , denoted by S L(n) , is the maximum value of u j − u j−1 where j = 1, • • • , L(n).It is also noteworthy that exp X T β(0) = 0. Thus, based on model 9, we can estimate β(u j ) by sequentially solving the estimating equation Since Equation ( 10) is not continuous, the exact root may not exist.Fygenson and Ritov proposed a generalization solution of β(u j ) for monotone estimating equations [14].To find the generalization solution of Equation (10), we need to perform some simple algebraic manipulations, and then the solution-finding problem is equivalent to locating the minimizer of the following L 1 -type convex function: where R * is a very large number and j = 1, • • • , L(n).One can also solve the l j (h) by using statistical software, such as the rq() function in R package quantreg or the l1fit() function in S-PLUS.

Asymptotic Properties
In this section, we establish the uniform consistency and weak convergence of the proposed estimator β(u).Firstly, some regularity conditions should be stated.
By using simple algebra, we also define B(b) = dµ(b)/db T = E X 2 e X T b g Z (e X T b ) , f (e X T b |Z) .Suppose the assumptions in Theorem 1 hold and β(u) is strongly consistent for β 0 (u).Assume the following conditions: (C1') Z and N(L) are bounded; (C2') (a) μZ (•) is bounded and continuous at t = τ Z (u), as well as uniform in Z, and (b) where eigmin(•) denotes the minimum eigenvalue of a matrix.
Condition (C1') implies the boundedness of covariates and the number of observed events, and (C2') gives the smoothness of β 0 (u), condition (C3') sets additional mild assumptions such as the positive definite.Noted that (C4') is a key condition that ensures the consistency of the proposed estimator.Then, we have the following theorems.The proofs are also in the Appendix A.

Simulation Studies
In this part, we conducted the Monte Carlo simulations to test our model.Combining the methods of Huang and Peng [13], recurrent events were generated from both homogeneous Poisson and non-homogeneous Poisson processes.We also generated two covariates, X 1 and X 2 , following the distribution Bernoulli(0.5)and Uni f (−0.5, 0.5) separately.The recurrent event time was generated by: where in one case, T * (j) , j = 1, 2, ... was a recurrent event time from the standard homogeneous Poisson process, in other words, the gap times of T * (j) − T * (j−1) are independent and identically exponetial(1) variables; in other cases, T * (j) , j = 1, 2, ... is a recurrent event time from the non-homogeneous Poisson process with the intensity function λ 0 (t) = t.Furthermore, the frailty γ followed the Gamma distribution for the homogeneous Poisson process, we considered two situations, that is the variance of γ was chosen to be 0 and 0.5, for the non-homogeneous Poisson process, and we only consider the variance of γ to be 0.Under our simulation setup, we have For censoring time L, we generated it from Uni f (0, 12).For each selection of the variance of γ, we generated 500 datasets of sample size n = 100.Since we adopted the grid-based method to estimate the β, an equally spaced grid on u ∈ (0, 3] with step size 0.02 was conducted in our simulation. Figures 3 and 4 are the simulation results for the homogeneous Poisson process from the set-up with a Gamma frailty of variance 0 and 0.5.In the first row, we plot the empirical bias of the proposed estimator β(u) (solid lines) and the empirical bias of the Sun's estimator (dashed lines) [8].Sun considered the double censored situation while we only consider the right-censored event time to suit a more general situation.In the second row of the plots, we depicted the empirical mean squared error (MSE) versus the expected frequency u.The third row presents the coverage probabilities of 95% confidence intervals obtained from the proposed estimator (solid lines) and Sun's estimator (dashed lines).We can see that both methods have a slight bias which converges to 0, and the empirical MSE also tends to be stable as u increases.In the homogeneous Poisson process, when the variance of gamma frailty equals 0, the empirical MSE shows less fluctuation compared to the variance that is equal to 0.5.
Figure 5 depicts the same parameters as Figures 3 and 4, while the difference is that the results are from the non-homogeneous Poisson process.We can see that, in the first and second rows of Figure 5, both methods performed quite similarly in terms of the bias and SD.However, from the third row, in terms of the coverage probability, the proposed estimator (solid lines) performed slightly better than the Sun's estimator (dashed lines), than for the simulation of the non-homogeneous Poisson process.

Application to the Bladder Tumor Studies
We applied our proposed estimated method to a well-known bladder tumor study [15].This dataset was conducted to analyze the effect of two treatments, pyridoxine and thiotepa, based on the recurrence of bladder tumors.A total of 118 subjects were recorded, 48 were treated with placebo, 32 were pyridoxine, and 38 with thiotepa.The covariates also contained the initial number of tumors, the size of the largest initial tumor and others.The maximum observed number of recurrences is 9.
We selected four covariates and the intercept: the two treatment methods and the initial tumor size and number.In Figure 6, we displayed our estimation result of the proposed method (solid lines) and Sun's method (gray line) [8], surrounded by point-wise Wald 95% confidence intervals (dashed lines).The grid-based estimation method estimated the regression coefficients over [0, 1.6].The intercept coefficient estimates represent the log time to the expected frequency of the bladder tumor recurrence, consisting of patients who had no pyridoxine and thiotepa treatment.The intercept term indicates that as time increases, the bladder tumor recurrence also increases, which is in line with expectations.The negative non-intercept coefficient estimates pyridoxine and thiotepa show that these two treatments can inhibit tumor recurrence.In contrast, the initial tumor size and number of covariates are positive, suggesting a negative effect on tumor recurrence.From Figure 6, we can see that our method and Sun's method do not have much difference, and the overall trend is the same.

Discussion
In this article, we introduced the accelerated recurrence time model for recurrence events, and then considered a new situation when the expected frequency of the accelerated recurrence time model was affected by the time-varying and combined this model with the counting process model.This new counting process model with a non-singleton general function is similar to Cox's regression model but can be transformed into quantile regression.This method can also estimate more general situations, such as the non-homogeneous Poisson process.
We can generalize our estimation procedure from double-censored to right-censored situations.In the recurrence events setting, the most popular choice of G(•), which may be G(u) = u, as well as G(u) = −ln(1 − u), we introduce our new estimation method with G(u) = ut.The assumption is that a constant hazard is rarely tenable in practical problems.Therefore, in the parametric estimation process, a more general distribution of the hazard function is the Weibull distribution, which is g(t) = λγt γ−1 for 0 < t < ∞, so we have G(u) = ut γ .When γ = 0, the Weibull distribution function becomes a constant, but resembles the accelerated recurrence time model.When γ = 1, the model became our proposed method.Thus, after combining the parametric and nonparametric methods, we can make the estimation method more diverse and adapt to different types of datasets.

Recurrent Event Setting
Suppose model 6 holds, then according to the following definition: E N(e X T β 0 (u) )|Z = µ Z (e X T β 0 (u) ∧ L), and the inequality e X T β 0 (u) ≤ L can also be written as G(u) ≤ µ Z (L) following the definition of τ Z (u).Then In model 6, G(u) = µ Z X T β 0 (u) , so it follows from the above equation that for u ∈ (0, U], E u 0 Y(e X T β 0 (s) )g(s)ds | Z = µ Z (L ∧ e X T β 0 (u) ).
Therefore, the model 5 is satisfied.Suppose that model 5 holds, then taking the derivative with respect to u on both sides of equation in model 5, we have μZ (e X T β(u) )F Z (e X T β 0 (u) )e X T β(u) X T dβ = F Z (e X T β 0 (u) )g(u)du, that is, dµ Z (e X T β(u) ) = g(u)du, then µ Z (e X T β(u) ) = G(u) and model 6 are satisfied.

Figure 1 .
Figure 1.Estimation results of homogeneous Poisson process.

Figure 2 .
Figure 2. Estimation results of non-homogeneous Poisson process.

Figure 3 .
Figure 3. Simulation results with sample size n = 100 and the set-up with Gamma frailty of variance 0.

Figure 4 .
Figure 4. Simulation results with sample size n = 100 and the set-up with Gamma frailty of variance 0.5.

Figure 5 .
Figure 5. Simulation results of non-homogeneous Poisson process with sample size n = 100 and the set-up with Gamma frailty of variance 0.