An Accelerated Failure Time Cure Model with Shifted Gamma Frailty and Its Application to Epidemiological Research

Survival analysis is a set of methods for statistical inference concerning the time until the occurrence of an event. One of the main objectives of survival analysis is to evaluate the effects of different covariates on event time. Although the proportional hazards model is widely used in survival analysis, it assumes that the ratio of the hazard functions is constant over time. This assumption is likely to be violated in practice, leading to erroneous inferences and inappropriate conclusions. The accelerated failure time model is an alternative to the proportional hazards model that does not require such a strong assumption. Moreover, it is sometimes plausible to consider the existence of cured patients or long-term survivors. The survival regression models in such contexts are referred to as cure models. In this study, we consider the accelerated failure time cure model with frailty for uncured patients. Frailty is a latent random variable representing patients’ characteristics that cannot be described by observed covariates. This enables us to flexibly account for individual heterogeneities. Our proposed model assumes a shifted gamma distribution for frailty to represent uncured patients’ heterogeneities. We construct an estimation algorithm for the proposed model, and evaluate its performance via numerical simulations. Furthermore, as an application of the proposed model, we use a real dataset, Specific Health Checkups, concerning the onset of hypertension. Results from a model comparison suggest that the proposed model is superior to existing alternatives.


Introduction
Survival analysis statistically infers the time until the occurrence of an event of interest, and its main subject of interest is the inference of the survival function. Particularly, investigating how the characteristics of the individuals (covariates) affect survival time has been an epidemiologically important challenge. To tackle this, the proportional hazard model is most commonly used as the regression model for survival time [1]. The advantage of the proportional hazard model is that it allows intuitive interpretation of the results if the covariates are the characteristics obtained at the beginning of the observation of the individuals and are independent of time. In contrast, because the model requires the assumption that the ratio of the hazard functions is independent of time (proportional hazard assumption), its disadvantage is that it may give inappropriate analysis results if the data do not meet the assumption. Furthermore, it has been pointed out that this proportional hazard assumption is strong and often unrealistic [2].
In addition, some of the events of interest may not occur in all individuals (for example, lifestyle-related diseases and the onset of diabetes in AIDS patients). Thus, the proportional hazard model, which assumes that an event always occurs under a sufficiently long observation period, is not suitable for the problem setting mentioned above, and the cure model has been proposed as a model that can be applied to such situations [3,4]. In particular, the mixture cure model divides individuals into latent groups with or without the occurrence of an event (uncured individuals in the case of a curable disease), expressing the survival function as their mixture distribution [5].
To solve these problems, the present study considers the accelerated failure time mixture cure model, which is a flexible model that directly expresses the relationship between the survival function and the covariates and does not necessarily require an assumption regarding the hazard ratios. In this study, we propose the mixture cure model based on the accelerated failure time (AFT) model with frailty, a latent variable that expresses the heterogeneity of the individuals [6], which enables the model to be more flexible. One of our ideas is to assume that the frailty for survival time of the uncured individuals follows a shifted gamma distribution, allowing clearer characterization of uncured individuals. In addition, we analyze the dataset on hypertension using the proposed model and compare its results with those of the existing models. One of the competitive models is a mixture cure model that assumes the proportional hazard model in the survival function for uncured patients. In this case, however, the proportional hazard assumption mentioned above is assumed in the survival function for uncured patients, leading to a model with a strong assumption. On the contrary, the assumption of the AFT model in the survival function for uncured patients allows us to consider a more flexible mixture cure model that does not require an assumption on hazard ratios [7,8].
Our major contribution of the present study is that introducing two aspects of heterogeneity and relaxation of model assumptions in parametric settings. Heterogeneity in this context allows existence of (i) uncured patients (experience an event in the long run) and cured patients in population and (ii) individual difference among uncured patients. Although similar models have been considered in the previous studies (for example, see [9]), our approach is novel in the sense that a shifted gamma frailty is found to improve model fitting and a generalized gamma distribution is considered for the uncured component for flexibility.
The overall organization of this paper is structured as follows: First, Section 2 explains survival time and censoring and defines the functions, such as the survival function, used in survival analysis. It also describes the proportional hazard and AFT models and introduces the survival time regression model considering the cure rate. Section 3 defines the AFT frailty model and the parameter estimation method for frailty. Section 4 describes a numerical example of the proposed model, in which we perform a simulation to evaluate the behavior of estimated values and summarize the results of data analysis on the onset of hypertension for discussion. Finally, Section 5 summarizes this study and discusses future challenges.

Literature Review
We develop a survival model that relaxes some assumptions in the proportional hazard model. Prior to model formulation, we briefly review some related previous studies without showing statistical models. When survival (time-to-even) outcome is of interest, statistical models are often presume that all patients will experience an event if they are observed sufficiently long time. However, this implicit assumption is justified depending on characteristics of the event or context of the study. The cure model considers a fraction of cured patients in a population, and is generally classified into non-mixture and mixture cure models [4]. Since the non-mixture cure model is not suitable for our application, the mixture cure model are considered in our study. While the proportional hazard model is most popular to model a censored survival outcome, it requires a strict "proportionality" assumption on the outcome distribution. Hence, some approach to overcome the restriction the accelerated failure time model becomes popular as an alternative [2,10,11]. To take into account heterogeneity of uncured patients, the concept of frailty is also introduced [6,12].
Frailty is a kind of random effect for each patient; we can interpret it as variability of outcome that does not come from covariates' effect.
The above reviews are three different generalizations of the proportional hazard model in survival analysis. They have developed independently or integrated partially until recent years [9,[13][14][15][16]. In this study, a new model that had all of these elements is proposed with one additional twist: a shifted gamma frailty. We tabulate elements in our proposed model to compare related literature in Table 1. Table 1. Selected related literature and elements of our proposed method. Elements of our proposed method and relationship among existing studies. The symbol means "Considered". PH and AFT means proportional hazard and accelerated failure time models, respectively.

Problem Formulation
This section provides an overview of the problem setting and the regression model for survival time. Let T be a continuous non-negative random variable representing the time of occurrence of an event of interest, and let C be a non-negative random variable that represents the censoring time. Let O = min(T, C) and ∆ = I(T ≤ C) be the actual observed survival time (right censoring), and Z be a p-dimensional covariate vector. Based on these, let {O, ∆, Z} be the observed variables. Let the sample size be n, with the observed values of the i-individual following the probability distribution of (O, ∆, Z ) independently of each other, and their realization is expressed as (t i , δ i , z i ) . Therefore, the sample is Let S(t|z) = P(T > t|Z = z), t > 0 be the conditional survival function of T given Z = z. In addition, is called the conditional hazard function of T. Furthermore, because of the expression h(t|z) = −d log S(t|z)/dt, the relationship between the survival function and the hazard function is as follows: The proportional hazard model, proposed by [1], assumes that the conditional hazard function is where β ∈ R p is the regression coefficient for the covariates, and h 0 (t) is the baseline hazard function. The proportional hazard model possesses the property that the ratio of hazard functions between different individuals does not depend on time (proportional hazard assumption). Therefore, for different Z 1 , Z 2 , the following is expressed: The most well-known regression model for survival analysis is the proportional hazard model, which is often used in actual medical research. The partial likelihood method for the semi-parametric model and the maximum likelihood method assuming a specific distribution for the baseline hazard function can be considered for the estimation of the regression coefficient β [17]. The advantage of the proportional hazard model is that the effects of covariates on the hazard function can be easily interpreted. On the other hand, because this model has a strong assumption that the proportional hazard assumption (1) always holds, its disadvantage is that it cannot draw appropriate conclusions due to the biased estimation when the true probability distribution violates this assumption.

Mixture Cure Models
It is reasonable to consider that events, such as cancer recurrence or the onset of lifestyle-related diseases, do not occur (or are cured) in some cases. The mixture cure model [3], one of the cure models, takes this into consideration. Let D be a random variable that is 1 when it belongs to the event occurrence group and is 0 otherwise. The conditional survival function of T can then be expressed as The right-hand side of Equation (2) contains the probability of belonging to the uncured group P(D = 1|Z = z), in which the survival function is P(T > t|D = 1, Z = z). Let p(Z), be the model for the former and S u (t|Z) be the model for the latter. The mixture cure model can then be expressed as where lim t→∞ S u (t|Z) = 0, 0 ≤ p(Z) ≤ 1 a.e. is assumed. Under the assumption described above, lim t→∞ S(t|Z) = 1 − p(Z) ≥ 0, in which 1 − p(Z) can be interpreted as the cured probability. In general, for the uncured probability p(Z), the logistic regression model with the parameter γ ∈ R p+1 is considered to be whereZ = (1, Z ) . In addition, the proportional hazard and AFT models can be applied to the survival function for uncured patients S u (t|Z) [5,7]. The observed data likelihood function is maximized in the estimation of model (3). Let f u (t|Z) be the probability density function corresponding to S u (t|Z). For the observed data {(t i , δ i , z i ) ; i = 1, . . . , n}, the observation likelihood function L of model (3) is expressed as In addition, the maximization of Equation (5) can be performed with the EM algorithm (expectation maximization algorithm) [18]. The non-mixture cure model, an approach to the cure model [4], originated from the biological mathematical model of cancer cells. Therefore, its application to the interpretation of data in epidemiological studies, which is the main objective of the present study, is difficult. In fact, the mixture cure model is often applied to clinical trials and observational studies [7,19].

Accelerated Failure Time Models
The AFT model is a regression model that assumes for T [20], with the assumption that ξ is a random variable that represents the error and is independent of Z. Here, let S ε (t|Z) be the conditional survival function where ε = e ξ ; then, In addition, using the conditional probability density function f ε (t) of ε, the conditional probability density function f (t|Z) of T is expressed as The advantage of the AFT model is that it models the relationship between T and Z, and it is equivalent to a linear regression model without censoring. In addition, a model without proportional hazard assumption can also be expressed depending on the distribution that assumes for ξ or ε (it becomes the parametric proportional hazard model in the case of exponential and Weibull distributions). As with the proportional hazard model, there are semiparametric and parametric methods for inferring the AFT model (e.g., [21]). In the present study, we consider the parametric method and make statistical inferences assuming a specific distribution, such as Weibull and logarithmic normal distributions, for ε.
In this chapter, we describe the survival time regression model that incorporates frailty and proposes a model using frailty. Frailty in survival analysis represents heterogeneity that cannot be expressed by covariates. Since Ref. [6] first proposed the frailty model, many have studied the regression models incorporating frailty and their parameter estimation methods (e.g., [13,22]). Furthermore, previous studies, including [14,23], reported cases in which the cure rate was considered. In this chapter, we propose the mixture cure model using the accelerated failure time frailty model that assumes a shifted gamma distribution of frailty in the survival function for uncured patients, and we describe its estimation method.

Frailty Models
Ref. [6] proposed a model incorporating a latent variable, known as frailty, for situations that assume heterogeneity between individuals that cannot be expressed by covariates. Let Y be a non-negative random variable that represents frailty and assume that it is independent of (T, Z). The frailty model for the proportional hazard model is thus the conditional hazard function h(t|Z) with Model (8) is a normal proportional hazard model when Y = 1. In this frailty model (8), the value of the hazard function increases when Y > 1, while it decreases when Y < 1. Therefore, it can express the changes in the magnitude of the event occurrence rate for the same Z depending on the magnitude of Y.
Additionally, taking frailty into consideration based on Equations (6) and (7), the following can be considered for the AFT model: whereS(t) is the survival function. Model (9) is obtained by multiplying the hazard function by a variable representing the frailty from the relationship between the survival function and the hazard function. Based on these, when the covariates and frailty values are given, the survival function is expressed as There are two methods of estimation for the AFT frailty model (8): a method assuming thatS(t) follows a parametric probability distribution [24] and a semiparametric method that does not assume a specific distribution [15].
Since the variable Y is a latent variable, we marginalize the conditional survival function S(t|Z, Y) with respect to Y. When we denote the distribution function of Y by F Y , the conditional survival function S(t|Z) of T in the frailty model (8) can be calculated by where H 0 (t) = t 0 h 0 (s)ds. Similarly, the accelerated failure time frailty model (9) can be calculated as whereH(t) is a cumulative hazard function corresponding toS, andS(t) = exp[−H(t)]. An example of frailty Y is the gamma distribution Gamma(ζ, τ) with the scale parameter ζ and the shape parameter τ. In this frailty model, we assume a one-parameter gamma distribution with ζ = θ, τ = 1/θ for identifiability. The probability density function of Y is then expressed as and E[Y] = 1 regardless of the value of θ [25]. Assuming this gamma distribution is the distribution of Y, model (11) is expressed as and model (12) obtains They are referred to as the gamma frailty model and the AFT gamma frailty model, respectively.

A Novel Accelerated Failure Time Frailty Mixture Cure Model
Our study uses a shifted gamma distribution for the random variable Y. In this section, we address our proposed model, the reason why the above distribution is assumed, and an estimation algorithm.

Proposed Model
The mixture cure model, which is proposed in this section, is based on the survival function S(t|Z) = p(Z)S u (t|Z) + (1 − p(Z)) in Equation (3). More specifically, we assume the AFT frailty model (12) in the survival function for the uncured group S u (t|Z) and the logistic regression model (4) in the uncured probability p(Z). In addition, this study considers the case in whichS(t) is the survival function of a generalized gamma distribution, and we use a reparametrized representation of the generalized gamma distribution given in the original study by [26]. Therefore, using three parameters µ ∈ R, σ > 0, q ∈ R, the probability density function is expressed as for x > 0 [27]. The function (16) becomes the probability density function of a Weibull distribution when q = 1, a gamma distribution when q = σ, and a logarithmic normal distribution when q = 0. The generalized gamma distribution is the flexible random model that contains the probability distribution described above, which is often used in survival analysis. Hereafter, the survival function and cumulative hazard function of a generalized gamma distribution are expressed as S gg (t; µ, σ, q), H gg (t; µ, σ, q), respectively. Furthermore, we assume a shifted gamma distribution for the frailty Y [28]. If Y ∼ Gamma(ζ, τ), then a random variable Y that follows a shifted gamma distribution with three parameters, η, ζ, and τ, is defined as In this study, the parameter η of a shifted gamma distribution is fixed at η = 1, and is assumed for the frailty. When the distribution function of Y is expressed as F sg (y), the marginalized survival functionŠ(t|Z) is as follows: The uncured individuals can be characterized more clearly by assuming this shifted gamma distribution. The reason for this is that, assuming a gamma distribution Gamma(θ, 1/θ) for Y, it is highly probable that the frailty value becomes 1 or less [29]. That is, holds for almost all θ ( Figure 1). Therefore, the normal gamma frailty model considers the case in which events are less likely to occur, compared to the case in which frailty is not assumed. In fact, in the analysis of the real dataset discussed below, the estimated value of θ in this model was significantly close to 0, suggesting that the normal gamma frailty model [16] is inappropriate. When the shifted gamma distribution (17) is assumed for frailty,Š holds for any t ∈ [0, ∞). Therefore, the proposed model can characterize the survival function of the uncured group more clearly compared to the case in which frailty is not assumed. Based on these, the proposed model of the survival rate function is expressed more specifically as

Estimation Method and Its Algorithm
In this section, we construct the parameter estimation method for the proposed model (19) based on the EM algorithm. Recall that β ∈ R p , γ ∈ R p+1 , and let κ = (µ, σ, q) and θ be the parameters of the generalized gamma distribution and shifted gamma distribution, respectively.
Let D i (i = 1, . . . , n) be a random variable that is 1 when the i-th individual belongs to the event occurrence group and is 0, otherwise (i = 1, . . . , n). Let the observation time of the The likelihood function L(β, κ, θ, γ) of the proposed model (19) is expressed as Note that indication of parameters on the right-hand side of the equation above is suppressed for simplicity of expression. Here, because the log-likelihood function can be written as follows: wherez i = (1, Z i ) . Since the log-likelihood function (20) is divided into the terms related to {β, κ, θ} and the terms related to γ, their maximization can be performed separately. In the algorithm given below, let the updated value of the k-th parameter be φ (k) = (β (k) , κ (k) , θ (k) , γ (k) ) and the observed samples be D To simplify the following notation, E[·|D; φ (k) ] is written asĒ [·]. If the conditional expected value of the function (20) is expressed as¯ (β, κ, θ, γ), the following is then expressed: In addition,Ē[D i ] can be calculated as follows: where and p (k) (z) is a function in which γ in p(z; γ) is replaced with γ (k) .
In the M step, the value of the parameter that maximizes Equation (21) is obtained based on the calculation result of the E step. As mentioned above, since Equation (21) is divided into the term related to {β, κ, θ} and the term related to γ, the maximization of the two parameter sets can be performed individually. Therefore, the function¯ (β, κ, θ, γ) is decomposed into and the updated value of (k + 1)-th parameter can be obtained as follows: The above calculation is organized as an algorithm. Let the sample as the realization D be {(t i , δ i , z i ) ; i = 1, . . . , n}.
The estimators estimated by the EM algorithm are the value that maximizes the observation likelihood and are the maximum likelihood estimated values. Therefore, the estimators estimated by this algorithm have consistency and asymptotic normality under appropriate regularity conditions [30]. The log-likelihood function of the observation likelihood in the proposed model (19) is expressed as with φ = (β , κ , θ, γ ) Then, the information matrix of φ ∈ R 2p+4 is expressed as and, from the asymptotic normality of the maximum likelihood estimator, the following holds: is the maximum likelihood estimator, and φ 0 is the true value of φ.

Numerical Examples
In this section, the behavior of estimated values of the parameters of the proposed model is confirmed by simulation. In addition, the dataset on the onset of hypertension will be analyzed using the proposed model and compared with that analyzed using existing models for discussion. The statistical software R was used for all analyses.
independently of each other, where Bernoulli(r) is a Bernoulli distribution with the parameter r ∈ (0, 1) and Unif(a, b) is a uniform distribution on the interval (a, b). Let n be the sample size and Z 1 , . . . , Z n be independent copies of Z. Next, forZ i = (1, Z i ) and γ = (γ 1 , γ 2 , γ 3 ) , let be the uncured probability of the i-th individual and generate D i ∼ Bernoulli(p i ) independently of each other. Assuming that the frailty Y i of the i-th individual follows the shifted gamma distribution (17) with the parameter θ independently of Z i , the event occurrence time T i for β = (β 1 , β 2 , β 3 ) is generated based on In addition, it is assumed that the censoring time C i (i = 1, . . . , n) follows the uniform distribution Unif(0, 5) independently of any variables. Based on the variables generated above, O i = min(T i , C i ) and ∆ i = I(T i ≤ C i ) are defined, and the observed sample {O i , ∆ i , Z i (i = 1, . . . , n)} is obtained to estimate the parameters of the proposed model. In this simulation, the following settings are used for the true values of the parameters: The difference between settings (i) and (ii) is the value of parameter θ of the shifted gamma distribution. We will examine the effects and tendencies of the estimation results due to the difference in the parameter on frailty.

Results
The mean parameter estimate values for settings (i) and (ii) are shown in Tables 2 and 3, respectively.  For almost all of the estimated values in settings (i) and (ii), the mean approaches the true value, and the standard deviation decreases as the sample size increases. In addition, Figures 2 and 3 show the histograms of estimated values of the main subjects of interest in setting (ii) (n = 1000). They show that the estimated values of the regression coefficients β 1 and γ 1 are symmetrically distributed around the true value. A similar result was obtained for σ, a component other than β and γ. Although most of the estimated values of θ are close to their true values, some were far from their true values. In addition, the results suggest that the magnitude of θ affects the parameter q of the generalized gamma distribution and that the estimation of θ affects the estimation of q. The initial values of all estimations were fixed to the same value in this simulation, but the estimated values were converged to those closer to the true value when the initial values were modified. Therefore, the use of multiple initial values for parameter estimation of the proposed model would be a measure for obtaining an appropriate estimate.

Real Data Example
In this section, we analyze the real dataset on the onset of hypertension using survival time regression models and compare the results.

Dataset and Previous Study
The subject of analysis is the dataset on the Specific Health Checkups and Specific Health Guidance, the program started in April 2008 under the National Health Insurance System to maintain lifestyle-related diseases aged between 40 and 74 years. We analyzed the data from Specific Health Checkups conducted in Habikino City, Osaka Prefecture, Japan. They consist of 8325 residents who participated in Specific Health Checkups in 2008 and were followed until the end of March 2013. Of these, 4993 were females and 3332 were males. In this program, lifestyle information for the participans was collected using a self-report questionnaires about the lifestyle and body measurement values, such as waist circumference, age and body weight, and triglyceride level.
We analyzed the dataset of size 3326 (2202 females and 1124 males) obtained by the same exclusion criteria of [31]. They studied the relationship between the onset of hypertension and lifestyle habits, collected via a standard questionnaire, using the proportional hazard model. The present study used the 14 variables (such as age, waist circumference, eating speed, and amount of drinking) used in this previous study as covariates and considered a maximum of 18 linear models as parameters as it also included categorical variables (refer to the paper for details on the variables).
In the present analysis, the onset of hypertension was regarded as an event of interest, and the regression model was applied for each gender. First, the survival functions for males and females were estimated using the Kaplan-Meier estimator (Figure 4). It showed the five-year survival rates of 0.33 and 0.42 for males and females, respectively, indicating that they were not low. In addition, we performed a hypothesis testing of the proportional hazard assumption based on the Schoenfeld residuals for this dataset [32]. The result indicated that the null hypothesis was rejected for the variable "eating snacks after dinner at least 3 times a week" in the datasets of both males and females at a significance level of 5% (p-values for the datasets of males and females were 0.0155 and 0.0203, respectively). Therefore, it was suggested that the survival function for the onset of hypertension does not have the proportional hazard assumption. This result indicates that assuming a model without the proportional hazard assumption is appropriate, considering the probability that an event does not occur.  The following five types of models are applied in this section: 1.
Mixture cure model with the proportional hazard model; 4.
Mixture cure model with the AFT model; 5.
Proposed model.
The survival distributions assumed for the model are an exponential distribution and a Weibull distribution for the proportional hazard model. A logarithmic normal distribution and a generalized gamma distribution were used for the AFT model. It should be noted that, based on the definitions of these two models, the model used when an exponential or Weibull distribution is assumed for the baseline hazard function of the proportional hazard model is the same as that used when an exponential or Weibull distribution is assumed as the error variable of the AFT model. A logistic regression model is assumed for all uncured probabilities in the mixture cure model. The distribution mentioned above is used for the survival distribution assumed for the survival function for uncured patients.
In some cases during the analysis using the mixture cure model, the absolute value of the estimated value of the regression coefficient γ in the logistic regression model was extremely large. Therefore, we adopted the method of handling the problem of complete separation by [33], which maximizes the objective function * (γ) = logis (γ) + 1 2 log |Î(γ)| (26) where a penalty function is added to the log-likelihood function logis (γ) of the logistic regression model [34]. For the sample covariates z 1 , . . . , z n , is an estimated value of the information matrix of logis (γ). This method was applied to¯ 2 for the proposed model (19). In addition, the Akaike information criterion (AIC) was used as a criterion for comparing the goodness of fit of the model [35].

Results
The regression model used in the analysis and AIC values are shown in Tables 4 and 5. As shown in Tables 4 and 5, the lowest AIC is observed in the proposed model when the variable selection is performed for variables in the logistic regression model in both males and females. The estimated values, 95% confidence intervals, and p-values for these estimation results are summarized in Tables 6-9. Table 6. Estimation result of regression coefficient when variable selection is performed by applying the proposed model in males. CI is the confidence interval. The asterisk (*) and dagger ( †) indicate that p-value is less than 0.10 and 0.05, respectively. As shown in Tables 4 and 5, the proposed model with the variable selection of the regression coefficient in the uncured probability had the lowest AIC values. However, other models had lower AIC values than the proposed model when variable selection was not performed. This is likely due to the instability resulting from the 18-parameter logistic regression model in the mixture cure model. For males, seven variables were selected for the uncured probability after variable selection: "age", "eating speed: normal", "eating speed: fast", "eating snacks after dinner at least 3 times a week", "increase in body weight by 10 kg or more compared to that at the age of 20 years", "amount of drinking: occasionally", and "amount of drinking: less than 1-2 go daily". Of these, "age", "eating snacks after dinner at least 3 times a week", "increase in body weight by 10 kg or more compared to that at the age of 20 years", and "amount of drinking: less than 1-2 go daily" were judged to have a significant regression coefficient at the 5% significance level. Among the variables included in the survival function for uncured patients, "eating snacks after dinner at least 3 times a week" was judged to have a significant regression coefficient at the 5% significance level. For females, three variables were selected for the logistic regression model of the uncured probability after variable selection: "age", "eating speed: fast", and "increase in body weight by 10 kg or more compared to that at the age of 20 years". Of these, only "age" was judged to be significant at the 5% significance level. There were no variables in the survival function for uncured patients judged to be significant at the 5% significance level. This result suggests that "age" is the only variable that affects the onset of hypertension in females. Figure 5 shows the probability density function of the estimated shifted gamma distribution. Estimation was also performed when the accelerated failure time gamma frailty model was assumed for the mixture cure model, showing that the estimated values of θ were 9.42 × 10 −6 and 6.93 × 10 −5 for males and females, respectively, which were significantly close to 0. The AIC values were 7006.649 and 11,603.85 for males and females, respectively. These were higher than those observed without the assumption of frailty, suggesting that this model was not appropriate. The number of variables selected for the logistic regression model differed between males and females, and some variables were judged to be significant only in males, suggesting that the covariates that affect the onset of hypertension differ between males and females. In the distribution of frailty, females had larger estimated values of θ and variance than males. This indicates that females have greater differences in the onset of hypertension among individuals.

Discussion
In this study, we proposed a model that assumes the AFT frailty model in the survival function for uncured patients in the mixture cure model. The characteristic of the proposed method is that the variable representing frailty is assumed to follow a shifted gamma distribution, which characterizes the uncured individuals more clearly. In addition, we constructed a parameter estimation method for the proposed model using the EM algorithm and confirmed that the estimation could be performed appropriately using a simulation. Furthermore, we analyzed the dataset on the onset of hypertension in epidemiological research. Compared with existing models, the proposed model showed the lowest AIC values for both males and females.
Ref. [31] reported the results of the analysis using a semiparametric proportional hazard model. Since a parametric model was used in the analysis of the present study, the results of these two studies cannot be compared directly using AIC. Therefore, comparison criteria between semiparametric and parametric models need to be considered in the future. Previously, comparison criteria between semiparametric and parametric models in the proportional hazard model were studied by [36,37], who used approaches based on the focused information criterion (FIC).
The model proposed in this study uses the parametric AFT model, but the method using the semiparametric AFT model can be considered as an alternative. The semiparametric AFT frailty model was previously studied by [38]. Because this model can estimate regression parameters without assuming a specific distribution in the error term, a more flexible model can be considered. On the other hand, its estimation method is more complicated than that of the parametric model. One future challenge is to extend the model proposed in this study to the semiparametric model and construct an estimation algorithm.
In addition, while this study constructed the estimation algorithm assuming only right censoring, there are other types of censoring, one of which is interval censoring. Interval censoring refers to the cases in which only the occurrence of an event between two observation periods is recorded [39]; it is a more generalized type of censoring and includes right censoring. The construction of an estimation algorithm for the datasets that contain interval censoring is also needed in the future. While models considered in the study are linear static models, dynamic prediction for survival data is an important issue [40]. In addition, recent development for big data such as machine learning techniques would be incorporated [41].