Next Article in Journal
Effect of Rotational Speed Fluctuation Parameters on Dynamic Characteristics of Angular Contact Ball Bearings
Previous Article in Journal
Optimal Function Study of One-Cycle Control with Embedded Composite Function for Boost Converters
Previous Article in Special Issue
A Priori Sample Size Determination for Estimating a Location Parameter Under a Unified Skew-Normal Distribution
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Generalized Gamma Frailty and Symmetric Normal Random Effects Model for Repeated Time-to-Event Data

1
School of Statistics and Mathematics, Interdisciplinary Research Institute of Data Science, Shanghai Lixin University of Accounting and Finance, Shanghai 201209, China
2
Department of Mathematics and Statistics, McMaster University, Hamilton, ON L8S4K1, Canada
3
Department of Mathematical Sciences, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China
*
Author to whom correspondence should be addressed.
Symmetry 2025, 17(10), 1760; https://doi.org/10.3390/sym17101760
Submission received: 14 August 2025 / Revised: 25 September 2025 / Accepted: 2 October 2025 / Published: 17 October 2025

Abstract

Clustered time-to-event data are quite common in survival analysis and finding a suitable model to account for dispersion as well as censoring is an important issue. In this article, we present a flexible model for repeated, overdispersed time-to-event data with right-censoring. We present here a general model by incorporating generalized gamma and normal random effects in a Weibull distribution to accommodate overdispersion and data hierarchies, respectively. The normal random effect has the property of being symmetrical, which means its probability density function is symmetric around its mean. While the random effects are symmetrically distributed, the resulting frailty model is asymmetric in its survival function because the random effects enter the model multiplicatively via the hazard function, and the exponentiation of a symmetric normal variable leads to lognormal distribution, which is right-skewed. Due to the intractable integrals involved in the likelihood function and its derivatives, the Monte Carlo approach is used to approximate the involved integrals. The maximum likelihood estimates of the parameters in the model are then numerically determined. An extensive simulation study is then conducted to evaluate the performance of the proposed model and the method of inference developed here. Finally, the usefulness of the model is demonstrated by analyzing a data on recurrent asthma attacks in children and a recurrent bladder data set known in the survival analysis literature.

1. Introduction

Clustered time-to-event data are commonly encountered in contemporary medical and statistical studies. It is important to accommodate data hierarchies that can result from repeated measurements of survival outcomes on the same subject [1]. In addition, overdispersion [2] as well as censored observations are most likely to be found in the data. Therefore, it is crucial to build a suitable model to capture the aforementioned features present in the data.
Models that simultaneously deal with issues of overdispersion and data hierarchy in survival analysis are not so common. Molenberghs et al. [3] proposed a general framework to model non-Gaussian outcomes, accommodating overdispersion and data hierarchies through two separate sets of random effects. The case of time-to-event data is one specific application under their broad class of models. Molenberghs et al. [4] extended and presented a model for repeated, overdispersed time-to-event data that are also subject to censoring. These authors [4] combined conjugate random effect for overdispersion with generalized linear mixed model (GLMM [5,6]) in which normal random effects are embedded within the linear predictor for clustering. Two estimation methods—a partial marginalization approach to full maximum likelihood and a pairwise-likelihood version of pseudo-likelihood—were presented and compared, based on a limited simulation study as well as two real data analyses. Molenberghs et al. [4] chose gamma distribution to be the frailty random effect to capture overdispersion. This convenient choice was motivated through the concept of conjugacy. However, convenience does not assure a good fit to the observed data. To overcome this limitation, we propose here a new model that is flexible to provide adequate fit to the repeated and overdispersed time-to-event data, while also accommodating right-censoring. The frailty random effect in this paper is assumed to follow a generalized gamma distribution, previously studied by Balakrishnan and Peng [7]. The generalized gamma distribution has one more parameter than the gamma distribution, and is therefore more flexible and less parametric [7]. Moreover, this distribution also includes the well-known gamma, lognormal, and Weibull models all as special cases.
The rest of this paper is organized as follows. Key ingredients for our model formulation and the new combined model are the subjects of Section 2. Section 3 discusses an estimation method for the proposed model. In Section 4, an extensive simulation study is conducted to investigate the performance of the proposed estimation method and to compare the performance of the proposed model with the existing one [4]. The illustration of the proposed model with two motivating data sets is made in Section 5. Finally, some concluding remarks are made in Section 6.

2. Generalized Gamma Frailty and Normal Random Effects Model

2.1. Generalized Gamma Distribution

The generalized gamma (GG) distribution has been applied in many areas of statistical applications, due mainly to its flexible form. A random variable Y > 0 is said to follow the generalized gamma distribution if its density function is as follows:
f ( y ; q , σ , λ ) = | q | ( q 2 ) q 2 ( λ y ) q 2 ( q / σ ) exp [ q 2 ( λ y ) q / σ ] / [ Γ ( q 2 ) σ y ] , q 0 ( 2 π σ y ) 1 exp { [ log ( λ y ) ] 2 / ( 2 σ 2 ) } , q = 0 ,
where < q < and σ > 0 are shape parameters and λ > 0 is a scale parameter. It was first developed by Stacy [8] and a review of the generalized gamma distribution is available in [9]. This distribution was studied by Prentice [10], wherein a parameterization was proposed that allowed the maximum likelihood estimates to be computed using standard algorithms. The application of the generalized gamma distribution in an accelerated failure time model was addressed by Lawless [11]. Yamaguchi [12], Peng et al. [13], and Balakrishnan and Pal [14] investigated the use of the distribution in the context of a cure rate model. Shin et al. [15] proposed a statistical model for speech signals based on the generalized gamma distribution. Cox et al. [16] provided a taxonomy of hazard functions for the generalized gamma distribution, while Balakrishnan and Peng [7] proposed a frailty model using the generalized gamma distribution as the frailty distribution.
Generalized gamma distribution is a broad family of distributions that includes exponential, gamma, and Weibull distributions as subfamilies, and lognormal as a limiting distribution. For example, the distribution reduces to the gamma distribution when q σ = 1 ; it is the lognormal distribution when q = 0 (the limit of f ( y ; q , σ , λ ) when q 0 ); and it is the Weibull distribution when q = 1 . As the generalized gamma distribution can offer a considerable amount of flexibility, in this way, it can be used to capture more features of the data that may be missed when any of its special cases are used instead. The mean of the generalized gamma distribution given in (2) exists only when q > 1 σ , and is given by
Γ ( q 2 + σ q ) Γ ( q 2 ) ( q 2 ) σ / q λ .
If the mean is set to one, then
λ = Γ ( q 2 + σ q ) Γ ( q 2 ) ( q 2 ) σ / q .
We can then substitute (3) to (1), and also get the variance as
Γ ( q 2 + 2 σ q ) Γ ( q 2 ) Γ 2 ( q 2 + σ q ) 1 .

2.2. Weibull-Generalized Gamma–Normal Model

A random variable Y follows an exponential family of distributions if its density is of the form
f ( y ) = f ( y | η , ϕ ) = exp { ϕ 1 [ y η ψ ( η ) ] + c ( y , ϕ ) } ,
where η is termed as the natural parameter, ϕ is called the dispersion parameter, and ψ ( · ) and c ( · ) are some known functions. In the regression context, the model needs to incorporate measured covariates to explain variability among outcome values, and this is referred to as a generalized linear model. Y 1 , . . . , Y N is a set of independent outcomes, and x 1 , . . . , x N represent the corresponding p-dimensional vectors of covariate values. Then, Y i ’s are assumed to have densities f ( y i | η i , ϕ ) that belong to the exponential family in (5), while μ i ’s are modeled as functions of the covariates. It is assumed that μ i = h ( x i ξ ) , with ξ as a vector of p fixed unknown regression coefficients and h 1 ( · ) is called the link function. In most applications, the natural link function is used, i.e., h ( · ) = ψ ( · ) with μ i = h ( x i ξ ) = ψ ( x i ξ ) = ψ ( η i ) . This is equivalent to assuming η i = x i ξ . As overdispersion is often present in time-to-event data, we can use a two-stage approach to construct an overdispersed model. Consider a distribution for the outcome, given a random effect, f ( y i | θ i ) , and assume a distribution for the random effect, f ( θ i ) . The marginal distribution is then produced as
f ( y i ) = f ( y i | θ i ) f ( θ i ) d θ i .
If f ( y i | θ i ) follows a Weibull distribution, and f ( θ i ) follows a gamma distribution, then we obtain a Weibull–Gamma model. The choice of gamma distribution for the random effect is motivated through the concept of conjugacy, so that the marginal distribution has a closed-form expression. Although gamma distribution is convenient to use, the convenience, however, does not assure that the fit will be good. The generalized gamma distribution has one more parameter than the gamma distribution, and so can be more flexible, enabling better modelling of the data. This can be generalized to clustered data in survival analysis, where Y i j can be used to denote the j t h outcome value in cluster i, for i = 1 , 2 , 3 , . . . , N and j = 1 , 2 , . . . , n i . On the other hand, it is possible to include normal random effects in the linear predictor of the generalized linear model, giving rise to the family known as generalized linear mixed model. Conditional on q-dimensional random effects b i N ( 0 , D ) , the outcomes Y i j are independent, with exponential-family densities of the form
f i ( y i j | b i , ξ , ϕ ) = exp { ϕ 1 [ y i j λ i j ψ ( λ i j ) ] + c ( y i j , ϕ ) } ,
with
η [ ψ ( λ i j ) ] = η ( μ i j ) = η [ E ( Y i j | b i , ξ ) ] = x i j ξ + z i j b i ,
where x i j and z i j are p-dimensional and q-dimensional vectors of known covariate values, η ( · ) is a known link function, ξ is a p-dimensional vector of unknown fixed regression coefficients, and ϕ is a scale (overdispersion) parameter. f ( b i | D ) is the density of N ( 0 , D ) distribution for the random effects b i . Now, the combination of overdispersion and normal random effect leads to the following combined model
f i ( y i j | b i , ξ , θ i j , ϕ ) = exp { ϕ 1 [ y i j λ i j ψ ( λ i j ) ] + c ( y i j , ϕ ) } .
The conditional mean is E ( Y i j | b i , ξ , θ i j ) = μ i j c = θ i j k i j , where θ i j G i j ( ν i j , σ i j 2 ) and k i j = g ( x i j ξ + z i j b i ) . The relationship between mean and natural parameter now is λ i j = h ( μ i j c ) = h ( θ i j k i j ) . Random effects θ i j can capture overdispersion, and k i j can be considered as the generalized linear mixed model component that can capture the in-between cluster effect. In the present work, we consider the general Weibull model for repeated measures, with both generalized gamma and normal random effects, of the following hierarchical form:
f ( y i | θ i , b i ) = j = 1 n i λ ρ θ i j y i j ρ 1 exp ( x i j ξ + z i j b i ) exp ( λ y i j ρ θ i j e x i j ξ + z i j b i ) ,
f ( b i ) = 1 ( 2 π ) q / 2 | D | 1 / 2 e 1 2 b i D 1 b i ,
f ( θ i ) = j = 1 n i | q | ( q 2 ) q 2 ( p θ i j ) q 2 ( q / σ ) exp [ q 2 ( p θ i j ) q / σ ] [ Γ ( q 2 ) σ θ i j ] .
To make the parameters in the model identifiable, the mean of the generalized gamma distribution is usually set to be one. As the mean of the generalized gamma distribution is Γ ( q 2 + σ q ) Γ ( q 2 ) ( q 2 ) σ / q p , with the mean set to one, we then have p = Γ ( q 2 + σ q ) Γ ( q 2 ) ( q 2 ) σ / q .

3. Likelihood Function and Estimation Method

In a frailty model, one of the most common methods used for estimating the unknown parameters of the model is the maximum likelihood method, as used, for example, by Balakrishnan and Peng [7]. In this paper, we make use of the marginal likelihood approach as a direct way for finding the maximum likelihood estimates of the unknown parameters in the combined generalized gamma and normal random effect model. However, one of the difficulties in the parameter estimation is that the likelihood function involves many integrals which render it to be intractable. For this reason, we can approximate the integral in the likelihood function directly and then maximize it to obtain the maximum likelihood estimates of the parameters of interest. The likelihood contribution of subject i (cluster i) is
f i ( y i | ϑ , D , ϑ i , Σ i ) = j = 1 n i f i j ( y i j | ϑ , b i , θ i ) f ( b i | D ) f ( θ i | ϑ i , Σ i ) d b i d θ i ,
while the likelihood function is given by
L ( ϑ , D , ϑ i , Σ i ) = i = 1 N j = 1 n i f i j ( y i j | ϑ , b i , θ i ) f ( b i | D ) f ( θ i | ϑ i , Σ i ) d b i d θ i .
A numerical method, with the use of Monte Carlo simulation, is implemented to approximate the integral I i in the likelihood function, where
I i = j = 1 n i f i j ( y i j | ϑ , b i , θ i ) f ( b i | D ) f ( θ i | ϑ i , Σ i ) d b i d θ i ,
and it is approximated as
I i = E θ i E b i j = 1 n i f i j ( y i j | ϑ , b i , θ i ) = 1 N 1 × N 2 k 1 = 1 N 1 k 2 = 1 N 2 j = 1 n i f i j ( y i j | ϑ , b ( k 1 ) , θ ( k 2 ) ) ,
where b ( k 1 ) , k 1 = 1 , 2 , , N 1 , and θ ( k 2 ) , k 2 = 1 , 2 , , N 2 are realizations of normal random variables and generalized gamma random variables, respectively. Here, N 1 = N 2 = 100 since it is sufficient enough to provide necessary accuracy and save computational time. Now, to accommodate right-censoring in the data, for each j, we integrate the conditional distribution over the time interval [ C i j , + ]:
S ( C i j | θ i j , b i ) = C i j f i j ( y i j | b i , θ i ) d y i j .
For the case of the Weibull–GG–Normal model, we have
S ( C i j | θ i j , b i ) = C i j λ ρ θ i j y i j ρ 1 exp ( x i j ξ + z i j b i ) exp ( λ y i j ρ θ i j e x i j ξ + z i j b i ) d y i j = exp ( λ e x i j ξ + z i j b i θ i j C i j ρ ) .
The maximum likelihood estimates of the parameters are then obtained numerically through the quasi-Newton–Raphson algorithm. The standard errors are also obtained numerically from this approximation.

4. Simulation Study and Model Discrimination

An extensive simulation study has been conducted in order to evaluate the performance of the proposed model and the associated estimation method. In the simulation study, four different cluster sizes have been considered to examine the impact of sample size on the proposed model: n = 30 , n = 100 , n = 200 , and n = 400 . Each cluster can be treated as one subject, and the number of observations for one subject was then generated randomly from a normal distribution N ( μ = 8 , σ 2 = 4 ) . Furthermore, 10%, 25%, and 50% censoring rates have also been introduced through a Bernoulli distribution (with π = 0.9 , 0.75, and 0.5). True values of the treatment effects used in the simulation study are chosen to be ϵ 0 = 2 and ϵ 1 = 0.1 . The parameter in the normal distribution to model the cluster effect has been set as d = 0.5 . Gamma, Weibull, and lognormal distribution with mean 1 and variance 0.5 have been used to model over-dispersion in the simulated data sets. The gamma model is in fact the model studied in [4]. So, we can compare the performance of the GG model with the existing work done by [4]. We then generated 500 random samples from each scenario, and then fitted the three special distributions along with the generalized gamma distribution of the proposed model to these simulated data sets.
Table 1 presents the bias and mean square errors (MSEs) (×100) of estimated treatment effect, normal random effect, and frailty variance when the data are simulated with 10% censoring rate. Looking at the estimated treatment effect, all models perform well in estimation, even when the frailty distributions differ from the correct one. It is interesting to note that the MSEs are the smallest when gamma frailty model is being fitted, because that may depending on the parameter values that have been chosen. The parameter values that are chosen may fall in a range that the three models are close, and therefore even if the true distribution is lognormal or Weibull, gamma provides adequate fit and is robust. Also, in most cases, the MSEs are the second smallest when the generalized gamma frailty is being fitted. When the number of clusters increase from 30 to 400, we notice a trend of the MSEs tending to become smaller. When examining the estimated normal random effect, models with Weibull and generalized gamma frailty have smaller bias than models with gamma and lognormal frailty. The bias and MSEs are the smallest when the generalized gamma model is being fitted. There is also a trend that as the number of clusters increases, the bias and MSEs tend to become smaller. There are noticeable differences in estimating the variance of the frailty distribution among the model considered. When the fitted model matches the true model, the bias and MSEs are quite small. However, it is evident that the lognormal frailty model tends to have very large bias and MSE while estimating the variance, except for the case when the true frailty distribution is lognormal. This shows that the Weibull–lognormal–normal model is not robust.
The results of the scenario when the true frailty distribution is gamma and the fitted model is gamma are extracted and presented in Table 2 to compare with the results from [4]. The results presented in Table 2 are bias and mean standard errors (mean s.e.). We can see that our estimation method provides better estimates than the pseudo-likelihood method by [4].
Model discrimination is motivated by the fact that the generalized gamma distribution encompasses some commonly used distributions as special cases. Choosing the scale and shape parameters from this broad family of distributions suitably, we can adequately fit an appropriate model to a data set. The model discrimination study will allow us to investigate how often a true model gets selected and others get rejected, considering the generality and flexibility of the generalized gamma distribution. The model discrimination is carried out here based on an information-based criterion. In our model selection, we use Akaike’s information criterion (AIC) as the criterion to discriminate among the candidate models. AIC is given by 2 l ^ + 2 p , where l ^ is the maximized log-likelihood value and p is the number of model parameters to be estimated. It is to be noted that the model with the minimum value of AIC is the model that best describes the data. The number of times that the correct model gets chosen and the incorrect models get selected have been computed.
Table 3 presents the selection rates based on AIC for the cases with 10% censoring rate. From Table 3, it can be seen that the selection rate of the correct model ranges from 31.2% to 40.0% if the true distribution is gamma; from 54.4% to 67.8% if the true distribution is lognormal; and from 49.6% to 62.2% if the true distribution is Weibull. It can also be seen that as the sample size increases, the selection rates for the correct model also increases, as expected. We observe that the generalized gamma always has the highest selection rate, which is also close to the correct selection rates. For example, when the true model is gamma and the sample size is 100, the selection rate of generalized gamma is 47%. It tells us that it contains all 31.2% of the time the correct gamma is selected, and the remaining 15.8% of the time some other member of generalized gamma distribution is selected.
If one is interested in testing the validity of one model within the generalized gamma family, a likelihood ratio test can be performed. But that will only test for the validity of one model, and would not lead to the problem of selecting the best model.

5. Illustrative Real-Life Data Analyses

In this section, we will illustrate the performance of the proposed model with two real data sets from the survival analysis literature.

5.1. Asthma Study

Asthma is claimed to be the most common chronic condition in children globally, and is occurring more and more frequently in very young children (between 6 and 24 months). It is associated with accelerated loss of lung function [17]. So, it is very important to prevent young children from it. The data of interest are the recurrent asthma attacks in children, studied by Duchateau and Janssen [1], and are available in R package parfm 2.7.8 implemented in R 4.2.1. In this study, a clinical trial is conducted with young children with higher risk of developing asthma to examine the effectiveness of a new application of an existing anti-allergic drug to prevent it. There are 232 such children in the study and their follow-up extends to 600 days; they are randomized to either the treatment (new application of the drug) or placebo group (standard application of the drug). A child can develop more than one asthma attack, so intermittent events are ordered in time and clustered within one child. A calendar time format is chosen to represent the data, where the time at-risk period for a particular event is the time from the end of the previous event (asthma attack) to the start of the next event (start of the next asthma attack). During the course of the study, the start and end dates of different at-risk periods are recorded for one patient, and these at-risk periods are separated either by an asthmatic event or by a period in which the patient is not under observation. The goal of the study is then to investigate whether the new application of the anti-allergic drug has an effect on reducing the frequency and length of the asthma attacks in children. In Table 4, the data for the first two patients are presented, for example. The first column contains the patient identification number, and the second column contains the drug information (placebo (drug = 0) or drug (drug = 1)). The third and forth columns give the start and the end (in days) of the risk period, and the fifth column gives the censoring status, taking value one (status = 1) if the patient experiences an asthma attack at the end time of the risk period and zero (status = 0) otherwise.
In this section, various frailty distributions under the proposed generalized gamma frailty and normal random effects framework are fitted for the time-to-event data introduced above. Our aim is to determine the impact of different frailty distributions on estimating the treatment effects, and to find a model that provides the best fit to these data.
We first consider an exponential model by setting ρ = 1 in the model of the form in (18).The predictor has a form of κ i j = ξ 0 + ξ 1 T i + b i , where T i takes on the value 1 if the patient is in the treatment group and 0 if in the control group, and b i N ( 0 , d ) . We fit the data with the generalized gamma frailty and normal random effect model and its three special cases: gamma, lognormal, and Weibull frailty and normal random effect models. The obtained results are reported in Table 5. Wald tests have been carried out to assess the estimated treatment effects and the corresponding results are presented in Table 6. It is seen that the treatment effects ξ 1 are reasonably close to each other. The generalized gamma, which contains lognormal, has the smallest AIC. This suggests that using generalized gamma can avoid the use of the wrong model as the generalized gamma could direct us effectively to the correct model. From Table 6, the p-values appear to be around 0.14, which suggests that the treatment effect is identifiable in all of the four fitted models.
Next, we take censoring into consideration when fitting all four models to the asthma data. The corresponding results are presented in Table 7. In both Table 5 and Table 7, it is important to note that the standard errors of the two parameters, ξ 0 and λ , are quite large. This could be the result of overdispersion being present in the data. In order to address this issue, it would make sense to set the scale parameter λ to be 1. We then re-fit all four models, and the corresponding results are shown in Table 8 and Table 9, for data without and with censoring, respectively.
From Table 8 and Table 9, it is evident that the standard errors are now lower and reasonable. Wald tests have again been performed to test the significance of the estimated treatment effects from all of the models based on the results in Table 8 and Table 9. Wald tests suggest that the treatment effects are still identifiable in models both with and without censoring. From Table 10, we observe that estimated treatment effects and the standard deviation of the normal random effects are similar in the four models.

5.2. Bladder Study

The model proposed here is further illustrated by the bladder data set, which is available from the survival package in R [18,19]. The study was conducted by the Veterans Administration Cooperative Urological Rsearch Group. This study involved patients with superficial bladder tumors at the start of the trial. Following the transurethral removal of tumors, the patients were randomly treated by placebo, thiotepa, or pyridoxine. Most patients experienced repeated tumor recurrences and new tumors were removed at each follow-up. The recurrence times of tumors were recorded. The subset of 85 patients who were assigned to either thiotepa or placebo has been used for analysis here. The exponential model taking ρ = 1 with generalized gamma frailty and its special cases are fitted with the data. The corresponding results are presented in Table 11. It is observed that the model with generalized gamma frailty has the smallest AIC, which suggests once again the usefulness of the generalized gamma frailty in picking the best model in a flexible and efficient way.

6. Discussion and Conclusions

Building on the work of Molenberghs et al. [4], in this article, we have extended the combined model of gamma frailty and normal random-effects for repeated survival data by using generalized gamma frailty distribution. Particular attention has been given to Weibull models for overdispersed, clustered time-to-event outcomes, with generalized gamma and normal random effects. Censoring has also been accommodated in the data. Model selection and goodness-of-fit are critical issues in frailty models for time-to-event data. Due to the latent nature of frailties in modelling, it is often challenging to determine an appropriate frailty distribution for a given data set. Balakrishnan and Peng [7] proposed generalized gamma distribution in a frailty model context, thereby significantly improving the fit of the frailty model to the observed data. The generalized gamma distribution is a flexible frailty distribution with two parameters besides the scale parameter, and it includes gamma, lognormal, and Weibull as special cases. The integration of the generalized gamma distribution to the combined model can therefore provide more flexibility for modelling clustered time-to-event data.
The combined Weibull–gamma–normal model proposed by Molenberghs et al. [4] enjoys the so-called strong conjugacy property, and closed-form expressions for the marginal distribution can be derived conveniently and then made use of subsequently in the maximum likelihood estimation. The computational challenge faced in considering the generalized gamma distribution as the frailty distribution is handled conveniently by using Monte Carlo approximation for the integrals involved in the likelihood function. The performance of the proposed model has been evaluated through an extensive simulation study. The estimates of treatment effect and normal random effect coefficients are seen to be accurate. As the sample size increases, the bias and MSEs of the estimates decrease. In addition, we observe that a lower censoring rate leads to a more accurate estimation of the parameters in the proposed model. A model discrimination is also carried out between Weibull-generalized gamma–normal model and its three special cases: Weibull–gamma–normal, Weibull–lognormal–normal, and Weibull–Weibull–normal model. Data sets are simulated from the three models, and the four candidate models are then fitted to each data set. The model selection has been done based on AIC. From our analysis, we observe that the selection rates of the correct model increase as the sample size increases; the selection rate of the correct submodel (e.g., gamma, Weibull, or lognormal) is rather low, ranging from 31.2% to 67.8%, because the selection rate is a function of the original parameter values and the sample size. If the original parameter values are chosen such that two models are very close, then the probability of correct selection will naturally be lower. In such cases, the model fits, even though it is based on the wrong model, will be similar. Finally, a recurrent asthma attack in the children data set and the recurrent bladder data set are used to illustrate the proposed Weibull-generalized gamma–normal random effect model. The proposed model is sufficient and flexible enough to provide the best fit for the recurrent asthma data set and the recurrent bladder data set.
In our model formulation, the generalized gamma and normal random effects capture overdispersion and between-subject association. Although the proposed model is flexible and widely applicable for univariate time-to-event outcomes, it is possible to generalize to bivariate or multilevel repeated time-to-event settings. Furthermore, regarding censoring, we have confined the analysis to right-censored outcomes. However, the methodology can be extended with no trouble to left-censoring and interval censoring. All computations have been implemented in R. Readers who are interested in the proposed model can request a copy of all data sets and software codes from the authors.

Author Contributions

Conceptualization, N.B.; methodology, K.L., Y.Q.W., X.Z., and N.B.; software, K.L.; validation, Y.Q.W.; formal analysis, K.L. and Y.Q.W.; data curation, K.L. and Y.Q.W.; writing—original draft preparation, K.L. and Y.Q.W.; writing—review and editing, X.Z. and N.B.; supervision, N.B.; project administration, Y.Q.W.; funding acquisition, K.L., X.Z., and N.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China—Young Scientists Fund grant number 12201410, Natural Sciences and Engineering Research Council of Canada—Individual Discovery Grant and Natural Science Foundation of Jiangsu Province, China–The Excellent Young Scholar Programme grant number BK20220098.

Data Availability Statement

The data that support the findings of this study are available in R parfm and survival packages.

Acknowledgments

The authors would like to thank the anonymous reviewers for their insightful suggestions, which greatly improved the quality of this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviation

The following abbreviation is used in this manuscript:
GGgeneralized gamma

References

  1. Duchateau, L.; Janssen, P. The Frailty Model; Springer: New York, NY, USA, 2007. [Google Scholar]
  2. Hinde, J.; Demetrio, C. Overdispersion: Models and estimation. Comput. Stat. Data Anal. 1998, 27, 151–170. [Google Scholar] [CrossRef]
  3. Molenberghs, G.; Verbeke, G.; Demetrio, C. A family of generalized linear models for repeated measures with normal and conjugate random effects. Stat. Sci. 2010, 25, 325–347. [Google Scholar] [CrossRef]
  4. Molenberghs, G.; Verbeke, G.; Efendi, A.; Braekers, R.; Demetrio, C. A combined gamma frailty and normal random-effects model for repeated, overdispersed time-to-event data. Stat. Methods Med. Res. 2015, 24, 434–452. [Google Scholar] [CrossRef] [PubMed]
  5. Engel, B.; Keen, A. A simple approach for the analysis of generalized linear mixed models. Stat. Neerl. 1994, 48, 1–22. [Google Scholar] [CrossRef]
  6. Breslow, N.; Clayton, D. Approximate inference in generalized linear mixed models. J. Am. Stat. Assoc. 1993, 88, 9–25. [Google Scholar] [CrossRef]
  7. Balakrishnan, N.; Peng, Y. Generalized gamma frailty. model. Stat. Med. 2006, 25, 2797–2816. [Google Scholar] [CrossRef] [PubMed]
  8. Stacy, E. A generalization of the gamma distribution. Ann. Math. Stat. 1962, 33, 1187–1192. [Google Scholar] [CrossRef]
  9. Johnson, N.; Kotz, S.; Balakrishnan, N. Continuous Univariate Distributions, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 1995; Volume 2. [Google Scholar]
  10. Prentice, R. A log gamma model and its maximum likelihood estimation. Biometrika 1974, 61, 539–544. [Google Scholar] [CrossRef]
  11. Lawless, J. Statistical Models and Methods for Lifetime Data, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2003. [Google Scholar]
  12. Yamaguchi, K. Accelerated failure-time regression models with a regression model of surviving fraction: An application to the analysis of ‘permanent employment’ in Japan. J. Am. Stat. Assoc. 1992, 87, 284–292. [Google Scholar] [PubMed]
  13. Peng, Y.; Dear, K.; Denham, J. A generalized F mixture model for cure rate estimation. Stat. Med. 1998, 17, 813–830. [Google Scholar] [CrossRef]
  14. Balakrishnan, N.; Pal, S. An EM algorithm for the estimation of parameters of a flexible cure rate model with generalized gamma lifetime and model discrimination using likelihood-and information-based methods. Comput. Stat. 2015, 30, 151–189. [Google Scholar] [CrossRef]
  15. Shin, J.; Chang, J.; Kim, N. Statistical modeling of speech signals based on generalized gamma distribution. IEEE Signal Process. Lett. 2005, 12, 258–261. [Google Scholar] [CrossRef]
  16. Cox, C.; Chu, H.; Schneider, M.; Muñoz, A. Parametric survival analysis and taxonomy of hazard functions for the generalized gamma distribution. Stat. Med. 2007, 26, 4352–4374. [Google Scholar] [CrossRef] [PubMed]
  17. Asher, I.; Pearce, N. Global burden of asthma among children. Int. J. Tuberc. Lung Dis. 2014, 18, 1269–1278. [Google Scholar] [CrossRef] [PubMed]
  18. Andrews, D.F.; Herzberg, A.M. Data: A Collection of Problems from Many Fields for the Student and Research Worker; Springer: New York, NY, USA, 2012. [Google Scholar]
  19. Wei, L.J.; Lin, D.Y.; Weissfeld, L. Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J. Am. Stat. Assoc. 1989, 84, 1065–1073. [Google Scholar] [CrossRef]
Table 1. MSEs and bias (×100) of estimated treatment effect ξ 1 , normal random effect d , and frailty variance ν with 10% censoring rate.
Table 1. MSEs and bias (×100) of estimated treatment effect ξ 1 , normal random effect d , and frailty variance ν with 10% censoring rate.
TrueParameterFitted Models
GammaLognormalWeibullGG
BiasMSEBiasMSEBiasMSEBiasMSE
n = 30
Gamma ξ 1 = 0.1 2.781.550.277.540.127.192.262.95
ν = 0.5 −0.610.1066.5768.57−7.972.086.635.52
d = 0.5 0.491.33−5.513.30−6.093.06−2.002.03
Lognormal ξ 1 = 0.1 1.812.15−0.176.69−0.577.411.854.48
ν = 0.5 −2.290.13−0.605.38−28.628.94−11.784.04
d = 0.5 −4.671.82−4.082.17−5.643.41−2.212.46
Weibull ξ 1 = 0.1 4.471.54−1.798.49−0.997.291.152.59
ν = 0.5 0.670.14128.56241.434.112.9316.2919.13
d = 0.5 4.351.44−3.414.25−5.163.26−1.092.27
n = 100
Gamma ξ 1 = 0.1 0.470.19−0.272.15−0.321.700.201.25
ν = 0.5 −0.650.2065.6850.46−7.500.979.879.16
d = 0.5 3.130.49−0.170.71−0.420.54−0.180.42
Lognormal ξ 1 = 0.1 1.780.200.621.470.791.830.681.47
ν = 0.5 −5.711.181.581.25−27.597.83−8.478.31
d = 0.5 2.800.53−0.760.40−0.660.49−0.860.44
Weibull ξ 1 = 0.1 0.440.220.352.590.862.020.661.49
ν = 0.5 0.710.09123.45171.493.210.8014.4615.26
d = 0.5 3.860.813.220.921.390.531.200.46
n = 200
Gamma ξ 1 = 0.1 0.580.230.501.200.270.990.110.56
ν = 0.5 −0.030.0864.0344.18−7.370.806.405.74
d = 0.5 2.570.331.000.330.880.240.980.20
Lognormal ξ 1 = 0.1 1.860.120.250.75−0.290.980.070.85
ν = 0.5 −5.651.090.210.65−27.907.89−7.843.79
d = 0.5 3.120.520.040.180.180.25−0.020.23
Weibull ξ 1 = 0.1 0.180.130.691.511.090.900.930.69
ν = 0.5 0.700.03123.48165.463.050.568.507.64
d = 0.5 3.520.853.500.711.160.281.230.27
n = 400
Gamma ξ 1 = 0.1 0.860.060.130.630.130.520.500.23
ν = 0.5 0.120.0662.7241.13−7.610.744.924.47
d = 0.5 2.500.251.570.190.860.131.270.12
Lognormal ξ 1 = 0.1 2.160.110.520.320.210.460.510.36
ν = 0.5 −4.750.870.550.31−27.607.67−4.923.02
d = 0.5 2.520.310.690.080.560.110.660.10
Weibull ξ 1 = 0.1 0.240.090.390.930.470.520.410.36
ν = 0.5 0.730.04124.71161.583.780.456.534.13
d = 0.5 4.091.135.670.741.990.241.700.17
Table 2. Comparison of results from existing work and method proposed in this work.
Table 2. Comparison of results from existing work and method proposed in this work.
TrueParameterFitted Gamma
Pseudo-Likelihood [4]MC-Based Likelihood
BiasMean s.e.BiasMean s.e.
n = 100
Gamma ξ 1 = 0.1 −0.10760.22240.00470.138
d = 0.5 1.35880.12560.03130.0134
n = 200
ξ 1 = 0.1 −0.09850.14990.00580.0991
d = 0.5 1.10350.08610.02570.0075
Table 3. Selection rates (selection rates of the correct model in bold) based on AIC with 10% censoring rate.
Table 3. Selection rates (selection rates of the correct model in bold) based on AIC with 10% censoring rate.
True ModelFitted Model
GammaLognormalWeibullGG*GG
n = 30
Gamma0.5720.0860.0740.2680.840
Lognormal0.2340.3180.2000.2480.566
Weibull0.4840.0700.1440.3020.446
n = 100
Gamma0.3120.2020.3280.1580.470
Lognormal0.0780.5440.3060.0720.616
Weibull0.2620.0920.4960.1500.646
n = 200
Gamma0.4000.1380.2760.1860.586
Lognormal0.0480.6140.2840.0540.668
Weibull0.2900.0400.5140.1560.670
n = 400
Gamma0.3480.1020.3140.2360.584
Lognormal0.0620.6780.1880.0720.750
Weibull0.1920.0200.6220.1660.788
GG*: generalized gamma excluding special cases.
Table 4. Recurrent attacks data for the first two patients.
Table 4. Recurrent attacks data for the first two patients.
Patient IDDrugBeginEndStatus
100151
1022901
10963251
103293321
103383691
103704121
104184221
104264741
104775261
105306000
2101801
211892671
212735811
215826000
Table 5. Asthma study without censoring: estimates (standard errors) of parameters from the four fitted frailty models with normal random effects.
Table 5. Asthma study without censoring: estimates (standard errors) of parameters from the four fitted frailty models with normal random effects.
ParameterGammaLognormalWeibullGG
Intercept ξ 0 −3.995 (0.501)−4.009 (15.681)−4.011 (8.922)−4.009 (15.681)
Treatment effect ξ 1 −0.082 (0.084)−0.084 (0.084)−0.092 (0.085)−0.084 (0.084)
Scale parameter λ 0.814 (0.404)0.878 (13.771)0.804 (7.170)0.878 (13.771)
Shape parameterα *6.995 (0.001)---
Shape parameterq---0
Shape parameter σ -0.462 (0.057)3.693 (0.608)0.462 (0.057)
SD random effect d 0.472 (0.041)0.460 (0.041)0.486 (0.042)0.460 (0.041)
logL−9314.007−9310.748−9317.422−9310.748
AIC18,638.01418,631.49618,644.84318,631.496
* : α = q 2 for Gamma.
Table 6. Asthma study without censoring: Wald test results for the assessment of treatment effect.
Table 6. Asthma study without censoring: Wald test results for the assessment of treatment effect.
ModelZ-Valuep-Value
Exponential–Gamma–Normal−1.05500.1457
Exponential–Lognormal–Normal−1.00240.1581
Exponential–Weibull–Normal−1.08730.1385
Exponential–GG–Normal−1.05470.1458
Table 7. Asthma study with censoring: estimates (standard errors) of parameters from the four fitted frailty models with normal random effects.
Table 7. Asthma study with censoring: estimates (standard errors) of parameters from the four fitted frailty models with normal random effects.
ParameterGammaLognormalWeibullGG
Intercept ξ 0 −4.021 (0.070)−3.988 (13.552)−4.033 (23.971)−3.988 (13.552)
Treatment effect ξ 1 −0.112 (0.099)−0.108 (0.099)−0.127 (0.101)−0.108 (0.099)
Scale parameter λ 0.787 (0.0001)0.822 (11.140)0.780 (18.705)0.822 (11.140)
Shape parameterα *3.836 (0.001)---
Shape parameterq---0
Shape parameter σ -0.630 (0.058)2.310 (0.256)0.630 (0.058)
SD random effect d 0.567 (0.001)0.560 (0.050)0.561 (0.051)0.5601 (0.050)
logL−8326.454−8319.916−8328.188−8319.916
AIC16,662.90816,649.83216,666.37616,649.832
* : α = q 2 for Gamma.
Table 8. Asthma study without censoring: estimates (standard errors) of parameters from the four fitted frailty models with normal random effects; refitted with scale parameter λ = 1 .
Table 8. Asthma study without censoring: estimates (standard errors) of parameters from the four fitted frailty models with normal random effects; refitted with scale parameter λ = 1 .
ParameterGammaLognormalWeibullGG
Intercept ξ 0 −4.207 (0.058)−4.244 (0.069)−4.158 (0.077)−4.197 (0.065)
Treatment effect ξ 1 −0.092 (0.082)−0.081 (0.084)-0.097 (0.084)−0.088 (0.086)
Shape parameterα *7.839 (0.0002)---
Shape parameterq---−0.510 (0.001)
Shape parameter σ -3.715 (0.609)0.440 (0.059)0.461 (0.009)
SD random effect d 0.473 (0.0004)0.482 (0.040)0.461 (0.040)0.473 (0.043)
logL−9313.029−9315.605−9314.543−9312.541
AIC18,634.05818,639.21018,637.08618,635.082
* : α = q 2 for Gamma.
Table 9. Asthma study with censoring: estimates (standard errors) of parameters from the four fitted frailty models with normal random effects; refitted with scale parameter λ = 1 .
Table 9. Asthma study with censoring: estimates (standard errors) of parameters from the four fitted frailty models with normal random effects; refitted with scale parameter λ = 1 .
ParameterGammaLognormalWeibullGG
Intercept ξ 0 −4.258 (0.089)−4.184 (0.090)−4.282 (0.082)−4.184 (0.090)
Treatment effect ξ 1 −0.112 (0.113)−0.108 (0.099)−0.127 (0.101)−0.108 (0.099)
Shape parameterα * 3.5634 (0.0016)---
Shape parameterq---0
Shape parameter σ -0.630 (0.058)2.310 (0.256)0.630 (0.058)
SD random effect d 0.562 (0.007)0.560 (0.050)0.561 (0.051)0.560 (0.050)
logL−8324.160−8319.916−8328.188−8319.916
AIC16,656.32016,647.83216,664.37616,647.832
* : α = q 2 for Gamma.
Table 10. Asthma study: Wald test results for the assessment of treatment effect; refitted with scale parameter λ = 1 .
Table 10. Asthma study: Wald test results for the assessment of treatment effect; refitted with scale parameter λ = 1 .
ModelZ-Valuep-Value
Exponential—Gamma–Normal without censoring−1.12620.1300
Exponential–Lognormal–Normal without censoring−1.15670.1237
Exponential–Weibull–Normal without censoring−0.96670.1669
Exponential–GG–Normal without censoring−1.03150.1512
Exponential–Gamma–Normal with censoring−0.98850.1615
Exponential–Lognormal–Normal with censoring−1.09830.1360
Exponential–Weibull–Normal with censoring−1.26370.1032
Exponential–GG–Normal with censoring−1.10790.1340
Table 11. Bladder study: estimates (standard errors) of parameters from the four fitted frailty models with normal random effects with censoring.
Table 11. Bladder study: estimates (standard errors) of parameters from the four fitted frailty models with normal random effects with censoring.
ParameterGammaLognormalWeibullGG
Treatment effect ξ 1 −0.349 (0.125)−0.316 (0.323)−0.533 (0.337)−0.533 (0.337)
Scale parameter λ 0.187 (0.012)0.051 (0.028)0.072 (0.037))0.072 (0.037)
Shape parameterα *0.528 (0.002)---
Shape parameterq----
Shape parameter σ -0.132 (1.010)4.307 (8.419)4.307 (8.419)
SD random effect d 0.240 (0.013)1.069 (0.200)1.057 (0.206)1.057 (0.206)
logL−449.5437−441.8936−441.6807−441.6807
AIC 891.7872891.3614891.3614
* : α = q 2 for Gamma.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, K.; Wang, Y.Q.; Zhu, X.; Balakrishnan, N. Generalized Gamma Frailty and Symmetric Normal Random Effects Model for Repeated Time-to-Event Data. Symmetry 2025, 17, 1760. https://doi.org/10.3390/sym17101760

AMA Style

Liu K, Wang YQ, Zhu X, Balakrishnan N. Generalized Gamma Frailty and Symmetric Normal Random Effects Model for Repeated Time-to-Event Data. Symmetry. 2025; 17(10):1760. https://doi.org/10.3390/sym17101760

Chicago/Turabian Style

Liu, Kai, Yan Qiao Wang, Xiaojun Zhu, and Narayanaswamy Balakrishnan. 2025. "Generalized Gamma Frailty and Symmetric Normal Random Effects Model for Repeated Time-to-Event Data" Symmetry 17, no. 10: 1760. https://doi.org/10.3390/sym17101760

APA Style

Liu, K., Wang, Y. Q., Zhu, X., & Balakrishnan, N. (2025). Generalized Gamma Frailty and Symmetric Normal Random Effects Model for Repeated Time-to-Event Data. Symmetry, 17(10), 1760. https://doi.org/10.3390/sym17101760

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop