Next Article in Journal
High-Efficiency Lightweight Quantum Key Agreement Scheme Based on Bell State Entanglement
Previous Article in Journal
A Multimodal Fake News Detection Method Based on Contrastive Learning and Variational Autoencoder
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Bayesian Estimation for α-Mixture Survival Models

1
Department of Statistics and Actuarial Science, Northern Illinois University, DeKalb, IL 60115, USA
2
Department of Statistics, The Pennsylvania State University, University Park, PA 16802, USA
3
Department of Mathematics and Statistics, University of North Florida, Jacksonville, FL 32224, USA
*
Author to whom correspondence should be addressed.
Mathematics 2026, 14(10), 1772; https://doi.org/10.3390/math14101772
Submission received: 24 February 2026 / Revised: 16 April 2026 / Accepted: 24 April 2026 / Published: 21 May 2026

Abstract

Heterogeneity in survival data poses substantial challenges for identifying appropriate mixture structures. The α -mixture family provides a flexible class of survival models that generalizes standard mixture formulations through a continuous weighting parameter, allowing it to balance failure rates and distributional shapes. Despite its theoretical appeal, the Bayesian inference for α -mixture survival models has received limited attention. In this paper, we develop a Bayesian framework for inference for α -mixture survival models, with a particular emphasis on estimation and structural identification. The posterior inference is conducted using Markov chain Monte Carlo methods, and simulation studies demonstrate accurate recovery of model parameters across a range of heterogeneous survival settings. The posterior distribution of the mixing parameter α offers a principled mechanism for model selection by identifying the mixture structure most consistent with the observed data. Applications to real-world datasets illustrate the interpretability and practical utility of the proposed approach in survival analysis.

1. Introduction

Survival analysis is a statistical approach used to analyze time-to-event data, which measure the time until a specific event occurs, such as disease progression, treatment response, or death [1]. This approach is widely used in biomedical research, where predicting survival probabilities and time-to-event outcomes is crucial for understanding patient prognosis and evaluating treatments [2].
Traditional survival analysis methods often use techniques such as the Kaplan–Meier estimator to estimate survival functions and the Cox proportional hazards model to assess the relationship between predictor variables and event risk over time [3]. However, these models involve limitations when dealing with more complex data, such as censored observations, time-varying covariate effects, and population heterogeneity.
Bayesian survival analysis extends traditional methods by combining prior information with observed data in survival models. This approach is especially useful when sample sizes are small or when there is substantial uncertainty in the data [4]. Bayesian frameworks use prior distributions to combine existing knowledge or expert opinion with observed data, resulting in more flexible and robust models [5]. Additionally, Bayesian methods naturally quantify the uncertainty in parameter estimates, which is crucial for precise predictions in fields such as personalized medicine [6].
Traditional survival models often assume that populations are homogeneous, but in reality, different groups may exhibit distinct survival patterns. Mixture models address this issue by allowing researchers to model diverse populations with varying survival outcomes [7]. These models include finite mixture models, latent class models, and nonparametric mixture models, each serving a specific role in handling data complexity.
Bayesian mixture-based survival modeling has also received increasing attention beyond classical parametric formulations. For instance, Kottas [8] developed a nonparametric Bayesian survival model using mixtures of Weibull distributions, while Iorio et al. [9] proposed a Bayesian nonparametric approach for survival regression under nonproportional hazards. Other related work includes Dirichlet process mixture models for survival outcome data [10], full Bayesian inference for hazard mixture models [11], and Bayesian nonparametric Erlang mixture modeling for survival analysis [12]. Although these studies demonstrate the flexibility of Bayesian mixture methods for modeling heterogeneous survival data, they do not address structural selection among arithmetic, geometric, harmonic, and intermediate mixture forms through a single mixing parameter. This gap motivates the present Bayesian treatment of the α -mixture survival model.
Finite mixture models divide a population into a finite number of subgroups, making it easier to identify distinct survival patterns within each group [7]. Latent class models extend this idea by grouping individuals into unobserved categories based on shared characteristics, thereby revealing hidden structures within the data [13]. Nonparametric mixture models, on the other hand, offer greater flexibility by avoiding strict parametric assumptions and enabling a more data-driven approach to identifying underlying survival distributions [14]. Together, these models improve the precision of survival analysis and provide a deeper understanding of the factors that influence survival outcomes.
Among the different types of mixture models, arithmetic, geometric, and harmonic mixtures stand out [15,16]. An arithmetic mixture model takes a weighted average of the individual component distributions and is the most common approach. In contrast, geometric and harmonic mixtures use alternative forms of aggregation, allowing them to capture distinctive data features such as skewness or heavy tails [17,18].
In a recent study, Asadi et al. [19] introduced the α -mixture model, which generalizes conventional mixture models in survival analysis using a single mixing power parameter α . The framework includes both survival mixtures and failure-rate mixtures as special cases, allowing for a more comprehensive approach to modeling heterogeneous survival data. The main idea is that different values of α correspond to different types of mixtures. Therefore, the  α -mixture model creates a continuous bridge among the classical mixture types. The detailed properties of the α mixture, along with the value of α , can be found in [19].
While most existing studies on α -mixture models have concentrated on their stochastic properties, such as comparisons, aging characteristics, and bending behavior (e.g., Shojaee et al. [20,21], Barmalzan et al. [22], Shojaee and Momeni [23]), the development of statistical inference methods within a Bayesian framework remains relatively underexplored.
A critical component of survival mixture modeling is model selection, which determines the model that best fits the data. This task becomes especially important when comparing different types of mixture models. Traditional criteria such as AIC, BIC, DIC, WAIC, and Bayes factors evaluate model adequacy based on likelihood comparisons and assume fixed model structures. In contrast, the  α -mixture model extends model selection to the structural level by introducing a parameter that generalizes classical mixture formulations. This parameter allows the model to capture heterogeneity more flexibly and improve overall fit. Through its posterior distribution, the  α -mixture model identifies the mixture form most consistent with the observed data. This unified formulation removes the need to fit multiple separate models and allows model selection and parameter estimation to occur simultaneously within a coherent Bayesian setting. For the model assessment, in this paper, we consider the logarithm of the pseudo-marginal likelihood (LPML). The LPML is a leave-one-out cross-validated criterion based on the conditional predictive ordinate [24].
The remainder of this paper is organized as follows. Section 2 presents the mathematical formulation of the α -mixture model and the Bayesian estimation procedure. Section 3 illustrates the approach through three examples involving different mixture combinations. Section 4 presents the results of the simulation studies. Section 5 analyzes three data sets with α -mixture models. Section 6 concludes the study and discusses the results, implications, and directions for future research.

2. Model

Suppose that S j ( · ) represents the survival function for component j, where j = 1 , , m . Let t i denote the survival time for the subject i, and let p j be the mixing proportion for the component j. Let α denote the mixing power, and let Ω denote the set of all parameters involved in the model. Following [19], the survival function of the α -mixture model is defined as
S α ( t i Ω ) = j = 1 m p j S j α ( t i θ j ) 1 α , α 0 , j = 1 m S j p j ( t i θ j ) , α = 0 ,
where θ j represents the parameters in the specific component, j, of the mixing function. Here, it should be noted that lim α 0 S α ( t i Ω ) = j = 1 m S j p j ( t i θ j ) , as shown in [19]. Let h j ( · ) denote the failure rate for component j. The corresponding probability density function (pdf) for subject i is given by
f α ( t i Ω ) = j = 1 m p j S j α ( t i θ j ) 1 α 1 j = 1 m p j S j α ( t i θ j ) h j ( t i θ j ) , α 0 , j = 1 m S j p j ( t i θ j ) j = 1 m p j h j ( t i θ j ) , α = 0 .
To avoid the label switch and identifiability issues, as did [25], we consider non-scalable survival functions S j ( · ) . Further, by the constraint j = 1 m p j = 1 and p 1 p m , we avoid the identifiability issue between α and p j , j = 1 , , m .
For right-censored data, let T i denote the event time and C i denote the censoring time for subject i = 1 , , n . Then, the observed survival time t i and the censoring indicator δ i can, respectively, be expressed as
t i = min ( T i , C i ) , δ i = I ( T i C i ) ,
where I ( · ) denotes the indicator function. Let t = ( t 1 , , t n ) denote the vector of observed survival times, and let δ = ( δ 1 , , δ n ) denote the vector of censoring indicators. Then, the observed-data likelihood is given by
f α * ( t , δ Ω ) = i = 1 n f α ( t i Ω ) δ i S α ( t i Ω ) 1 δ i = i = 1 n S α ( t i Ω ) h α ( t i Ω ) δ i ,
where h j ( · ) denotes the failure rate for component j. Thus, if  δ i = 1 , the contribution is the density f α ( t i Ω ) , whereas if δ i = 0 , the contribution is the survival function S α ( t i Ω ) .
Equation (2) presents the pdf of the α -mixture model, and different values of α lead to different types of mixtures. For example, in the absence of censoring, it follows from (2) that:
(a)
when α = 1 , (2) gives the pdf of the arithmetic mixture such that
f α ( t i Ω ) = j = 1 m p j S j ( t i θ j ) h j ( t i θ j ) , i = 1 , , n ,
(b)
when α = 0 , (2) gives the pdf of the geometric mixture, such that
f α ( t i Ω ) = j = 1 m S j p j ( t i θ j ) j = 1 m p j h j ( t i θ j ) , i = 1 , , n ,
(c)
and, when α = 1 , (2) gives the pdf of the harmonic mixture, such that
f α ( t i Ω ) = j = 1 m p j S j 1 ( t i θ j ) 2 j = 1 m p j S j 1 ( t i θ j ) h j ( t i θ j ) , i = 1 , , n .
Under the Bayesian framework, for the proper joint posterior distribution of parameters, we assign independent proper prior distributions for parameters involved in the model. The mixing power α takes a real value, so that we assign a Normal prior distribution on it. The mixing proportions p = ( p 1 , , p m ) indicate the membership probabilities of mixing components. Hence, we consider a Dirichlet prior distribution for it. For the parameters involved in the component j, that is, θ j , we assign appropriate prior distributions depending on its survival function. For example, we assign Normal prior distributions for the Lognormal survival function, and we assign Gamma prior distribution for Weibull survival function.
Based on Bayes rule, we may obtain the joint posterior distribution of all parameters given survival times from the joint distribution of all survival times and parameters that can be expressed by the likelihood and the product of prior distributions. Regarding Monte Carlo-based inferences, we may collect Markov Chain Monte Carlo samples through Gibbs sampling by using the full conditional distributions. For example, if we assign prior distributions for the α -mixture with m components and a single positive parameter in each component, such that
α Normal ( μ α , σ α 2 ) , p Dirichlet ( c 1 , , c m ) , θ j Gamma ( A j , B j ) , j = 1 , , m ,
for the hyperparameters μ α , σ α 2 , c j and A j , B j , j = 1 , , m , then we may use the full conditional distributions that are proportional to
p ( α · ) i = 1 n f α ( t i Ω ) exp 1 2 σ α 2 ( α μ α ) 2 , p ( p · ) i = 1 n f α ( t i Ω ) j = 1 m p j c j 1 , p ( θ j · ) i = 1 n f α ( t i Ω ) θ j ( A j 1 ) exp θ j B j .
It should be noted that the estimation of parameters in the α -mixture model includes not only the parameters in each mixture components and membership probabilities but also the mixing power α . The Bayes estimator of the mixing power, that is the posterior mean of α , reveals the appropriate types of mixture by (1) and (2).
We update α , the mixture weights p , and  θ j or μ j using Metropolis–Hastings within Gibbs algorithms. At iteration k, proposal values are generated as follows:
α ( k ) N α ( k 1 ) , 1 , p ( k ) Dirichlet ( c ) , θ j ( k ) Gamma ( a j , b j ) ,
where c is a vector of concentration parameters, and the Gamma proposal distribution is parameterized by
a j = θ j ( k 1 ) 2 10 2 , b j = 10 2 θ j ( k 1 ) .
For mixtures of Lognormal components, the location parameters μ j are updated using a Normal proposal:
μ j ( k ) N μ j ( k 1 ) , σ μ 2 ,
rather than the Gamma proposal used for θ j .
We present three examples involving different distributions in the next section.

3. Examples

3.1. Example 1: Weibull–Weibull Mixture Model

In this first example, we consider a mixture model with two Weibull distributions with the shape parameters fixed at 2. The survival function for subject i is
S j ( t i θ j ) = exp t i θ j 2 , i = 1 , , n , j = 1 , 2 ,
and the corresponding failure rate function is
h j ( t i θ j ) = 2 t i θ j 2 ,
where θ j is the scale parameter for component j. We have considered the three sample sizes with n = 100 , 1000 , 10,000.
With mixing proportions p = ( p 1 , p 2 ) , we define the parameter set
Ω = { θ 1 , θ 2 , p , α } ,
which contains all model parameters. Let t = { t 1 , , t n } denote the survival times for all subjects. Let δ i = 1 if the event time is observed and δ i = 0 if the observation is right-censored. Then, the likelihood function of the α -mixture model is
f α * ( t , δ | Ω ) = i = 1 n p 1 S 1 α ( t i | θ 1 ) + p 2 S 2 α ( t i | θ 2 ) 1 δ i [ p 1 S 1 α ( t i | θ 1 ) + p 2 S 2 α ( t i | θ 2 ) 1 α 1 × p 1 S 1 α ( t i | θ 1 ) 2 t i θ 1 2 + p 2 S 2 α ( t i | θ 2 ) 2 t i θ 2 2 ] δ i , α 0 , S 1 p 1 ( t i | θ 1 ) S 2 p 2 ( t i | θ 2 ) 1 δ i exp t i 2 p 1 θ 1 2 + p 2 θ 2 2 2 t i p 1 θ 1 2 + 2 t i p 2 θ 2 2 δ i , α = 0 ,
where p 1 + p 2 = 1 .
We assume that the failure times are independently distributed across subjects and use a Normal prior for α , specified as α N ( μ α , σ α 2 ) , where μ α = 0 and σ α 2 = 10 . A Beta prior, B ( c 1 , c 2 ) , is assigned to the mixing proportions p j , with  c 1 = c 2 = 1 . For the scale parameters θ j , we assume Gamma priors, G ( A j , B j ) , with  A j = B j = 1 .
Let t = { t 1 , , t n } denote the observed survival times, with  δ i = 1 if the event is observed and δ i = 0 if the observation is right-censored.
Applying Bayes’ rule, we derive the posterior distributions for the parameters, as follows:
α · f α * ( t , δ Ω ) exp 1 2 σ α 2 ( α μ α ) 2 , p · f α * ( t , δ Ω ) j = 1 2 p j c j 1 , θ j · f α * ( t , δ Ω ) θ j A j 1 exp θ j B j , j = 1 , 2 .
It should be noted that, because the expression for S α ( t i Ω ) has a limiting form at α = 0 , we evaluate the geometric-mixture limit whenever | α | < Δ α . This is a numerical convention used for stability and for classification of the near-zero case, rather than a literal point mass in the posterior distribution.
In the Markov chain Monte Carlo (MCMC) sampling procedure, we consider the full conditional distribution of p ( α = 0 | · ) . In Equation (1), lim α 0 S α ( t i | Ω ) = j = 1 m S j p j ( t i | θ j ) , so that α ( Δ α , Δ α ) p ( α | · ) d α p ( α = 0 | · ) as Δ α 0 . Hence, in Gibbs sampling, we generate the sample α from its full conditional distribution p ( α | · ) and take it as zero if it is within ( Δ α , Δ α ) for a small Δ α value. Here, we use Δ α = 0.01 .

3.2. Example 2: Gamma–Weibull Mixture Model

In this example, we consider a mixture of two different distributions, Gamma and Weibull, and fix the shape parameters at 1 and 1 2 , respectively. The survival functions for subject i are
S 1 ( t i θ 1 ) = exp t i θ 1 , S 2 ( t i θ 2 ) = exp t i θ 2 1 / 2 .
The corresponding failure rate functions are
h 1 ( t i θ 1 ) = 1 θ 1 , h 2 ( t i θ 2 ) = 1 2 ( t i θ 2 ) 1 / 2 .
With mixing proportions p = ( p 1 , p 2 ) , we define the parameter set
Ω = { θ 1 , θ 2 , p , α } ,
which contains all model parameters. Let t = { t 1 , , t n } denote the survival times for all subjects. The likelihood function of the α -mixture model is
f α * ( t , δ | Ω ) = i = 1 n p 1 S 1 α ( t i | θ 1 ) + p 2 S 2 α ( t i | θ 2 ) 1 α 1 p 1 θ 1 S 1 α ( t i | θ 1 ) + p 2 2 ( t i θ 2 ) 1 2 S 2 α ( t i | θ 2 ) δ i × p 1 S 1 α ( t i | θ 1 ) + p 2 S 2 α ( t i | θ 2 ) 1 α 1 δ i , α 0 , S 1 p 1 ( t i | θ 1 ) S 2 p 2 ( t i | θ 2 ) p 1 θ 1 + p 2 2 ( t i θ 2 ) 1 2 δ i S 1 p 1 ( t i | θ 1 ) S 2 p 2 ( t i | θ 2 ) 1 δ i , α = 0 ,
where p 1 + p 2 = 1 .
The same prior information for each parameter is used, following a similar approach to that in Example 1. We obtain the posterior distributions for the parameters:
α · f α * ( t , δ Ω ) exp 1 2 σ α 2 ( α μ α ) 2 , p · f α * ( t , δ Ω ) j = 1 2 p j c j 1 , θ j · f α * ( t , δ Ω ) θ j A j 1 exp θ j B j , j = 1 , 2 .
For the sampling of α in the Gibbs procedure, we use the full conditional distribution p ( α | · ) . If the generated value lies in ( 0.01 , 0.01 ) , we evaluate the limiting geometric-mixture form at α = 0 and record the value as zero for reporting purposes. This near-zero rule is a numerical convention, not a literal posterior point mass at zero.

3.3. Example 3: Lognormal–Lognormal Mixture Model

In this example, we consider a mixture model with two Lognormal distributions and fix the standard deviations at σ 1 = 0.5 and σ 2 = 0.1 , respectively. The survival functions for subject i are
S 1 ( t i μ 1 ) = 1 Φ log ( t i ) μ 1 0.5 , S 2 ( t i μ 2 ) = 1 Φ log ( t i ) μ 2 0.1 .
The corresponding failure-rate functions are
h 1 ( t i μ 1 ) = t i 1 ϕ log ( t i ) μ 1 0.5 / 0.5 1 Φ log ( t i ) μ 1 0.5 , h 2 ( t i μ 2 ) = t i 1 ϕ log ( t i ) μ 2 0.1 / 0.1 1 Φ log ( t i ) μ 2 0.1 ,
where ϕ and Φ denote the pdf and cumulative distribution function (cdf) of the Normal distribution, respectively.
With mixing proportions p = ( p 1 , p 2 ) , we define the parameter set
Ω = { μ 1 , μ 2 , p , α } ,
which contains all model parameters. Let t = { t 1 , , t n } denote the survival times for all subjects. The likelihood function of the α -mixture model is
f α * ( t , δ | Ω ) = i = 1 n p 1 S 1 α ( t i | μ 1 ) + p 2 S 2 α ( t i | μ 2 ) 1 α 1 p 1 S 1 α ( t i | μ 1 ) h 1 ( t i | μ 1 ) + p 2 S 2 α h 2 ( t i | μ 2 ) δ i × p 1 S 1 α ( t i | μ 1 ) + p 2 S 2 α ( t i | μ 2 ) 1 δ i α , α 0 , S 1 p 1 ( t i | μ 1 ) S 2 p 2 ( t i | μ 2 ) p 1 h 1 ( t i | μ 1 ) + p 2 h 2 ( t i | μ 2 ) δ i S 1 p 1 ( t i | μ 1 ) S 2 p 2 ( t i | μ 2 ) 1 δ i , α = 0 ,
where p 1 + p 2 = 1 .
We use the same prior specification for α and p j , and we assign a Normal prior N ( m j , s j 2 ) to each μ j , j = 1 , 2 . Following an approach similar to that used in the previous examples, we derive the posterior distributions for the parameters:
α · f α * ( t , δ Ω ) exp 1 2 σ α 2 ( α μ α ) 2 , p · f α * ( t , δ Ω ) j = 1 2 p j c j 1 , μ j · f α * ( t , δ Ω ) exp 1 2 s j 2 ( μ j m j ) 2 , j = 1 , 2 .
Note that, for the sampling of α in the Gibbs procedure, we use the full conditional distribution p ( α | · ) . If the generated value lies in ( 0.01 , 0.01 ) , we evaluate the limiting geometric-mixture form at α = 0 and record the value as zero for reporting purposes. This near-zero rule is a numerical convention, not a literal posterior point mass at zero.

4. Simulation Study

In this section, we conduct simulation studies to evaluate the performance of the proposed Bayesian estimation method for α -mixture models across three specific cases. The estimation accuracy is evaluated using the mean squared error (MSE). A model comparison is performed using the log pseudo-marginal likelihood (LPML), which is a commonly used Bayesian model selection criterion. Larger LPML values indicate a better model fit. LPML provides an alternative to other criteria, such as DIC and WAIC, and is particularly suitable for mixture models. For each case, we generate 5000 Markov chain Monte Carlo (MCMC) samples after a burn-in period of 5000 iterations and perform 100 simulation runs. The acceptance rates of the Metropolis–Hastings updates are also recorded to assess the mixing performance of the MCMC algorithm.
For each simulated dataset, the sample size is set to n = 100 , 1000 , 10 , 000 . Right censoring is introduced by generating censoring times from a uniform distribution with the upper bound chosen to yield approximately 10% censoring.
We have simulated the survival times under the following three cases:
  • WW: mixture of two Weibull distributions, W ( 2 , θ 1 = 1 ) and W ( 2 , θ 2 = 1 2 ) ;
  • GW: mixture of Gamma and Weibull distributions, G ( 1 , θ 1 = 1 ) and W ( 1 2 , θ 2 = 1 ) ; and
  • LL: mixture of two Lognormal distributions, L N ( μ 1 = 1 , 0.5 ) and L N ( μ 2 = 2 , 0.1 ) .
We assume equal mixing proportions, specifically p 1 = p 2 = 1 2 , for all cases, and evaluate the performance of the estimator for α at various true values. Specifically, for WW, we consider α { 1 , 0 , 0.25 , 0.5 , 0.75 , 1 } , while for GW and LL, we consider α { 1 , 0 , 1 } .
For Bayesian inference, we specify independent prior distributions for all parameters. The mixing parameter α is assigned a Normal prior, α N ( μ α , σ α 2 ) . The mixing proportions follow a Beta prior, p Beta ( c 1 , c 2 ) , for two–component mixtures. For the parameters in each component of the mixture, we have assigned independent Gamma prior distribution for the scale parameter and Normal prior distribution for the location parameter, respectively, with hyperparameters chosen to be weakly informative.
Posterior samples are obtained using a Metropolis–Hastings within the Gibbs sampling algorithm. At each iteration, the parameters are updated sequentially from their full conditional distributions. All the parameters are updated using Metropolis–Hastings steps since the closed-form full conditionals are not available. For each simulation, 10,000 iterations are generated, with the first 5000 iterations discarded as burn-in, and the remaining 5000 samples are used for inference. The convergence of the MCMC chains is assessed using trace plots, and the trace plots show good mixing behavior. Representative trace plots are provided in Appendix A. Posterior means based on the retained samples are used as point estimates, and the performance of the proposed method is evaluated using MSE, LPML, and acceptance rates.
Table 1 summarizes the estimation results for α under three cases. In addition to α , we also assess the recovery of mixture weights and component parameters. The estimates for p j , θ j , and  μ j , j = 1 , 2 , exhibit small bias and mean squared error across simulation settings, indicating accurate recovery of the underlying parameters. Furthermore, the coverage probabilities of the 95% credible intervals were examined and found to be close to the nominal level, suggesting that the proposed Bayesian procedure provides reliable uncertainty quantification.
To avoid imposing strong prior information, we adopt weakly informative priors for all model parameters. Specifically, the parameter α is assigned a Normal prior with large variance, allowing a wide range of plausible values. The mixing proportions are assigned Beta or Dirichlet priors, which ensure that the weights lie within the unit simplex and are commonly used in mixture modeling. The component parameters θ j are assigned Gamma priors to enforce positivity while remaining weakly informative. To assess robustness to prior specification, we have conducted sensitivity analyses using alternative hyperparameter values. The resulting estimates show qualitative similarity, indicating that the proposed method is not sensitive to the choice of priors.
Figure 1 displays the estimated survival curves for the three mixture model examples considered in the simulation study.

5. Data Application

We analyze three survival datasets: the Kidney Catheter data [26], the Hospital Infection data [27], and glioma data obtained from the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute (seer.cancer.gov). For each dataset, the proposed α -mixture model is fitted under multiple mixture specifications to assess the implied mixture structure through the estimated value of α . Table 2 summarizes the posterior results.
Here, WW denotes the two-component Weibull mixture described in Example 1, GW represents the Gamma–Weibull mixture from Example 2, and LL represents the Lognormal–Lognormal mixture from Example 3.
The posterior summaries for the model parameters are reported in Table 2. Across the three datasets, the estimates of α under the WW model are small and positive, indicating mixtures close to the geometric case. For the SEER–Medicare data, the credible interval for α is particularly narrow, suggesting strong evidence that the mixture structure is near-geometric. Although the estimates of α are close to zero in several cases, this does not imply that the geometric mixture necessarily provides the best fit. The value of α is estimated jointly with the mixing proportions and component parameters, and small deviations from zero can still affect the fitted survival function. As a result, the arithmetic case may provide a comparable or slightly better fit, as reflected by the LPML values reported in Table 3. This suggests that the data favor a mixture structure that is close to geometric but not exactly equal to it.
Table 4 compares LPML values for the single Weibull, single Lognormal, and single Gamma models across the three datasets. Since larger LPML values indicate a better fit, the single Lognormal model is preferred for all datasets. The differences are small for the Kidney Catheter data but substantial for the SEER-Medicare and Hospital Infection datasets, indicating that the Lognormal distribution provides the best single-model representation and is, therefore, used in the subsequent plots.
Figure 2 compares the fitted survival curves with the Kaplan–Meier estimates for the three real-data applications.
Across the three datasets, the Lognormal α -mixture model (red) generally provides the best overall fit compared with the fixed- α mixtures (blue) and the single Lognormal model (green). The fixed- α model differs by dataset: the Weibull–Weibull arithmetic mixture is preferred for the kidney data, whereas the Lognormal–Lognormal geometric mixture performs best for the hospital and SEER data. Although the single Lognormal model is the best within the single-model class for all datasets, it is less flexible and tends to deviate more from the Kaplan–Meier curve, especially in the tail regions.

6. Discussion

In this study, we developed and extended mixture models for analyzing heterogeneous survival data within a Bayesian framework. The work began with the Bayesian α -mixture model, which estimates the mixture type through the mixing power parameter α . Different values of α correspond to distinct types of mixtures, and the simulation studies demonstrated that the Bayesian α -mixture accurately estimates α across a range of distributional settings. The method was implemented in the R package alpmixBayes and applied to multiple real datasets, illustrating its practical usefulness for identifying mixture types and guiding applied survival analysis. These results show that the Bayesian α -mixture provides a reliable and interpretable approach to modeling heterogeneity when a single mixing power is adequate.
Within the Bayesian framework, model selection aims to identify the model that best captures the underlying data structure while balancing the goodness of fit and model complexity. Common criteria include the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC), the Deviance Information Criterion (DIC) [28], the Widely Applicable Information Criterion (WAIC) [29], and Bayes factors [30]. Although these measures provide systematic tools for comparing models, they rely on fixed model structures and require separate estimation for each candidate model, which can be computationally demanding when model uncertainty is present.
The α -mixture model offers an alternative by embedding multiple competing structures within a single formulation. Rather than fitting separate models, the parameter α allows the mixture to adjust its form according to the data, and its posterior distribution provides direct information about the appropriate structure. When α approaches specific values (e.g., 0 or 1), the mixture simplifies to particular component forms, effectively identifying the structure most consistent with the data. In this way, the α -mixture acts as a continuous bridge among competing formulations, rather than relying on discrete comparisons.
By incorporating structural learning into the estimation step, the α -mixture model provides a unified Bayesian approach to both parameter estimation and model selection. This reduces computational redundancy and allows for a more interpretable assessment of model adequacy while accounting for uncertainty in the underlying structure.
Future research may extend this work in several directions. One direction is to adapt the modeling framework to settings involving censoring or truncation, which are common in survival analysis and require careful treatment in the likelihood and computation. Another direction is to improve computational performance as the hierarchical structure grows since deeper models introduce more parameters and require efficient sampling or discretization methods to ensure stable estimation.
Additional applications to real data will help clarify when hierarchical mixtures offer clear benefits and when simpler models may be sufficient. Exploring a broader range of scientific settings can deepen our understanding of how mixture-based approaches represent heterogeneity and support survival modeling in more complex contexts. These extensions offer opportunities for further methodological development and for refining the role of mixture models in applied survival analysis.

Author Contributions

Methodology, F.L. and Z.Y.; software, F.L., D.R. and D.B.; investigation, Z.Y.; data curation, D.R. and D.B.; supervision, D.R. and D.B. All authors have read and agreed to the published version of the manuscript.

Funding

The hospital infection data analyzed in this study were used with permission from the appropriate data owner/custodian. These data were originally analyzed/described in Bilgili et al. (2016) [27] and were obtained through the project TUBITAK-107S178, Transmission Dynamics of Acinetobacter baumannii in Intensive Care Units. The present analysis received no additional external funding.

Data Availability Statement

The data are not publicly available because they are owned/controlled by a third party. Access to the data requires permission from the data owner/custodian.

Acknowledgments

We thank the Editor, Associate Editor, and the referees for thoughtful critiques that improved this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Trace Plots

Figure A1. Sample trace plots for the Weibull–Weibull model. From left to right: α , p 1 , θ 1 , and θ 2 .
Figure A1. Sample trace plots for the Weibull–Weibull model. From left to right: α , p 1 , θ 1 , and θ 2 .
Mathematics 14 01772 g0a1
Figure A2. Sample trace plots for Gamma–Weibull model. From left to right α , p 1 , θ 1 , and θ 2 .
Figure A2. Sample trace plots for Gamma–Weibull model. From left to right α , p 1 , θ 1 , and θ 2 .
Mathematics 14 01772 g0a2
Figure A3. Sample trace plots for the Lognormal–Lognormal model. From left to right: α , p 1 , μ 1 , and μ 2 .
Figure A3. Sample trace plots for the Lognormal–Lognormal model. From left to right: α , p 1 , μ 1 , and μ 2 .
Mathematics 14 01772 g0a3

Appendix B. MCMC Algorithm

Step 1: Initialization
  • Set the initial values of the parameters θ j , μ j , α , and p j .
  • Specify the hyperparameter values for the prior distributions of θ j , μ j , α , and p j .
Step 2: MCMC Sampling
Let K denote the total number of iterations and B the number of burn-in iterations. For k = 1 , 2 , , K ,
  • Update θ j : Generate θ j ( k ) from its full conditional distribution using the Metropolis–Hastings algorithm with a Gamma proposal distribution.
  • Update μ j : Generate μ j ( k ) from its full conditional distribution using the Metropolis–Hastings algorithm with a Normal proposal distribution.
  • Update α : Generate α ( k ) from its full conditional distribution using the Metropolis–Hastings algorithm with a Normal proposal distribution.
  • Update p j : Generate p ( k ) from its full conditional distribution using the Metropolis–Hastings algorithm with a Dirichlet proposal distribution.
  • After the burn-in period (B iterations), collect ( θ ( k ) , μ ( k ) , α ( k ) , p ( k ) ) for posterior inference.
Posterior Inference Discard the burn-in samples and compute posterior means.

References

  1. Klein, J.P.; Moeschberger, M.L. Survival Analysis: Techniques for Censored and Truncated Data, 2nd ed.; Springer: New York, NY, USA, 2003. [Google Scholar] [CrossRef]
  2. Kaplan, E.L.; Meier, P. Nonparametric Estimation from Incomplete Observations. J. Am. Stat. Assoc. 1958, 53, 457–481. [Google Scholar] [CrossRef]
  3. Cox, D.R. Regression Models and Life-Tables. J. R. Stat. Soc. Ser. B (Methodol.) 1972, 34, 187–202. [Google Scholar] [CrossRef]
  4. Ibrahim, J.G.; Chen, M.H.; Sinha, D. Bayesian Survival Analysis; Springer Series in Statistics; Springer: New York, NY, USA, 2001. [Google Scholar] [CrossRef]
  5. Kass, R.E.; Wasserman, L. The Selection of Prior Distributions by Formal Rules. J. Am. Stat. Assoc. 1996, 91, 1343–1370. [Google Scholar] [CrossRef]
  6. Ibrahim, J.G.; Chen, M.H.; Sinha, D. Bayesian semiparametric models for survival data with a cure fraction. Biometrics 2001, 57, 383–388. [Google Scholar] [CrossRef]
  7. McLachlan, G.J.; Peel, D. Finite Mixture Models; Wiley Series in Probability and Statistics; Wiley: New York, NY, USA, 2000. [Google Scholar] [CrossRef]
  8. Kottas, A. Nonparametric Bayesian Survival Analysis using Mixtures of Weibull Distributions. J. Stat. Plan. Inference 2006, 136, 578–596. [Google Scholar] [CrossRef]
  9. Iorio, M.D.; Johnson, W.O.; Müller, P.; Rosner, G.L. Bayesian Nonparametric Nonproportional Hazards Survival Modeling. Biometrics 2009, 65, 762–771. [Google Scholar] [CrossRef]
  10. Zhao, L.; Shi, J.; Hulbert-Shearon, T.E.; Li, Y. A Dirichlet Process Mixture Model for Survival Outcome Data: Assessing Nationwide Kidney Transplant Centers. Stat. Med. 2015, 34, 1404–1416. [Google Scholar] [CrossRef]
  11. Arbel, J.; Lijoi, A.; Nipoti, B. Full Bayesian Inference with Hazard Mixture Models. Comput. Stat. Data Anal. 2016, 93, 359–372. [Google Scholar] [CrossRef]
  12. Li, Y.; Lee, J.; Kottas, A. Bayesian Nonparametric Erlang Mixture Modeling for Survival Analysis. Comput. Stat. Data Anal. 2024, 191, 107874. [Google Scholar] [CrossRef]
  13. Vermunt, J.; Magidson, J. Latent Class Analysis; Sage Publications: Thousand Oaks, CA, USA, 2002. [Google Scholar]
  14. Escobar, M.D.; West, M. Bayesian Density Estimation and Inference Using Mixtures. J. Am. Stat. Assoc. 1995, 90, 577–588. [Google Scholar] [CrossRef]
  15. Everitt, B.; Hand, D. Finite Mixture Distributions; Chapman and Hall: London, UK, 1981. [Google Scholar]
  16. Titterington, D.M.; Smith, A.F.M.; Makov, U.E. Statistical Analysis of Finite Mixture Distributions; Wiley: New York, NY, USA, 1985. [Google Scholar]
  17. Aitkin, M.; Rubin, D.B. Estimation and Hypothesis Testing in Finite Mixture Models. J. R. Stat. Soc. Ser. B (Methodol.) 1985, 47, 67–75. [Google Scholar] [CrossRef]
  18. Banfield, J.D.; Raftery, A.E. Model-Based Gaussian and Non-Gaussian Clustering. Biometrics 1993, 49, 803–821. [Google Scholar] [CrossRef]
  19. Asadi, M.; Ebrahimi, N.; Soofi, E.S. The alpha-mixture of survival functions. J. Appl. Probab. 2019, 56, 1151–1167. [Google Scholar] [CrossRef]
  20. Shojaee, O.; Asadi, M.; Finkelstein, M. On Some Properties of α-Mixtures. Metrika 2021, 84, 1213–1240. [Google Scholar] [CrossRef]
  21. Shojaee, O.; Asadi, M.; Finkelstein, M. Stochastic properties of generalized finite α-mixtures. Probab. Eng. Inf. Sci. 2021, 36, 1055–1079. [Google Scholar] [CrossRef]
  22. Barmalzan, G.; Kosari, S.; Zhang, Y. On stochastic comparisons of finite α-mixture models. Stat. Probab. Lett. 2021, 173, 109083. [Google Scholar] [CrossRef]
  23. Shojaee, O.; Momeni, R. The α-Mixture of Cumulative Distribution Functions: Properties, Applications to Parallel System and Stochastic Comparisons. J. Indian Soc. Probab. Stat. 2023, 24, 599–621. [Google Scholar] [CrossRef]
  24. Yin, G.; Ibrahim, J.G. Cure rate models: A unified approach. Can. J. Stat. 2005, 33, 559–570. [Google Scholar] [CrossRef]
  25. Hanin, L.; Huang, L.S. Identifiability of cure models revisited. J. Multivar. Anal. 2014, 130, 261–274. [Google Scholar] [CrossRef]
  26. Mohammed, Y.A.; Yatim, B.; Ismail, S. Mixture Model of the Exponential, Gamma and Weibull Distributions to Analyse Heterogeneous Survival Data. J. Sci. Res. Rep. 2015, 5, 132–139. [Google Scholar] [CrossRef]
  27. Bilgili, D.; Ryu, D.; Ergönül, Ö.; Ebrahimi, N. Bayesian Framework for Parametric Bivariate Accelerated Lifetime Modeling and Its Application to Hospital Acquired Infections. Biometrics 2016, 72, 56–63. [Google Scholar] [CrossRef]
  28. Spiegelhalter, D.J.; Best, N.G.; Carlin, B.P.; Van Der Linde, A. Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2002, 64, 583–639. [Google Scholar] [CrossRef]
  29. Watanabe, S. Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory. J. Mach. Learn. Res. 2010, 11, 3571–3594. [Google Scholar] [CrossRef]
  30. Kass, R.E.; Raftery, A.E. Bayes Factors. J. Am. Stat. Assoc. 1995, 90, 773–795. [Google Scholar] [CrossRef]
Figure 1. Estimated survival curves for the three mixture model examples. (a) Weibull–Weibull. (b) Gamma–Weibull. (c) Lognormal–Lognormal.
Figure 1. Estimated survival curves for the three mixture model examples. (a) Weibull–Weibull. (b) Gamma–Weibull. (c) Lognormal–Lognormal.
Mathematics 14 01772 g001
Figure 2. Comparison of fitted survival curves for (a) hospital, (b) kidney, and (c) SEER datasets. With reference to Table 3 and Table 4, the best mixture models in LPML are available. The black step curve represents the Kaplan–Meier estimate. The red dashed curve corresponds to the α -mixture model. The blue dotted curve shows the best mixture model: (a) geometric, (b) arithmetic and (c) geometric. The green dash-dot curve represents the best single model fit: (a) Lognormal, (b) Lognormal and (c) Lognormal.
Figure 2. Comparison of fitted survival curves for (a) hospital, (b) kidney, and (c) SEER datasets. With reference to Table 3 and Table 4, the best mixture models in LPML are available. The black step curve represents the Kaplan–Meier estimate. The red dashed curve corresponds to the α -mixture model. The blue dotted curve shows the best mixture model: (a) geometric, (b) arithmetic and (c) geometric. The green dash-dot curve represents the best single model fit: (a) Lognormal, (b) Lognormal and (c) Lognormal.
Mathematics 14 01772 g002
Table 1. Simulation results for estimating α under different models and sample sizes. All values are rounded to three decimal places.
Table 1. Simulation results for estimating α under different models and sample sizes. All values are rounded to three decimal places.
n = 100 n = 1000 n = 10,000
Model α α ^ MSE LPML AR α ^ MSE LPML AR α ^ MSE LPML AR
WW−1.00−0.5255.291−5.2970.922−1.7553.216−27.4400.812−1.3661.162−256.0900.621
0.00−0.2039.288−17.3470.960−0.4889.611−154.8010.932−0.1412.108−1533.6820.421
0.25−0.1978.837−23.4530.9511.3814.048−219.0290.7960.5380.593−2235.4570.307
0.501.2996.469−32.7410.9191.3461.938−299.5200.7010.5830.149−2961.1760.282
0.751.4825.899−35.9380.9281.5472.150−361.8440.7220.8330.403−3559.8900.325
1.002.1335.248−44.0990.9271.8042.392−401.4820.7500.9410.536−4033.7520.342
−1.00−0.19710.429−92.6510.970−0.2469.733−899.0890.958−0.1378.267−9011.2330.946
GW0.00−0.4108.967−91.7350.973−0.1347.917−900.3190.955−0.6156.820−9001.4870.948
1.000.0159.938−91.7450.971−0.1519.799−905.2480.9550.7720.180−9451.8080.453
−1.00−1.4965.983−172.6340.932−1.3681.359−1713.7300.829−1.0520.039−17,103.7500.510
LL0.00−0.1501.984−191.2920.830−0.0330.025−1884.6460.497−0.0440.004−18,786.3100.199
1.001.3421.684−194.9440.7880.8120.180−1876.1390.4490.8300.074−18,532.2400.185
Note: AR indicates the acceptance rate in the Metropolis–Hastings within Gibbs sampling.
Table 2. Bayesian parameter estimates (posterior mean and 95% credible intervals).
Table 2. Bayesian parameter estimates (posterior mean and 95% credible intervals).
DatasetModelParameterEstimate (95% CI)
WW α 0.0503(0.0129, 0.0857)
p0.0295(0.0104, 0.0602)
θ 1 0.0538(0.0245, 0.0895)
θ 2 2.0702(1.7319, 2.5031)
GW α 1.7970(−0.3903, 5.3778)
Kidney Catheterp0.5027(0.0684, 0.9538)
θ 1 0.8147(0.1802, 1.8407)
θ 2 0.6169(0.1092, 1.4926)
LL α −0.0219(−0.0450, 0.0023)
p0.0389(0.0177, 0.0606)
μ 1 −3.1729(−3.8021, −2.6787)
μ 2 1.6621(1.2714, 3.3189)
WW α 0.0324(0.0314, 0.0335)
p0.9402(0.9393, 0.9406)
θ 1 3.5133(3.4178, 3.6105)
θ 2 0.1592(0.1575, 0.1605)
GW α 0.0690(0.0621, 0.0757)
SEER-Medicarep0.3211(0.3041, 0.3372)
θ 1 0.2594(0.2448, 0.2774)
θ 2 9.5528(6.5457, 13.0185)
LL α 0.0668(0.0632, 0.0708)
p0.3019(0.2994, 0.3059)
μ 1 −1.2597(−1.2729, −1.2463)
μ 2 4.1466(2.7164, 7.0445)
WW α 0.0764(0.0539, 0.0992)
p0.9638(0.9546, 0.9715)
θ 1 3.2313(2.7233, 3.9719)
θ 2 0.0651(0.0535, 0.0869)
GW α 0.9155(0.2354, 2.2708)
Hospital Infectionp0.5667(0.2560, 0.8978)
θ 1 0.5320(0.2412, 0.8949)
θ 2 3.4156(1.5048, 6.3631)
LL α 0.0443(0.0087, 0.0862)
p0.0887(0.0534, 0.1253)
μ 1 −2.3698(−2.6634, −2.1292)
μ 2 3.9104(2.6426, 5.6082)
Table 3. LPML comparison for the α -Mixture model and its special cases.
Table 3. LPML comparison for the α -Mixture model and its special cases.
DatasetModel α -MixtureArithmeticGeometricHarmonic
WW−57.9054−57.4385−112.5133−112.7585
Kidney CatheterGW−59.4634−59.5014−60.0868−60.4301
LL−57.5859−142.9597−57.6835−183.605
WW−16,964.29−17,706.55−32,342.81−32,342.49
SEER-MedicareGW−18,636.79−18,661.84−19,320.58−19,321.15
LL−14,596.86−21,502.71−14,876.03−24,618.42
WW−200.1449−200.8766−434.4769−435.4586
Hospital InfectionGW−169.6894−169.9026−188.1094−188.8202
LL−142.5559−317.3886−144.3072−496.0775
Table 4. LPML comparison for the Empirical models.
Table 4. LPML comparison for the Empirical models.
DatasetSingle WeibullSingle LognormalSingle Gamma
Kidney Catheter−60.01895−59.10576−60.22008
SEER-Medicare−19,215.39−15,645.40−18,726.96
Hospital Infection−182.9559−157.8370−186.5615
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Luan, F.; Ryu, D.; Yang, Z.; Bilgili, D. Bayesian Estimation for α-Mixture Survival Models. Mathematics 2026, 14, 1772. https://doi.org/10.3390/math14101772

AMA Style

Luan F, Ryu D, Yang Z, Bilgili D. Bayesian Estimation for α-Mixture Survival Models. Mathematics. 2026; 14(10):1772. https://doi.org/10.3390/math14101772

Chicago/Turabian Style

Luan, Feng, Duchwan Ryu, Zhexuan Yang, and Devrim Bilgili. 2026. "Bayesian Estimation for α-Mixture Survival Models" Mathematics 14, no. 10: 1772. https://doi.org/10.3390/math14101772

APA Style

Luan, F., Ryu, D., Yang, Z., & Bilgili, D. (2026). Bayesian Estimation for α-Mixture Survival Models. Mathematics, 14(10), 1772. https://doi.org/10.3390/math14101772

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop