1. Introduction
Human beings are affected by various diseases, such as non-communicable diseases. A well-known example of a non-communicable disease is cancer. A particular type of cancer is blood cancer, which refers to malignant conditions affecting the blood, bone marrow, and lymphatic system. These disorders involve the uncontrolled growth of abnormal blood cells, which disrupt normal blood functions. The primary types of blood cancer are leukemia, lymphoma, and multiple myeloma. Leukemia affects the blood and bone marrow, lymphoma originates in the lymphatic system, and multiple myeloma involves malignant plasma cells. For more comprehensive details, see [
1]. Cancer is a major societal, public health, and economic problem. Around 1.24 million blood cancer cases in the world are recorded annually, which represents approximately
of all cancer cases. In addition, over 720,000 people die due to blood cancer every year, accounting for more than
of cancer deaths [
2].
A patient who is diagnosed with blood cancer might have other risk factors, such as diabetes and obesity, among others. Therefore, when the time of death is declared for such a patient, there are at least two risk factors that compete to cause the loss of the patient’s life. This complexity highlights the importance of modeling patient lifetimes precisely. For improving treatment design, clinical trial planning, and healthcare decision-making, reliable statistical models are essential. Selecting the best lifetime modeling leads to a reduction in uncertainty of survival predictions, ultimately contributing to enhanced survival rates and reduced economic burden of cancer care. From a statistical perspective, the time of death declarations are time-to-event data common in reliability and survival analyses. When at least two factors can cause failure (from an engineering perspective) or death (from a medical viewpoint), researchers are dealing with time-to-event data in the context of competing risks. This type of data is in the form of successive failure times, with indicators referring to the cause of each failure.
Various models have been proposed for such data; however, Cox’s competing risks model is often used to analyze competing risks data [
3]. For more details and examples about the competing risks model, readers are referred to the monograph of [
4]. Cox’s model has the following assumptions:
The object (or subject) fails only due to any of the k independent causes of failure.
The lifetimes of the objects (or subjects) under study, , are independent and identically distributed random variables (i.e., a random sample). Also, suppose that for , each lifetime can be redefined as such that are latent failure times corresponding to the j different independent competing risk factors.
Although Cox’s model was introduced over six decades ago, it has garnered significant attention from various researchers. For example, ref. [
5] investigated comparative life tests using joint-censoring samples from an exponential competing risks model, whereas ref. [
6] considered progressively Type-II-censored competing risks data with lifetimes assumed to follow a linear-exponential distribution. Moreover, ref. [
7] investigated estimation for the competing risks model under adaptive progressively Type-II-censored Rayleigh lifetime data. Additionally, ref. [
8] examined the statistical inference of competing risks samples under a joint Type-II-censoring plan, assuming that the underlying lifetime distribution is Weibull. Ref. [
9] conducted statistical inferences in competing risks models with Akshaya sub-distributions based on a Type-II-censoring scheme. A competing risks model was adopted in [
10], assuming a generalized inverted exponential distribution with independent failure modes and partially observed under a Type-II progressive-censoring scheme. In [
11], inference for the competing risks model based on the Weibull distribution was examined under an improved adaptive progressive Type-II-censoring scheme.
Furthermore, ref. [
12] developed statistical inference for a competing risks model with latent failure times following the Kumaraswamy-G family of distributions under a unified progressive hybrid-censoring scheme. Addressing a masking problem, ref. [
13] introduced an expectation–maximization (EM) algorithm for parameter estimation for the Birnbaum–Saunders distribution under competing risks with censored data, while ref. [
14] applied a similar algorithm for the inverse Weibull distribution under competing risks. Additionally, ref. [
15] proposed a novel competing risks model termed the additive generalized linear-exponential (AGLE) competing risks model. More recently, ref. [
16] concentrated on statistical inference for independent competing risks data modeled by the inverse Lomax distribution using a Type-II generalized hybrid-censored dataset, while ref. [
17] evaluated the reliability function in competing risks models by employing adaptive progressively Type-II-censored data from small electrical appliances and X-ray-exposed male mice, using the XGamma distribution as the parent model.
In the literature, there are some studies that consider the Gamma distribution as an underlying lifetime distribution in a competing risks model. Ref. [
18] provided both classical and Bayesian estimation of the parameters of a competing risk model defined on the basis of the minimum of exponential and gamma failure, where the failures were due to aging or were accidental failures, and as such they considered Gamma with increasing failure rate. They considered the case where the causes were identified but which of the causes lead to failure is unknown. Ref. [
19] studied the competing risk model on the basis of the minimum of exponential and gamma failure, considering the case where failure times are classified due to the two causes. Recently, ref. [
20] used cardiovascular disease patients’ data for the comparison of parametric competing risk regression models, attempting to study the effect of covariates in the competing risk model. The cause of risks is assumed to follow certain lifetime distributions, such as exponential, Weibull, gamma, and generalized gamma. They used an ML method to estimate the parameters without a proposed simulation study to see the performance of the estimates. The Stacy’s GG distribution proposed by Stacy [
21] is widely used not only in classical lifetimes but also extends beyond many advanced models because its multiple shape parameters accommodate a wide range of skewness, tail behavior, and hazard shapes. Models such as frailty models [
22], autoregressive moving average (ARMA) model [
23], reliability and accelerated life testing [
24], stress strength [
25], and regression models [
26] are included within the stochastic process framework as in [
27]. Across diverse applications, the Stacy’s GG distribution provides a good fit of real-life data and sometimes consistently outperforms or competes strongly with traditional models (Gamma, Weibull, lognormal, Gaussian, Generalized Lindley Distribution, etc.) [
28,
29,
30,
31,
32,
33,
34,
35,
36,
37]. Theoretically, the GG model extends the exponential, Weibull, and gamma families through two shape parameters, allowing for monotonic and non-monotonic hazard forms. Methodologically, this flexibility permits nested model comparison and provides a unified framework to assess whether simpler models are sufficient. In the context of competing risks, this adaptability ensures more accurate representation of cause-specific hazards in heterogeneous lifetime data such as blood cancer survival.
Traditionally, researchers consider the exponential, the Rayleigh, or the Weibull distribution, among others, when modeling competing risks. This study focuses on Stacy’s generalized gamma (GG) competing risks model for two main reasons. First, it is a generalization of the aforementioned models, as well as the gamma, the additive gamma, and the additive Weibull models. The two competing risks, both with different failure rates as in Stacy’s GG, cover various practical cases and might fit the available data well. Second, although the generalized gamma distribution has been widely used in reliability, frailty, and regression settings, its explicit formulation as a competing risks model with two GG cause-specific lifetimes and a detailed comparison of several frequentist estimators—particularly in the context of blood cancer survival data—has, to the best of our knowledge, not been systematically investigated. In this study, blood cancer data is utilized as an illustrative application to study the applicability of our proposed model and discover how precise the fitting of lifetimes is for patients. The choice of blood cancers is particularly pertinent, not only because of their high prevalence and increasing burden but also because the complexity of competing risks affects survival outcomes. We apply the competing risks model in this situation, aiming to provide insights that are transferable to other cancer types, supporting broader efforts in oncology.
Let
denote the time-to-event datum caused by risk
j, such that
. If
follow the Stacy’s GG distribution [
21] with scale parameter
and shape parameters
and
, where
, then the probability density function (PDF) and survival function (SF) are given by
and
where
is the gamma function, while
is the upper incomplete gamma function, respectively. Note that this study considers only two competing risk factors (i.e.,
) without loss of generality. A re-parameterized and extended logarithm of the distribution of the GG random variable was developed by [
38]. Recently, a new partially linear regression based on a reparametrized generalized gamma distribution with two systematic components that can be easily interpreted was established by [
36], while ref. [
37] presented the flexible GG distribution with a PDF that can be expressed as an infinite linear combination of generalized gamma densities, and ref. [
39] rehabilitated maximum likelihood (ML) estimation for the parameters of the GG distribution via a modified MLE to deal with computational difficulties, including, but not limited to, non-convergence or convergence to the wrong root for the normal equations.
Under the assumption that there are two unknown causes of failure (i.e.,
k = 2), this study aims to explore the mathematical properties of the Stacy’s GG competing risk model, compare various estimation methods through Monte Carlo simulations, and validate the model’s practical applicability by analyzing real blood cancer data. Comparing estimation methods via Monte Carlo simulation is crucial for numerically evaluating their performance under diverse finite sample and parametric settings, thereby providing a robust framework for assessing their accuracy, consistency, and efficiency. Monte Carlo simulations have become increasingly prevalent as computational technologies have advanced over the past two and a half decades; see [
40,
41,
42,
43,
44,
45,
46,
47,
48] as examples of Monte Carlo simulation comparative studies, among others. In this study, seven frequentist estimation procedures are considered; namely, maximum likelihood estimators (MLEs), least squares estimators (LSEs), weighted least squares estimators (WLSEs), maximum product of spacings estimators (MPSEs), Cramér–von Mises estimators (CVMEs), Anderson–Darling estimators (ADEs), and right-tailed Anderson–Darling estimators (RADEs), assuming that these estimators exist and are unique.
The remainder of this study is organized as follows.
Section 2 discusses key distributional aspects of the competing risks model, while
Section 3 outlines the seven frequentist estimation procedures under consideration.
Section 4 presents simulation studies that compare the performance of these estimation methods. An analysis of an actual dataset is provided to demonstrate the model’s practical applicability in
Section 5. Finally, the article is concluded by a discussion and future research direction in
Section 6.
4. Monte Carlo Simulation Outcomes
A comprehensive Monte Carlo simulation analysis is conducted to numerically analyze and compare the efficacy of the proposed estimators under different combinations of sample size
n and different parameters values of
where the parameters
, and
affect the shape of the distribution while the scale parameters
and
control the spread of data, stretching the distribution along the x-axis rather than changing its fundamental shape which is basically controlled by the parameter shape. The simulation results are categorized into two sections: the first section assesses the efficiency of estimators, while the second section focuses on goodness-of-fit analysis for each estimation method. In the simulation, the sample size is designated as
; moreover, the parameter values were set as
, and
with the scale parameters maintained at
without loss of generality. The simulation outcomes are based on
simulation iterations. All numerical results were obtained using
RStudio 2024.12.1 [
71], an integrated development environment (IDE) for
R 4.4.3 [
72], a language and environment designed for statistical computing, visualization, and analysis. Due to its versatility,
R includes numerous built-in functions for optimization, in addition to a wide range of contributed packages. One such package is
nloptr 2.2.1, which serves as an
R interface to
NLopt 2.10.0, a free and open-source library for nonlinear optimization.
NLopt provides a unified interface to various freely available optimization routines, as well as original implementations of several algorithms [
73]. Among these is the Bound Optimization by Quadratic Approximation (BOBYQA) algorithm [
74], which demonstrated strong performance in obtaining the estimators.
Each estimation method was implemented using numerical optimization via the BOBYQA algorithm. We determine the stopping criteria when the relative change in the calculated optimized objects from one iteration to the next is less than , or when the maximum number of function evaluations is reached. The initial parameter values were obtained using uniform sampling within of the true parameters for each simulation run. For any of the seven estimation objects, if a failure to converge was detected (e.g., returning infinite values or reaching the maximum number of function evaluations), the optimization was repeated using a newly generated initial value until the convergence criteria were met. The estimation is ensured to be stable across all simulation runs due to adaptive re-initialization, resulting in negligible final non-convergence rates.
To acquire random samples from the underlying model, the following steps are implemented:
Generate two independent random samples of size n for each cause of failure, i.e., represent a random sample from (failure #1) and form a random sample from .
Set for all to obtain a random sample from the AGG distribution.
Using the specified simulation settings and procedures described above, 1000 random samples were generated, and the model parameters were estimated using the seven previously discussed methods. As some comparison metrics yielded disproportionately large values, min–max normalization was applied to scale all results between 0 and 1. It is important to note that lower values indicate better performance; thus, an estimation method with metrics closer to 0 is considered superior to its counterparts.
4.1. Estimation Efficiency
The root mean square error (RMSE) is a common metric for assessing the average magnitude of errors between estimated values and actual values. This evaluation is crucial for determining the performance of estimators across various statistical applications. In this study, RMSE is used to compare estimator performance under varying parameter settings. The simulated RMSE is obtained for each estimator as
where
is the point estimator of
on simulation run
i.
Table 3,
Table 4 and
Table 5 include the estimated RMSEs for three cases of parameters selected from simulation results. For a complete visual summary,
Figure 4,
Figure 5,
Figure 6 and
Figure 7 illustrate the efficiency of the estimation and provide numerous patterns in RMSE values related to various estimation methods in differing values of parameters and sample sizes. For clarification,
Figure 4 is a heatmap that displays the simulated RMSE of the estimator
across the seven estimation methods and different sample sizes for all combinations of the parameters
, and
; more green is associated with lower values of RMSE. The outcomes of this part of the simulation study are discussed as follows:
Generally, the RMSEs for the estimators of converges to 0 as the sample size increases starting from . Nonetheless, in certain instances, we observe variations. For instance, when , the RMSEs for the MPSEs initially (at ) exhibit small values, subsequently rise at moderate samples, and then decline once more.
When exceeds 1, the RMSEs for the estimators of are minimal in large samples, indicating that the estimators are efficient, except when , where the RMSEs for LSE, WLSE, and CVME are lower (close to 0) in small samples () than in large samples () where the values substantially increase to 1. In contrast, the RMSEs for ; in the first place, MPSEs exhibit reduced efficiency in certain situations, such as , where at the RMSEs are close to 1.
When is less than 1, the RMSEs of the estimators of diminish as the sample gets larger, starting from color red . Nevertheless, some methods yield more efficient estimators for small samples than the others. The RMSEs of the LSEs, WLSEs, CVMEs, ADEs, and RADEs are inferior to those of MLEs and MPSEs when , where at , the RMSEs are extremely close to 0.
Generally, the RMSEs of the estimators of , which decrease with a growth in sample size (starting from , reflect the efficiency of the estimators’ performance. However, in some situations, some estimation approaches provide superior performance in small samples compared to other ones such as MPSEs in case of , and MLEs when , where RMSEs converge to 1 as sample size increases.
When is more than 1, the estimators of show better performance with increasing sample size, regardless of the methods employed, but in most situations where is less than 1, we observe variations in the RMSE values of estimators, which are initially low at , which subsequently get higher at , and then decrease again at .
The presence of fluctuations in RMSEs across sample sizes might be attributed to several causes, such as sampling variability, the complexity of the lifetime distribution, and the sensitivity of the estimation methods to the underlying distribution characteristics, sample size, and sample variability. Smaller samples, by coincidence, may sometimes produce estimators with unexpectedly low RMSE; for instance, the sample might resemble the population well enough. As the sample size increases, the estimation procedure might be more susceptible to the specific sample realization and fail to reflect the underlying distribution’s characteristics.
The medians of RMSEs (denoted by
) were used as robust summary measures to compare the overall performance of the estimation methods across parameters and sample sizes. As shown in
Table 6, the
values consistently decrease as the sample size increases, indicating improved estimation accuracy with larger samples. Among all methods, LS, CVM, AD, RAD, and WLS exhibit relatively low
values across parameters, reflecting stable and reliable performance. In contrast, MPS and particularly ML estimators yield considerably higher
values, especially for small and moderate sample sizes, suggesting slower or less stable convergence.
Overall, estimation precision improves with sample size, leading to smaller RMSEs. However, the observed fluctuations across estimators and parameters underscore the importance of selecting estimation methods that are suited to both the parameter regime and the available sample size.
4.2. Goodness-of-Fit Analysis
This section of the simulation results presents simulation results based on two simulated goodness-of-fit metrics: the average absolute difference (
) between the true and estimated cumulative distribution functions (CDFs), and the maximum absolute difference (
) between the true and estimated CDFs, to evaluate the estimation methodologies. These metrics are defined as
and
where
denotes the estimate of the model parameter
based on the
i-th simulation run. The statistical measures
and
are used to assess the accuracy of the estimation methods by quantifying the average and maximum absolute differences across all data points and simulation runs.
Table 3,
Table 4 and
Table 5 include the estimated values of
and
for three cases of parameters selected from simulation results. The simulated goodness-of-fit results are presented in
Figure 8 and
Figure 9. Both
and
exhibit a decreasing trend as the sample size increases, regardless of the values of the shape parameters. Moreover, the values of
and
are consistent across all estimators, indicating independence from the specific values of the shape parameter. While the RMSE values may not exhibit a perfectly monotonic convergence pattern, the comparison criteria based on
and
indicate that all estimators achieve comparable distributional accuracy. The observed variation in parameter estimates arises because each method optimizes a different objective function. Despite these numerical differences, the resulting fitted distributions exhibit similar overall shapes and goodness-of-fit measures, confirming that the generalized gamma model adequately represents the data. Consequently, for larger sample sizes, all seven estimation procedures demonstrate reliable performance in capturing the underlying lifetime distribution.
5. Data Analysis
This section illustrates the AGG distribution using real data in the health field. These real data are used to investigate the suggested estimation approaches and the practical applicability of the model. In addition to the RMSE, goodness-of-fit criteria such as Kolmogorov–Smirnov (KS) statistic and the corresponding
p-value are used when analyzing the data. The preferred distribution is selected based on the lowest calculated criterion alongside the highest
p-value. Because of estimating the parameters and existing ties in data, the corresponding
p-value is calculated based on 1000 parametric bootstrap samples (see Algorithm 1).
| Algorithm 1 Calculation of p-value |
- Require:
Observed data , hypothesized CDF , number of bootstrap replicates . - Ensure:
Bootstrap-based p-values and Monte Carlo Standard Errors for the KS test. - 1:
Step 1: Estimate model parameters Obtain initial estimates
for the seven estimation methods. - 2:
Step 2: Compute observed test statistic Using , compute . - 3:
Step 3: Parametric bootstrap simulation - 4:
for to B do - 5:
Generate bootstrap sample using the estimated parameters in step 1 - 6:
Re-estimate
based on bootstrap sample # j. - 7:
Compute the KS test statistics based on bootstrap sample #j. - 8:
end for - 9:
Step 4: Calculate bias of each estimator - 10:
Step 5: Calculate RMSE for the estimator - 11:
Step 6: Compute bootstrap p-values (mid-p correction) - 12:
Step 7: Compute Monte Carlo Standard Error (MCSE)
|
The first dataset in
Table 7 consists of the survival times in days of 43 patients with blood cancer obtained from the Ministry of Health Hospital in Saudi Arabia, studied by [
75]. The data are transformed from days to years for computational convenience. The second dataset in
Table 8 consists of survival times of 30 patients with chronic granulocytic (CG) leukemia, as cited in the National Cancer Institute [
76]. Both datasets represent complete samples of uncensored survival times. No additional information was available in the original sources regarding disease subtypes, treatment protocols, or specific causes of death. Consequently, these datasets are interpreted as pure lifetime data, where the observed failures are assumed to result from either disease progression (Cause #1) or other complications (Cause #2), both of which are unknown. The summary statistics of the two datasets are obtained in
Table 9. The histogram and box-plot for each dataset are shown in
Figure 10.
For blood cancer data, the estimated model parameters employing the seven estimation methods are listed in
Table 10. The variation in RMSEs is observed across parameters within the same method, with no specific approach providing uniformly optimal estimates for all parameters. ML procedure exhibited the highest RMSE for
, while performing well for
,
, and
. MPS approach showed the greatest RMSEs for
,
, and
. AD and RAD methods delivered close estimates as well as close RMSEs for all parameters except
where their RMSEs are not the least.
The goodness-of-fit metrics based on the observed blood cancer data are studied and the results are listed in
Table 11. The quantile–quantile (Q-Q) plot between the estimated and empirical distribution functions is provided in
Figure 11. Obviously, all the estimation methods yielded close test statistics KS, demonstrating a good distributional fit. The
p-values for AD, RAD, and ML methods are the highest compared to others. The MCSE refers to how uncertain
is because of using a finite number of simulations
B. It is the standard error of “successes” that represents the number of bootstrap statistics greater than or equal to the observed statistic. The MCSEs corresponding to AD, RAD, and ML are the least, indicating their
p-values are the most confident.
Based on CG leukemia data, the calculated values of the model’s parameters using the previously described approaches are listed in
Table 12. The variation in estimates, as well as within the corresponding RMSEs, is obvious. The MPS method produced estimates with the highest RMSEs corresponding to
and
, while RMSEs of MLEs of
and
are the least efficient. Although there was no systemic pattern among the performance of estimates that concluded a specific approach that could provide uniformly optimal estimates for all parameters, generally, AD provided more accurate estimates according to their RMSEs, despite the RMSE of
and
, which provided a slight rise higher than those of RAD.
The results of goodness-of-fit tests are shown in
Table 13, while the Q-Q plot is shown in
Figure 12. All methods produced adequate distributional fits according to test statistics, but AD showed outstanding performance based on the corresponding
p-value and MCSE.
The evaluations revealed that some approaches provided better parameter estimation accuracy (lower RMSE). At the same time, other methods showed superior distributional approximation (better goodness-of-fit results) as demonstrated in the first application, where ML exhibited goodness-of-fit results and the highest RMSE of . The conflict between the estimation efficiency of the parameters and the performance of the goodness-of-fit test is not a problem to be solved; it is expected, especially when dealing with complex distributions. The results demonstrated the ability of unstable parameters to produce better distributional fits and provide adaptation to limited data. Based on analysis of two datasets, AD achieved a best fit with more accurate estimates. The choice of optimal methods depends on the purpose of the analysis, whether it is parameter inference or distributional modeling. For clinical data analysis, goodness-of-fit tests are the primary recommendation since decisions depend on distributional accuracy, risk evaluation requires accurate distributional behavior, and predictive performance is more critical than parameter precision.