3.1. Empirical Study
We conduct a simulation study to investigate the performance of the proposed method. Simulated data are generated from the mixture cure rate model (
1) with
where
X is simulated from the uniform distribution over
,
W is simulated from the Bernoulli distribution with probability 0.5,
is simulated from the uniform distribution over
or standard normal distribution,
,
,
,
,
,
,
,
,
,
, 0.5,
, 0, 0.5. The censoring time is simulated from the uniform distribution over
. The average cure rate and the censoring rates are 0.31 and 0.43, respectively. The sample size is set to
, 500, 1000, 2000.
Under each of the settings, 1000 samples were simulated and fitted with the proposed SMLE, PSMLE and PMLE methods. The baseline survival function
is assumed to be from the Weibull distribution, and the normal kernel function
is used in the methods. The bandwidths in the SMLE and PSMLE methods are determined using the method discussed in
Section 2 if not specified. All the computing work was carried out on the clusters of the Digital Research Alliance of Canada.
Table 1,
Table 2,
Table 3 and
Table 4 present biases, standard deviations (SD), mean squared errors (MSE), and coverage probabilities (CP) obtained from the three estimation methods for simulated data with
and
U generated from the uniform distribution. The biases, SD, MSE and CP are calculated according to (
7) and (
8).
The estimates of most parameters have small biases and MSEs under all considered sample sizes and estimation methods except for and under the small sample size of 200. When the sample size increases, the biases, SDs, and MSEs of all the estimates decrease. The SMLE method outperforms the PSMLE method as the SMLE method possesses smaller SDs and MSEs, especially in the cases with smaller sample sizes. The bias, SD and MSE of SMLE are slightly smaller in some parameters and larger in some other parameters than those of PMLE, indicating that the two estimation methods are comparable in terms of accuracy and precision. But the computational time of PMLE is much longer. The CPs of SMLE are closer to the 95% nominal level than the other three methods. The CPs of and are away from the nominal level for small sample sizes, but they are closer to the nominal level as the sample size increases.
Table 4 presents the simulation results under a large sample size (
), demonstrating the asymptotic properties of the proposed estimation method. The estimates of SMLE for the subgroup thresholds
and
show minimal bias and standard deviation, confirming that the method can accurately and precisely recover the true subgroup boundaries when sufficient data are available. Even with
, the coverage probability for
remains slightly below the nominal 95% level. This suggests that inference on the subgroup-defining thresholds, particularly when the two subgroups are not well-separated, remains the most challenging aspect of the inference. The coverage for the interaction effect is slightly conservative. This is likely due to the combined uncertainty from estimating both the threshold and the interaction coefficient.
Table 5 presents the biases, SDs, MSEs, and CPs of the estimates from the SMLE method using both the normal and logistic kernel functions for simulated data with
and
U generated from the standard normal distribution. The results indicate that the choice of kernel has a meaningful impact on the estimation of
, the treatment-by-subgroup interaction effect on the cure probability. The normal kernel yields a CP of 0.91 and an MSE of 1.05, while the logistic kernel yields a CP of 0.88 and an MSE of 0.88. This indicates that the normal kernel provides better-calibrated uncertainty intervals for this key parameter. The incidence part of the model involves estimating cure status, which may be affacted by the choice of kernel function. The normal kernel’s faster tail decay provides a more localized approximation while the heavier tails of the logistic kernel may over-smooth the boundary, which appears to be crucial for accurately capturing the discontinuity in the cure probability across the subgroup boundary. For all other parameters, performance is nearly identical between the two kernels. Consequently, we recommend the normal kernel as the default specification for the proposed method based on its robust performance for the most sensitive parameter.
The results for other values and sample sizes are similar and therefore not presented here.
Table 6 presents the biases, SDs, MSEs, and CPs of the estimates from the SMLE method with
and
, 0, 0.5, and
U generated from the standard normal distribution. It investigates a fundamental practical concern that how the method’s performance depends on the location of the true subgroup threshold
. This is crucial for applications where a treatment-sensitive subgroup may be a large majority (
low, e.g., −0.5) or a small minority (
high, e.g., 0.5) of the population. The biases, SDs, and MSEs of the estimates of
from the SMLE method increase and the CPs move away from the 95% nominal level as
goes from
to
because of fewer data in
. This demonstrates the method’s reliance on having an adequate number of subjects within the identified subgroup for reliable inference on probablity of being cured. The biases, SDs, and MSEs of
from the SMLE method do not strictly increase as
increases, which may be because
also depends on
W. The incidence model’s logistic link function may make the cure probability estimate more robust to moderate changes in subgroup size when other predictive information is present. However, the CP for
remains below the nominal level across all
values, underscoring that inference for the cure probability interaction is challenging and requires careful interpretation, regardless of subgroup size.
We also investigated the dependence of the performance of the SMLE method on the choice of the bandwidth
h.
Table 7 and
Table 8 present the biases, SDs, MSEs, and CPs of the estimates from the SMLE method with the bandwidth set to 0.5, 0.2, 0.1, 0.01, 0.001,
,
, the normal kernel function, and
U generated from the standard normal distribution. The selection of the bandwidth
h governs a critical bias-variance trade-off in the smoothed likelihood estimation. For large bandwidth (
), we see substantial bias in
(0.19) and
, and the MSEs are relatively high. The coverage probabilities are below nominal for
and
. As the bandwidth decreases, the biases for
and
decrease, and the MSEs also decrease. However, the coverage for
remains below nominal. For very small bandwidths, the biases become very small and the SDs decrease, but the coverage probabilities for the threshold
drop dramatically to around 0.8. This indicates that the confidence intervals are too narrow, likely because the bootstrap variance estimation fails when the likelihood is too sharp. The data-driven bandwidth (
) strikes a balance. It gives low bias for
and moderate bias for
, and the coverage probabilities are much better for
and
. For
and
, the biases are moderate and coverage is around 0.91 and 0.88 respectively. For all the other parameters, the SDs and MSEs of the estimates from the SMLE method decrease as the value of
h decreases, and the results from
perform well. Therefore, the data-driven bandwidth is recommended as the default, robust specification to ensure both accuracy and valid inference.
3.2. Analysis of Colon Cancer Data
We analyzed a colon cancer data set with the proposed Weibull PH mixture cure model and the piecewise PH mixture cure model for subgroup analysis. We also fitted the data set with the regular Weibull PH mixture cure model for comparison. The data set comes from a clinical trial study designed to evaluate the benefit of the combination of fluorouracil plus levamisole as an adjuvant therapy after resection of stage III colon carcinoma compared to levamisole alone therapy [
29], which is available in survival package in R. The data set contains 614 patients, among them 310 patients were treated with levamisole (Lev) alone, and 304 patients were treated with the combination of levamisole plus fluorouracil (Lev+5FU). The maximum follow-up time is 3329 days, and the outcome of interest is the days to recurrence. There are 119 and 172 recurrence events in the Lev+5FU therapy and Lev therapy, respectively, and the respective censoring rates are 60.86% and 44.52%. The Kaplan–Meier survival curves for the two treatments are presented in
Figure 1. The curves level off at survival probabilities of 59.94% and 43.29%, respectively, after 2500 days of follow-up, indicating the presence of cured subjects in the sample.
We consider subgroups defined by
in the latency part and
in the incidence part, and fit the data set with the proposed Weibull mixture cure rate models to determine the optimal values of
and
and treatment differences in the subgroups. The results are presented in
Table 9.
Under the Weibull mixture cure model with age thresholds, the estimated optimal threshold for age is 67 for the time to recurrence among uncured patients, and 66 for the cure probability from the Weibull PH mixture cure model. However, the interaction effect between treatment and subgroup is significant only for latency. It implies that the treatment effects are significantly different in the time to recurrence for uncured patients between age younger than 67 and older than 67. For uncured patients younger than 67, the hazard ratio between the combined therapy and levamisole alone therapy is 0.23, while for patients older than 67, the hazard ratio increases to 0.7 and is statistically significant, indicating that although the combined treatment improves the survival of uncured patients, patients younger than 67 benefit more than patients older than 67. More specific, the hazard ratio of 0.23 indicates 77% reduction in risk of recurrence if the uncured patient younger than 67 takes combined treatment instead of single treatment, and the hazard ratio of 0.7 indicates 30% reduction in risk of recurrence for the uncured patient older than 67 if the patient takes the combined treatment.
The effects in the incidence part are not statistically significant, implying that none of the treatment, age, or their interaction has any significant effects on the probability of being cured.
The data set is also fitted with the regular Weibull proportional hazard mixture cure model, where age is not dichotomized using a threshold in the main and the interaction effects. The results are presented in
Table 9, and none of the effects in the model are statistically significant.