1. Introduction
Patient responses to pharmacological treatments show significant variability, influenced by factors such as genetics, disease traits, comorbid conditions, and other individual-specific characteristics. This variability highlights the importance of personalized treatment approaches over traditional uniform protocols. Standardized treatment plans may not yield optimal clinical results and might even cause adverse effects in certain cases [
1,
2]. A relevant example is flexible sigmoidoscopy, a common screening method for colorectal cancer. While beneficial for patients expected to live ten or more years, this procedure can pose risks, including perforation or infection, that may outweigh benefits for those with shorter life expectancy. Therefore, deciding on such interventions requires a careful assessment of each patient’s health status and prognosis [
3].
Given the complexity inherent in personalized care, devising individualized treatment decision rules has become essential. These guidelines offer clinicians a systematic approach to customize therapies based on clinical, demographic, and genetic information, thus reducing treatment-related risks and enhancing therapeutic outcomes [
4,
5,
6]. The emerging field of precision medicine embodies this concept by aiming to tailor treatments to the genetic and clinical profile of each patient. Delivering the appropriate medication at the optimal dose and timing, precision medicine seeks to improve outcomes while minimizing side effects and inefficiencies in healthcare delivery [
7,
8,
9]. This individualized strategy not only boosts the chances of positive patient results but also promotes more effective use of healthcare resources.
Various approaches have been developed to estimate optimal treatment regimes, including regression-based methods [
10,
11,
12], value-search methods [
13,
14,
15], and model-based planning frameworks [
16,
17]. Traditional survival models like the Cox proportional hazards (PH) model [
18] have been widely used to evaluate treatment effects. However, such models often have limited capacity to capture complex interactions between treatments and patient features. Moreover, the presence of censored observations complicates the estimation of optimal treatment strategies, as the event of interest may not be fully observed for all individuals. To address these challenges, several extensions of survival analysis techniques have been proposed. For instance, Fang et al. [
19] introduces a semiparametric accelerated failure time model for optimal treatment rule estimation, employing augmented inverse probability weighted estimators to manage treatment effect heterogeneity within censored data. Wang et al. [
20] expands the Cox model and proposes an iterative alternating optimization algorithm to determine optimal treatment rules for censored survival outcomes. Although the aforementioned extensions have improved the capacity of traditional survival models to accommodate complex data structures and treatment heterogeneity, their flexibility is still limited by reliance on pre-specified model forms. In this context, machine learning techniques have emerged. The advent of machine learning has opened new opportunities for optimizing treatment regimes, offering increased flexibility and predictive power to capture complex, nonlinear relationships without the restrictive assumptions of traditional parametric models [
21,
22]. However, a major limitation is the reduced interpretability of many machine learning models. Their “black-box” nature makes it difficult to interpret individual covariate effects and treatment interactions, hindering clinicians’ ability to relate results to clinical knowledge and thereby reducing trust. The absence of explicit parameters precludes formal statistical inference, such as constructing confidence intervals or conducting hypothesis tests, thereby obscuring the reliability and significance of estimated treatment effect heterogeneity. In contrast, the statistical model offers improved interpretability through clearly estimated parameters with direct clinical meaning, enabling direct quantification of covariate effects. This transparency facilitates clinician understanding, evaluation, and practical application.
Interval censoring frequently occurs in clinical research involving periodic follow-ups. A well-studied case is current status data, where the event of interest is only known to have happened before or after a single inspection time [
23]. More generally, interval censoring refers to situations where events are observed to occur between two time points, resulting in more complicated data patterns. This type of censoring presents computational and theoretical difficulties, necessitating sophisticated statistical methods that balance model complexity and estimation efficiency. Numerous studies have tackled various aspects of inference with interval-censored data [
24,
25,
26,
27]. Despite these efforts, methods specifically designed to estimate optimal treatment regimes under interval censoring remain limited. Developing robust techniques tailored to interval-censored survival data would advance personalized clinical decision-making and contribute to better patient care and precision medicine.
In this research, we introduce a novel semiparametric single-index model to investigate treatment effects on survival outcomes within the framework of interval-censored data. Our approach links the treatment variable to a linear combination of covariates through a flexible monotonic function, enabling the capture of intricate relationships between treatments and patient features. This design also maintains clinical interpretability, offering meaningful guidance for personalized therapy development. For parameter estimation, we employ a combination of nonparametric maximum likelihood and sieve estimation methods. In particular, monotone splines are utilized to approximate the cumulative baseline hazard function, while B-splines are applied to model the unknown link function. The estimation procedure is carried out via an expectation-maximization (EM) algorithm that leverages data augmentation to simplify computations. Using empirical process theory, we establish the asymptotic properties of the estimators. Additionally, simulation studies confirm that the proposed algorithm is robust, computationally efficient, and easy to implement in practice.
The rest of the paper is organized as follows.
Section 2 introduces the notation, describes the proposed model, and derives the likelihood function based on observed data.
Section 3 focuses on the formulation of the sieve maximum likelihood estimation and presents the accompanying EM algorithm. In
Section 4, we establish the asymptotic properties of the proposed estimators.
Section 5 and
Section 6 provide simulation results and apply the method to real-world data, including a case study using the ACTG320 clinical trial dataset. Finally,
Section 7 offers concluding remarks and discusses future research directions.
2. Notation, Models and Likelihood
Let
denote a vector of bounded baseline covariates, and let
be the treatment indicator, where 1 corresponds to the treatment group and 0 to the control group. We consider a semiparametric single-index model for the conditional cumulative hazard function of survival time
T, given
and
A, specified as
where
represents the unspecified cumulative baseline hazard function,
is an unknown strictly increasing link function,
denotes a
q-dimensional subset of
,
and
are vectors of unknown regression parameters. Under model (
1), the conditional cumulative distribution function of
T given
and
A can be written as
For identifiability, we impose the constraint
along with the sign restriction
, where
denotes the Euclidean norm and
is the first component of the vector
. When the link function
is linear, model (
1) simplifies to the classical Cox proportional hazards model incorporating a treatment–covariate interaction term [
18].
To derive valid estimators under model (
1) that facilitate the identification of optimal treatment rules, we first introduce the notation
to represent the potential survival time an individual would experience if assigned treatment
a, with
. Following Wang et al. [
20], we impose two commonly adopted assumptions from the causal inference framework [
28]:
- (A1)
Stable Unit Treatment Value Assumption: The survival time T equals the potential survival time associated with the received treatment .
- (A2)
No Unmeasured Confounders Assumption: Conditional on the covariates , the treatment assignment A is independent of the potential outcomes .
Assumption (A1) guarantees that one individual’s treatment does not affect the potential outcomes of others. It also assumes that the treatment effect should be the same for all individuals receiving the same treatment. Assumption (A2) is fundamental but remains unverifiable in observational studies. This assumption implies that there are no unmeasured confounders that simultaneously influence treatment decisions and survival outcomes after accounting for covariates
[
29]. If this assumption is violated, meaning that there exist unmeasured confounders that affect both treatment assignment and outcomes, the estimated treatment effects may be biased. Such bias can consequently lead to suboptimal or even misleading individualized treatment rules. This ultimately reduces the reliability and generalizability of the derived treatment regimes when applied to new patient populations. Given these assumptions, one can evaluate and contrast the potential survival outcomes associated with various treatment strategies and determine the best possible treatment regimen. In particular, the optimal treatment strategy can be formulated as
, where
denotes the indicator function. For better interpretability, this is equivalently expressed as
. The key motivation behind this rule lies in leveraging clinically meaningful contrasts in survival probabilities: when
, the model predicts that the patient’s survival probability is higher under treatment
than under
. Conversely, if
, the survival probability is higher if the patient receives treatment
. This thresholding reflects a direct comparison of estimated survival benefits, grounded in patient-specific covariates summarized by the index
. For each patient, following the treatment assignment defined by this decision rule ensures that the treatment selected is the one predicted to yield the highest survival probability based on their individual characteristics. This maximization is not merely a statistical abstraction but reflects a concrete comparison of patient-specific survival outcomes under competing treatment options. By explicitly basing treatment choice on which regimen offers the most favorable survival chance, the rule provides a clinically interpretable contrast: it indicates when one treatment clearly outperforms the other in terms of expected patient survival.
We consider the interval-censored data, where the failure time is only known to fall within a specific time range and cannot be observed precisely. For each subject , the failure time is observed only to fall within the interval , with . Define the indicator variables as follows: if is left-censored, meaning and ; if is interval-censored, i.e., with and ; and if is right-censored, indicating and . Note that for each subject i, exactly one of these indicators is equal to 1, so that . In a randomized clinical trial setting with n subjects, the observed data can be summarized as
Under the two aforementioned assumptions and assuming that the observed failure times are conditionally independent given the covariates, the likelihood function for the observed data in relation to the model parameters can be written as follows:
To handle the nuisance functions
and
, we approximate them with spline-based functions. In particular, since
is a positive, increasing function satisfying
, it is modeled using monotone splines [
30]. The approximation takes the following form:
where
represents the integrated spline basis functions. Each
is non-decreasing and takes values within the interval
, with corresponding spline coefficients
. To ensure the monotonicity of
, we enforce non-negativity constraints on the coefficients
. The total number of basis functions, denoted by
, is determined once the spline degree and the placement of interior knots are fixed. Here,
represents the number of interior knots, and
refers to the spline degree, which can be set to 1, 2, or 3 for linear, quadratic, or cubic splines, respectively. In practical applications, the interior knots may be chosen either as equally spaced points between the minimum and maximum observed times or according to selected quantiles of the observational time distribution.
On the other hand, we approximate
using
B-splines [
31], which provide a flexible and smooth approximation while ensuring the preservation of the function’s monotonicity. The approximation is given by:
where
represents the quadratic
B-spline basis functions, and
’s denote the corresponding spline coefficients. To ensure the monotonicity of
and maintain numerical stability in the estimation process, we impose the condition
for some constant
. These conditions are necessary to ensure the monotonicity of
and to maintain numerical stability of the estimation process. Similar to monotone splines,
B-splines require specifying the spline degree
and the set of interior knots. The total number of basis functions,
, is given by
where
denotes the number of interior knots. These knots may be placed either uniformly spaced between the minimum and maximum of
, or selected based on quantiles of these values for any unit vector
.
With these approximations, the likelihood in Equation (
2) can be expressed as:
where
Maximizing the likelihood function in (
3) is often computationally challenging due to its intractable form and the large number of parameters involved. Even in the case of relatively simpler models, such as the PH model, the optimization task remains inherently complex. In such cases, traditional methods like the Newton–Raphson algorithm can encounter significant numerical issues, including non-convergence or convergence to local extrema rather than the global optimum [
32,
33]. To address these challenges and simplify the computation of the sieve maximum likelihood, we develop an EM algorithm as a more efficient and tractable solution. This approach incorporates a three-stage data augmentation procedure, which effectively reduces the computational complexity and mitigates the numerical instability commonly associated with direct maximization. By introducing Poisson latent variables and iteratively refining the parameter estimates, the EM algorithm offers a more stable and efficient solution for estimating the parameters in the model.
3. Estimation Procedure
To improve the efficiency and stability of the estimation procedure, we propose an EM algorithm augmented with a three-stage data augmentation scheme. The EM algorithm alternates between two key steps: in the E-step, conditional expectations of latent variables are computed given the observed data and current parameter values; in the M-step, parameter estimates are updated by maximizing the expected log-likelihood obtained from the E-step. This iterative process continues until convergence, yielding stable and reliable parameter estimates. Incorporating the three-stage data augmentation further enhances computational efficiency and numerical stability by simplifying the calculations involved.
To facilitate maximization, we augment the dataset by introducing two layers of independent Poisson latent variables. Specifically, for each subject
i, two independent Poisson random variables are defined as
and
. Here, the notation
denotes the Poisson distribution with mean
. We express the terms
and
as follows:
and
if
. Using the spline-based representation of
, we can decompose the random variables
and
into the following sums
where
represent independent Poisson random variables. Each
has an expected value of
, while the expected value for each
is given by
, for
. We use the notation
to denote the probability mass function of a Poisson random variable
u with mean
. Additionally, we express
as
. Considering the latent variables as if they were fully observed, the complete data likelihood function can be expressed as
with the constraints:
,
,
if
,
and
if
, and
and
if
, where
for
.
To estimate the parameters
,
,
, and
, we implement the EM algorithm. Specifically, in the E-step, we calculate the conditional expectation of the complete-data log-likelihood
with respect to the latent variables, given the current parameter estimates. This step yields
In the above equation,
,
,
, and
represent the
mth update of
,
,
, and
, respectively. Furthermore, the function
depends on these current iterates but does not vary with respect to the optimization variables
,
,
, and
. For the sake of brevity, we omit the explicit conditioning on the observed data and current parameter estimates in the conditional expectations. Let
,
, and
. The expressions for the above conditional expectations are given by
Furthermore, the conditional expectations
and
can be expressed as
In this process, terms independent of the unknown parameters can be omitted, allowing us to focus on the key contributions from the latent variables. This simplification yields a more tractable expression for optimization. Consequently, maximizing (
4) reduces to
The computational procedure proceeds as follows. Following the initialization approach described in Wang et al. [
20], we begin by fitting a PH model [
32] with covariates arranged as
. From this model, initial estimates
,
, and
are obtained. Specifically,
corresponds to the coefficient estimates for
, while
is derived from the interaction term coefficients associated with
. The initial values
are extracted from the spline coefficients of the estimated baseline cumulative hazard function, as detailed in Wang et al. [
32]. For the
B-spline coefficients
, we employ a least squares fitting procedure assuming a linear link function, where the intercept and slope correspond respectively to the coefficient for
A and the coefficients for
obtained from the PH model.
At iteration
, the conditional expectations
are computed using the current parameter estimates
, and
, along with the observed data. These expectations are derived from the expressions provided earlier. In the M-step, we obtain a closed-form update for each
(
) by setting the partial derivative of the function
with respect to
to zero. The resulting solution is expressed in terms of
,
, and
as follows:
Importantly, when each is initialized to a nonnegative value, all conditional expectations involved in the expression for are nonnegative. Consequently, remains nonnegative throughout the iterations. This desirable property of the proposed EM algorithm obviates the need for constrained optimization to enforce nonnegativity of , thereby simplifying the computational procedure.
Substituting
into (
5) yields the following objective function:
To update the parameter
, we first approximate the objective function using a first-order Taylor expansion around the current estimate
. This yields a surrogate function
in
, which facilitates optimization.
Subject to the constraint
and
,
is the first derivative of the link function
. The optimal solution
is obtained using established nonlinear optimization techniques, such as the
solnp() function from the “
Rsolnp” package [
34].
In the subsequent step, we update the parameters
and
while fixing
at its current estimate
. Substituting this value into the objective function (
6) yields a new surrogate objective, denoted
, which depends solely on
and
.
We then maximize
subject to the ordered bound constraints
for
and
. This optimization is performed using the “
nloptr” package in
R [
35], which is particularly effective for optimization problems of this form.
The proposed EM algorithm can be summarized as follows:
- Step 1:
Fit a standard PH model to obtain initial estimates , and . Initialize via least-squares regression. Set iteration counter .
- Step 2:
At iteration , compute the conditional expectations based on the current estimates , , , and , as well as the observed data.
- Step 3:
Update by maximizing subject to the unit-norm constraint and .
- Step 4:
Update and by maximizing using nonlinear optimization methods, subject to the ordered bounds .
- Step 5:
Calculate for .
- Step 6:
Set . Go to Step 2 and iterate until convergence.
The algorithm is considered to have converged when the maximum absolute change in the log-likelihood between consecutive iterations falls below a prespecified tolerance, such as 0.001. During initialization, we obtain the initial parameter estimates using the standard PH model as proposed in Wang et al. [
32]. To balance computational efficiency and estimation accuracy, we adopt a convergence criterion of 0.005 for the initial model fitting, which is less stringent than the commonly used 0.001 threshold. This relaxation significantly reduces computation time without compromising the quality of the initial estimates. In generating these initial values, the regression coefficients are initialized at zero, and the spline coefficients are set to one without multiple restarts. This approach provides a stable starting point, and subsequent optimization effectively refines the estimates, helping to avoid suboptimal local maxima. The simulation results presented below demonstrate the effectiveness of this initialization approach.
To facilitate readers who may be less familiar with the multi-layer data augmentation framework, we provide a concise pseudo-code representation of the proposed EM algorithm, which is presented as Algorithm 1. This pseudo-code complements the detailed mathematical derivations by clearly outlining the iterative steps involved in parameter estimation, thereby enhancing the overall clarity and accessibility of the algorithm.
| Algorithm 1: Proposed EM algorithm |
![Symmetry 18 00532 i001 Symmetry 18 00532 i001]() |
For spline estimation, the numbers of interior knots and are selected using the Akaike Information Criterion (AIC) or analogous model selection criteria. The link function is estimated as . Then the optimal treatment rule is given by . Finally, the average treatment effect (ATE) is estimated by .
4. Asymptotic Behavior and Variance Estimation
In this section, we aim to establish the theoretical properties of the estimators proposed for the model parameters. To begin, let us clearly specify the estimators, the true model parameters, and the relevant function spaces. Let
denote the estimator of the parameter vector
, where
,
,
and
are the proposed estimators of
,
and
, respectively. The true parameter values are represented by
,
,
and
. Collectively, we write the true parameter vector as
. For notational convenience, let
and
denote the Euclidean norm and the sup-norm, respectively. Specifically, for a vector
, the Euclidean norm is defined by
, whereas the sup-norm is
. Let
be the maximum follow-up time considered in the study, and
be the space of bounded sequences on the interval
. We introduce the function space for the cumulative hazard function as
is monotone increasing with
and
. Moreover, we define the
-norm on
with respect to the Lebesgue measure, defined as
where
denotes the distribution function of
t. Let
denote the union of the supports of
over all
satisfying
. We then define
and
as the
-norm and
-norm over
, respectively, such that
and
where
is the distribution function of the variable
y.
Define the knot sequence
, satisfying
where the interval
is divided into
subintervals.
Similarly, define the knot sequence
, with points
, such that
where the interval
is partitioned into
subintervals.
To establish the asymptotic behavior of the proposed estimators, we impose the following regularity assumptions.
- (C1)
The true parameter vectors and , where and are compact sets. The function , possesses a strictly positive, continuously differentiable first derivative over the interval . The function is a strictly increasing with respect to y and has continuous derivatives up to third order on its domain .
- (C2)
The vector of covariates is supported within a bounded subset of , and its probability density function is bounded away from zero.
- (C3)
For , the th derivative of satisfies the Lipchitz continuity condition on . Specifically, there exists a constant such that for all , . Similarly, the th derivative of satisfies the Lipchitz condition on . That is, there exists a constant such that for all , .
- (C4)
Define the maximal knot spacings as , . Additionally, the ratios of maximum to minimum knot spacings and are uniformly bounded, where .
- (C5)
If there exists a unit vector such that almost surely, then it must hold that .
- (C6)
The number of interior knots satisfies that
and
. Let
be the upper bound of the
B-spline coefficients and
the upper bound for the monotone spline coefficients. These satisfy:
Additionally, the distance between adjacent interior knots lies in the interval
for some constant
.
- (C7)
If for any in a certain support set with probability 1, then and in this support set.
Conditions (C1) and (C2) ensure that both the true parameter values and covariates lie within bounded regions, which is a standard requirement in the analysis of interval-censored data [
36,
37]. These conditions also guarantee the requisite smoothness of the functions
and
. Meanwhile, condition (C3) is introduced to facilitate the derivation of the convergence rate of the estimators. Condition (C4) is essential for establishing asymptotic normality [
38]. Condition (C5) serves as an identifiability condition for single-index models, and it is satisfied when
follows a multivariate normal distribution [
20]. Under condition (C6), one may choose the number of knots as
, and the spline coefficient bound as
for any constant
. Condition (C7) is used to prove that the matrix
is nonsingular. Notably, the aforementioned regularity conditions may appear somewhat strong or restrictive at first glance. However, such assumptions are standard in the theoretical analysis of interval-censored and semiparametric models. They are crucial for guaranteeing key theoretical properties of the proposed estimators, including consistency and asymptotic normality. Moreover, these conditions are practically reasonable and often satisfied in real applications. For instance, parameter spaces are typically bounded due to prior scientific knowledge or data domain restrictions; smoothness and monotonicity assumptions naturally align with the typical behavior of cumulative hazard functions; and the bounded support of covariates ensures stable estimation. Therefore, the imposed regularity conditions strike a balance between mathematical tractability and practical applicability.
Theorem 1. Suppose that conditions (C1)–(C6) are satisfied. Then the estimators satisfy , and in probability, where for any differentiable function f with dervative , the norm is defined as .
In the above theorem,
denotes the norm in the Sobolev space
, which incorporates both the function itself and its first-order derivatives [
39]. This norm quantifies the uniform boundedness and smoothness of functions in the space. By constraining the estimator under this norm, we effectively restrict its complexity and prevent excessive local oscillations, which promotes estimator stability and favorable convergence properties such as consistency.
To characterize the asymptotic distribution, we impose the assumption without loss of generality. For a q-dimensional vector , we denote by the vector consisting of its first components.
Theorem 2. Assume conditions (C1)–(C7) hold, let the combined parameter vector be defined as with true value . Then, as ,where denotes convergence in distribution, and is the information matrix evaluated at . Consequently, converges in distribution to a mean-zero normal random vector with covariance matrix . Furthermore, the inverse information matrix achieves the semiparametric efficiency bound. The comprehensive proofs of Theorems 1 and 2 can be found in
Appendix A. For conducting inference on the true parameter vectors
and
, it is vital to obtain consistent estimates of the covariance matrices of the estimators
and
. We recommend a nonparametric bootstrap procedure, which proceeds in three main steps. First, generate
S bootstrap samples by resampling the original dataset with replacement, where
S is a large integer (typically
or greater). Second, for each bootstrap replicate
, compute the parameter estimates
,
,
, and
using the same estimation procedure applied to the original data. Third, approximate the asymptotic covariance matrices of
and
by the sample covariance matrices of the bootstrap estimates
and
, respectively. Since the asymptotic distributions of
and
are not well characterized under interval censoring, we construct 95% pointwise confidence bands using the 2.5% and 97.5% quantiles of the corresponding bootstrap replicates
and
. We note that these confidence bands provide valid coverage only at each individual point (i.e., pointwise coverage) but do not guarantee simultaneous coverage over the entire range of the functions. Developing methods for constructing simultaneous confidence bands under interval censoring remains an important direction for future research.
5. Simulation Studies
This section reports the results of simulation experiments designed to assess the finite-sample behavior of the proposed method. For each subject
i, four independent covariates,
, were generated from a uniform distribution over the interval
to represent the main effects. The influence of treatment on survival times depended solely on the first two covariates,
and
. The failure time
was generated according to the model in Equation (
1), with main-effect parameters
and interaction-effect parameters
. Treatment assignment
was generated independently of the covariates
and followed a Bernoulli distribution with probability 0.5. Simulations were conducted for sample sizes of
, 800, and 1000, each repeated 500 times to evaluate performance. The baseline hazard function
was defined as a Weibull hazard with shape parameter
and scale parameter
, yielding the cumulative baseline hazard
. Two scenarios were examined, differing only by the choice of the link function: (i)
; and (ii)
. The follow-up period was fixed at
, with no observations beyond this time. To induce interval censoring, two inspection times were generated for each individual:
and
. These inspection times divided the timeline into three intervals:
,
, and
. Under these settings, approximately 12–17% of observations were left-censored and 34–41% were right-censored.
We explored a variety of models with the number of knots ranging from 1 to 8. Through this exploration, the cumulative baseline hazard function was approximated using cubic monotone splines equipped with 5 interior knots, which were positioned at equally spaced quantiles of the observed event times. For the link function , quadratic B-splines with 3 interior knots were employed, with knots located at equally spaced quantiles of the values . The model incorporating 5 interior knots for the monotone splines and 3 interior knots for the B-splines achieved the lowest Akaike information criterion (AIC) and was thus selected. In the simulation studies, we also investigated the effect of varying the tuning parameter on the estimation performance. The empirical results demonstrate that once surpasses roughly 13, the estimates stabilize, showing very little sensitivity to further increases. For practical simplicity, we fixed as the default choice for all simulations.
Table 1 presents the finite-sample performance metrics of the proposed estimator under the linear link setting
. For each parameter, we summarize four statistics computed over 500 replications: Bias (defined as the average difference between the estimated and true parameter values), sample standard error of the 500 estimates (SSE), the average of the 500 standard error estimates (SEE), and 95% coverage probability (CP95) based on normal approximation. Standard errors were obtained through a nonparametric bootstrap procedure with 100 bootstrap samples per replication. As shown in
Table 1, the proposed approach yields nearly unbiased estimates for both main-effect parameters
and interaction parameters
. The SSE and SEE closely align, and the 95% coverage probabilities remain close to the intended nominal level, indicating the bootstrap variance estimator is reliable. Furthermore, improvements in bias and variability are observed as the sample size increases from 500 to 1000. The lower part of
Table 1 reports inference results for the ATE, denoted
. Across sample sizes, the ATE estimator displays negligible bias, well-calibrated standard errors, and coverage probabilities near 95%. Collectively, these findings validate the practical effectiveness of the proposed method for finite samples. We conducted a comparison between our proposed method and the approach of Wang et al. [
32], hereafter referred to as the “Cox method.” As shown in
Table 1, Wang’s method exhibited considerably large bias in the estimation of the
parameters, while the bias for the
parameters remained relatively moderate. The poor performance of the Cox model, despite the linearity assumption, is likely due to the binary nature of the treatment variable
A (taking values 0 or 1), which limits the model’s ability to accurately estimate
, resulting in biased estimates. Furthermore, Wang’s method faced considerable challenges in variance estimation, with many variance estimates reported as NA. This issue may be partly attributed to the EM algorithm employed by Wang et al. [
32], which relies on
optim() requiring convergence at every iteration step, failure to achieve convergence at any step can cause the procedure to crash. Consequently, we only present bias and SSE for this comparison. It is worth noting that the frequency of these NA values decreased as the sample size increased, suggesting that Wang’s method may be more appropriate for large-sample scenarios. Overall, these results demonstrate the enhanced accuracy and robustness of our proposed method, particularly in moderate to small sample settings.
Figure 1 displays the estimated baseline survival function
and the linear link function
. The estimated
and
closely align with their true counterparts, demonstrating high accuracy. With the visualization of the estimated link function
, clinicians can compute the patient-specific risk score, defined as the index variable
derived from patient covariates, and then use the plotted curve to more intuitively determine the optimal treatment. Specifically, by locating the patient’s risk score on the horizontal axis of the
plot, clinicians can observe whether the function value is below zero, which indicates that treatment
is preferable, or above zero, indicating treatment
is preferable. This visual tool enhances interpretability and facilitates transparent, personalized treatment decisions based on the model’s estimates. These findings underscore the robustness and reliability of the proposed method, particularly for larger datasets.
Table 2 reports simulation results for parameter estimates obtained using the exponential link function
. Consistent with the findings in
Table 1, the proposed method exhibits low bias, accurate variance estimation, and 95% coverage probabilities near the nominal level. As anticipated, estimation accuracy improves and bias diminishes with increasing sample sizes.
Figure 2 presents the mean estimates of the link function
and the baseline survival function
. As shown in
Table 2, the Cox method continues to exhibit substantial bias in estimating the
parameters; however, the bias is notably reduced compared to the scenario where the function
is assumed linear. Interestingly, under this setup, we observed that Wang et al. [
32]’s method encountered far fewer NA values in variance estimation than in previous settings. This suggests that the Cox method may not be well-suited for the simulation scenario where
. These results further confirm accurate estimation and excellent agreement with the true functions.
To assess the performance of the derived treatment decision rules, we created an independent test dataset consisting of 1000 subjects. Individualized treatment recommendations were generated by applying the estimated coefficients from the proposed model. These recommendations were then compared against the true optimal treatment assignments.
Table 3 reports the average rates of treatment assignment errors under both the linear and exponential link functions. The model misclassified treatment assignments at very low rates—0.8% for the linear link and 0.2% for the exponential link—highlighting the high accuracy of the proposed method in identifying appropriate treatments. Collectively, these findings reinforce the method’s robustness in tailoring treatment strategies on a personalized level and its potential to enhance clinical outcomes.
To comprehensively evaluate and compare the predictive performance of the proposed model against the Cox model, patients in the test dataset were stratified into three distinct subgroups according to their “risk scores”, defined as
. This stratification aims to assess how well each model performs across different levels of patient risk, allowing for a more nuanced comparison beyond aggregate measures. The three subgroups correspond to low-risk patients with
, moderate-risk patients with
, and high-risk patients with
.
Figure 3 illustrates the average survival probability over time for each subgroup under the exponential link function scenario with a sample size of
. This breakdown enables us to examine the treatment rule effectiveness and survival predictions within clinically relevant risk categories, highlighting areas where the proposed model may offer improved individualized prognostication relative to the Cox model. Notably, in subgroups (i) and (iii), which correspond to patients with
less than 0.2 and greater than 0.5 respectively, the survival probabilities predicted by both the proposed model and the Cox model closely match the true survival curves. This alignment indicates that, for low-risk and high-risk patients, both models capture the underlying survival patterns with comparable accuracy. However, a distinct difference emerges in subgroup (ii), which includes 176 patients with intermediate
between 0.2 and 0.5. As illustrated in
Figure 3, the proposed model’s predicted average survival probabilities more closely approximate the true survival function compared to those generated by the Cox model. Specifically, at time
, patients receiving treatments based on the proposed model have an average survival probability of 0.33, which is notably closer to the optimal treatment rule’s survival probability of 0.36. In contrast, patients treated according to the Cox model’s recommendations exhibit a lower average survival probability of 0.31. These findings highlight the superior ability of the proposed model to personalize treatment decisions within this moderate-risk subgroup. The improved performance likely stems from the model’s flexible structure, which better captures complex nonlinear relationships and interactions that may be overlooked by the Cox model.
We assessed the computational efficiency of our approach as follows. In terms of computation speed, the proposed method generally requires less than 2 min to calculate the point estimates based on one simulated dataset of size 1000. For variance estimation, we employed the bootstrap approach with resamples, balancing accuracy and efficiency. To mitigate the increased computation time, we implemented parallel processing utilizing 20 cores to accelerate both the bootstrap procedure and the overall algorithm. Regarding memory usage, monitoring with system tools indicated that the main R process running the computation consumes approximately 4 GB of RAM for the dataset of this size. This memory usage is moderate and considered feasible for typical modern desktop environments.
6. Application
We applied the proposed approach to data from the AIDS Clinical Trials Group (ACTG) 320 study, a randomized, double-blind, placebo-controlled trial designed to assess the efficacy of a three-drug antiretroviral regimen—indinavir (IDV) combined with open-label zidovudine (ZDV) or stavudine plus lamivudine (IDV group)—compared to a two-drug regimen of ZDV (or stavudine) plus lamivudine alone (non-IDV group). The trial enrolled HIV-1-infected individuals who had undergone at least three months of prior zidovudine therapy and had baseline CD4 counts not exceeding 200 cells/mm
3 [
40]. Participants were administered open-label ZDV and lamivudine, with randomization to either indinavir treatment (
) or a placebo (
) every eight hours. The primary composite outcome, denoted by
T, was defined as the time until the first occurrence of an AIDS-defining event, a reduction of at least 50% in CD4 cell count from baseline, or death due to any cause. Further details on the trial’s design and procedures are available in Hosmer et al. [
41]. Our objective was to explore baseline patient features that modify treatment effects on survival and to develop individualized treatment rules that optimally allocate either the three-drug IDV-containing regimen or the two-drug non-IDV regimen to maximize expected survival benefit. A key analytic challenge stemmed from the periodic nature of clinical follow-up assessments in the trial. Since AIDS-defining events, substantial CD4 declines, or deaths were detected only at scheduled visits, typically occurring every 4 to 8 weeks, the exact event time could not be precisely determined and was only known to lie between two consecutive visits. Consequently,
T is interval-censored, meaning it is known to fall within the interval between the last event-free visit and the first visit at which the event was confirmed. This interval censoring complicates standard survival analysis and necessitates specialized semiparametric modeling frameworks to ensure valid inference and optimal treatment allocation.
After removing participants with missing covariates, the ACTG320 dataset included 1080 individuals, including 65 with interval-censored event times and 992 with right-censored observations. While most observations are right-censored due to no event by the last follow-up, about 6% are interval-censored as a direct result of the study design rather than data limitations. Treating these interval-censored cases as exact event times or as right-censored observations may lead to biased survival estimates. Although interval-censored observations represent a smaller portion of the dataset, properly incorporating them improves inference accuracy and fully utilizes the available information, thus enhancing model validity compared to methods that consider only right censoring. We incorporated 10 covariates as main effects:
Sex (1 = male, 0 = female),
Race (1 = white Non-Hispanic, 0 = otherwise),
Ivdrug (1 = never used IV drugs, 0 = current or previous IV drug use),
Hemophil (1 = hemophiliac, 0 = otherwise),
Weight (weight at enrollment in kilograms),
Karnof (Karnofsky Performance Scale, where 100 indicates normal functioning without complaints or disease signs; 90 denotes minor symptoms with normal activity; 80 reflects some symptoms with activity requiring effort; and 70 means the ability to care for oneself but no normal or active work),
AveCD4 (baseline CD4 count),
AveCD8 (baseline CD8 count),
Priorzdv (months of prior ZDV use), and
Age (participant age at enrollment in years). All continuous covariates were standardized to have a mean of 0 and a standard deviation of 1. Previous studies Jiang et al. [
42] and Geng et al. [
43] suggested potential interactions between treatment effects and the covariates
Karnof,
Weight,
AveCD4, and
Age, leading us to include these four variables as interaction. We evaluated the influence of treatment–covariate interactions on survival time and derived the optimal treatment strategies by fitting the specified model (
1).
We assumed that the failure time followed a semiparametric single-index model characterized by the conditional hazard function:
where
represent the ten main effects mentioned earlier, with
corresponding to
Karnof,
Weight,
AveCD4, and
Age, respectively. Following the simulation, we used cubic splines to approximate
, quadratic
B-splines to approximate link function and 5 interior knots for monotone splines and 3 for
B-splines. The interior knots were positioned at equally spaced quantiles covering the range from the smallest to the largest observed times.
Figure 4 displays the estimated link function, revealing a clear nonlinear pattern.
Table 4 presents a comparison between our proposed approach and the traditional Cox proportional hazards model in assessing main covariate effects, the average treatment effect, and treatment–covariate interaction terms. Displayed results include parameter estimates (EST), estimated standard errors (SE), and corresponding
p-values. For our method, the standard errors were obtained through nonparametric bootstrapping using 100 resamples. The estimated interaction coefficients quantify the relative contributions of each treatment–covariate interaction to modifying treatment effects. Age (0.683) and Karnofsky score (0.297) have positive coefficients, indicating that higher values increase the single-index score
. Conversely, weight (0.153) shows a smaller positive effect, while average CD4 count has a negative coefficient (−0.649), suggesting higher CD4 levels decrease the index. Since our treatment rule assigns the IDV regimen when the estimated link function
is less than zero, patients with lower single-index values, who are often younger or have lower Karnofsky scores but favorable profiles in other covariates, are recommended for treatment. Our model identified both race and age as significant predictors of survival, whereas the Cox model only found race to be significant. A key finding from both models is the significant interaction between the baseline CD4 count (
AveCD4) and treatment, indicating that patients with lower initial CD4 levels derive greater benefit from the IDV-inclusive therapy, experiencing higher survival probabilities or extended survival times. Crucially, our method uncovered an additional significant interaction: age modifies treatment efficacy, with older patients experiencing a greater survival advantage from the IDV regimen than younger patients. Although the average treatment effect was significant in both analyses, our method yielded a far more stable estimate (SE = 0.026) compared to the Cox model (SE = 0.395). Unlike the Cox model, which relies on the proportional hazards assumption, our proposed method provides robust analysis even when this assumption is violated, offering a more reliable representation of treatment effects across diverse patient populations. In summary, our analysis demonstrates that the proposed flexible model reveals significant effects, such as the interaction with age that are missed by the Cox model, which imposes a linear structure on the link function. As demonstrated in
Figure 4, the link function clearly exhibits nonlinear patterns, highlighting the advantage of our approach’s flexibility and robustness to potential model misspecification. This makes it particularly well-suited for analyzing complex datasets with interval-censored outcomes, enabling more personalized and effective treatment strategies for HIV patients.
The results presented in
Table 5 compare the mean survival probabilities at specific time points across three treatment arms: Treatment A (IDV combined with open-label ZDV and lamivudine for all patients), Treatment B (ZDV and lamivudine alone for all patients), and Treatment C (a personalized treatment regimen tailored to individual patient characteristics based on the proposed model). The findings clearly indicate that Treatment C consistently achieves the highest survival probabilities, underscoring the effectiveness of personalized treatment plans driven by individual patient characteristics. Among the three arms, Treatment A demonstrates the lowest survival probabilities, suggesting that the addition of IDV to ZDV and lamivudine does not yield improved outcomes compared to the other regimens. Treatment B, while outperforming Treatment A, shows survival probabilities that are slightly lower than those of Treatment C. This indicates that the standard combination of ZDV and lamivudine alone is more effective than adding IDV but falls short of the benefits provided by a tailored approach. These findings emphasize the importance of personalized medicine in improving patient outcomes.
We further visualized the estimated link function
against age to provide a more clinically interpretable illustration of treatment effect heterogeneity.
Figure 5 exhibits a nonlinear relationship that aligns with the overall pattern observed between
and the index variable. Notably, the link function dips below zero primarily at younger ages, which, according to our treatment rule, corresponds to recommending the IDV regimen. This pattern suggests that treatment decisions are not driven by age alone but by the combined effect of age and other covariates encapsulated in the single-index
.
In particular, some younger patients with favorable profiles in other covariates may have low overall index values and thus be assigned to IDV treatment. This observation underscores the complexity and multivariate nature of the personalized treatment rule derived from our model.
7. Discussion and Concluding Remarks
In this study, we introduced a new framework for estimating optimal individualized treatment rules by modeling treatment–covariate interactions within a flexible single-index model. Our method utilizes a sieve maximum likelihood estimation technique specifically designed for interval-censored survival data, incorporating monotone splines to model the cumulative baseline hazard and
B-splines to approximate the link function. The adaptability of the resulting treatment rules arises from this nonparametric link function, which effectively captures complex, potentially nonlinear interactions between treatment assignment and patient covariates. To efficiently compute the sieve estimators, we developed an easy-to-implement EM algorithm. Building on empirical process theory, we derived the asymptotic properties of our estimators, thereby providing rigorous theoretical support for the proposed methodology. In this work, we utilized
bootstrap replications for inference, which provides reliable standard error estimation. While increasing the number of replications could further improve the accuracy of confidence interval calibration, this was not feasible in the current study due to computational time constraints. Future research will focus on enhancing computational efficiency, allowing for a larger number of bootstrap replications and more precise inference. To promote wider use of the proposed method, the R code is openly accessible at
https://github.com/ssshyyy0411/single-index-IC (accessed on 11 March 2026).
Several promising directions merit further investigation. First, extending the current approach to high-dimensional covariate settings is essential, given the increasing prevalence of biomarkers and genomic data in modern precision medicine. Direct application of the existing method without proper regularization or variable selection may lead to overfitting and reduced interpretability in such settings. Efficiently handling high-dimensional data thus requires incorporating sparsity-inducing techniques, such as LASSO [
44], SCAD [
45], or other regularization methods. These approaches enable simultaneous variable selection and treatment–covariate interaction modeling, thereby help maintain estimation accuracy and improve model interpretability. Developing and integrating these mechanisms represents a crucial direction for future methodological enhancements. Moreover, many clinical decisions involve not a single binary treatment, but rather multiple competing treatments or dynamic sequences of interventions. Therefore, to broaden the applicability of our method, it is essential to develop extensions that can handle multi-arm and longitudinal treatment settings. Second, incorporating advanced machine learning architectures, including deep neural networks, could substantially enhance flexibility. These models offer great flexibility in capturing complex, nonlinear relationships between covariates and outcomes. These methods have proven effective in a range of semiparametric regression settings, including those involving right-censored survival data [
46] and interval-censored survival data [
47,
48]. Their ability to automatically learn patterns from large datasets could complement traditional statistical methods and enhance predictive accuracy. Third, accommodating time-varying treatment effects is critical for chronic disease management and longitudinal interventions, where treatment efficacy may evolve with disease progression or cumulative exposure [
49,
50]. Extending the framework to dynamic treatment regimes would broaden its clinical relevance. Fourth, an important extension involves incorporating cluster-level random effects to capture unobserved heterogeneity across centers or clusters. In multi-center studies or clustered designs, fixed or random effects are commonly employed to model center-specific or cluster-specific variability [
51]. Integrating such random effects into the proposed framework would broaden its applicability to multi-center trial data, thereby improving both inference and predictive performance. In addition, while the current work is developed within a sieve maximum likelihood framework, situating the methodology within the broader frequentist-Bayesian context would further enrich its scope. The Bayesian approach offers complementary advantages such as natural uncertainty quantification through posterior distributions, the incorporation of prior knowledge, and flexible modeling of complex hierarchical structures. Recent advances in Bayesian degradation modeling and inference [
51,
52] provide useful methodological references for such an extension. Furthermore, in the current ACTG320 analysis, subjects with missing baseline covariates were excluded for simplicity and to maintain a complete-case analysis framework. We acknowledge that approach may introduce potential selection bias and reduce statistical efficiency. Incorporating principled missing-data methods such as multiple imputation [
53] would help to reduce bias and increase the robustness of the results. We plan to explore these approaches in future work to improve the analysis. Finally, generalizing the methodology to other semi-parametric survival models, such as accelerated failure time [
54] or transformation models [
55], would enable richer modeling of the underlying survival distribution and treatment–covariate dependencies. Such extensions could accommodate cure rate structures, or competing risks, thereby enhancing applicability across diverse medical and public health contexts.