Next Article in Journal
Voice-Based Detection of Parkinson’s Disease Using Machine and Deep Learning Approaches: A Systematic Review
Previous Article in Journal
AI-Assisted Response Surface Methodology for Growth Optimization and Industrial Applicability Evaluation of the Diatom Gedaniella flavovirens GFTA21
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Two-Step Variable Selection Strategy for Multiply Imputed Survival Data Using Penalized Cox Models

by
Qian Yang
1,
Bin Luo
2,
Chenxi Yu
3 and
Susan Halabi
3,*
1
Division of Infectious Diseases, Department of Medicine, Emory University School of Medicine, Atlanta, GA 30322, USA
2
School of Data Science and Analytics, Kennesaw State University, Marietta, GA 30060, USA
3
Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27708, USA
*
Author to whom correspondence should be addressed.
Bioengineering 2025, 12(11), 1278; https://doi.org/10.3390/bioengineering12111278
Submission received: 1 September 2025 / Revised: 7 November 2025 / Accepted: 18 November 2025 / Published: 20 November 2025
(This article belongs to the Section Biosignal Processing)

Abstract

Multiple imputation (MI) is widely used for handling missing data. However, applying penalized methods after MI can be challenging because variable selection may be inconsistent across imputations. We propose a two-step variable selection method for multiply imputed datasets with survival outcomes: apply LASSO or ALASSO to each MI dataset, followed by ridge regression, and combine estimates using variable selected in any or d% (d = 50, 70, 90, 100) of the MI datasets. For comparison, we also fit stacked MI datasets with weighted penalized regression and a group LASSO approach that enforces consistent selection across imputations. Simulations with Cox models evaluated tuning by AIC, BIC, cross-validation at the minimum error, and the 1SE rule. Across scenarios, performance differed by both the penalization and the selection rule. More conservative choices such as ALASSO with BIC and a 50% inclusion frequency tended to control false positive and gave more stable calibration. The grouped approach achieved comparable selection with modestly higher estimation error. Overall, no single method consistently outperformed others across all scenarios. Our findings suggest that practitioners should weigh trade-offs between selection stability, estimation accuracy, and calibration when applying penalized methods to multiply imputed survival data.

1. Introduction

Clinical studies are often hampered with missing data [1]. The missingness could be due to nonresponse, early dropout, or data collection errors. The nature of missingness is typically classified into three groups, based on the reasons of missing: missing completely at random (MCAR), where missing occurs completely at random and not related to any study variable; missing at random (MAR), where the probability of missing is related to the observed data; and missing not at random (MNAR), where the probability of missing depends on the unobserved data [2]. Common approaches for handling the missingness includes regression [3], maximum likelihood estimation [4,5], Bayesian methods [6], and multiple imputation (MI) [7,8]. Among these methods, multiple imputation is recognized for its ability to produce less bias and is a widely accepted approach that mitigates the impact of missing data in clinical research [9,10,11,12].
In this study, we assume missing at random (MAR), where missingness is conditionally independent of unobserved data given the observed covariates. This assumption underlies most multiple imputation methods, including those implemented here. However, it is important to acknowledge that missing not at random (MNAR) may occur in clinical datasets. For example, certain biomarkers may be more likely to be missing in patients with worse prognosis due to sample processing failures or selective testing, which may lead to a missingness pattern related to unobserved health status. While handling MNAR typically requires additional assumptions or sensitivity analyses, we focus here on MAR as a practically reasonable and widely used working assumption in oncology studies.
Penalized selection methods, such as the least absolute shrinkage and selection operator (LASSO) [13], adaptive least absolute shrinkage and selection operator (ALASSO) [14], and elastic net [15] have been extensively used to identify important predictors of clinical outcomes. However, when applied to multiply imputed datasets, these approaches pose a new challenge. In MI, each imputed dataset is a plausible version of the original data, and implementing variable selection separately to each one may yield inconsistent sets of selected variables. This inconsistency violates the assumptions under Rubin’s rules (RR) [7], making it difficult to combine estimates or draw overall conclusions.
Wood et al. [16] proposed three general strategies: (1) performing variable selection within each imputed dataset and selecting variables based on their inclusion frequency, (2) stacking all imputed datasets into a single dataset and applying variable selection once, using appropriate weights, and (3) conducting stepwise model selection using the Wald statistic combined across imputations via the RR. Although the RR approach was highly recommended for preserving the type I error rate, it was computationally intensive and may not scale well to larger datasets. However, Wood et al. mainly focused on continuous outcomes and assumed MCAR under most simulation scenarios, which limits the applicability of their findings in more realistic settings.
Subsequent research has extended penalized methods such as LASSO and elastic net to MI data, mostly for continuous or binary outcomes. Approaches include MI-LASSO, a group LASSO method [17] for joint modeling across imputed data, MI-WENet, a weighted elastic net method applied to stacked MI data [18], penalized objective functions that enforce consistent selection across imputations, with both “stacked” and “grouped” strategies [19], and variable selection based on the magnitude of estimates across imputations [20]. Building on earlier work, Zhao and Long [21] categorized imputation-based variable selection into pooled, stacked, and resampling-enhanced strategies, and highlighted that choice of method remains context-dependent and under-developed. Thao and Geskus [22] systematically compared LASSO-based approaches under multiple imputation and highlighted the robustness of the 1-SE penalty across settings.
Despite recent advances, variable selection with MI in survival analysis is underexplored. To our knowledge, studies applying group LASSO, inclusion frequency-based methods, or stacked penalization regression methods to the proportional hazards models under MAR are lacking. Recent work on more complex survival models—such as multi-parameter regression [23]—assume fully observed data, while earlier studies, such as Vonta et al. [24], used MI in survival modeling with AIC-based selection, but did not consider modern penalized regression techniques. Furthermore, the performance of penalized methods under criteria such as AIC or BIC remains understudied.
Our study addresses a critical gap by systematically evaluating penalized variable selection strategies in the context of survival analysis with MI data. Using data from a randomized phase III trial in men with metastatic castrate-resistant prostate cancer (mCRPC) (CALGB 90401) [25], we aimed to build a prognostic model of overall survival (OS) incorporating clinical and biomarker variables. The key challenge was the presence of missing baseline clinical covariates, which were assumed to be MAR. The MAR assumption was considered reasonable because missingness in these variables was likely associated with observed characteristics (e.g., age, performance status), rather than the unobserved values themselves. This aligns with the trial’s standardized study conduct, where missingness is often due to administrative reasons rather than underlying patient health status [9]. Building on methods proposed for continuous and binary outcomes, we adapt and assess the inclusion frequency and the stacked-data approaches—extending the framework of Wood et al. to penalized Cox models under realistic missing data mechanisms. Our contribution is primarily empirical and pragmatic: we evaluate and compare existing penalized approaches for variable selection under multiple imputation through extensive simulation and application to clinical trial data. We aim to determine the optimal strategy to perform penalized variable selection methods using MI data with survival outcomes, especially when the data structure and missing mechanism is similar to the CALGB 90401 data. To our knowledge, this is one of the first comprehensive evaluations of such approaches in the context of survival outcomes with MI.
The remainder of this article is organized in the following outline: the proposed variable selection methods are introduced in Section 2. The performance of the variable selection methods is then evaluated through extensive simulation studies and presented in Section 3. The variable selection methods are then applied to a real dataset in men with mCRPC in Section 4. Conclusions and discussions derived from both the simulation results and real-life application are presented in Section 5 and Section 6.

2. Methods

2.1. Penalized Models

We explored using LASSO and ALASSO penalty functions with different selection of tuning parameter λ : Akaike’s Information Criteria (AIC) [26], Bayesian Information Criteria (BIC) [27], minimum mean cross-validated error (CV.min), and within one standard error from the minimum (CV.1se). These penalized methods were chosen for their sparsity property [13], which shrinks weaker coefficients to zero to improve model interpretability, while also enhancing predictive accuracy. Moreover, penalized regression methods such as LASSO are well-established tools for reducing overfitting, as they effectively balance model fit with parameter shrinkage [28,29]. In our implementation, the tuning parameter λ  was selected independently within each imputed dataset for all criteria (AIC, BIC, CV.min, and CV.1se) following standard practice in the MI context.

2.2. Model Selection Approaches with MI Data

We investigated three main model selection approaches using variable selection on each MI data separately and using variable selection on stacked MI data. We compared these proposed model selection strategies integrated with MI with the complete cases analysis (CC).

2.2.1. Perform LASSO/ALASSO Selection on Each MI Data Separately

  • AVG1: Select variables that are selected in any of the MI datasets
  • AVG50: Select variables with an inclusion frequency >50% across MI datasets
  • AVG70: Select variables with an inclusion frequency >70% across MI datasets
  • AVG90: Select variables with an inclusion frequency >90% across MI datasets
  • AVGALL: Select variables that are consistently selected by all MI datasets
For all the strategies above, the penalized model was fitted to each of the M datasets, where M denotes the number of imputations. The final variable selection was determined by the inclusion frequency of each variable across the M models. A ridge regression proportional hazards model was fitted to each of the M dataset using the selected variables and the regression coefficient estimates were averaged over all M models.

2.2.2. Perform LASSO/ALASSO Selection on the Stacked Long Data

In the stacked approaches, all the M MI datasets were combined into a long single dataset, and a penalized model was fitted with weight assigned to each observation. wi denotes the weight for each observation in the stacked data.
  • STK1: Assigned a uniform weight wi = 1/M to each observation, resulting in a total weight of 1 per individual across the M imputations
  • STK2: Assigned weight wi = fi/M where fi was defined as the ratio of the number of complete variables for individual i to the total number of covariates, which is used previously in MI-WENet by Wan et al. [18]. This approach gives lower weight to individuals with more missing data.
For both stacked methods, a ridge penalized proportional hazards model was fitted to the stacked dataset using the previously selected variables to obtain the coefficient estimates.

2.2.3. Perform Group LASSO/ALASSO Selection on Column-Bound Wide Data

To impose consistent selection across imputations, we implemented a group penalized approach by combining the M imputed datasets into a single wide-format dataset, where each covariate had M copies (one per imputation). A group LASSO (or group ALASSO) Cox model [30] was then fit, with the same covariate across imputations treated as a group and penalized jointly. This encourages selection consistency by either retaining or excluding the covariate in all imputations simultaneously.
To ensure comparability across methods, we adopted the following procedure: (i) Variable selection using the group penalized Cox model; (ii) Refitting the selected variables via ridge regression within each multiply imputed dataset, followed by averaging the refitted coefficients across imputations.
For each method described above, coefficient estimates from refitted ridge penalized regression models were averaged across imputations to provide a descriptive summary of variable selection and effect size. Because penalized models do not yield well-defined standard errors and the selected model may vary across imputations, Rubin’s rules could not be applied directly here. Accordingly, the average coefficients should be interpreted descriptively rather than inferentially. In simulation analysis, a sensitivity analysis was added to evaluate the empirical coverage of confidence intervals.

2.3. Evaluation of Different Approaches

The performance of the different approaches in the simulation studies was evaluated in terms of variable selection, parameter estimates, and predictive performance. The time-dependent area under the curve (tAUC) was also used to assess the predictive accuracy of the models [31]. For each simulation dataset, an independent testing set of 2000 individuals was generated under the same settings. The tAUC was then computed using the average parameter estimates from the refitted penalized proportional hazards model across M imputations [32].
In addition, we computed the integrated Brier score (IBS), calibration slope, and calibration intercepts at the 25th, 50th, and 75th percentiles of the observed event times, to evaluate overall prediction error and model calibration.
To evaluate the performance of the approaches when applied to the CALGB 90401 data in Section 4, as external validation data were not available, we calculated the optimism-corrected tAUC using bootstrap resampling. Specifically, for each of the M-imputed dataset, 200 bootstrapping samples were drawn. For each bootstrap sample, we computed the tAUC on the bootstrap sample (resampled data). The optimism was defined as the difference between the tAUC of the resampled and the original imputed data, and corrected tAUC was obtained by subtracting the optimism from the original tAUC on each MI data. These corrected tAUC values were summarized across imputations to assess model performance.

2.3.1. Variable Selection

Our goal was to achieve a higher proportion of correct selections, positive discoveries, along with lower proportion of false positives and false negatives. Ideally, better performing models would identify more true covariates while excluding irrelevant (noise) covariates. We assessed each method’s ability to detect true covariates in the presence of noise, with performance metrics averaged across S simulation datasets to evaluate variable selection.
Correct selection (CS) was defined as the proportion of selecting exactly all the true covariates out of all simulation datasets.
C S = 1 S S = 1 S 1   i f   e x a c t l y   9   t r u e   c o v a r i a t e s   a r e   s e l e c t e d 0   o t h e r w i s e  
Positive discovery (PD) was defined as the average proportion of selecting the true covariates out of all selected covariates across all simulation datasets.
P D = 1 S S = 1 S # s e l e c t e d   t r u e   c o v a r i a t e s #   a l l   s e l e c t e d   c o v a r i a t e s
False positive (FP) was calculated as the average proportion of noise covariates selected by the model out of all noise covariates.
F P = 1 S s = 1 S # s e l e c t e d   n o i s e   c o v a r i a t e s #   a l l   n o i s e   c o v a r i a t e s
False negative (FN) was calculated as the average proportion of the true covariates not selected by the model out all true covariates.
F N = 1 S s = 1 S # t r u e   c o v a r i a t e s   n o t   s e l e c t e d #   a l l   t r u e   c o v a r i a t e s

2.3.2. Parameter Estimates

The parameter estimates of the models were assessed using the following criteria and the final values were averaged across the nine true covariates and S simulation data, with β ^ defined as the estimated coefficient and β as the true coefficient:
Average bias:
B I A S = 1 S s = 1 S ( β β ^ )
Mean Squared Error (MSE):
M S E = 1 S s = 1 S ( β β ^ ) 2

3. Simulation

3.1. Simulation Design

Simulation studies were conducted to compare the performance of each variable selection method, motivated by the CALGB 90401 randomized phase III trial. Each simulated dataset included 500 individuals with 20 covariates denoted as X(X1X20) with X1X9 designated as true covariates and X10X20 as noise. The number of covariates was chosen to reflect the number of covariates considered for building the prognostic model of OS in prostate cancer. Covariates were generated from a multivariate normal distribution with mean 0, unit variance, and pairwise correlation of 0.1 to mimic realistic correlation structures. In addition, X2, X3 and X5 were dichotomized at 0 to create binary variables (values greater than or equal to zero coded as 1, and values less than 0 as 0).
The survival time was simulated from the proportional hazards model using all the true covariates (X1X9) with the following hazard function:
h ( t | X ) = h 0 ( t ) exp ( X T β ) ,
where β = ( β 1 , β 2 , β 3 , β 4 , β 5 , β 6 , β 7 , β 8 , β 9 ) .
The baseline hazard function h0(t) was assumed to follow a Weibull distribution (κ = 2, λ = 0.001 ) [33,34], motivated by its widespread use in simulating Cox proportional hazards models and its consistency with the survival patterns observed in our motivating CALGB 90401 dataset. Censoring times were independently generated from an independent uniform distribution ~ U 0 , θ , where the censoring parameter θ was selected to achieve the target 0.1 or 0.3 censoring proportions. These levels reflect what was observed in the CALGB 90401 study (censoring proportion = 0.1) and are consistent with other oncology studies [35,36,37]. Following our prior work [38], we considered two sets of regression coefficients to represent weak and strong signals; the following scenarios were considered.
The coefficients for the weak signal were β 1 = 0.375 , β 2 = 0.5 , β 3 = 0.75 , β 4 = 0.375 , β 5 = 0.75 , β 6 = 0.5 , β 7 = 0.25 , β 8 = 0.75 , β 9 = 0.5 , whereas the coefficients for the strong signal were β 1 = 2.38 , β 2 = 2.02 , β 3 = 2.19 , β 4 = 2.26 , β 5 = 2.00 , β 6 = 2.25 ,   β 7 = 2.11 , β 8 = 2.33 , β 9 = 2.27 .
MAR was implemented, consistent with the strong plausibility that missingness in our study arose from observed clinical factors rather than from unmeasured baseline variables [10,39]. In randomized clinical trials, missing baseline covariates are often related to characteristics such as age, study center, or performance status, and not the unobserved values themselves. To simulate this mechanism, a logistic regression model was used for the probability of missingness for covariate Xj, where j = 2,4,12,14 based on complete covariates Xc = (X1, X3, X5, X7, X8, X9). The probability of missingness Rij for individual i and variable j for j = 2,4,12,14 was defined as:
L o g i t ( P r ( R i j | X c ) = α 0 + 0.25 X i c T .
The value of α0 is then selected to achieve 10% and 20% of missingness, respectively. Importantly, the missing mechanism was implemented independently of the censoring mechanism in all simulations. That is, the probability of a covariate being missing was not influenced by whether or when a participant was censored. We explored various combinations of censoring proportion (C) (at 0.1 and 0.3), proportions of missing values per covariate (pm) (at 10% and 20%), and the number of multiple imputations (M) (at 10 and 30). The maximum missingness observed in the baseline covariates in the CALGB 90401 dataset was 19.1% for opioid analgesic use, with BMI at 12.4% and most other baseline variables ≤0.5%. Thus, simulating 10–20% missingness allowed us to reflect the level of incompleteness in our dataset while also covering an upper yet realistic level commonly encountered in clinical trials [40,41]. Due to the extensive computing time, 100 simulation datasets (S) of 500 individuals were created under each of the eight scenarios, with the run time of approximately 8–12 h for each task. MI was performed using the {mice} package in R [42], which assumes that missingness is at random (MAR). The assumption is consistent with the design of the CALGB 90401 trial. The MI process utilized random-forest method (meth = “rf”, ntree = 10), generating M = 10 or M = 30 completed datasets according to the scenario. Before imputation, we approximated the cumulative hazard to the survival time using the Nelson–Aalen estimator and included this variable as a predictor in the imputation model, following the recommendation of White and Royston to carry out survival information into the imputation step [43].

3.2. Sensitivity Analysis

Because penalized selection followed by refitting within each imputed dataset does not directly provide standard errors, we conducted a focused sensitivity analysis to examine CI coverage for the three configurations that are most representative of our proposed workflows: AVG50, STK2, and GRP with ALASSO and BIC tuning. For the CI sensitivity analysis, after the adaptive group LASSO selected a set of predictors, we recomputed robust standard errors using an event time-based sandwich estimator for the Cox partial likelihood. The estimator aggregates score and risk-set information across all failure times, adjusts the information matrix on the active (nonzero) coefficients to account for the adaptive penalty, inverts the adjusted matrix, and maps it back to the full parameter vector. Wald-type 95% CIs were then formed from the diagonal of this sandwich covariance. We evaluated empirical CI coverage in the weak signal, high-missingness, high-censoring scenario (MI = 10, pm = 30%, C = 0.30) using:
Mean CI Coverage (CI.Cov):
M e a n   C o v e r a g e = 1 S s = 1 S 1   i f   t h e   t r u e   c o e f f i c i e n t   i s   c o v e r e d   b y   t h e   e s t i m a t e d   95 %   C I 0   o t h e r w i s e   ,
and the reported coverage was averaged over the S simulation replicates and over the MI multiply imputed datasets. Here, for the sensitivity analysis S = 100 replicated simulation dataset.

3.3. Simulation Results

3.3.1. Weak Signal

Figure 1 presents the results of the variable selection under the weak signal setting, with 10 imputed data (M), 10% missing (pm), and 0.10 censoring (C). Under this certain combination of M, pm, and C, ALASSO BIC with moderate inclusion frequency (AVG50, AVG70) gave the most balanced variable selection performance, with consistently balanced sensitivity (capturing the true covariates) and specificity (limiting noise), whereas the 1SE tuned versions were noticeably more conservative and tended to miss borderline true predictors (Panel: Falso Negative). Similarly, the GRP approach tends to give the most stable predictors across imputations. It does not pick up a lot of noise, but at the price of missing true covariates. The stacked methods showed stability in retaining true covariates, but at the cost of slightly higher false positive rates compared with inclusion frequency approaches. It is worth noting that all approaches maintained low proportion of false negative.
Table 1 shows the summary statistics of the results from the simulation of the variable selection with 10 imputed data (M) and 0.1 censoring (C). When the percentage increased from 10% to 20%, ALASSO.CV.1se with 50% inclusion frequency remained top-performing method across all metrics. Additional variable selection results with various combination of number of imputed data, censoring and percentage missing are presented in the Supplementary Figures S1–S4. Among all factors, censoring had the highest impact on variable selection, when moving from C = 0.10 to C = 0.30. In contrast, increasing the number of imputations from 10 to 30 had minimal effect on variable selection when the percentage missing was moderate. When C = 0.3, the false negative proportion increased for most methods, reflecting the loss of information from fewer observed events, and thus reduced the effective information available for distinguishing between true predictors and noise. This pattern was most evident for stricter inclusion rules such as AVG90 and AVGALL, which require near consensus across imputations to retain a predictor. Stacked methods (STK1/STK2) continued to show selection stability—they tended to keep more true variables—but this came with modestly higher false positive rates than inclusion frequency-based methods.
Compared to other approaches, AVG1 selected more noise covariates and had a high proportion of false positive, as it retained any variable selected in at least one imputed dataset, leading to the inclusion of many noise covariates. In contrast, AVGALL, with its strict selection criterion, selected the fewest noise covariates. Both stacked approaches (STK1 and STK2) produced similar variable selection results. ALASSO with the BIC in stacked methods achieved higher proportions of correct selections and positive discoveries, while maintaining relatively lower false positive and false negative proportion. Overall, ALASSO with BIC using a moderate inclusion frequency gave the best trade-off between true variable selection and controlling false positives, while maintaining a low proportion of false positive, false negative, and high proportion of correct selection and positive discovery among all the variable selection strategies.
Figure 2 shows the summary statistics for the results of the parameter estimates simulation. Among all approaches CC had the highest bias and MSE. Table 2 presents the summary statistics of the parameter estimates for simulations with 10 imputed data and 0.1 censoring. Additional results for other combinations of imputation number, censoring rates, and missing percentage are presented in Supplementary Figures S5 and S6. No single method consistently outperformed the others across all scenarios. For most approaches, both bias and MSE increased as the proportion of missing data increased. When the proportion of the missing remained the same, the impact of C = 0.3 was similarly adverse as in variable selection: bias and MSE increased relative to C = 0.10 across most methods. Nevertheless, BIC and min lambda-based approaches continued to yield lower bias and MSE, with AIC close behind, preserving their advantage even as censoring rose, though the gap narrowed under heavier censoring and weaker signals. Increasing the number of imputations from 10 to 30 had a negligible impact on the parameter estimates. Overall, ALASSO and LASSO approaches with 1SE maintained relatively higher bias and MSE than the other penalized methods. These trends persisted with STK1 and STK2, although the differences between penalized methods were less pronounced under the stacked approaches. GRP approach was generally well behaved but not dominant. It kept bias in a reasonable range, yet its MSE was often slightly higher than its BIC-tuned AVG counterparts.
Across all scenarios, differences in tAUC were small between methods, based on Figure 3, Table 3, and Supplementary Figure S7. The main drivers were data scarcity factors: moving from 0.1 to 0.3 censoring and from 10% to 20% missingness produced small, consistent declines in tAUC, while increasing the number of imputations from 10 to 30 had negligible impact. Notably, parsimonious choice (e.g., ALASSO with 1SE and moderate inclusion thresholds) achieved similar tAUC to more complex models such as min lambda, indicating that aggressive variable retention did not translate into significantly better tAUC here. In summary, under weak signals, modest losses in information (higher censoring or missingness) reduces tAUC slightly. Beyond discrimination, we also summarized the integrated Brier score (IBS), calibration slope, and calibration intercepts at the 25th, 50th, and 75th percentiles of follow-up for all methods and tuning rules. These results are displayed in Figure 3 and Figures S8–S12, which shows that IBS values were tightly clustered, especially across the inclusion threshold (AVG50–AVG90) strategies, indicating that none of the approaches introduced substantial additional prediction error. Calibration slopes were generally close to 1.0 for the averaging-based methods. The LASSO and ALASSO models tuned with the 1se rule showed noticeably larger calibration slopes since their stronger penalty over shrunk the coefficients, producing risk scores with low variability then required reinflation at the calibration stage. Taken together, the MI strategies that reduced cross-imputation selection variability (e.g., AVG50) also preserved overall accuracy and calibration, not just tAUC.

3.3.2. Strong Signal

Figure 4 displays the summary of the results of the variable selection for the strong signal setting (10 imputed data, 10% missing, and 0.1 censoring). ALASSO with 1SE and moderate inclusion frequency now achieving the best balance of correct selection, positive discovery, and low false positive proportion. Compared to the weak signal setting, all the methods tended to have a lower proportion of false negative and false positive, since the true covariates are easier to detect. ALASSO with minimal lambda and ALASSO with BIC had improved variable selection under the strong signal, with increases in correct selection, positive discovery, and false positive rates. Table 4 presents detailed statistics, and Supplementary Figures S11–S15 provide additional combinations of settings. Under strong signal, both false negative and false positive decreased because larger effects are easier to detect and spurious inclusions are less likely. In C = 0.30 setting, variable selection decreased less than in weak-signal settings, and ALASSO with 1SE and moderate inclusion thresholds again balanced sensitivity and specificity well. ALASSO models with BIC and min lambda exhibited improved correct selection and positive discovery rates under strong signals. Stacked approaches remained competitive but retained relatively higher false positives than the inclusion frequency-based approaches. Similarly to the results observed earlier in weaker setting, the false positive rate decreased from AVG1 to AVGALL, and ALASSO models with BIC, min, and 1SE lambdas continued to show strong performance. Although ALASSO 1SE with AVGALL and GRP with LASSO 1SE showed slightly higher false negatives under certain scenarios, the magnitude was minimal.
Figure 5 presents the parameter estimates results (10 imputed data, 10% missing, and 0.1 censoring). Overall, bias and MSE were higher than the weak signal settings, an expected finding due to the larger true values of the coefficients. Table 5 shows the summary statistics of parameter estimate results and Supplementary Figures S17 and S18 provide additional combination settings. AIC, BIC, and min lambda approaches continued to deliver the lowest bias and MSE across censoring levels, including C = 0.30. 1SE approaches tended to be more conservative and showed slightly higher bias and MSE. Stacked approaches produced moderate estimation errors with little separation between stacking weights. For both signal strengths, stacked methods tend to emphasize stability—capturing more true variables, but at the cost of slightly higher false positives.
With stronger effects, overall tAUC levels were slightly higher and remained remarkably stable across penalization and selection strategies, as shown in Table 6 and Supplementary Figure S19. Even under heavier censoring (0.3) or higher missingness (20%), the reductions in tAUC were small. More complex models such as min lambda did not yield detectable gains in tAUC over more conservative 1SE or BIC selections. Stacked versus inclusion-frequency strategies showed comparable discriminative performance. In summary, in strong signal settings, tAUC is largely insensitive to the modeling variant. A similar pattern was seen for the integrated Brier Score in Figure 6. Point estimates were tightly clustered across methods and tuning choices and increases in missingness or censoring led to only modest degradations in overall prediction error. Calibration measures were somewhat more method-sensitive: LASSO/ALASSO models tuned by the 1SE rule tended to have larger calibration slopes and shifted intercepts, consistent with their slightly over-shrunk risk scores. Under strong signals, the main methods deliver comparable Brier Scores, and differences are driven primarily by how aggressively the penalty shrinks the linear predictor.
In the sensitivity analysis, the mean CI coverage was reported in Supplementary Table S1. The CIs showed the expected under-coverage, especially for weaker signal with high missingness. Mean CI coverage was the highest for AVG50 approach, lower for GRP, and poorest for STK2, which is consistent with STK2′s more aggressive down-weighting across imputations.

4. Application

We analyzed data from CALGB 90401, a phase III trial of 1050 men with mCRPC comparing docetaxel plus prednisone with either bevacizumab (DP+B) or placebo (DP) [25]. The primary outcome of the study was OS, defined as the time from date of random assignment to date of death or last follow-up. We had previously developed and validated a prognostic model for predicting overall survival for mCRPC patients using this dataset addressing missing data via regression imputation [44].
For the current analysis, we have focused on the 853 patients who consented to plasma and serum collection, incorporating eight clinical predictors of OS from our previous model: site of metastasis disease (bone metastases only (DS2), any visceral metastases (DS3)), opioid analgesic use (PAIN), Eastern Cooperative Oncology Group performance status (ECOG), LDH > 1 upper limit of normal (LDH.High), albumin (ALB), hemoglobin (HGB), alkaline phosphatase (ALKPHOS), and prostate-specific antigen (PSA) [44]. We considered to expand the model to include 24 plasma angiokines (Ang-2, BMP-9, CD-73, Chromogranin A, HER-3, HGF, ICAM-1, IL-6, OPN, PDGF-AA, PDGF-BB, PIGF, SDF-1, TGF-b1, TGF-b2, TGFb-R3, TIMP-1, TSP-2, VCAM-1, VEGF, VEGF-D, VEGF-R1, VEGF-R2, and VEGF-R3) and 3 serum androgens (testosterone, androstenedione, and dehydroepiandrosterone) aiming to increase the prognostic accuracy of the OS model. Proportional hazards assumptions for the core clinical predictors (DS2, DS3, ECOG, LDH.High, ALB, HGB, ALKPHOS) were evaluated using Schoenfeld residuals.
Among the 853 consenting patients, opioid analgesic use had the highest missingness (19.1% missing) and 538 patients (62.6%) had complete information for all the eight clinical variables (Table 7). To provide baseline survival characteristics, we include the Kaplan–Meier curve for overall survival in the CALGB 90401 data (Supplementary Figure S25) and a complete-case Cox regression summary (Supplementary Table S2). We created 10 MI datasets and assessed the model performance by examining the covariates selected and the average coefficient from refitted penalized models. The summary variable selection results and corresponding coefficient estimates across the various methods are presented in Table 8. In general, CC selected the fewest variables, while STK2 tended to select the most variables. GRP with ALASSO cv and 1SE did not select any variables. Among the penalized approaches, LASSO/ALASSO with minimal lambda selected more variables than 1SE and BIC, although some of the selected variables had small coefficient estimates with hazard ratios (HR) close to one. The three serum androgens were either not selected or had HRs close to one when included, suggesting limited prognostic value. Among the 24 plasma angiokines, ICAM-1, TIMP-1, and VEGF-R3 were selected by a few approaches (such as with LASSO/ALASSO with minimal lambda and 70% inclusion frequency) and showed relatively larger HR. This suggested that these angiokines may have potential prognostic value for overall survival in the mCRPC population studied in CALGB 90401. All the methods produced comparable tAUC values after correcting for optimism. Model performance showed slight improvement as more covariates were selected (from AVG90 to AVG50), although these differences were modest in magnitude (Table 9). Across the MI-averaging and stacking strategies, IBS values (Table 10) were very similar. Calibration slopes (Table 11) for most approaches were close to one, indicating no gross miscalibration. These findings are consistent with the simulation results.

5. Discussion

In this article, we evaluated and compared the performance of various variable selection methods in combinations with MI, a widely accepted and used approach for handling missing data in clinical studies. MI creates several plausible complete datasets, applies standard statistical methods to each imputed dataset, and then combines the resulting estimates. This approach incorporates the variability arising from both the imputation and the estimation process, thereby reflecting the true uncertainty associated with missing data. MI is very efficient and achieves maximal data usage and reduces the potential risk of bias inherent in complete cases studies. However, the combination of MI and variable selection introduces a substantial layer of complexity into the development of prognostic models. Variable selection is inherently unstable in the presence of missing data, and MI, by design, produces multiple versions of the dataset. Combining the outputs of multiple selection processes—each potentially yielding a different set of variables—requires careful methodological consideration. This complexity is not unique to our study but is an unavoidable issue when building prognostic models in real-world datasets, particularly in oncology, where missingness is often non-trivial.
Our simulations demonstrated that no single method consistently performed the best under all circumstances, whether in terms of variable selection, or the parameter estimates under both weak and strong signal settings. In strong signal settings, all methods tended to have lower proportions of false negative and false positive, a pattern consistent with the expectation that the existence of a stronger signal makes true covariates easier to identify while reducing spurious selections. This improvement reflects greater stability in selection. Some methods—particularly the penalized regression approaches (LASSO/ALASSO) with minimal lambda—often selected more variables than their 1SE or BIC-tuned counterparts. This improvement in sensitivity often came at the cost of including additional variables with weaker effects, some of which had hazard ratios close to one, indicating limited clinical relevance. The inclusion of predictors with small effect sizes raises the question of whether such expanded models meaningfully improve predictive performance. In fact, the optimism-corrected tAUC values across methods were remarkably similar, suggesting that model parsimony may not substantially compromise predictive accuracy in this context. The consistently higher bias and MSE observed for the 1SE approaches (for both LASSO and ALASSO) are expected. The 1SE rule selects a larger penalty than min lambda, yielding sparser models and greater shrinkage of non-zero coefficients toward zero—i.e., increased shrinkage bias. Under MI, this stronger penalty also raises the chance that weak but truly associated covariates are excluded in some imputations. After aggregation (e.g., by inclusion frequency threshold), some of these true covariates are dropped, introducing omitted-variable bias. This conservatism also reflects as slightly higher false negatives than AIC, BIC, or min lambda in variable selection, particularly when C = 0.30. Therefore, even though 1SE yields well-balanced variable selection results with fewer false positives, the combination of heavier shrinkage and occasional exclusion of weak signals increased bias and MSE relative to AIC, BIC, and min lambda.
There are a few limitations that warrant consideration. First, our simulations assumed an MAR mechanism. While we think this is a reasonable assumption for CALGB 90401, alternative mechanisms, such as MNAR, could affect the performance of these approaches. Second, our simulations were limited to specific conditions for the proportion of missingness, level of censoring, and the number of imputations. These design choices were guided by the characteristics of the CALGB 90401 trial, which may limit the generalizability of our findings to substantially different settings. We used a Weibull distribution to simulate event times because it is commonly employed in survival simulation studies and it closely approximated the empirical survival distribution observed in CALGB 90401 and other cancer studies [45]. Third, a key limitation of our approach is the lack of formal variance estimation when combining post-selection penalized models across imputations. Since penalized regression does not yield valid within-imputation standard errors, and the selected variables may differ across imputations, Rubin’s rules could not be applied. As such, the averaged coefficients presented should be interpreted as descriptive rather than inferential. While we attempted to address this limitation through empirical coverage evaluation in simulation, further methodological development can be conducted to provide valid inference after penalized model selection under multiple imputation. An additional limitation of our study lies in the independent selection of tuning parameters across imputations. While we used a conventional approach where λ is tuned separately within each imputed dataset, this can exacerbate variability in selected models across imputations and contribute to reduced selection stability. Coordinated tuning strategies, such as selecting a single λ across imputations based on pooled information or stacking techniques, may improve consistency and warrant further investigation. Another limitation concerns the STK2 approach, which uses a simple completeness-based weighting scheme. While this method, adopted from MI-WINet by Wan et al., is computationally convenient, it does not account for the relationship between missingness and observed covariates, which could bias results under more complex missing data mechanisms. Alternative model-based weighting strategies, such as inverse probability weighting based on a fitted missingness model, could be explored in the future to more appropriately address the missing data mechanism under MAR or MNAR. Due to computational constraints, optimism correction was applied only to the tAUC via bootstrap resampling. The Integrated Brier Score (IBS) and calibration metrics (slope and intercepts) were computed on the original imputed data without optimism correction. While this limits direct comparability across metrics, these additional measures were included to provide complementary insight into overall prediction error and calibration. We acknowledge this as a limitation and note that future work could extend optimism correction to all metrics for consistency. Beyond variable selection and estimation accuracy, we evaluated predictive performance using time-dependent AUC, integrated Brier score, and calibration indices to capture discrimination, overall prediction error, and calibration. While additional decision-oriented metrics such as risk stratification performance or decision curve analysis could offer further insight into clinical utility, their added modeling and computational demands under the multiply imputed setting were therefore beyond our current scope. We therefore view these extensions as valuable future work aimed at assessing the clinical utility of MI-based selection and prediction strategies.
Finally, this work is intended to provide pragmatic guidance for applied researchers, rather than new theoretical guarantees. Our findings are based on empirical evaluation of methods under realistic conditions, and do not establish formal consistency or optimality properties. While our simulation provides useful empirical insights, there remains limited formal theory supporting variable selection consistency or risk bounds in the setting of penalized Cox models with multiple imputations. Development of such theoretical foundations represents an important direction of future work.

6. Conclusions

Our findings underscore that there is no universally “best” method for variable selection in the context of MI and survival data. Method performance depends on signal strength, the degree of missingness, level of censoring, and the underlying correlation structure of the predictors. Nonetheless, certain approaches demonstrated consistently competitive performance in specific scenarios and may serve as strong candidates for use in similar prognostic modeling settings. For clinical researchers, the key message is that the choice of method should be driven not only by statistical performance metrics but also by considerations of model interpretability, parsimony, and clinical utility.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/bioengineering12111278/s1; Table S1. Mean CI coverage of the true covariates; Table S2. Complete-case Cox proportional hazards model estimates for baseline clinical and biomarker variables in CALGB 90401; Figure S1. Variable selection—correctly selected predictors for weak signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S2. Variable selection—positive discovery/selection rate for weak signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S3. Variable selection—false positives for weak signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S4. Variable selection—false negatives for weak signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S5. Parameter estimation—average bias across methods and tuning settings for weak signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S6. Parameter estimation—average MSE across methods and tuning settings for weak signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S7. tAUC (mean and SD) from ridge refits for weak signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S8. Integrated Brier Score (mean and SD) for weak signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S9. Calibration slope (mean and SD) for weak signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S10. Calibration intercept at the 25th percentile (mean and SD) for weak signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S11. Calibration intercept at the 50th percentile (mean and SD) for weak signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S12. Calibration intercept at the 75th percentile (mean and SD) for weak signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S13. Variable selection—correctly selected predictors for strong signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S14. Variable selection—positive discovery/selection rate for strong signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S15. Variable selection—false positives for strong signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S16. Variable selection—false negatives for strong signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S17. Parameter estimation—average bias across methods and tuning settings for strong signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S18. Parameter estimation—average MSE across methods and tuning settings for strong signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S19. tAUC (mean and SD) from ridge refits for strong signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S20. Integrated Brier Score (mean and SD) for strong signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S21. Calibration slope (mean and SD) for strong signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S22. Calibration intercept at the 25th percentile (mean and SD) for strong signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S23. Calibration intercept at the 50th percentile (mean and SD) for strong signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S24. Calibration intercept at the 75th percentile (mean and SD) for strong signal scenarios across MI levels, missingness (10%, 20%), and censoring (0.10, 0.30); Figure S25. Kaplan–Meier curve of overall survival in the CALGB 90401 data.

Author Contributions

Conceptualization, S.H.; methodology, Q.Y., B.L. and S.H.; software, Q.Y., B.L.; validation, Q.Y., B.L., C.Y. and S.H.; formal analysis, Q.Y., B.L. and S.H.; investigation, S.H.; resources, S.H.; data curation, Q.Y., B.L. and S.H.; writing—original draft preparation, Q.Y. and S.H.; writing—review and editing, Q.Y., B.L., C.Y. and S.H.; visualization, Q.Y.; supervision, S.H.; project administration, S.H.; funding acquisition, S.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported in part by the National Institutes of Health Grants R01 CA256157, R01 CA249279; the United States Army Medical Research Awards HT9425-23-1-0393 and HT9425-25-1-0623, the Food and Drug Administration (FDA) Award 1U01FD007857-01 of the U.S. Department of Health and Human Services (HHS); and the Prostate Cancer Foundation.

Institutional Review Board Statement

This research used de-identified human participant data obtained from the clinical study conducted by Nixon et al. (2025) [46]. The original study was approved by the Institutional Review Board of Duke University (Protocol Number: Pro00112912; Date of Approval: 1 May 2023).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study by the investigators of the original study reported in Nixon et al. (2025) [46].

Data Availability Statement

The datasets used in the application section are publicly available via NCTN NCORP Data Archive. All the programs for simulation and application were written in R version 4.2.1 and are available upon reasonable request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AICAkaike’s information criteria
ALASSOAdaptive least absolute shrinkage and selection operator
BICBayesian information criteria
CCComplete cases
HRHazard ratio
IBSIntegrated Brier Score
LASSOLeast absolute shrinkage and selection operator
MARMissing at random
MCARMissing completely at random
mCRPCMetastatic castration-resistant prostate cancer
MIMultiple imputation
MNARMissing not at random
OSOverall survival
tAUCTime-dependent area under the curve

References

  1. Little, R.J.A.; Rubin, D.B. Statistical Analysis with Missing Data, 3rd ed.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2019; ISBN 978-0-470-52679-8. [Google Scholar]
  2. Rubin, D.B. Inference and Missing Data. Biometrika 1976, 63, 581–592. [Google Scholar] [CrossRef]
  3. Solomon, N.; Lokhnygina, Y.; Halabi, S. Comparison of Regression Imputation Methods of Baseline Covariates That Predict Survival Outcomes. J. Clin. Trans. Sci. 2021, 5, e40. [Google Scholar] [CrossRef] [PubMed]
  4. Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum Likelihood from Incomplete Data Via the EM Algorithm. J. R. Stat. Soc. Ser. B Methodol. 1977, 39, 1–22. [Google Scholar] [CrossRef]
  5. Beale, E.M.L.; Little, R.J.A. Missing Values in Multivariate Analysis. J. R. Stat. Soc. Ser. B Stat. Methodol. 1975, 37, 129–145. [Google Scholar] [CrossRef]
  6. Chen, M.-H.; Ibrahim, J.G.; Lipsitz, S.R. Bayesian Methods for Missing Covariates in Cure Rate Models. Lifetime Data Anal. 2002, 8, 117–146. [Google Scholar] [CrossRef]
  7. Rubin, D.B. Multiple Imputation for Nonresponse in Surveys; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 1987; ISBN 978-0-470-31669-6. [Google Scholar]
  8. Schafer, J.L. Analysis of Incomplete Multivariate Data; Chapman and Hall/CRC: New York, NY, USA, 1997; ISBN 978-0-367-80302-5. [Google Scholar]
  9. Little, R.J.; D’Agostino, R.; Cohen, M.L.; Dickersin, K.; Emerson, S.S.; Farrar, J.T.; Frangakis, C.; Hogan, J.W.; Molenberghs, G.; Murphy, S.A.; et al. The Prevention and Treatment of Missing Data in Clinical Trials. N. Engl. J. Med. 2012, 367, 1355–1360. [Google Scholar] [CrossRef]
  10. Sterne, J.A.C.; White, I.R.; Carlin, J.B.; Spratt, M.; Royston, P.; Kenward, M.G.; Wood, A.M.; Carpenter, J.R. Multiple Imputation for Missing Data in Epidemiological and Clinical Research: Potential and Pitfalls. BMJ 2009, 338, b2393. [Google Scholar] [CrossRef]
  11. Heymans, M.W.; Twisk, J.W.R. Handling Missing Data in Clinical Research. J. Clin. Epidemiol. 2022, 151, 185–188. [Google Scholar] [CrossRef]
  12. Austin, P.C.; White, I.R.; Lee, D.S.; Van Buuren, S. Missing Data in Clinical Research: A Tutorial on Multiple Imputation. Can. J. Cardiol. 2021, 37, 1322–1331. [Google Scholar] [CrossRef]
  13. Tibshirani, R. Regression Shrinkage and Selection Via the Lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
  14. Zou, H. The Adaptive Lasso and Its Oracle Properties. J. Am. Stat. Assoc. 2006, 101, 1418–1429. [Google Scholar] [CrossRef]
  15. Zou, H.; Hastie, T. Regularization and Variable Selection via the Elastic Net. J R. Stat. Soc B 2005, 67, 301–320. [Google Scholar] [CrossRef]
  16. Wood, A.M.; White, I.R.; Royston, P. How Should Variable Selection Be Performed with Multiply Imputed Data? Statist. Med. 2008, 27, 3227–3246. [Google Scholar] [CrossRef]
  17. Chen, Q.; Wang, S. Variable Selection for Multiply-Imputed Data with Application to Dioxin Exposure Study. Statist. Med. 2013, 32, 3646–3659. [Google Scholar] [CrossRef]
  18. Wan, Y.; Datta, S.; Conklin, D.J.; Kong, M. Variable Selection Models Based on Multiple Imputation with an Application for Predicting Median Effective Dose and Maximum Effect. J. Stat. Comput. Simul. 2015, 85, 1902–1916. [Google Scholar] [CrossRef] [PubMed]
  19. Du, J.; Boss, J.; Han, P.; Beesley, L.J.; Kleinsasser, M.; Goutman, S.A.; Batterman, S.; Feldman, E.L.; Mukherjee, B. Variable Selection with Multiply-Imputed Datasets: Choosing Between Stacked and Grouped Methods. J. Comput. Graph. Stat. 2022, 31, 1063–1075. [Google Scholar] [CrossRef]
  20. Zahid, F.M.; Faisal, S.; Heumann, C. Variable Selection Techniques after Multiple Imputation in High-Dimensional Data. Stat. Methods Appl. 2020, 29, 553–580. [Google Scholar] [CrossRef]
  21. Zhao, Y.; Long, Q. Variable Selection in the Presence of Missing Data: Imputation-Based Methods: Variable Selection in the Presence of Missing Data. WIREs Comput. Stat. 2017, 9, e1402. [Google Scholar] [CrossRef] [PubMed]
  22. Thao, L.T.P.; Geskus, R.A. Comparison of Model Selection Methods for Prediction in the Presence of Multiply Imputed Data. Biom. J. 2019, 61, 343–356. [Google Scholar] [CrossRef]
  23. Jaouimaa, F.-Z.; Do Ha, I.; Burke, K. Penalized Variable Selection in Multi-Parameter Regression Survival Modeling. Stat. Methods Med. Res. 2023, 32, 2455–2471. [Google Scholar] [CrossRef]
  24. Vonta, F.; Karagrigoriou, A. Variable Selection Strategies in Survival Models with Multiple Imputations. Lifetime Data Anal. 2007, 13, 295–315. [Google Scholar] [CrossRef]
  25. Kelly, W.K.; Halabi, S.; Carducci, M.; George, D.; Mahoney, J.F.; Stadler, W.M.; Morris, M.; Kantoff, P.; Monk, J.P.; Kaplan, E.; et al. Randomized, Double-Blind, Placebo-Controlled Phase III Trial Comparing Docetaxel and Prednisone With or Without Bevacizumab in Men With Metastatic Castration-Resistant Prostate Cancer: CALGB 90401. J. Clin. Oncol. 2012, 30, 1534–1540. [Google Scholar] [CrossRef]
  26. Akaike, H. A New Look at the Statistical Model Identification. IEEE Trans. Automat. Contr. 1974, 19, 716–723. [Google Scholar] [CrossRef]
  27. Schwarz, G. Estimating the Dimension of a Model. Ann. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
  28. Bainter, S.A.; McCauley, T.G.; Fahmy, M.M.; Goodman, Z.T.; Kupis, L.B.; Rao, J.S. Comparing Bayesian Variable Selection to Lasso Approaches for Applications in Psychology. Psychometrika 2023, 88, 1032–1055. [Google Scholar] [CrossRef]
  29. Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed.; Springer Series in Statistics; Springer: New York, NY, USA, 2009; ISBN 978-0-387-84857-0. [Google Scholar]
  30. Breheny, P.; Huang, J. Group Descent Algorithms for Nonconvex Penalized Linear and Logistic Regression Models with Grouped Predictors. Stat. Comput. 2015, 25, 173–187. [Google Scholar] [CrossRef]
  31. Uno, H.; Cai, T.; Tian, L.; Wei, L.J. Evaluating Prediction Rules for T-Year Survivors with Censored Regression Models. J. Am. Stat. Assoc. 2007, 102, 527–537. [Google Scholar] [CrossRef]
  32. Halabi, S.; Li, C.; Luo, S. Developing and Validating Risk Assessment Models of Clinical Outcomes in Modern Oncology. JCO Precis. Oncol. 2019, 3, 1–12. [Google Scholar] [CrossRef]
  33. Bender, R.; Augustin, T.; Blettner, M. Generating Survival Times to Simulate Cox Proportional Hazards Models. Stat. Med. 2005, 24, 1713–1723. [Google Scholar] [CrossRef]
  34. Halabi, S.; Singh, B. Sample Size Determination for Comparing Several Survival Curves with Unequal Allocations. Stat. Med. 2004, 23, 1793–1815. [Google Scholar] [CrossRef]
  35. Armstrong, A.J.; Nixon, A.B.; Carmack, A.; Yang, Q.; Eisen, T.; Stadler, W.M.; Jones, R.J.; Garcia, J.A.; Vaishampayan, U.N.; Picus, J.; et al. Angiokines Associated with Targeted Therapy Outcomes in Patients with Non-Clear Cell Renal Cell Carcinoma. Clin. Cancer Res. 2021, 27, 3317–3328. [Google Scholar] [CrossRef]
  36. Halabi, S.; Yang, Q.; Carmack, A.; Zhang, S.; Foo, W.-C.; Eisen, T.; Stadler, W.M.; Jones, R.J.; Garcia, J.A.; Vaishampayan, U.N.; et al. Tissue Based Biomarkers in Non-Clear Cell RCC: Correlative Analysis from the ASPEN Clinical Trial. Kidney Cancer J. 2021, 19, 64–72. [Google Scholar] [CrossRef]
  37. Brown, L.C.; Halabi, S.; Schonhoft, J.D.; Yang, Q.; Luo, J.; Nanus, D.M.; Giannakakou, P.; Szmulewitz, R.Z.; Danila, D.C.; Barnett, E.S.; et al. Circulating Tumor Cell Chromosomal Instability and Neuroendocrine Phenotype by Immunomorphology and Poor Outcomes in Men with mCRPC Treated with Abiraterone or Enzalutamide. Clin. Cancer Res. 2021, 27, 4077–4088. [Google Scholar] [CrossRef]
  38. Pi, L.; Halabi, S. Combined Performance of Screening and Variable Selection Methods in Ultra-High Dimensional Data in Predicting Time-to-Event Outcomes. Diagn. Progn. Res. 2018, 2, 21. [Google Scholar] [CrossRef]
  39. National Research Council (US). Panel on Handling Missing Data in Clinical Trials. In The Prevention and Treatment of Missing Data in Clinical Trials; National Academies Press (US): Washington, DC, USA, 2010; ISBN 978-0-309-15814-5. [Google Scholar]
  40. Jakobsen, J.C.; Gluud, C.; Wetterslev, J.; Winkel, P. When and How Should Multiple Imputation Be Used for Handling Missing Data in Randomised Clinical Trials—A Practical Guide with Flowcharts. BMC Med. Res. Methodol. 2017, 17, 162. [Google Scholar] [CrossRef]
  41. Clark, T.G.; Altman, D.G. Developing a Prognostic Model in the Presence of Missing Data: An Ovarian Cancer Case Study. J. Clin. Epidemiol. 2003, 56, 28–37. [Google Scholar] [CrossRef]
  42. Van Buuren, S.; Groothuis-Oudshoorn, K. Mice: Multivariate Imputation by Chained Equations in R. J. Stat. Soft. 2011, 45, 1–67. [Google Scholar] [CrossRef]
  43. White, I.R.; Royston, P. Imputing Missing Covariate Values for the Cox Model. Stat. Med. 2009, 28, 1982–1998. [Google Scholar] [CrossRef]
  44. Halabi, S.; Lin, C.-Y.; Kelly, W.K.; Fizazi, K.S.; Moul, J.W.; Kaplan, E.B.; Morris, M.J.; Small, E.J. Updated Prognostic Model for Predicting Overall Survival in First-Line Chemotherapy for Patients With Metastatic Castration-Resistant Prostate Cancer. J. Clin. Oncol. 2014, 32, 671–677. [Google Scholar] [CrossRef]
  45. Plana, D.; Fell, G.; Alexander, B.M.; Palmer, A.C.; Sorger, P.K. Cancer Patient Survival Can Be Parametrized to Improve Trial Precision and Reveal Time-Dependent Therapeutic Effects. Nat. Commun. 2022, 13, 873. [Google Scholar] [CrossRef]
  46. Nixon, A.B.; Liu, Y.; Yang, Q.; Luo, B.; Starr, M.D.; Brady, J.C.; Kelly, W.K.; Beltran, H.; Morris, M.J.; George, D.J.; et al. Prognostic and predictive analyses of circulating plasma biomarkers in men with metastatic castration resistant prostate cancer treated with docetaxel/prednisone with or without bevacizumab. Prostate Cancer Prostatic Dis. 2025, 28, 355–362. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Figure 1. Summary variable selection results using weak signal, 10 multiple imputation and 0.10 censoring, 10% missing.
Figure 1. Summary variable selection results using weak signal, 10 multiple imputation and 0.10 censoring, 10% missing.
Bioengineering 12 01278 g001
Figure 2. Summary parameter estimates results using weak signal, 10 multiple imputation and 0.10 censoring, 10% missing.
Figure 2. Summary parameter estimates results using weak signal, 10 multiple imputation and 0.10 censoring, 10% missing.
Bioengineering 12 01278 g002
Figure 3. Mean tAUC, Integrated Brier score and calibration (95% CI) under the weak signal setting (10 multiple imputation, 0.10 censoring, and 10% missing).
Figure 3. Mean tAUC, Integrated Brier score and calibration (95% CI) under the weak signal setting (10 multiple imputation, 0.10 censoring, and 10% missing).
Bioengineering 12 01278 g003
Figure 4. Summary variable selection results using strong signal, 10 multiple imputation and 0.10 censoring, 10% missing.
Figure 4. Summary variable selection results using strong signal, 10 multiple imputation and 0.10 censoring, 10% missing.
Bioengineering 12 01278 g004
Figure 5. Summary parameter estimates results using strong signal, 10 multiple imputation and 0.10 censoring, 10% missing.
Figure 5. Summary parameter estimates results using strong signal, 10 multiple imputation and 0.10 censoring, 10% missing.
Bioengineering 12 01278 g005
Figure 6. Mean tAUC, Integrated Brier score and calibration (95% CI) under the strong signal setting (10 multiple imputation, 0.10 censoring, and 10% missing).
Figure 6. Mean tAUC, Integrated Brier score and calibration (95% CI) under the strong signal setting (10 multiple imputation, 0.10 censoring, and 10% missing).
Bioengineering 12 01278 g006
Table 1. Summary Statistic of variable selection by simulation settings for weak signal.
Table 1. Summary Statistic of variable selection by simulation settings for weak signal.
Missing = 10%, MI = 10, Censoring = 0.10Missing = 20%, MI = 10, Censoring = 0.10
CCAVG50AVG70AVG90STK2GRPCCAVG50AVG70AVG90STK2GRP
Correctly Selection
LASSO.BIC0.070.080.160.250.170.090.070.140.180.280.180.07
ALASSO.BIC0.550.680.760.840.780.710.410.670.730.820.770.73
LASSO.CV.min0.000.000.000.010.000.010.000.000.010.020.000.00
ALASSO.CV.min0.240.410.460.620.290.780.190.370.450.560.240.60
LASSO.CV.1se0.460.680.620.540.050.150.130.540.510.370.060.14
ALASSO.CV.1se0.320.440.430.260.850.500.150.280.210.110.810.52
Positively Discovery
LASSO.BIC0.750.790.830.870.840.820.760.810.850.890.860.83
ALASSO.BIC0.930.960.970.980.970.970.930.960.970.980.970.98
LASSO.CV.min0.590.600.640.700.510.660.620.610.660.730.500.64
ALASSO.CV.min0.830.890.910.940.790.970.840.880.910.940.780.94
LASSO.CV.1se0.950.980.980.980.760.920.950.980.980.980.770.93
ALASSO.CV.1se1.001.001.001.000.981.001.001.001.001.000.981.00
False Positive
LASSO.BIC0.310.240.190.140.170.190.290.210.160.110.150.17
ALASSO.BIC0.070.050.030.020.030.030.070.040.030.020.030.02
LASSO.CV.min0.600.570.480.370.810.450.520.560.460.330.820.49
ALASSO.CV.min0.190.120.100.060.270.030.180.130.090.060.290.06
LASSO.CV.1se0.050.020.020.020.280.080.050.020.020.020.260.07
ALASSO.CV.1se0.000.000.000.000.020.000.000.000.000.000.020.00
False Negative
LASSO.BIC0.000.000.000.000.000.020.020.000.000.000.000.02
ALASSO.BIC0.000.000.000.000.000.010.030.000.000.010.000.01
LASSO.CV.min0.000.000.000.000.000.000.010.000.000.000.000.00
ALASSO.CV.min0.000.000.000.000.000.000.020.000.000.000.000.00
LASSO.CV.1se0.040.020.030.040.000.070.120.040.050.070.000.08
ALASSO.CV.1se0.100.080.090.130.000.060.200.120.150.200.010.06
Table 2. Summary Statistic of parameter estimates by simulation settings for weak signal.
Table 2. Summary Statistic of parameter estimates by simulation settings for weak signal.
Missing = 10%, MI = 10, Censoring = 0.10Missing = 20%, MI = 10, Censoring = 0.10
CCAVG50AVG70AVG90STK2GRPCCAVG50AVG70AVG90STK2GRP
Bias
LASSO.BIC0.010.010.010.010.010.010.010.010.010.010.010.01
ALASSO.BIC0.010.010.010.010.010.010.010.010.010.010.010.01
LASSO.CV.min0.010.010.010.010.000.000.020.010.010.010.000.00
ALASSO.CV.min0.010.010.010.010.000.000.010.010.010.010.000.00
LASSO.CV.1se0.080.070.070.070.030.030.090.070.070.060.030.03
ALASSO.CV.1se0.080.070.070.070.030.030.100.070.070.070.030.03
MSE
LASSO.BIC0.010.010.010.010.010.010.020.010.010.010.010.01
ALASSO.BIC0.010.010.010.010.010.010.020.010.010.010.010.01
LASSO.CV.min0.010.010.010.010.010.010.020.010.010.010.010.01
ALASSO.CV.min0.010.010.010.010.010.010.020.010.010.010.010.01
LASSO.CV.1se0.090.080.080.090.030.030.120.090.090.100.030.03
ALASSO.CV.1se0.100.080.080.090.030.030.130.100.100.110.030.03
Table 3. Summary of tAUC (standard deviation) from ridge regression for the simulation with 10 multiple imputation using weak signal.
Table 3. Summary of tAUC (standard deviation) from ridge regression for the simulation with 10 multiple imputation using weak signal.
LASSO.BICALASSO.BICLASSO.CV.minALASSO.CV.minLASSO.CV.1seALASSO.CV.1se
Missing = 10%, MI = 10, Censoring = 0.10
CC0.855 (0.009)0.858 (0.009)0.854 (0.009)0.857 (0.008)0.854 (0.01)0.848 (0.011)
AVG500.859 (0.008)0.86 (0.008)0.858 (0.008)0.859 (0.008)0.858 (0.009)0.852 (0.01)
AVG700.859 (0.008)0.86 (0.008)0.858 (0.008)0.859 (0.008)0.857 (0.01)0.852 (0.01)
AVG900.859 (0.008)0.86 (0.008)0.858 (0.008)0.86 (0.008)0.856 (0.01)0.848 (0.011)
STK20.859 (0.008)0.86 (0.008)0.857 (0.008)0.859 (0.008)0.858 (0.008)0.859 (0.008)
GRP0.858 (0.008)0.859 (0.009)0.858 (0.008)0.86 (0.008)0.853 (0.011)0.853 (0.009)
Missing = 10%, MI = 10, Censoring = 0.30
CC0.847 (0.011)0.849 (0.011)0.846 (0.011)0.848 (0.011)0.834 (0.017)0.817 (0.026)
AVG500.851 (0.01)0.852 (0.01)0.85 (0.01)0.851 (0.01)0.842 (0.016)0.829 (0.018)
AVG700.851 (0.01)0.852 (0.01)0.85 (0.01)0.852 (0.01)0.841 (0.016)0.827 (0.018)
AVG900.851 (0.01)0.852 (0.01)0.85 (0.01)0.852 (0.01)0.839 (0.017)0.823 (0.018)
STK20.851 (0.01)0.852 (0.01)0.849 (0.01)0.851 (0.01)0.85 (0.01)0.85 (0.011)
GRP0.846 (0.013)0.85 (0.012)0.85 (0.01)0.852 (0.01)0.842 (0.013)0.841 (0.013)
Missing = 20%, MI = 10, Censoring = 0.10
CC0.852 (0.01)0.854 (0.009)0.85 (0.01)0.853 (0.009)0.844 (0.016)0.834 (0.02)
AVG500.858 (0.008)0.859 (0.008)0.857 (0.008)0.859 (0.008)0.856 (0.009)0.848 (0.012)
AVG700.858 (0.008)0.859 (0.008)0.857 (0.008)0.859 (0.008)0.856 (0.009)0.845 (0.012)
AVG900.859 (0.008)0.859 (0.008)0.858 (0.008)0.859 (0.008)0.854 (0.01)0.841 (0.013)
STK20.858 (0.008)0.859 (0.008)0.857 (0.008)0.858 (0.008)0.857 (0.008)0.858 (0.008)
GRP0.857 (0.008)0.859 (0.008)0.857 (0.008)0.859 (0.008)0.852 (0.011)0.852 (0.009)
Missing = 20%, MI = 10, Censoring = 0.30
CC0.84 (0.013)0.842 (0.013)0.839 (0.013)0.842 (0.013)0.807 (0.039)0.794 (0.039)
AVG500.85 (0.01)0.851 (0.01)0.849 (0.01)0.851 (0.01)0.839 (0.016)0.823 (0.017)
AVG700.851 (0.01)0.851 (0.01)0.849 (0.01)0.851 (0.01)0.836 (0.017)0.821 (0.017)
AVG900.85 (0.011)0.851 (0.01)0.85 (0.01)0.851 (0.011)0.833 (0.017)0.819 (0.016)
STK20.851 (0.01)0.851 (0.01)0.849 (0.01)0.85 (0.01)0.85 (0.01)0.849 (0.011)
GRP0.844 (0.013)0.847 (0.014)0.849 (0.01)0.851 (0.01)0.839 (0.014)0.84 (0.014)
Table 4. Summary Statistic of variable selection by simulation settings for strong signal.
Table 4. Summary Statistic of variable selection by simulation settings for strong signal.
Missing = 10%, MI = 10, Censoring = 0.10Missing = 20%, MI = 10, Censoring = 0.10
CCAVG50AVG70AVG90STK2GRPCCAVG50AVG70AVG90STK2GRP
Correctly Selection
LASSO.BIC0.000.000.000.140.000.020.000.010.050.280.020.00
ALASSO.BIC0.750.760.920.980.800.920.660.700.870.970.770.69
LASSO.CV.min0.000.000.020.280.000.080.000.000.030.300.000.02
ALASSO.CV.min0.690.920.991.000.751.000.490.810.940.970.650.99
LASSO.CV.1se0.050.670.820.930.000.150.000.620.810.920.000.09
ALASSO.CV.1se1.001.001.001.000.951.001.001.000.990.970.901.00
Positively Discovery
LASSO.BIC0.540.570.690.860.670.670.550.600.740.880.700.63
ALASSO.BIC0.970.970.991.000.970.990.960.960.991.000.970.96
LASSO.CV.min0.500.610.740.900.500.740.510.610.740.890.500.66
ALASSO.CV.min0.950.991.001.000.951.000.910.980.991.000.941.00
LASSO.CV.1se0.760.950.980.990.660.840.770.950.980.990.670.81
ALASSO.CV.1se1.001.001.001.001.001.001.001.001.001.000.981.00
False Positive
LASSO.BIC0.720.660.390.140.430.440.700.570.310.120.380.50
ALASSO.BIC0.030.030.010.000.030.010.040.040.010.000.030.04
LASSO.CV.min0.830.560.310.100.820.330.790.540.300.110.820.45
ALASSO.CV.min0.060.010.000.000.060.000.090.020.010.000.070.00
LASSO.CV.1se0.280.050.020.010.440.170.270.050.020.010.420.21
ALASSO.CV.1se0.000.000.000.000.010.000.000.000.000.000.020.00
False Negative
LASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.min0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.min0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
Table 5. Summary Statistic of parameter estimates by simulation settings for strong signal.
Table 5. Summary Statistic of parameter estimates by simulation settings for strong signal.
Missing = 10%, MI = 10, Censoring = 0.10Missing = 20%, MI = 10, Censoring = 0.10
CCAVG50AVG70AVG90STK2GRPCCAVG50AVG70AVG90STK2GRP
Bias
LASSO.BIC−0.27−0.35−0.35−0.35−0.36−0.35−0.27−0.39−0.39−0.39−0.40−0.39
ALASSO.BIC−0.28−0.35−0.35−0.35−0.36−0.35−0.28−0.39−0.39−0.39−0.40−0.39
LASSO.CV.min−0.27−0.36−0.36−0.35−0.36−0.36−0.27−0.40−0.39−0.39−0.40−0.39
ALASSO.CV.min−0.28−0.36−0.36−0.36−0.36−0.36−0.28−0.39−0.39−0.39−0.40−0.39
LASSO.CV.1se−0.34−0.43−0.43−0.43−0.39−0.43−0.35−0.47−0.47−0.47−0.43−0.47
ALASSO.CV.1se−0.34−0.43−0.43−0.43−0.38−0.43−0.35−0.47−0.47−0.47−0.42−0.47
MSE
LASSO.BIC0.771.291.301.321.371.310.791.581.591.601.651.59
ALASSO.BIC0.811.361.371.371.431.370.821.651.651.661.711.65
LASSO.CV.min0.771.311.321.331.361.330.791.591.601.611.641.60
ALASSO.CV.min0.801.381.381.381.421.380.821.661.661.671.701.66
LASSO.CV.1se1.171.941.941.941.541.941.242.272.272.281.862.27
ALASSO.CV.1se1.221.961.961.961.591.961.302.292.292.301.902.29
Table 6. Summary of tAUC (standard deviation) from ridge regression for the simulation with 10 multiple imputation using strong signal.
Table 6. Summary of tAUC (standard deviation) from ridge regression for the simulation with 10 multiple imputation using strong signal.
LASSO.BICALASSO.BICLASSO.CV.minALASSO.CV.minLASSO.CV.1seALASSO.CV.1se
Missing = 10%, MI = 10, Censoring = 0.10
CC0.987 (0.002)0.987 (0.002)0.987 (0.002)0.987 (0.002)0.987 (0.002)0.987 (0.002)
AVG500.986 (0.002)0.986 (0.002)0.986 (0.002)0.986 (0.002)0.986 (0.002)0.986 (0.002)
AVG700.986 (0.002)0.986 (0.002)0.986 (0.002)0.986 (0.002)0.986 (0.002)0.986 (0.002)
AVG900.986 (0.002)0.986 (0.002)0.986 (0.002)0.986 (0.002)0.987 (0.002)0.986 (0.002)
STK20.985 (0.002)0.986 (0.002)0.985 (0.002)0.986 (0.002)0.985 (0.002)0.986 (0.002)
GRP0.986 (0.002)0.986 (0.002)0.986 (0.002)0.986 (0.002)0.986 (0.002)0.986 (0.002)
Missing = 10%, MI = 10, Censoring = 0.30
CC0.984 (0.002)0.984 (0.002)0.984 (0.002)0.984 (0.002)0.984 (0.002)0.984 (0.002)
AVG500.983 (0.002)0.983 (0.002)0.983 (0.002)0.983 (0.002)0.984 (0.002)0.983 (0.002)
AVG700.983 (0.002)0.983 (0.002)0.983 (0.002)0.983 (0.002)0.984 (0.002)0.983 (0.002)
AVG900.983 (0.002)0.983 (0.002)0.983 (0.002)0.983 (0.002)0.984 (0.002)0.983 (0.003)
STK20.982 (0.002)0.983 (0.002)0.982 (0.002)0.983 (0.002)0.983 (0.002)0.983 (0.002)
GRP0.983 (0.002)0.983 (0.002)0.983 (0.002)0.983 (0.002)0.983 (0.002)0.983 (0.002)
Missing = 20%, MI = 10, Censoring = 0.10
CC0.986 (0.002)0.987 (0.002)0.986 (0.002)0.987 (0.002)0.986 (0.002)0.987 (0.002)
AVG500.984 (0.002)0.985 (0.002)0.984 (0.002)0.985 (0.002)0.985 (0.002)0.985 (0.002)
AVG700.984 (0.002)0.985 (0.002)0.984 (0.002)0.985 (0.002)0.985 (0.002)0.985 (0.003)
AVG900.984 (0.002)0.985 (0.002)0.984 (0.002)0.985 (0.003)0.985 (0.002)0.985 (0.003)
STK20.983 (0.003)0.984 (0.002)0.983 (0.003)0.984 (0.003)0.984 (0.003)0.984 (0.002)
GRP0.984 (0.002)0.985 (0.002)0.984 (0.002)0.985 (0.002)0.985 (0.002)0.985 (0.002)
Missing = 20%, MI = 10, Censoring = 0.30
CC0.983 (0.002)0.984 (0.002)0.983 (0.002)0.984 (0.002)0.983 (0.002)0.984 (0.002)
AVG500.981 (0.002)0.982 (0.002)0.981 (0.002)0.982 (0.002)0.982 (0.002)0.982 (0.003)
AVG700.981 (0.002)0.982 (0.002)0.981 (0.002)0.982 (0.002)0.983 (0.002)0.982 (0.003)
AVG900.982 (0.002)0.982 (0.002)0.982 (0.002)0.982 (0.002)0.983 (0.003)0.981 (0.004)
STK20.981 (0.003)0.981 (0.003)0.98 (0.003)0.981 (0.003)0.981 (0.003)0.981 (0.003)
GRP0.981 (0.002)0.982 (0.002)0.981 (0.002)0.982 (0.002)0.982 (0.002)0.982 (0.003)
Table 7. Baseline characteristics of patients in CALGB 90401 study.
Table 7. Baseline characteristics of patients in CALGB 90401 study.
Overall (N = 853)
Bone metastases
No238 (27.9%)
Yes615 (72.1%)
Visceral metastases
No710 (83.2%)
Yes143 (16.8%)
Opioid analgesic use
No435 (51.0%)
Yes255 (29.9%)
Missing163 (19.1%)
Age
Median [Min, Max]69.0 [42.0, 93.0]
BMI
Median [Min, Max]28.9 [15.0, 212]
Missing106 (12.4%)
Race
Other101 (11.8%)
White752 (88.2%)
ECOG Performance Status
0479 (56.2%)
1341 (40.0%)
233 (3.9%)
Comorbidity
0265 (31.1%)
1268 (31.4%)
2161 (18.9%)
377 (9.0%)
447 (5.5%)
520 (2.3%)
66 (0.7%)
75 (0.6%)
82 (0.2%)
91 (0.1%)
Missing1 (0.1%)
Gleason score
21 (0.1%)
38 (0.9%)
48 (0.9%)
528 (3.3%)
694 (11.0%)
7300 (35.2%)
8156 (18.3%)
9226 (26.5%)
1030 (3.5%)
Missing2 (0.2%)
Previous radiotherapy
No161 (18.9%)
Yes692 (81.1%)
LDH >1 ULN
No536 (62.8%)
Yes314 (36.8%)
Missing3 (0.4%)
ALB
Median [Min, Max]4.00 [1.10, 5.70]
Missing4 (0.5%)
BILI
Median [Min, Max]0.500 [0, 3.00]
HGB
Median [Min, Max]12.8 [6.60, 17.7]
PLT
Median [Min, Max]253 [15.0, 813]
Missing1 (0.1%)
WBC
Median [Min, Max]6.30 [2.50, 17.6]
ALKPHOS *
Median [Min, Max]4.76 [3.53, 7.60]
AST
Median [Min, Max]25.0 [5.00, 161]
PSA *
Median [Min, Max]4.33 [−3.00, 9.21]
* Log transformed.
Table 8. Coefficient estimates for CALGB 90401 data.
Table 8. Coefficient estimates for CALGB 90401 data.
DS2DS3PAINECOGLDH.HighALBHGBALKPHOSPSAtesto_mAndro_mdeh_m
CC
LASSO.BIC0.250.370.260.300.38−0.25−0.100.000.000.00−0.010.00
ALASSO.BIC0.310.400.250.300.41−0.27−0.110.000.000.000.000.00
LASSO.CV.min0.030.040.070.070.08−0.07−0.030.000.000.000.000.00
ALASSO.CV.min0.310.400.250.300.41−0.27−0.110.000.000.000.000.00
LASSO.CV.1se0.020.030.060.060.07−0.06−0.030.000.000.000.000.00
ALASSO.CV.1se0.040.070.090.080.13−0.08−0.020.000.000.000.000.00
AVG50
LASSO.BIC0.130.320.120.260.28−0.16−0.100.150.070.000.000.00
ALASSO.BIC0.160.350.090.250.29−0.10−0.090.150.060.000.000.00
LASSO.CV.min0.050.190.090.190.22−0.10−0.070.130.050.00−0.030.00
ALASSO.CV.min0.110.290.080.240.29−0.10−0.070.130.040.000.000.00
LASSO.CV.1se0.020.080.080.100.13−0.09−0.040.090.030.000.000.00
ALASSO.CV.1se0.030.120.050.130.17−0.08−0.030.070.010.000.000.00
AVG70
LASSO.BIC0.130.320.120.260.28−0.16−0.100.150.070.000.000.00
ALASSO.BIC0.160.360.120.280.30−0.17−0.090.150.070.000.000.00
LASSO.CV.min0.070.220.100.200.24−0.11−0.080.140.050.000.000.00
ALASSO.CV.min0.110.290.080.240.29−0.10−0.070.130.040.000.000.00
LASSO.CV.1se0.010.060.070.080.11−0.08−0.040.080.030.000.000.00
ALASSO.CV.1se0.030.120.050.140.18−0.08−0.030.070.020.000.000.00
AVG90
LASSO.BIC0.130.320.120.260.28−0.16−0.100.150.070.000.000.00
ALASSO.BIC0.160.360.120.280.30−0.17−0.090.150.070.000.000.00
LASSO.CV.min0.080.250.110.230.26−0.14−0.090.140.060.000.000.00
ALASSO.CV.min0.130.310.110.270.30−0.16−0.080.150.050.000.000.00
LASSO.CV.1se0.010.060.060.080.10−0.07−0.030.070.020.000.000.00
ALASSO.CV.1se0.030.120.050.130.17−0.08−0.020.070.010.000.000.00
STK2
LASSO.BIC0.130.320.120.260.28−0.16−0.100.150.070.000.000.00
ALASSO.BIC0.160.360.120.280.30−0.17−0.090.150.070.000.000.00
LASSO.CV.min0.140.320.080.240.25−0.09−0.080.100.06−0.01−0.050.00
ALASSO.CV.min0.180.380.070.260.27−0.08−0.070.100.050.00−0.050.00
LASSO.CV.1se0.040.150.090.160.19−0.10−0.060.120.040.00−0.020.00
ALASSO.CV.1se0.070.220.070.210.26−0.08−0.050.110.030.00−0.010.00
GRP
LASSO.BIC0.000.230.000.290.280.00−0.110.150.070.000.000.00
ALASSO.BIC0.000.200.000.380.470.000.000.000.000.000.000.00
LASSO.CV.min0.000.240.000.270.000.00−0.120.220.080.000.000.00
ALASSO.CV.min0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.1se0.000.000.000.100.000.00−0.050.100.000.000.000.00
ALASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
ANG2BMP9CD73ChromograninAHER3HGFICAM1IL6OPNPDGFAAPDGFbbPlGF
CC
LASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.min0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.min0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
AVG50
LASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.BIC0.000.000.000.000.000.000.280.000.000.000.000.00
LASSO.CV.min0.080.000.000.000.000.000.170.000.000.000.000.00
ALASSO.CV.min0.000.000.000.000.000.000.230.000.000.000.000.00
LASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
AVG70
LASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.min0.000.000.000.000.000.000.200.000.000.000.000.00
ALASSO.CV.min0.000.000.000.000.000.000.230.000.000.000.000.00
LASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
AVG90
LASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.min0.000.000.000.000.000.000.290.000.000.000.000.00
ALASSO.CV.min0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
STK2
LASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.min0.05−0.04−0.030.020.020.050.17−0.020.03−0.070.020.04
ALASSO.CV.min0.04−0.04−0.020.010.000.040.18−0.010.02−0.070.010.04
LASSO.CV.1se0.06−0.040.000.000.000.030.140.000.000.000.000.05
ALASSO.CV.1se0.000.000.000.000.000.000.180.000.00−0.030.000.00
GRP
LASSO.BIC0.000.000.000.000.000.000.360.000.000.000.000.00
ALASSO.BIC0.000.000.000.000.000.000.380.000.000.000.000.00
LASSO.CV.min0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.min0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
SDF1TGFb1TGFb2TGFbR3TIMPTSP2VCAM1VEGFAVEGFDVEGFR1VEGFR2VEGFR3
CC
LASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.min0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.min0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
AVG50
LASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.BIC0.000.000.000.000.160.000.000.000.000.000.000.00
LASSO.CV.min0.000.000.000.000.100.000.000.040.000.000.000.14
ALASSO.CV.min0.000.000.000.000.120.000.000.000.000.000.000.19
LASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
AVG70
LASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.min0.000.000.000.000.100.000.000.050.000.000.000.17
ALASSO.CV.min0.000.000.000.000.120.000.000.000.000.000.000.19
LASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
AVG90
LASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.min0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.min0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
STK2
LASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.min0.02−0.03−0.07−0.050.180.040.020.08−0.02−0.020.090.14
ALASSO.CV.min0.01−0.02−0.07−0.040.190.030.000.08−0.01−0.010.090.14
LASSO.CV.1se0.000.00−0.050.000.100.000.000.040.000.000.050.12
ALASSO.CV.1se0.000.00−0.020.000.130.000.000.030.000.000.030.14
GRP
LASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.BIC0.000.000.000.000.000.000.000.000.000.000.000.30
LASSO.CV.min0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.min0.000.000.000.000.000.000.000.000.000.000.000.00
LASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
ALASSO.CV.1se0.000.000.000.000.000.000.000.000.000.000.000.00
Table 9. Optimism-corrected tAUCs (standard deviation) for CALGB 90401 data.
Table 9. Optimism-corrected tAUCs (standard deviation) for CALGB 90401 data.
LASSO.BICALASSO.BICLASSO.CV.minALASSO.CV.minLASSO.CV.1seALASSO.CV.1se
CC0.7418 (0.0198)0.7413 (0.0199)0.7441 (0.0206)0.7411 (0.0198)0.7476 (0.0205)0.7389 (0.0199)
AVG500.7349 (0.0172)0.7339 (0.0167)0.7369 (0.0174)0.7344 (0.0169)0.7374 (0.0174)0.7322 (0.0168)
AVG700.7328 (0.0166)0.7321 (0.0165)0.7339 (0.0166)0.7323 (0.0162)0.7353 (0.0169)0.7301 (0.0162)
AVG900.7296 (0.0156)0.7288 (0.0155)0.7301 (0.0156)0.7285 (0.0155)0.732 (0.0159)0.7269 (0.0151)
STK20.7401 (0.0161)0.7393 (0.016)0.7343 (0.0162)0.7357 (0.0161)0.7432 (0.0166)0.7388 (0.0157)
GRP0.7239 (0.0205)0.7233 (0.0213)0.7265 (0.0206)NA0.7271 (0.0217)NA
Note: “NA” indicates the model selected no variables; therefore, tAUC was not available.
Table 10. Integrated Brier Score for CALGB 90401 data.
Table 10. Integrated Brier Score for CALGB 90401 data.
LASSO.BICALASSO.BICLASSO.CV.minALASSO.CV.minLASSO.CV.1seALASSO.CV.1se
CC0.1250.1250.1260.1250.1260.126
AVG500.1230.1220.1210.1210.1240.124
AVG700.1230.1230.1210.1210.1240.124
AVG900.1230.1230.1220.1230.1240.124
STK20.1230.1230.1190.1190.1190.120
GRP0.1230.1270.126NA0.127NA
Note: “NA” indicates the model selected no variables; therefore, integrated Brier score was not available.
Table 11. Calibration slope for CALGB 90401 data.
Table 11. Calibration slope for CALGB 90401 data.
LASSO.BICALASSO.BICLASSO.CV.minALASSO.CV.minLASSO.CV.1seALASSO.CV.1se
CC0.9830.9323.8880.93211.4702.761
AVG501.0511.0481.1871.1453.1953.153
AVG701.0511.0491.1561.1404.1393.375
AVG901.0511.0491.1221.1243.4973.445
STK21.0521.0491.0421.0371.2591.362
GRP1.0571.0611.096NA4.974NA
Note: “NA” indicates the model selected no variables; therefore, calibration slope was not available.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yang, Q.; Luo, B.; Yu, C.; Halabi, S. A Two-Step Variable Selection Strategy for Multiply Imputed Survival Data Using Penalized Cox Models. Bioengineering 2025, 12, 1278. https://doi.org/10.3390/bioengineering12111278

AMA Style

Yang Q, Luo B, Yu C, Halabi S. A Two-Step Variable Selection Strategy for Multiply Imputed Survival Data Using Penalized Cox Models. Bioengineering. 2025; 12(11):1278. https://doi.org/10.3390/bioengineering12111278

Chicago/Turabian Style

Yang, Qian, Bin Luo, Chenxi Yu, and Susan Halabi. 2025. "A Two-Step Variable Selection Strategy for Multiply Imputed Survival Data Using Penalized Cox Models" Bioengineering 12, no. 11: 1278. https://doi.org/10.3390/bioengineering12111278

APA Style

Yang, Q., Luo, B., Yu, C., & Halabi, S. (2025). A Two-Step Variable Selection Strategy for Multiply Imputed Survival Data Using Penalized Cox Models. Bioengineering, 12(11), 1278. https://doi.org/10.3390/bioengineering12111278

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop