This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
A Two-Step Variable Selection Strategy for Multiply Imputed Survival Data Using Penalized Cox Models
by
Qian Yang
Qian Yang 1,
Bin Luo
Bin Luo 2
,
Chenxi Yu
Chenxi Yu 3 and
Susan Halabi
Susan Halabi
Susan Halabi is the James B. Duke Distinguished Professor of Biostatistics & Bioinformatics and of A [...]
Susan Halabi is the James B. Duke Distinguished Professor of Biostatistics & Bioinformatics and Co-Chief of the Division of Biostatistics at Duke University. She earned her Ph.D. from the University of Texas Health Sciences Center, Houston, in 1994. Her research focuses on the design and analysis of clinical trials, the development and evaluation of predictive models, and the design and analysis of biomarker and high-dimensional data, including variable selection and model validation. Dr. Halabi co-edited two foundational books in the field: Oncology Clinical Trials (2nd Edition, Demos, 2018) and the Textbook of Clinical Trials in Oncology (CRC Press, 2019). A past president of the Society for Clinical Trials and the recipient of several prestigious awards, she is also a Fellow of the Society for Clinical Trials, the American Statistical Association, and the American Society for Clinical Oncology.
3,*
1
Division of Infectious Diseases, Department of Medicine, Emory University School of Medicine, Atlanta, GA 30322, USA
2
School of Data Science and Analytics, Kennesaw State University, Marietta, GA 30060, USA
3
Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27708, USA
*
Author to whom correspondence should be addressed.
Bioengineering 2025, 12(11), 1278; https://doi.org/10.3390/bioengineering12111278 (registering DOI)
Submission received: 1 September 2025
/
Revised: 7 November 2025
/
Accepted: 18 November 2025
/
Published: 20 November 2025
Abstract
Multiple imputation (MI) is widely used for handling missing data. However, applying penalized methods after MI can be challenging because variable selection may be inconsistent across imputations. We propose a two-step variable selection method for multiply imputed datasets with survival outcomes: apply LASSO or ALASSO to each MI dataset, followed by ridge regression, and combine estimates using variable selected in any or d% (d = 50, 70, 90, 100) of the MI datasets. For comparison, we also fit stacked MI datasets with weighted penalized regression and a group LASSO approach that enforces consistent selection across imputations. Simulations with Cox models evaluated tuning by AIC, BIC, cross-validation at the minimum error, and the 1SE rule. Across scenarios, performance differed by both the penalization and the selection rule. More conservative choices such as ALASSO with BIC and a 50% inclusion frequency tended to control false positive and gave more stable calibration. The grouped approach achieved comparable selection with modestly higher estimation error. Overall, no single method consistently outperformed others across all scenarios. Our findings suggest that practitioners should weigh trade-offs between selection stability, estimation accuracy, and calibration when applying penalized methods to multiply imputed survival data.
Share and Cite
MDPI and ACS Style
Yang, Q.; Luo, B.; Yu, C.; Halabi, S.
A Two-Step Variable Selection Strategy for Multiply Imputed Survival Data Using Penalized Cox Models. Bioengineering 2025, 12, 1278.
https://doi.org/10.3390/bioengineering12111278
AMA Style
Yang Q, Luo B, Yu C, Halabi S.
A Two-Step Variable Selection Strategy for Multiply Imputed Survival Data Using Penalized Cox Models. Bioengineering. 2025; 12(11):1278.
https://doi.org/10.3390/bioengineering12111278
Chicago/Turabian Style
Yang, Qian, Bin Luo, Chenxi Yu, and Susan Halabi.
2025. "A Two-Step Variable Selection Strategy for Multiply Imputed Survival Data Using Penalized Cox Models" Bioengineering 12, no. 11: 1278.
https://doi.org/10.3390/bioengineering12111278
APA Style
Yang, Q., Luo, B., Yu, C., & Halabi, S.
(2025). A Two-Step Variable Selection Strategy for Multiply Imputed Survival Data Using Penalized Cox Models. Bioengineering, 12(11), 1278.
https://doi.org/10.3390/bioengineering12111278
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article Access Statistics
For more information on the journal statistics, click
here.
Multiple requests from the same IP address are counted as one view.