A Bayesian Approach for Imputation of Censored Survival Data
Abstract
:1. Introduction
2. Imputing Censored Observations
2.1. Parametric Bayesian Approach
2.2. Illustration
3. Simulation Study
- Sample survival times were generated from a desired distribution (here the samples are simulated from a Weibull distribution).
- Some observations are then assigned as censored values by:
- (a)
- Drawing random values from the censoring distribution, taken here to be an exponential distribution.
- (b)
- If the value that is generated from the censoring distribution is smaller than the actual survival value in the sample, then that individual is assumed to be censored with the censoring value obtained from the exponential distribution.
- (c)
- Otherwise the observation is uncensored with its generated survival time.
- The parametric Bayesian imputation method is used to impute values for the incomplete censored observations.
- Imputed values are combined together with the uncensored failure times to create a complete data set.
- A sample is generated from a particular Weibull distribution and the resulting values are considered as the true failure times.
- The desired percentage of censoring is applied, where for the censored observations the true failure values are replaced with values from an exponential distribution.
- Winbugs is used to fit the parametric Bayesian model and simulated draws of the imputed values for the censored observations are generated.
- Using 100 sets of these simulated imputed values and combining them with the uncensored event times, 100 complete datasets are generated.
- The difference between the survivor function based on the true values from the Weibull distribution and the survivor function based on the completed (imputed) datasets is calculated over a range of different time points.
- A total of 100 sets of data with the desired sample size are simulated from the specific Weibull distribution with no censoring.
- The survivor curve of these data is estimated using a Kaplan–Meier estimator.
- The Kaplan–Meier estimated value for all of the 100 simulation sets is subtracted from the true value of the Weibull distribution survivor function at specific time points.
- The maximum, minimum and quartiles of these differences are used to define a bound range and interquartile range over the time period of interest.
4. Applications
4.1. 6-MP Data
4.2. Bronchopulmonary Dysplasia (BPD) Data
5. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
References
- Gould, S.J. The median isn’t the message. Discover 1985, 6, 40–42. [Google Scholar] [CrossRef] [Green Version]
- Jalali, A.; Alvarez-Iglesias, A.; Roshan, D.; Newell, J. Visualising statistical models using dynamic nomograms. PLoS ONE 2019, 14, e0225253. [Google Scholar] [CrossRef] [PubMed]
- Wei, G.C.; Tanner, M.A. Posterior computations for censored regression data. J. Am. Stat. Assoc. 1990, 85, 829–839. [Google Scholar] [CrossRef]
- Wei, G.C.; Tanner, M.A. Applications of multiple imputation to the analysis of censored regression data. Biometrics 1991, 47, 1297–1309. [Google Scholar] [CrossRef]
- Pan, W.; Connett, J.E. A multiple imputation approach to linear regression with clustered censored data. Lifetime Data Anal. 2001, 7, 111–123. [Google Scholar] [CrossRef]
- Ageel, M. A novel means of estimating quantiles for 2-parameter Weibull distribution under the right random censoring model. J. Comput. Appl. Math. 2002, 149, 373–380. [Google Scholar] [CrossRef] [Green Version]
- Chiou, K.C. A Study of Imputing Censored Observations for 2-Parameter Weibull Distribution Based on Random Censoring; Department of Accounting and Statistics, The Overseas Chinese Institute of Technology: Taichung, Taiwan, 2003; pp. 1–5. [Google Scholar]
- Cantor, A. Imputation for Censored Observations in Survival Studies Allowing for a Positive Cure Rate; University of Alabama at Birmingham: Birmingham, AL, USA, 2009. [Google Scholar]
- Lue, H.H.; Chen, C.H.; Chang, W.H. Dimension reduction in survival regressions with censored data via an imputed spline approach. Biom. J. 2011, 53, 426–443. [Google Scholar] [CrossRef]
- Buckley, J.; James, I. Linear regression with censored data. Biometrika 1979, 66, 429–436. [Google Scholar] [CrossRef]
- Jackson, D.; White, I.R.; Seaman, S.; Evans, H.; Baisley, K.; Carpenter, J. Relaxing the independent censoring assumption in the Cox proportional hazards model using multiple imputation. Stat. Med. 2014, 33, 4681–4694. [Google Scholar] [CrossRef] [Green Version]
- Royston, P. The lognormal distribution as a model for survival time in cancer, with an emphasis on prognostic factors. Stat. Neerl. 2001, 55, 89–104. [Google Scholar] [CrossRef]
- Royston, P.; Parmar, M.K.; Altman, D.G. Visualizing length of survival in time-to-event studies: A complement to Kaplan–Meier plots. J. Natl. Cancer Inst. 2008, 100, 92–97. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Taylor, J.M.; Murray, S.; Hsu, C.H. Survival estimation and testing via multiple imputation. Stat. Probab. Lett. 2002, 58, 221–232. [Google Scholar] [CrossRef]
- Faucett, C.L.; Schenker, N.; Taylor, J.M. Survival analysis using auxiliary variables via multiple imputation, with application to AIDS clinical trial data. Biometrics 2002, 58, 37–47. [Google Scholar] [CrossRef] [PubMed]
- Hsu, C.H.; Taylor, J.M.; Murray, S.; Commenges, D. Survival analysis using auxiliary variables via non-parametric multiple imputation. Stat. Med. 2006, 25, 3503–3517. [Google Scholar] [CrossRef]
- Hsu, C.H.; Taylor, J.M. Nonparametric comparison of two survival functions with dependent censoring via nonparametric multiple imputation. Stat. Med. 2009, 28, 462–475. [Google Scholar] [CrossRef] [Green Version]
- Hsu, C.H.; Taylor, J.M.; Hu, C. Analysis of accelerated failure time data with dependent censoring using auxiliary variables via nonparametric multiple imputation. Stat. Med. 2015, 34, 2768–2780. [Google Scholar] [CrossRef] [Green Version]
- Ibrahim, J.G.; Chen, M.H.; Sinha, D. Bayesian Survival Analysis; Wiley Online Library: Hoboken, NJ, USA, 2005. [Google Scholar]
- Gelf, A.E.; Smith, A.F. Sampling-based approaches to calculating marginal densities. J. Am. Stat. Assoc. 1990, 85, 398–409. [Google Scholar]
- Lunn, D.; Jackson, C.; Best, N.; Thomas, A.; Spiegelhalter, D. The BUGS Book: A Practical Introduction to Bayesian Analysis; CRC Press: Boca Raton, FL, USA, 2012. [Google Scholar]
- Christensen, R.; Johnson, W.; Branscum, A.; Hanson, T.E. Bayesian Ideas and Data Analysis: An Introduction for Scientists and Statisticians; CRC Press: Boca Raton, FL, USA, 2011. [Google Scholar]
- Halabi, S.; Singh, B. Sample size determination for comparing several survival curves with unequal allocations. Stat. Med. 2004, 23, 1793–1815. [Google Scholar] [CrossRef]
- Kooperberg, C. Logspline: Logspline Density Estimation Routines. 2016. Available online: https://CRAN.R-project.org/package=logspline (accessed on 1 February 2018).
- Gehan, E.A. A generalized Wilcoxon test for comparing arbitrarily singly-censored samples. Biometrika 1965, 52, 203–224. [Google Scholar] [CrossRef]
- Acute Leukemia Group B; Freireich, E.J.; Gehan, E.; Frei, E.M., III; Schroeder, L.R.; Wolman, I.J.; Anbari, R.; Burgert, E.O.; Mills, S.D.; Pinkel, D.; et al. The effect of 6-mercaptopurine on the duration of steroid-induced remissions in acute leukemia: A model for evaluation of other potentially useful therapy. Blood 1963, 21, 699–716. [Google Scholar]
- Hosmer, D.W.; Lemeshow, S.; May, S. Applied Survival Analysis: Regression Modeling of Time-to-Event Data, 2nd ed.; Wiley-Interscience: Hoboken, NJ, USA, 2008. [Google Scholar]
- Sturtz, S.; Ligges, U.; Gelman, A. R2WinBUGS: A Package for Running WinBUGS from R. J. Stat. Softw. 2005, 12, 1–16. [Google Scholar] [CrossRef] [Green Version]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Moghaddam, S.; Newell, J.; Hinde, J. A Bayesian Approach for Imputation of Censored Survival Data. Stats 2022, 5, 89-107. https://doi.org/10.3390/stats5010006
Moghaddam S, Newell J, Hinde J. A Bayesian Approach for Imputation of Censored Survival Data. Stats. 2022; 5(1):89-107. https://doi.org/10.3390/stats5010006
Chicago/Turabian StyleMoghaddam, Shirin, John Newell, and John Hinde. 2022. "A Bayesian Approach for Imputation of Censored Survival Data" Stats 5, no. 1: 89-107. https://doi.org/10.3390/stats5010006
APA StyleMoghaddam, S., Newell, J., & Hinde, J. (2022). A Bayesian Approach for Imputation of Censored Survival Data. Stats, 5(1), 89-107. https://doi.org/10.3390/stats5010006