Semiparametric Survival Analysis of 30-Day Hospital Readmissions with Bayesian Additive Regression Kernel Model
Abstract
:1. Introduction
2. Bayesian Additive Regression Kernel Model for Survival Outcome
2.1. A Quick Background of Bayesian Additive Regression Kernels (BARK)
- -
- : observed covariate vectors in .
- -
- : responses.
- -
- : regression coefficient.
- -
- : kernal location parameter in X.
- -
- : independent additive Gaussian noise.
- -
- : the scale parameter, which measures the contribution of the lth variable in the kernel function.
- -
- : no contribution made by the lth variable through the kernel function.
- -
- is large: the lth variable is important in the kernel function.
2.1.1. Use of Symmetric α-Stable (SαS) Lévy Random Field as Prior Distribution on the Mean Function f(x)
- -
- : the random signed measure.
- -
- .
- -
- the probability measure on X.
- -
- : the stable index.
- -
- : the intensity parameter.
2.1.2. Automatic Feature Selection in BARK
- Equal weights: All kernel scale parameters are the same and follow the same Gamma distribution as:
- Different weights: All kernel scale parameters are different, and each follows a different Gamma distribution as:
- Equal weights with hard shrinkage and selection: Introduce a vector to indicate if a variable is selected or not. Each indicator follows a Bernoulli distribution. If the indicator is 1, the corresponding Note that here all selected kernel scale parameters are the same as in Equal weights.
- Different weights with hard shrinkage and selection: Exactly the same formulation as in 3, however, each selected kernel scale parameter has its own Gamma distribution where d is the number of nonzero kernel scale parameters.
2.1.3. Likelihood and the Posterior Distribution
- -
- K: the kernel matrix, with .
2.2. Nonparametric Survival Analysis with BARK
2.2.1. Data Transformation Method: Converting Right-Censored Data to Binary Classification
2.2.2. Bayesian Additive Kernel-Based Survival Model for the Transformed Data (sBARK)
- -
- : the probability of an event at time given no previous event for instance .
- -
- : the probit link function.
- -
- : the predicted event status.
- -
- : the baseline, which equals to 0 for centered outcome.
- -
- : truncated normal latent variable.
- -
- : the nonlinear function that connects the binary response with the set of predictors. The f function [35] is modeled nonparametrically using the BARK architecture discussed previously in detail.
2.3. MCMC Algorithm to Fit sBARK
- Sample the latent variables conditional on all other parameters from the truncated normal distribution described in (10). The truncation scheme is based on the value of described above in (10). The truncated normal distribution has mean which is based on the posterior sample of from BARK output and variance is 1.
- Update all parameters of BARK architecture with the MCMC process built in the BARK algorithm. The full MCMC scheme to generate the BARK parameters are documented in detail in [26] and in https://github.com/merliseclyde/bark, accessed on 20 November 2019.
- The zij’s are centered around a known constant µ0, which is equivalent to centering the probabilities pij around p0 = Φ(µ0). By default, we set µ0 = 0, which means pij are centered around 0.5.
2.4. Prior and MCMC Parameter Selection for sBARK
3. Simulation Study and a Benchmark Real Data Analysis
3.1. Simulated Data
3.2. Benchmark Data Analysis with Kidney Catheter Data
4. Analysis of 30-Day Hospital Readmission Data
5. Conclusions
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Readmissions Reduction Program (HRRP). Available online: https://www.cms.gov/medicare/medicare-fee-for-service-payment/acuteinpatientpps/readmissions-reduction-program.html (accessed on 20 November 2019).
- Hospital Quality Initiative—Outcome Measures 2016 Chartbook. Available online: https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/HospitalQualityInits/OutcomeMeasures.html (accessed on 20 November 2019).
- Jencks, S.F.; Williams, M.V.; Coleman, E.A. Rehospitalizations among Patients in the Medicare Fee-for-Service Program. N. Engl. J. Med. 2009, 360, 1418–1428. [Google Scholar] [CrossRef] [PubMed]
- Weissman, J.S.; Ayanian, J.Z.; Chasan-Taber, S.; Sherwood, M.J.; Roth, C.; Epstein, A.M. Hospital Readmissions and Quality of Care. Med. Care 1999, 37, 490–501. [Google Scholar] [CrossRef] [PubMed]
- Polanczyk, C.A.; Newton, C.; Dec, G.W.; Di Salvo, T.G. Quality of care and hospital readmission in congestive heart failure: An explicit review process. J. Card. Fail. 2001, 7, 289–298. [Google Scholar] [CrossRef]
- Luthi, J.-C.; Lund, M.J.; Sampietro-Colom, L.; Kleinbaum, D.G.; Ballard, D.J.; McClellan, W.M. Readmissions and the quality of care in patients hospitalized with heart failure. Int. J. Qual. Health Care 2003, 15, 413–421. [Google Scholar] [CrossRef] [PubMed]
- Boccuti, C.; Casillas, G. Aiming for Fewer Hospital U-turns: The Medicare Hospital Readmission Reduction Program. Policy Brief. 2017, 1–10. [Google Scholar]
- Kansagara, D.; Englander, H.; Salanitro, A.; Kagen, D.; Theobald, C.; Freeman, M.; Kripalani, S. Risk prediction models for hospital readmission: A systematic review. JAMA 2011, 306, 1688–1698. [Google Scholar] [CrossRef]
- Miller, R.G. Survival Analysis; John Wiley & Sons: Hoboken, NJ, USA, 1997. [Google Scholar]
- Wang, P.; Li, Y.; Reddy, C.K. Machine Learning for Survival Analysis: A Survey. arXiv 2017, arXiv:1708.04649. Available online: https://arxiv.org/abs/1708.04649 (accessed on 28 February 2020).
- Klein, J.P.; Moeschberger, M.L. Survival Analysis: Techniques for Censored and Truncated Data, 2nd ed.; Springer: New York, NY, USA, 2003; Volume 9, pp. 302–308. [Google Scholar]
- Dätwyler, C.; Stucki, T. Parametric Survival Models. 2011. Available online: http://stat.ethz.ch/education/semesters/ss2011/seminar/contents/handout_9.pdf (accessed on 15 February 2020).
- Kaplan, E.L.; Meier, P. Nonparametric Estimation from Incomplete Observations. J. Am. Stat. Assoc. 1958, 53, 457. [Google Scholar] [CrossRef]
- Andersen, P.K.; Borgan, O.; Gill, R.D.; Keiding, N. Statistical Models Based on Counting Processes; Springer: Berlin/Heidelberg, Germany, 1993. [Google Scholar]
- Cutler, S.J.; Ederer, F. Maximum Utilization of the Life Table Method in Analyzing Survival. In Annals of Life Insurance Medicine; Springer: Berlin/Heidelberg, Germany, 1964; pp. 9–22. [Google Scholar]
- Cox, D.R. Regression Models and Life-Tables. J. R. Stat. Soc. Ser. B 1972, 34, 187–220. [Google Scholar] [CrossRef]
- Ishwaran, H.; Kogalur, U.B.; Blackstone, E.H.; Lauer, M.S. Random survival forests. Ann. Appl. Stat. 2008, 2, 841–860. [Google Scholar] [CrossRef] [PubMed]
- Chipman, H.A.; George, E.I.; McCulloch, R.E. BART: Bayesian additive regression trees. Ann. Appl. Stat. 2012, 6, 266–298. [Google Scholar] [CrossRef]
- Sparapani, R.; Logan, B.R.; McCulloch, R.E.; Laud, P.W. Nonparametric survival analysis using Bayesian Additive Regression Trees (BART). Stat. Med. 2016, 35, 2741–2753. [Google Scholar] [CrossRef] [PubMed]
- Bonato, V.; Baladandayuthapani, V.; Broom, B.M.; Sulman, E.P.; Aldape, K.D.; Do, K.-A.; Anh, K. Bayesian ensemble methods for survival prediction in gene expression data. Bioinformatics 2010, 27, 359–367. [Google Scholar] [CrossRef] [PubMed]
- Van Belle, V.; Pelckmans, K.; Van Huffel, S.; Suykens, J.A.K. Support vector methods for survival analysis: A comparison between ranking and regression approaches. Artif. Intell. Med. 2011, 53, 107–118. [Google Scholar] [CrossRef] [PubMed]
- Khan, F.M.; Zubek, V.B. Support Vector Regression for Censored Data (SVRc): A Novel Tool for Survival Analysis. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008; Institute of Electrical and Electronics Engineers (IEEE): Piscataway, NJ, USA, 2008; pp. 863–868. [Google Scholar]
- Shivaswamy, P.K.; Chu, W.; Jansche, M. A support vector approach to censored targets. In Proceedings of the IEEE International Conference on Data Mining, ICDM 2007, Omaha, NE, USA, 28–31 October 2007; Institute of Electrical and Electronics Engineers (IEEE): Piscataway, NJ, USA, 2008; pp. 655–660. [Google Scholar]
- Kiaee, F.; Sheikhzadeh, H.; Eftekhari Mahabadi, S. Relevance Vector Machine for Survival Analysis. In IEEE Transactions on Neural Networks and Learning Systems; Institute of Electrical and Electronics Engineers (IEEE): Piscataway, NJ, USA, 2015; Volume 27, pp. 648–660. [Google Scholar]
- Tipping, M. Sparse Bayesian Learning and the Relevance Vector Mach. J. Mach. Learn. Res. 2001, 1, 211–244. [Google Scholar]
- Ouyang, Z. Bayesian Additive Regression Kernels; Duke University: Durham, NC, USA, 2008. [Google Scholar]
- Chakraborty, S.; Ghosh, M.; Mallick, B.K. Bayesian nonlinear regression for large p small n problems. J. Multivar. Anal. 2012, 108, 28–40. [Google Scholar] [CrossRef]
- Aronszajn, N. Theory of reproducing kernels. Trans. Am. Math. Soc. 1950, 68, 337–404. [Google Scholar] [CrossRef]
- Maity, A.; Mallick, B.K. Proportional Hazards Regression Using Bayesian Kernel Machines. In Bayesian Modeling in Bioinformatics; CRC Press Taylor & Francis Group: Boca Raton, FL, USA, 2011; pp. 317–336. [Google Scholar]
- Kalbfleisch, J.D. Non-Parametric Bayesian Analysis of Survival Time Data. J. R. Stat. Soc. Ser. B 1978, 40, 214–221. [Google Scholar] [CrossRef]
- Electronic Health Records. Available online: https://www.cms.gov/Medicare/E-Health/EHealthRecords/index.html (accessed on 20 November 2019).
- Hodgkins, A.J.; Bonney, A.; Mullan, J.; Mayne, D.; Barnett, S. Survival analysis using primary care electronic health record data: A systematic review of the literature. Health Inf. Manag. J. 2017, 47, 6–16. [Google Scholar] [CrossRef]
- Albert, J.H.; Chib, S. Bayesian Analysis of Binary and Polychotomous Response Data. J. Am. Stat. Assoc. 1993, 88, 669. [Google Scholar] [CrossRef]
- Gelman, A.; Rubin, D.B. Inference from Iterative Simulation Using Multiple Sequences. Stat. Sci. 1992, 7, 457–472. [Google Scholar] [CrossRef]
- Moriña, D.; Navarro, A. The R Package survsim for the Simulation of Simple and Complex Survival Data. J. Stat. Softw. 2014, 59, 1–20. [Google Scholar] [CrossRef]
- McGilchrist, C.A.; Aisbett, C.W. Regression with Frailty in Survival Analysis. Biometrics 1991, 47, 461. [Google Scholar] [CrossRef] [PubMed]
- Therneau, T.; Grambsch, P. Modeling Survival Data: Extending the Cox Model, 1st ed.; Springer: New York, NY, USA, 2000. [Google Scholar]
- Johnson, A.E.W.; Pollard, T.J.; Shen, L.; Lehman, L.-W.H.; Feng, M.; Ghassemi, M.; Moody, B.; Szolovits, P.; Celi, L.A.; Mark, R.G. MIMIC-III, a freely accessible critical care database. Sci. Data 2016, 3, 160035. [Google Scholar] [CrossRef] [PubMed]
- Fahrmeir, L. Encyclopedia of Biostatistics. Discrete Survival-Time Models; Wiley: Hoboken, NJ, USA, 1998; pp. 1163–1168. [Google Scholar]
- Masyn, K.E. Discrete-Time Survival Mixture Analysis for Single and Recurrent Events Using Latent Variables. Unpublished Doctoral Dissertation, University of California, Los Angeles, CA, USA, 2003. Available online: http://www.statmodel.com/download/masyndissertation.pdf (accessed on 20 April 2020).
Cross-Validation Fold | Survival BARK | Survival BART | Cox PH-Model |
---|---|---|---|
1 | 0.7628994 | 0.7726098 | 2.9273658 |
2 | 0.7417008 | 0.9180355 | 2.9747698 |
3 | 0.9238848 | 0.9706946 | 2.9974077 |
4 | 0.7477913 | 1.0473969 | 2.9366146 |
5 | 0.8638385 | 0.8741633 | 2.9211262 |
Mean | 0.8080229 | 0.9165800 | 2.9514568 |
Cross-Validation Fold | Survival BARK | Survival BART | Cox PH-Model |
---|---|---|---|
1 | 0.6738203 | 0.9865552 | 3.6232368 |
2 | 0.8400254 | 1.1055954 | 4.1958699 |
3 | 0.7821322 | 1.2144879 | 3.8610213 |
4 | 0.3709701 | 0.4996261 | 3.6559455 |
5 | 0.9308825 | 1.2646332 | 3.9148510 |
Mean | 0.7195661 | 1.0141800 | 3.8501849 |
Cross-Validation Fold | Survival BARK | Survival BART | Cox PH-Model |
---|---|---|---|
1 | 1.7829373 | 3.8578604 | 8.1691553 |
2 | 1.6841202 | 3.8892409 | 8.0396034 |
3 | 1.6383323 | 3.8480278 | 8.1605448 |
4 | 1.8599575 | 3.6338503 | 8.2678329 |
5 | 1.7226940 | 3.7844023 | 8.3306097 |
Mean | 1.7376082 | 3.8026763 | 8.1935492 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chakraborty, S.; Zhao, P.; Huang, Y.; Dey, T. Semiparametric Survival Analysis of 30-Day Hospital Readmissions with Bayesian Additive Regression Kernel Model. Stats 2022, 5, 617-630. https://doi.org/10.3390/stats5030038
Chakraborty S, Zhao P, Huang Y, Dey T. Semiparametric Survival Analysis of 30-Day Hospital Readmissions with Bayesian Additive Regression Kernel Model. Stats. 2022; 5(3):617-630. https://doi.org/10.3390/stats5030038
Chicago/Turabian StyleChakraborty, Sounak, Peng Zhao, Yilun Huang, and Tanujit Dey. 2022. "Semiparametric Survival Analysis of 30-Day Hospital Readmissions with Bayesian Additive Regression Kernel Model" Stats 5, no. 3: 617-630. https://doi.org/10.3390/stats5030038
APA StyleChakraborty, S., Zhao, P., Huang, Y., & Dey, T. (2022). Semiparametric Survival Analysis of 30-Day Hospital Readmissions with Bayesian Additive Regression Kernel Model. Stats, 5(3), 617-630. https://doi.org/10.3390/stats5030038