Variational Bayesian Inference in High-Dimensional Linear Mixed Models
Abstract
:1. Introduction
2. Model
3. Skinny Gibbs Sampler for Bayesian Lasso
4. Variational Bayesian Inference
4.1. Variational Bayes
4.2. Optimizing via Coordinate Ascent Algorithm
- Step (a) Given the initial values of variational densities , , , , , , , , and , compute the lower bound (denoted as ) and set .
- Step (b) Compute variational density and update .
- Step (c) Compute variational density and update .
- Step (d) Compute variational density and update .
- Step (e) Compute variational density and update .
- Step (f) For , compute variational densities and update .
- Step (g) Compute variational density and update .
- Step (h) Compute variational densities and update .
- Step (i) Compute variational density and update .
- Step (j) Compute variational density and update .
- Step (k) Compute variational density and update .
- Step (l) Based on variational densities from Steps (b)–(k), compute the ELB (denoted as ) and the relative change
- Step (m) Given sufficiently small , if , the algorithm is stopped. Otherwise, repeat Steps (b)–(l).
| Algorithm 1: Variational Bayesian estimation |
![]() |
4.3. Model Comparison
5. Simulation Studies
- Type I. Components of covariate vector are independent of each other, i.e., when and when .
- Type II. is an autoregressive correlation, i.e., when and when .
6. An Empirical Example
7. Discussions
- Overcoming the problem of selecting a high-dimensional vector of shrinkage parameters required for the Bayesian lasso method;
- Simultaneously estimating model parameters and variance–covariance matrices and selecting fixed-effects and random-effects components with a relatively low computational cost;
- Avoiding large matrix computations and the curse of dimensionality problem;
- Providing a flexible and efficient approach to compute the Bayes factor for model comparison.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| MCMC | Markov chain Monte Carlo algorithm |
| EM | Expectation Maximization algorithm |
| ELB | evidence lower bound |
| TP | average number of active covariates correctly identified as active |
| FP | average number of inactive covariates incorrectly detected as active |
| RMS | mean square between the Bayesian estimates based on 100 replications and true value |
| of unknown parameter | |
| VB | variational Bayesian with proposed method |
| LASSO | Bayesian lasso method |
| AD | Alzheimer’s Disease |
| ADNI | Alzheimer’s Disease Neuroimaging Initiative |
| MRI | magnetic resonance imaging |
| MMSE | mini-mental state examination |
Appendix A. Conditional Distributions Required in Implementing the Gibbs Sampler
Appendix B. Calculating the Evidence Lower Bound (ELB)
Appendix C. Calculating the Estimated Bayes Factor in the Second Simulation
References
- Lindstrom, M.J.; Bates, D.M. Newton-raphson and EM algorithms for linear mixed-effects models for repeated measures data. J. Am. Stat. Assoc. 1988, 83, 1014–1022. [Google Scholar]
- Laird, N.; Lange, N.; Stram, D. Maximum likelihood computations with repeated measures: Applications of the EM algorithm. J. Am. Stat. Assoc. 1987, 82, 97–105. [Google Scholar] [CrossRef]
- Zeger, S.L.; Karim, M.R. Generalized linear models with random effects: A Gibbs sampling approach. J. Am. Stat. Assoc. 1991, 3, 79–86. [Google Scholar] [CrossRef]
- Gilks, W.R.; Wang, C.C.; Yvonnet, B.; Coursaget, P. Random-effects models for longitudinal data using Gibbs sampling. Biometrics 1993, 49, 441–453. [Google Scholar] [CrossRef] [PubMed]
- Chen, Z.; Dunson, D.B. Random effects selection in linear mixed models. Biometrics 2003, 59, 762–769. [Google Scholar] [CrossRef] [PubMed]
- Ahn, M.; Zhang, H.H.; Lu, W. Moment-based method for random effects selection in linear mixed models. Stat. Sin. 2012, 22, 1539–1562. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bondell, H.D.; Krishna, A.; Ghosh, S.K. Joint variable selection of fixed and random effects in linear mixed-effects models. Biometrics 2010, 66, 1069–1077. [Google Scholar] [CrossRef]
- Ibrahim, J.G.; Zhu, H.; Garcia, R.I.; Guo, R. Fixed and random effects selection in mixed effects models. Biometrics 2011, 67, 495–503. [Google Scholar] [CrossRef] [Green Version]
- Schelldorfer, J.; Buhlmann, P.; Van De Geer, S. Estimation for high-dimensional linear mixed-effects models using ℓ1–penalization. Scand. J. Stat. 2011, 38, 197–214. [Google Scholar] [CrossRef] [Green Version]
- Fan, Y.; Li, R. Variable selection in linear mixed effects models. Ann. Stat. 2012, 40, 2043–2068. [Google Scholar] [CrossRef]
- Li, Y.; Wang, S.J.; Song, P.X.K.; Wang, N.; Zhou, L.; Zhu, J. Doubly regularized estimation and selection in linear mixed-effects models for high-dimensional longitudinal data. Stat. Interface 2018, 11, 721–737. [Google Scholar] [CrossRef] [PubMed]
- Bradic, J.; Claeskens, G.; Gueuning, T. Fixed effects testing in high-dimensional linear mixed models. J. Am. Stat. Assoc. 2020, 115, 1835–1850. [Google Scholar] [CrossRef] [Green Version]
- Li, S.; Cai, T.T.; Li, H. Inference for high-dimensional linear mixed-effects models: A quasi-likelihood approach. J. Am. Stat. Assoc. 2021, 1–12. [Google Scholar] [CrossRef]
- Berger, J.; Bernardo, J.M. Reference priors in a variance components problem. In Bayesian Analysis in Statistics and Econometrics; Lecture Notes in Statistics; Goel, P., Ed.; Springer: New York, NY, USA, 1992; Volume 75, pp. 177–194. [Google Scholar]
- George, E.I.; McCullogh, R.E. Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 1993, 88, 881–889. [Google Scholar] [CrossRef]
- Ishwaran, H.; Rao, J.S. Spike and slab gene selection for multigroup microarray data. J. Am. Stat. Assoc. 2005, 100, 764–780. [Google Scholar] [CrossRef]
- Polson, N.G.; Scott, J.G. Local shrinkage rules, Levy processess and regularized regression. J. R. Stat. Soc. 2012, 74, 287–311. [Google Scholar] [CrossRef]
- Narisetty, N.N.; He, X. Bayesian variable selection with shrinking and diffusing priors. Ann. Stat. 2014, 42, 789–817. [Google Scholar] [CrossRef] [Green Version]
- Park, T.; Casella, G. The Bayesian Lasso. J. Am. Stat. Assoc. 2008, 103, 681–686. [Google Scholar] [CrossRef]
- Griffin, J.E.; Brown, P.J. Bayesian adaptive lassos with non-convex penalization. Aust. N. Z. J. Stat. 2011, 53, 423–442. [Google Scholar] [CrossRef]
- Rockova, V.; George, E.I. EMVS: The EM approach to Bayesian variable selection. J. Am. Stat. Assoc. 2014, 109, 828–846. [Google Scholar] [CrossRef]
- Latouche, P.; Mattei, P.A.; Bouveyron, C.; Chiquet, J. Combining a relaxed EM algorithm with Occam’s razor for Bayesian variable selection in high-dimensional regression. J. Multivar. Anal. 2016, 146, 177–190. [Google Scholar] [CrossRef]
- Narisetty, N.N.; Shen, J.; He, X. Skinny Gibbs: A consistent and acalable Gibbs sampler for model selection. J. Am. Stat. Assoc. 2019, 114, 1205–1217. [Google Scholar] [CrossRef]
- Wipf, D.P.; Rao, B.D.; Nagarajan, S. Latent variable Bayesian models for promoting sparsity. IEEE Trans. Inf. Theory 2011, 57, 6236–6255. [Google Scholar] [CrossRef] [Green Version]
- Ghahramani, Z.; Beal, M.J. Variational inference for Bayesian mixtures of factor analysis. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2000; Volume 12, pp. 449–455. [Google Scholar]
- Attias, H. A variational Bayesian framework for graphical models. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2000; Volume 12, pp. 209–215. [Google Scholar]
- Wu, Y.; Tang, N.S. Variational Bayesian partially linear mean shift models for high-dimensional Alzheimer’s disease neuroimaging data. Stat. Med. 2022, in press. [Google Scholar] [CrossRef]
- Zhang, C.H. Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 2010, 38, 894–942. [Google Scholar] [CrossRef] [Green Version]
- Fan, J.; Li, R. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 2001, 96, 1348–1360. [Google Scholar] [CrossRef]
- Zou, H. The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 2006, 101, 1418–1429. [Google Scholar] [CrossRef] [Green Version]
- Rockova, V.; George, E.I. The Spike-and-Slab Lasso. J. Am. Stat. Assoc. 2018, 113, 431–444. [Google Scholar] [CrossRef]
- Leng, C.; Tran, M.N.; Nott, D. Bayesian adaptive Lasso. Ann. Inst. Stat. Math. 2014, 66, 221–244. [Google Scholar] [CrossRef] [Green Version]
- Beal, M.J. Variational Algorithms for Approximate Bayesian Inference. Ph.D. Thesis, University of London, London, UK, 2003. [Google Scholar]
- Bishop, C. Pattern Recognition and Machine Learning; Springer: New York, NY, USA, 2006. [Google Scholar]
- Blei, D.M.; Kucukelbir, A.; McAuliffe, J.D. Variational inference: A review for statisticians. J. Am. Stat. Assoc. 2017, 518, 859–877. [Google Scholar] [CrossRef] [Green Version]
- Lee, S.Y.; Song, X.Y. Model comparison of nonlinear structural equation models with fixed covariates. Psychometrika 2003, 68, 27–47. [Google Scholar] [CrossRef]
- Lee, S.Y.; Tang, N.S. Bayesian analysis of nonlinear structural equation models with nonignorable missing data. Psychometrika 2005, 71, 541–564. [Google Scholar] [CrossRef]
- Tierney, L.; Kadane, J.B. Accurate approximations for posterior moments and marginal densities. J. Am. Stat. Assoc. 1986, 81, 82–86. [Google Scholar] [CrossRef]
- Neal, R.M. Annealed importance sampling. Stat. Comput. 2001, 11, 125–139. [Google Scholar] [CrossRef]
- Meng, X.L.; Wong, W. Simulating ratios of normalizing constants via a simple identity: A theoretical exploration. Stat. Sin. 1996, 6, 831–860. [Google Scholar]
- Gelman, A.; Meng, X.L. Simulating normalizing constants: From importance sampling to bridge sampling to path sampling. Stat. Sci. 1998, 13, 163–185. [Google Scholar] [CrossRef]
- Skilling, J. Nested sampling for general bayesian computation. Bayesian Anal. 2006, 1, 833–859. [Google Scholar] [CrossRef]
- Friel, N.; Pettitt, A.N. Marginal likelihood estimation via power posterior. J. R. Stat. Soc. 2008, 70, 589–607. [Google Scholar] [CrossRef]
- DiCicio, T.; Kass, R.; Raftery, A.; Wasserman, L. Computing Bayes factor by combining simulation and asymptotic approximations. J. Am. Stat. Assoc. 1997, 92, 903–915. [Google Scholar] [CrossRef]
- LIorente, F.; Martino, L.; Delgado, D.; Lopez-Santiago, J. Marginal likelihood computation for model selection and hypothesis testing: An extensive review. arXiv 2022, arXiv:2005.08334. [Google Scholar]
- Kass, R.E.; Raftery, A.E. Bayes factors. J. Am. Stat. Assoc. 1995, 90, 773–795. [Google Scholar] [CrossRef]
- Jack, C.; Bernstein, M.; Fox, N.; Thompson, P.; Alexander, G.; Harvey, D.; Borowski, B.; Britson, P.; Whitwell, J.; Ward, C. The Alzheimer’s disease neuroimaging initiative (ADNI): MRI methods. J. Magn. Reson. Imaging 2008, 27, 685–691. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhang, Y.Q.; Tang, N.S.; Qu, A. Imputed factor regression for high-dimensional block-wise missing data. Stat. Sin. 2020, 30, 631–651. [Google Scholar] [CrossRef]
- Brookmeyer, R.; Johnson, E.; Ziegler-Graham, K.; Arrighi, H. Forecasting the global burden of Alzheimer’s disease. Alzheimers Dement. 2007, 3, 186–191. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chen, Y.; Bornn, L.; De Freitas, N.; Eskelin, M.; Fang, J.; Welling, M. Herded Gibbs sampling. J. Mach. Learn. Res. 2016, 17, 263–291. [Google Scholar]
- Martino, L.; Elvira, V.; Camps-Valls, G. The recycling Gibbs sampler for efficient learning. Digit. Signal Process. 2018, 74, 1–13. [Google Scholar] [CrossRef] [Green Version]
- Roberts, G.O.; Sahu, S.K. Updating schemes, correlation structure, blocking and parameterization for the Gibbs sampler. J. R. Stat. Soc. 1997, 59, 291–317. [Google Scholar] [CrossRef]
| (, ) | n | Method | p = 500 | p = 1000 | p = 2000 | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| TP | FP | RMS | TP | FP | RMS | TP | FP | RMS | ||||
| (0.5, 0.5) | I | 100 | 3.91 | 0.00 | 0.11 | 3.79 | 0.00 | 0.08 | 3.84 | 0.00 | 0.06 | |
| 4.44 | 0.87 | 1.90 | 3.54 | 1.03 | 1.66 | 1.39 | 0.00 | 1.91 | ||||
| 200 | 4.71 | 0.00 | 0.11 | 4.68 | 0.00 | 0.08 | 4.65 | 0.00 | 0.06 | |||
| 4.95 | 0.24 | 2.24 | 2.78 | 1.91 | 1.36 | 3.34 | 0.00 | 1.64 | ||||
| 300 | 4.89 | 0.00 | 0.11 | 4.81 | 0.00 | 0.08 | 4.91 | 0.00 | 0.06 | |||
| 4.99 | 0.01 | 2.12 | 4.91 | 0.00 | 1.41 | 4.23 | 0.00 | 1.45 | ||||
| II | 100 | 3.79 | 0.00 | 0.11 | 3.84 | 0.00 | 0.08 | 3.76 | 0.00 | 0.06 | ||
| 3.48 | 0.10 | 2.19 | 3.01 | 0.00 | 1.87 | 3.00 | 0.00 | 2.01 | ||||
| 200 | 3.97 | 0.00 | 0.11 | 3.96 | 0.00 | 0.08 | 3.98 | 0.00 | 0.06 | |||
| 3.59 | 0.02 | 2.44 | 3.12 | 0.00 | 1.78 | 3.00 | 0.00 | 1.84 | ||||
| 300 | 3.98 | 0.00 | 0.11 | 3.96 | 0.00 | 0.08 | 3.98 | 0.00 | 0.06 | |||
| 3.63 | 0.03 | 2.31 | 3.20 | 0.00 | 1.79 | 3.01 | 0.00 | 1.75 | ||||
| (0.1, 0.9) | I | 100 | 3.88 | 0.00 | 0.11 | 3.79 | 0.00 | 0.08 | 3.84 | 0.00 | 0.06 | |
| 4.44 | 0.87 | 1.90 | 3.54 | 1.03 | 1.66 | 1.39 | 0.00 | 1.91 | ||||
| 200 | 4.71 | 0.00 | 0.11 | 4.66 | 0.00 | 0.08 | 4.64 | 0.00 | 0.06 | |||
| 4.95 | 0.24 | 2.24 | 2.78 | 1.91 | 1.36 | 3.34 | 0.00 | 1.64 | ||||
| 300 | 4.89 | 0.00 | 0.11 | 4.81 | 0.00 | 0.08 | 4.91 | 0.00 | 0.06 | |||
| 4.99 | 0.01 | 2.12 | 4.91 | 0.00 | 1.41 | 4.23 | 0.00 | 1.45 | ||||
| n | p | |||
|---|---|---|---|---|
| 500 | 1000 | 2000 | ||
| 100 | −194 | −102 | −86 | |
| 200 | −372 | −272 | −294 | |
| 300 | −506 | −544 | −588 | |
| () | 100 | −0.95 | −4.03 | −1.41 |
| 200 | −1.54 | −3.68 | −2.54 | |
| 300 | −3.13 | −3.58 | −2.26 | |
| Model | n | p | RMSE | MAP |
|---|---|---|---|---|
| Complete | 62 | 340 | 49.17 | 49.15 |
| Selected | 62 | 3 | 1.05 | 0.82 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yi, J.; Tang, N. Variational Bayesian Inference in High-Dimensional Linear Mixed Models. Mathematics 2022, 10, 463. https://doi.org/10.3390/math10030463
Yi J, Tang N. Variational Bayesian Inference in High-Dimensional Linear Mixed Models. Mathematics. 2022; 10(3):463. https://doi.org/10.3390/math10030463
Chicago/Turabian StyleYi, Jieyi, and Niansheng Tang. 2022. "Variational Bayesian Inference in High-Dimensional Linear Mixed Models" Mathematics 10, no. 3: 463. https://doi.org/10.3390/math10030463
APA StyleYi, J., & Tang, N. (2022). Variational Bayesian Inference in High-Dimensional Linear Mixed Models. Mathematics, 10(3), 463. https://doi.org/10.3390/math10030463


