Generative AI for Bayesian Computation
Abstract
1. Introduction
Connections to Previous Work
2. Generative Bayesian Computation (GBC)
Algorithm 1 Generative Bayesian Computation (GBC) |
|
Summary Statistics
3. Bayes Quantile Neural Networks
3.1. Learning Quantiles
3.2. Synthetic Data
4. Bayes with Quantiles
Normal–Normal Bayes Learning: Wang Distortion
5. Applications
5.1. Traffic Data
5.2. Satellite Drag
6. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- Polson, N.; Ruggeri, F.; Sokolov, V. Generative Bayesian Computation for Maximum Expected Utility. Entropy 2024, 26, 1076. [Google Scholar] [CrossRef] [PubMed]
- Sun, F.; Gramacy, R.B.; Haaland, B.; Lawrence, E.; Walker, A. Emulating Satellite Drag from Large Simulation Experiments. SIAM/ASA J. Uncertain. Quantif. 2019, 7, 720–759. [Google Scholar] [CrossRef]
- Kolmogorov, A.N. Definition of Center of Dispersion and Measure of Accuracy from a Finite Number of Observations. Izv. Akad. Nauk SSSR Ser. Mat. 1942, 6, 3–32. (In Russian) [Google Scholar]
- Fearnhead, P.; Prangle, D. Constructing Summary Statistics for Approximate Bayesian Computation: Semi-Automatic Approximate Bayesian Computation. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2012, 74, 419–474. [Google Scholar] [CrossRef]
- Dabney, W.; Ostrovski, G.; Silver, D.; Munos, R. Implicit Quantile Networks for Distributional Reinforcement Learning. arXiv 2018, arXiv:1806.06923. [Google Scholar] [CrossRef]
- Ostrovski, G.; Dabney, W.; Munos, R. Autoregressive Quantile Networks for Generative Modeling. arXiv 2018, arXiv:1806.05575. [Google Scholar] [CrossRef]
- Dabney, W.; Rowland, M.; Bellemare, M.G.; Munos, R. Distributional Reinforcement Learning with Quantile Regression. arXiv 2017, arXiv:1710.10044. [Google Scholar] [CrossRef]
- Albert, C.; Ulzega, S.; Ozdemir, F.; Perez-Cruz, F.; Mira, A. Learning Summary Statistics for Bayesian Inference with Autoencoders. SciPost Phys. Core 2022, 5, 043. [Google Scholar] [CrossRef]
- Müller, P.; West, M.; MacEachern, S.N. Bayesian models for non-linear auto-regressions. J. Time Ser. Anal. 1997, 18, 593–614. [Google Scholar] [CrossRef]
- Bhadra, A.; Datta, J.; Polson, N.; Sokolov, V.; Xu, J. Merging Two Cultures: Deep and Statistical Learning. arXiv 2021, arXiv:2110.11561. [Google Scholar] [CrossRef]
- Breiman, L. Statistical Modeling: The Two Cultures (with Comments and a Rejoinder by the Author). Stat. Sci. 2001, 16, 199–231. [Google Scholar] [CrossRef]
- Polson, N.G.; Ročková, V. Posterior Concentration for Sparse Deep Learning. In Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2018; Volume 31. [Google Scholar]
- Polson, N.; Sokolov, V.; Xu, J. Deep Learning Partial Least Squares. arXiv 2021, arXiv:2106.14085. [Google Scholar]
- Brillinger, D.R. A Generalized Linear Model With “Gaussian” Regressor Variables. In Selected Works of David Brillinger; Guttorp, P., Brillinger, D., Eds.; Selected Works in Probability and Statistics; Springer: New York, NY, USA, 2012; pp. 589–606. [Google Scholar]
- Blum, M.G.B.; Nunes, M.A.; Prangle, D.; Sisson, S.A. A Comparative Review of Dimension Reduction Methods in Approximate Bayesian Computation. Stat. Sci. 2013, 28, 189–208. [Google Scholar] [CrossRef]
- Park, M.; Jitkrittum, W.; Sejdinovic, D. K2-ABC: Approximate Bayesian Computation with Kernel Embeddings. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, Cadiz, Spain, 9–11 May 2016; pp. 398–407. [Google Scholar]
- Zhou, Y.; Gu, Y.; Dunson, D.B. Bayesian Deep Generative Models for Replicated Networks with Multiscale Overlapping Clusters. arXiv 2024, arXiv:2405.20936. [Google Scholar]
- Baker, E.; Barbillon, P.; Fadikar, A.; Gramacy, R.B.; Herbei, R.; Higdon, D.; Huang, J.; Johnson, L.R.; Ma, P.; Mondal, A.; et al. Analyzing Stochastic Computer Models: A Review with Opportunities. Stat. Sci. 2022, 37, 64–89. [Google Scholar] [CrossRef]
- Gallant, A.R.; McCulloch, R.E. On the Determination of General Scientific Models with Application to Asset Pricing. J. Am. Stat. Assoc. 2009, 104, 117–131. [Google Scholar] [CrossRef]
- Beaumont, M.A.; Zhang, W.; Balding, D.J. Approximate Bayesian Computation in Population Genetics. Genetics 2002, 162, 2025–2035. [Google Scholar] [CrossRef]
- Gramacy, R.B. Surrogates: Gaussian Process Modeling, Design, and Optimization for the Applied Sciences; CRC Press: Boca Raton, FL, USA, 2020. [Google Scholar]
- Jiang, B.; Wu, T.Y.; Zheng, C.; Wong, W.H. Learning Summary Statistic For Approximate Bayesian Computation Via Deep Neural Network. Stat. Sin. 2017, 27, 1595–1618. [Google Scholar] [CrossRef]
- Papamakarios, G.; Murray, I. Fast \epsilon -Free Inference of Simulation Models with Bayesian Conditional Density Estimation. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2016; Volume 29. [Google Scholar]
- Bishop, C.M. Mixture Density Networks; Technical Report NCRG/94/004; Aston University: Birmingham, UK, 1994. [Google Scholar]
- Nunes, M.A.; Balding, D.J. On Optimal Selection of Summary Statistics for Approximate Bayesian Computation. Stat. Appl. Genet. Mol. Biol. 2010, 9. [Google Scholar] [CrossRef]
- Jiang, B.; Wu, T.-Y.; Wing, H.W. Approximate Bayesian Computation with Kullback-Leibler Divergence as Data Discrepancy. In Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, Lanzarote, Canary Islands, 9–11 April 2018; pp. 1711–1721. [Google Scholar]
- Bernton, E.; Jacob, P.E.; Gerber, M.; Robert, C.P. Approximate Bayesian Computation with the Wasserstein Distance. J. R. Stat. Soc. Ser. B 2019, 81, 235–269. [Google Scholar] [CrossRef]
- Longstaff, F.A.; Schwartz, E.S. Valuing American Options by Simulation: A Simple Least-Squares Approach. Rev. Financ. Stud. 2001, 14, 113–147. [Google Scholar] [CrossRef]
- Pastorello, S.; Patilea, V.; Renault, E. Iterative and Recursive Estimation in Structural Nonadaptive Models. J. Bus. Econ. Stat. 2003, 21, 449–509. [Google Scholar] [CrossRef]
- Shan, S.; Wang, G.G. Survey of modeling and optimization strategies to solve high-dimensional design problems with computationally-expensive black-box functions. Struct. Multidiscip. Optim. 2010, 41, 219–241. [Google Scholar] [CrossRef]
- Donoho, D.L. High-dimensional data analysis: The curses and blessings of dimensionality. In Proceedings of the Ams Conference on Math Challenges of the 21st Century, Lund, Sweden, 6–12 August 2000. [Google Scholar]
- Binois, M.; Gramacy, R.B.; Ludkovski, M. Practical heteroskedastic Gaussian process modeling for large simulation experiments. J. Comput. Graph. Stat. 2018, 27, 808–821. [Google Scholar] [CrossRef]
- Lázaro-Gredilla, M.; Quinonero-Candela, J.; Rasmussen, C.E.; Figueiras-Vidal, A.R. Sparse spectrum Gaussian process regression. J. Mach. Learn. Res. 2010, 11, 1865–1881. [Google Scholar]
- Cortes, C.; Haffner, P.; Mohri, M. Rational kernels: Theory and algorithms. J. Mach. Learn. Res. 2004, 5, 1035–1062. [Google Scholar]
- Gramacy, R.B.; Lee, H.K.H. Bayesian Treed Gaussian Process Models with an Application to Computer Modeling. J. Am. Stat. Assoc. 2008, 103, 1119–1130. [Google Scholar] [CrossRef]
- Gramacy, R.B.; Apley, D.W. Local Gaussian Process Approximation for Large Computer Experiments. J. Comput. Graph. Stat. 2015, 24, 561–578. [Google Scholar] [CrossRef]
- Chang, W.; Haran, M.; Olson, R.; Keller, K. Fast dimension-reduced climate model calibration and the effect of data aggregation. Ann. Appl. Stat. 2014, 8, 649–673. [Google Scholar] [CrossRef]
- Bayarri, M.J.; Berger, J.O.; Paulo, R.; Sacks, J.; Cafeo, J.A.; Cavendish, J.; Lin, C.H.; Tu, J. A Framework for Validation of Computer Models. Technometrics 2007, 49, 138–154. [Google Scholar] [CrossRef]
- Wilson, A.G.; Gilboa, E.; Nehorai, A.; Cunningham, J.P. Fast kernel learning for multidimensional pattern extrapolation. Adv. Neural Inf. Process. Syst. 2014, 27. [Google Scholar]
- Higdon, D. Space and space-time modeling using process convolutions. In Quantitative Methods for Current Environmental Issues; Springer: Berlin/Heidelberg, Germany, 2002; pp. 37–56. [Google Scholar]
- Papamakarios, G.; Pavlakou, T.; Murray, I. Masked Autoregressive Flow for Density Estimation. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
- van den Oord, A.; Dieleman, S.; Zen, H.; Simonyan, K.; Vinyals, O.; Graves, A.; Kalchbrenner, N.; Senior, A.; Kavukcuoglu, K. WaveNet: A Generative Model for Raw Audio. arXiv 2016, arXiv:1609.03499. [Google Scholar] [CrossRef]
- Germain, M.; Gregor, K.; Murray, I.; Larochelle, H. MADE: Masked Autoencoder for Distribution Estimation. arXiv 2015, arXiv:1502.03509. [Google Scholar] [CrossRef]
- Izmailov, P.; Vikram, S.; Hoffman, M.D.; Wilson, A.G.G. What are Bayesian neural network posteriors really like? In Proceedings of the International Conference on Machine Learning, Online, 18–24 July 2021; pp. 4629–4640. [Google Scholar]
- Wang, Y.; Polson, N.; Sokolov, V.O. Data Augmentation for Bayesian Deep Learning. Bayesian Anal. 2022, 1, 1–29. [Google Scholar] [CrossRef]
- Schultz, L.; Auld, J.; Sokolov, V. Bayesian Calibration for Activity Based Models. arXiv 2022, arXiv:2203.04414. [Google Scholar] [CrossRef]
- Kallenberg, O. Foundations of Modern Probability, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 1997. [Google Scholar]
- White, H. Some Asymptotic Results for Learning in Single Hidden-Layer Feedforward Network Models. J. Am. Stat. Assoc. 1989, 84, 1003–1013. [Google Scholar] [CrossRef]
- Sohl-Dickstein, J.; Weiss, E.A.; Maheswaranathan, N.; Ganguli, S. Deep Unsupervised Learning Using Nonequilibrium Thermodynamics. arXiv 2015, arXiv:1503.03585. [Google Scholar] [CrossRef]
- Nareklishvili, M.; Polson, N.; Sokolov, V. Deep Partial Least Squares for IV Regression. arXiv 2022, arXiv:2207.02612. [Google Scholar] [CrossRef]
- Padilla, O.H.M.; Tansey, W.; Chen, Y. Quantile Regression with ReLU Networks: Estimators and Minimax Rates. J. Mach. Learn. Res. 2022, 23, 247:11251–247:11292. [Google Scholar]
- Shen, G.; Jiao, Y.; Lin, Y.; Horowitz, J.L.; Huang, J. Deep Quantile Regression: Mitigating the Curse of Dimensionality Through Composition. arXiv 2021, arXiv:2107.04907. [Google Scholar] [CrossRef]
- Schmidt-Hieber, J. Nonparametric Regression Using Deep Neural Networks with ReLU Activation Function. Ann. Stat. 2020, 48, 1875–1897. [Google Scholar]
- Bach, F. High-Dimensional Analysis of Double Descent for Linear Regression with Random Projections. SIAM J. Math. Data Sci. 2024, 6, 26–50. [Google Scholar] [CrossRef]
- Belkin, M.; Rakhlin, A.; Tsybakov, A.B. Does Data Interpolation Contradict Statistical Optimality? In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, Naha, Japan, 16–18 April 2019; pp. 1611–1619. [Google Scholar]
- Nakkiran, P.; Kaplun, G.; Bansal, Y.; Yang, T.; Barak, B.; Sutskever, I. Deep Double Descent: Where Bigger Models and More Data Hurt. J. Stat. Mech. Theory Exp. 2021, 2021, 124003. [Google Scholar] [CrossRef]
- Cao, Y.; Gu, Q. Generalization Error Bounds of Gradient Descent for Learning Over-Parameterized Deep ReLU Networks. Proc. AAAI Conf. Artif. Intell. 2020, 34, 3349–3356. [Google Scholar] [CrossRef]
- Parzen, E. Quantile Probability and Statistical Data Modeling. Stat. Sci. 2004, 19, 652–662. [Google Scholar] [CrossRef]
- Levina, E.; Bickel, P. The Earth Mover’s Distance Is the Mallows Distance: Some Insights from Statistics. In Proceedings of the Proceedings Eighth IEEE International Conference on Computer Vision, ICCV, Vancouver, BC, Canada, 7–14 July 2001; IEEE: Piscataway, NJ, USA, 2001; Volume 2, pp. 251–256. [Google Scholar]
- Weng, L. From GAN to WGAN. arXiv 2019, arXiv:1904.08994. [Google Scholar] [CrossRef]
- Chernozhukov, V.; Fernández-Val, I.; Galichon, A. Quantile and Probability Curves Without Crossing. Econometrica 2010, 78, 1093–1125. [Google Scholar] [CrossRef]
- Cannon, A.J. Non-Crossing Nonlinear Regression Quantiles by Monotone Composite Quantile Regression Neural Network, with Application to Rainfall Extremes. Stoch. Environ. Res. Risk Assess. 2018, 32, 3207–3225. [Google Scholar] [CrossRef]
- Polson, N.; Sokolov, V. Bayesian Analysis of Traffic Flow on Interstate I-55: The LWR Model. arXiv 2014, arXiv:1409.6034. [Google Scholar] [CrossRef]
- Berger, T.E.; Dominique, M.; Lucas, G.; Pilinski, M.; Ray, V.; Sewell, R.; Sutton, E.K.; Thayer, J.P.; Thiemann, E. The Thermosphere Is a Drag: The 2022 Starlink Incident and the Threat of Geomagnetic Storms to Low Earth Orbit Space Operations. Space Weather 2023, 21, e2022SW003330. [Google Scholar] [CrossRef]
- Mehta, P.M.; Walker, A.; Lawrence, E.; Linares, R.; Higdon, D.; Koller, J. Modeling satellite drag coefficients with response surfaces. Adv. Space Res. 2014, 54, 1590–1607. [Google Scholar] [CrossRef]
- Sauer, A.; Cooper, A.; Gramacy, R.B. Non-stationary Gaussian Process Surrogates. arXiv 2023, arXiv:2305.19242. [Google Scholar] [CrossRef]
- Gneiting, T.; Raftery, A.E. Strictly proper scoring rules, prediction, and estimation. J. Am. Stat. Assoc. 2007, 102, 359–378. [Google Scholar] [CrossRef]
- Zamo, M.; Naveau, P. Estimation of the continuous ranked probability score with limited information and applications to ensemble weather forecasts. Math. Geosci. 2018, 50, 209–234. [Google Scholar] [CrossRef]
- Akesson, M.; Singh, P.; Wrede, F.; Hellander, A. Convolutional Neural Networks as Summary Statistics for Approximate Bayesian Computation. In IEEE/ACM Transactions on Computational Biology and Bioinformatics; IEEE: Piscataway, NJ, USA, 2021. [Google Scholar]
- Diggle, P.J.; Gratton, R.J. Monte Carlo Methods of Inference for Implicit Statistical Models. J. R. Stat. Society. Ser. B (Methodol.) 1984, 46, 193–227. [Google Scholar] [CrossRef]
- Stroud, J.R.; Müller, P.; Polson, N.G. Nonlinear state-space models with state-dependent variances. J. Am. Stat. Assoc. 2003, 98, 377–386. [Google Scholar] [CrossRef]
- Drovandi, C.C.; Pettitt, A.N.; Faddy, M.J. Approximate Bayesian Computation Using Indirect Inference. J. R. Stat. Soc. Ser. C (Appl. Stat.) 2011, 60, 317–337. [Google Scholar] [CrossRef]
- Drovandi, C.C.; Pettitt, A.N.; Lee, A. Bayesian Indirect Inference Using a Parametric Auxiliary Model. Stat. Sci. 2015, 30, 72–95. [Google Scholar] [CrossRef]
Parameter | Range |
---|---|
velocity [m/s] | [5500, 9500] |
surface temperature [K] | [100, 500] |
atmospheric temperature [K] | [200, 2000] |
yaw [radians] | |
pitch [radians] | |
normal energy AC [unitless] | [0, 1] |
tangential momentum AC [unitless] | [0, 1] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Polson, N.; Sokolov, V. Generative AI for Bayesian Computation. Entropy 2025, 27, 683. https://doi.org/10.3390/e27070683
Polson N, Sokolov V. Generative AI for Bayesian Computation. Entropy. 2025; 27(7):683. https://doi.org/10.3390/e27070683
Chicago/Turabian StylePolson, Nick, and Vadim Sokolov. 2025. "Generative AI for Bayesian Computation" Entropy 27, no. 7: 683. https://doi.org/10.3390/e27070683
APA StylePolson, N., & Sokolov, V. (2025). Generative AI for Bayesian Computation. Entropy, 27(7), 683. https://doi.org/10.3390/e27070683