Application of the Bayesian Method in Statistical Modeling

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Probability and Statistics".

Deadline for manuscript submissions: 15 July 2024 | Viewed by 12971

Special Issue Editor


E-Mail Website
Guest Editor
Department of Educational Leadership, Research, and School Improvement, University of West Georgia, Carrollton, GA 30118, USA
Interests: multivariate statistics; latent variable modeling; estimation methods; latent class analysis; latent profile analysis; factor analysis; structural equation modeling; cluster analysis

Special Issue Information

Dear Colleagues,

Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability, where probability expresses a degree of belief in an event. Bayesian statistical methods use Bayes' theorem to compute and update probabilities after obtaining new data. Named after Thomas Bayes, Bayes' theorem (1973) describes the conditional probability of an event based on data as well as prior information or beliefs about the event or conditions related to the event. This approach differs from other interpretations of probability, such as the frequentist interpretation, which views probability as the limit of the relative frequency of an event after many trials. During much of the 20th century, many statisticians viewed Bayesian methods unfavorably due primarily to practical considerations. Bayesian methods required much computation to complete, and the most widely used methods during the century relied on frequentist interpretation. However, with the advent of powerful computers and new algorithms, such as Markov chain Monte Carlo, Bayesian methods have seen increasing use within statistics in the 21st century. This Special Issue aims to raise awareness of the availability and applicability of Bayesian analyses. It includes a collection of theoretical and applied studies using Bayesian statistics and provides information on statistical software that allows using Bayesian estimation methods.

Dr. Diana Mindrila
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Bayesian analysis
  • Bayesian estimation
  • statistics
  • probability

Published Papers (11 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 1409 KiB  
Article
The Efficiency of Hazard Rate Preservation Method for Generating Discrete Rayleigh–Lindley Distribution
by Hanan Haj Ahmad
Mathematics 2024, 12(8), 1261; https://doi.org/10.3390/math12081261 - 22 Apr 2024
Viewed by 362
Abstract
In this study, we introduce two novel discrete counterparts for the Rayleigh–Lindley mixture, constructed through the application of survival and hazard rate preservation techniques. These two-parameter discrete models demonstrate exceptional adaptability across various data types, including skewed, symmetric, and monotonic datasets. Statistical analyses [...] Read more.
In this study, we introduce two novel discrete counterparts for the Rayleigh–Lindley mixture, constructed through the application of survival and hazard rate preservation techniques. These two-parameter discrete models demonstrate exceptional adaptability across various data types, including skewed, symmetric, and monotonic datasets. Statistical analyses were conducted using maximum likelihood estimation and Bayesian approaches to assess these models. The Bayesian analysis, in particular, was implemented with the squared error and LINEX loss functions, incorporating a modified Lwin Prior distribution for parameter estimation. Through simulation studies and numerical methods, we evaluated the estimators’ performance and compared the effectiveness of the two discrete adaptations of the Rayleigh–Lindley distribution. The simulations reveal that Bayesian methods are especially effective in this setting due to their flexibility and adaptability. They provide more precise and dependable estimates for the discrete Rayleigh–Lindley model, especially when using the hazard rate preservation method. This method is a compelling alternative to the traditional survival discretization approach, showcasing its significant potential in enhancing model accuracy and applicability. Furthermore, two real data sets are analyzed to assess the performance of each analog. Full article
(This article belongs to the Special Issue Application of the Bayesian Method in Statistical Modeling)
Show Figures

Figure 1

25 pages, 934 KiB  
Article
Tampered Random Variable Analysis in Step-Stress Testing: Modeling, Inference, and Applications
by Hanan Haj Ahmad, Dina A. Ramadan and Ehab M. Almetwally
Mathematics 2024, 12(8), 1248; https://doi.org/10.3390/math12081248 - 20 Apr 2024
Viewed by 308
Abstract
This study explores a new dimension of accelerated life testing by analyzing competing risk data through Tampered Random Variable (TRV) modeling, a method that has not been extensively studied. This method is applied to simple step-stress life testing (SSLT), and it considers multiple [...] Read more.
This study explores a new dimension of accelerated life testing by analyzing competing risk data through Tampered Random Variable (TRV) modeling, a method that has not been extensively studied. This method is applied to simple step-stress life testing (SSLT), and it considers multiple causes of failure. The lifetime of test units under changeable stress levels is modeled using Power Rayleigh distribution with distinct scale parameters and a constant shape parameter. The research introduces unique tampering coefficients for different failure causes in step-stress data modeling through TRV. Using SSLT data, we calculate maximum likelihood estimates for the parameters of our model along with the tampering coefficients and establish three types of confidence intervals under the Type-II censoring scheme. Additionally, we delve into Bayesian inference for these parameters, supported by suitable prior distributions. Our method’s validity is demonstrated through extensive simulations and real data application in the medical and electrical engineering fields. We also propose an optimal stress change time criterion and conduct a thorough sensitivity analysis. Full article
(This article belongs to the Special Issue Application of the Bayesian Method in Statistical Modeling)
Show Figures

Figure 1

26 pages, 1162 KiB  
Article
Variational Bayesian Variable Selection for High-Dimensional Hidden Markov Models
by Yao Zhai, Wei Liu, Yunzhi Jin and Yanqing Zhang
Mathematics 2024, 12(7), 995; https://doi.org/10.3390/math12070995 - 27 Mar 2024
Viewed by 551
Abstract
The Hidden Markov Model (HMM) is a crucial probabilistic modeling technique for sequence data processing and statistical learning that has been extensively utilized in various engineering applications. Traditionally, the EM algorithm is employed to fit HMMs, but currently, academics and professionals exhibit augmenting [...] Read more.
The Hidden Markov Model (HMM) is a crucial probabilistic modeling technique for sequence data processing and statistical learning that has been extensively utilized in various engineering applications. Traditionally, the EM algorithm is employed to fit HMMs, but currently, academics and professionals exhibit augmenting enthusiasm in Bayesian inference. In the Bayesian context, Markov Chain Monte Carlo (MCMC) methods are commonly used for inferring HMMs, but they can be computationally demanding for high-dimensional covariate data. As a rapid substitute, variational approximation has become a noteworthy and effective approximate inference approach, particularly in recent years, for representation learning in deep generative models. However, there has been limited exploration of variational inference for HMMs with high-dimensional covariates. In this article, we develop a mean-field Variational Bayesian method with the double-exponential shrinkage prior to fit high-dimensional HMMs whose hidden states are of discrete types. The proposed method offers the advantage of fitting the model and investigating specific factors that impact the response variable changes simultaneously. In addition, since the proposed method is based on the Variational Bayesian framework, the proposed method can avoid huge memory and intensive computational cost typical of traditional Bayesian methods. In the simulation studies, we demonstrate that the proposed method can quickly and accurately estimate the posterior distributions of the parameters with good performance. We analyzed the Beijing Multi-Site Air-Quality data and predicted the PM2.5 values via the fitted HMMs. Full article
(This article belongs to the Special Issue Application of the Bayesian Method in Statistical Modeling)
Show Figures

Figure 1

23 pages, 2344 KiB  
Article
Evaluating the Discrete Generalized Rayleigh Distribution: Statistical Inferences and Applications to Real Data Analysis
by Hanan Haj Ahmad, Dina A. Ramadan and Ehab M. Almetwally
Mathematics 2024, 12(2), 183; https://doi.org/10.3390/math12020183 - 5 Jan 2024
Cited by 2 | Viewed by 798
Abstract
Various discrete lifetime distributions have been observed in real data analysis. Numerous discrete models have been derived from a continuous distribution using the survival discretization method, owing to its simplicity and appealing formulation. This study focuses on the discrete analog of the newly [...] Read more.
Various discrete lifetime distributions have been observed in real data analysis. Numerous discrete models have been derived from a continuous distribution using the survival discretization method, owing to its simplicity and appealing formulation. This study focuses on the discrete analog of the newly generalized Rayleigh distribution. Both classical and Bayesian statistical inferences are performed to evaluate the efficacy of the new discrete model, particularly in terms of relative bias, mean square error, and coverage probability. Additionally, the study explores different important submodels and limiting behavior for the new discrete distribution. Various statistical functions have been examined, including moments, stress–strength, mean residual lifetime, mean past time, and order statistics. Finally, two real data examples are employed to evaluate the new discrete model. Simulations and numerical analyses play a pivotal role in facilitating statistical estimation and data modeling. The study concludes that the discrete generalized Rayleigh distribution presents a notably appealing alternative to other competing discrete distributions. Full article
(This article belongs to the Special Issue Application of the Bayesian Method in Statistical Modeling)
Show Figures

Figure 1

17 pages, 728 KiB  
Article
Assessing the Risk of APOE-ϵ4 on Alzheimer’s Disease Using Bayesian Additive Regression Trees
by Yifan Xia and Baosheng Liang
Mathematics 2023, 11(13), 3019; https://doi.org/10.3390/math11133019 - 7 Jul 2023
Viewed by 787
Abstract
Alzheimer’s disease (AD) affects about a tenth of the population aged over 65 and nearly half of those over 85, and the number of AD patients continues to grow. Several studies have shown that the ϵ4 variant of the apolipoprotein E ( [...] Read more.
Alzheimer’s disease (AD) affects about a tenth of the population aged over 65 and nearly half of those over 85, and the number of AD patients continues to grow. Several studies have shown that the ϵ4 variant of the apolipoprotein E (APOE) gene is potentially associated with an increased risk of AD. In this study, we aimed to investigate the causal effect of APOE-ϵ4 on Alzheimer’s disease under the potential outcome framework and evaluate the individualized risk of disease onset for APOE-ϵ4 carriers. A total of 1705 Hispanic individuals from the Washington Heights-Inwood Columbia Aging Project (WHICAP) were included in this study, comprising 453 APOE-ϵ4 carriers and 1252 non-carriers. Among them, 265 subjects had developed AD (23.2%). The non-parametric Bayesian additive regression trees (BART) approach was applied to model the individualized causal effects of APOE-ϵ4 on disease onset in the presence of right-censored outcomes. The heterogeneous risk of APOE-ϵ4 on AD was examined through the individualized posterior survival probability and posterior causal effects. The results showed that, on average, patients carrying APOE-ϵ4 were 0.968 years younger at onset than those with non-carrying status, and the disease risk associated with APOE-ϵ4 carrying status was 3.9% higher than that for non-carrying status; however, it should be noted that neither result was statistically significant. The posterior causal effects of APOE-ϵ4 for individualized subjects indicate that 14.41% of carriers presented strong evidence of AD risk and approximately 38.65% presented mild evidence, while around 13.71% of non-carriers presented strong evidence of AD risk and 40.89% presented mild evidence. Furthermore, 79.26% of carriers exhibited a posterior probability of disease risk greater than 0.5. In conclusion, no significant causal effect of the APOE-ϵ4 gene on AD was observed at the population level, but strong evidence of AD risk was identified in a sub-group of APOE-ϵ4 carriers. Full article
(This article belongs to the Special Issue Application of the Bayesian Method in Statistical Modeling)
Show Figures

Figure 1

18 pages, 1788 KiB  
Article
Bayesian Latent Class Analysis: Sample Size, Model Size, and Classification Precision
by Diana Mindrila
Mathematics 2023, 11(12), 2753; https://doi.org/10.3390/math11122753 - 17 Jun 2023
Viewed by 2899
Abstract
The current literature includes limited information on the classification precision of Bayes estimation for latent class analysis (BLCA). (1) Objectives: The present study compared BLCA with the robust maximum likelihood (MLR) procedure, which is the default procedure with the Mplus 8.0 software. [...] Read more.
The current literature includes limited information on the classification precision of Bayes estimation for latent class analysis (BLCA). (1) Objectives: The present study compared BLCA with the robust maximum likelihood (MLR) procedure, which is the default procedure with the Mplus 8.0 software. (2) Method: Markov chain Monte Carlo simulations were used to estimate two-, three-, and four-class models measured by four binary observed indicators with samples of 1000, 750, 500, 250, 100, and 75 observations, respectively. With each sample, the number of replications was 500, and entropy and average latent class probabilities for most likely latent class membership were recorded. (3) Results: Bayes entropy values were more stable and ranged between 0.644 and 1. Bayes’ average latent class probabilities ranged between 0.528 and 1. MLR entropy values ranged between 0.552 and 0.958. and MLR average latent class probabilities ranged between 0.539 and 0.993. With the two-class model, BLCA outperformed MLR with all sample sizes. With the three-class model, BLCA had higher classification precision with the 75-sample size, whereas MLR performed slightly better with the 750- and 1000-sample sizes. With the 4-class model, BLCA underperformed MLR and had an increased number of unsuccessful computations, particularly with smaller samples. Full article
(This article belongs to the Special Issue Application of the Bayesian Method in Statistical Modeling)
Show Figures

Figure 1

23 pages, 982 KiB  
Article
Bayesian Spatial Split-Population Survival Model with Applications to Democratic Regime Failure and Civil War Recurrence
by Minnie M. Joo, Brandon Bolte, Nguyen Huynh and Bumba Mukherjee
Mathematics 2023, 11(8), 1886; https://doi.org/10.3390/math11081886 - 16 Apr 2023
Viewed by 1156
Abstract
The underlying risk factors associated with the duration and termination of biological, sociological, economic, or political processes often exhibit spatial clustering. However, existing nonspatial survival models, including those that account for “immune” and “at-risk” subpopulations, assume that these baseline risks are spatially independent, [...] Read more.
The underlying risk factors associated with the duration and termination of biological, sociological, economic, or political processes often exhibit spatial clustering. However, existing nonspatial survival models, including those that account for “immune” and “at-risk” subpopulations, assume that these baseline risks are spatially independent, leading to inaccurate inferences in split-population survival settings. In this paper, we develop a Bayesian spatial split-population survival model that addresses these methodological challenges by accounting for spatial autocorrelation among units in terms of their probability of becoming immune and their survival rates. Monte Carlo experiments demonstrate that, unlike nonspatial models, this spatial model provides accurate parameter estimates in the presence of spatial autocorrelation. Applying our spatial model to data from published studies on authoritarian reversals and civil war recurrence reveals that accounting for spatial autocorrelation in split-population models leads to new empirical insights, reflecting the need to theoretically and statistically account for space and non-failure inflation in applied research. Full article
(This article belongs to the Special Issue Application of the Bayesian Method in Statistical Modeling)
Show Figures

Figure 1

26 pages, 1927 KiB  
Article
Bayesian and Non-Bayesian Risk Analysis and Assessment under Left-Skewed Insurance Data and a Novel Compound Reciprocal Rayleigh Extension
by Mohamed Ibrahim, Walid Emam, Yusra Tashkandy, M. Masoom Ali and Haitham M. Yousof
Mathematics 2023, 11(7), 1593; https://doi.org/10.3390/math11071593 - 25 Mar 2023
Cited by 5 | Viewed by 1055
Abstract
Continuous probability distributions can handle and express different data within the modeling process. Continuous probability distributions can be used in the disclosure and evaluation of risks through a set of well-known basic risk indicators. In this work, a new compound continuous probability extension [...] Read more.
Continuous probability distributions can handle and express different data within the modeling process. Continuous probability distributions can be used in the disclosure and evaluation of risks through a set of well-known basic risk indicators. In this work, a new compound continuous probability extension of the reciprocal Rayleigh distribution is introduced for data modeling and risk analysis. Some of its properties including are derived. The estimation of the parameters is carried out via different techniques. Bayesian estimations are computed under gamma and normal prior. The performance and assessment of all techniques are studied and assessed through Monte Carlo experiments of simulations and two real-life datasets for applications. Two applications to real datasets are provided for comparing the new model with other competitive models and to illustrate the importance of the proposed model via the maximum likelihood technique. Numerical analysis for expected value, variance, skewness, and kurtosis are given. Five key risk indicators are defined and analyzed under Bayesian and non-Bayesian estimation. An extensive analytical study that investigated the capacity to reveal actuarial hazards used a wide range of well-known models to examine actuarial disclosure models. Using actuarial data, actuarial hazards were evaluated and rated. Full article
(This article belongs to the Special Issue Application of the Bayesian Method in Statistical Modeling)
Show Figures

Figure 1

20 pages, 1532 KiB  
Article
On Designing of Bayesian Shewhart-Type Control Charts for Maxwell Distributed Processes with Application of Boring Machine
by Fatimah Alshahrani, Ibrahim M. Almanjahie, Majid Khan, Syed M. Anwar, Zahid Rasheed and Ammara N. Cheema
Mathematics 2023, 11(5), 1126; https://doi.org/10.3390/math11051126 - 23 Feb 2023
Cited by 1 | Viewed by 1192
Abstract
The quality characteristic(s) are assumed to follow the normal distribution in many control chart constructions, although this assumption may not hold in some instances. This study proposes the Bayesian-I and Bayesian-II Shewhart-type control charts for monitoring the Maxwell scale parameter in the phase [...] Read more.
The quality characteristic(s) are assumed to follow the normal distribution in many control chart constructions, although this assumption may not hold in some instances. This study proposes the Bayesian-I and Bayesian-II Shewhart-type control charts for monitoring the Maxwell scale parameter in the phase II study. The posterior and predictive distributions are used to construct the control limits for the proposed Bayesian-I and Bayesian-II Shewhart-type control charts, respectively. Various performance indicators, including average run length, quadratic loss, relative average run length, and performance comparison index, are utilized to evaluate the performance of the proposed control charts. The Bayesian-I and Bayesian-II Shewhart-type control charts are compared to their competitive CUSUMV, EWMAV and V control charts. Sensitivity analysis is also performed to study the effect of hyperparameter values on the performance behavior of the proposed control charts. Finally, real-life data is analyzed for the implementation of the proposed control charts. Full article
(This article belongs to the Special Issue Application of the Bayesian Method in Statistical Modeling)
Show Figures

Figure 1

22 pages, 3198 KiB  
Article
Autocorrelation and Parameter Estimation in a Bayesian Change Point Model
by Rui Qiang and Eric Ruggieri
Mathematics 2023, 11(5), 1082; https://doi.org/10.3390/math11051082 - 21 Feb 2023
Cited by 1 | Viewed by 1692
Abstract
A piecewise function can sometimes provide the best fit to a time series. The breaks in this function are called change points, which represent the point at which the statistical properties of the model change. Often, the exact placement of the change points [...] Read more.
A piecewise function can sometimes provide the best fit to a time series. The breaks in this function are called change points, which represent the point at which the statistical properties of the model change. Often, the exact placement of the change points is unknown, so an efficient algorithm is required to combat the combinatorial explosion in the number of potential solutions to the multiple change point problem. Bayesian solutions to the multiple change point problem can provide uncertainty estimates on both the number and location of change points in a dataset, but there has not yet been a systematic study to determine how the choice of hyperparameters or the presence of autocorrelation affects the inference made by the model. Here, we propose Bayesian model averaging as a way to address the uncertainty in the choice of hyperparameters and show how this approach highlights the most probable solution to the problem. Autocorrelation is addressed through a pre-whitening technique, which is shown to eliminate spurious change points that emerge due to a red noise process. However, pre-whitening a dataset tends to make true change points harder to detect. After an extensive simulation study, the model is applied to two climate applications: the Pacific Decadal Oscillation and a global surface temperature anomalies dataset. Full article
(This article belongs to the Special Issue Application of the Bayesian Method in Statistical Modeling)
Show Figures

Figure 1

22 pages, 14586 KiB  
Article
Revisited Bayesian Sequential Indicator Simulation: Using a Log-Linear Pooling Approach
by Nasser Madani
Mathematics 2022, 10(24), 4669; https://doi.org/10.3390/math10244669 - 9 Dec 2022
Viewed by 1066
Abstract
It has been more than a decade since sequential indicator simulation was proposed to model geological features. Due to its simplicity and easiness of implementation, the algorithm attracts the practitioner’s attention and is rapidly becoming available through commercial software programs for modeling mineral [...] Read more.
It has been more than a decade since sequential indicator simulation was proposed to model geological features. Due to its simplicity and easiness of implementation, the algorithm attracts the practitioner’s attention and is rapidly becoming available through commercial software programs for modeling mineral deposits, oil reservoirs, and groundwater resources. However, when the algorithm only uses hard conditioning data, its inadequacy to model the long-range geological features has always been a research debate in geostatistical contexts. To circumvent this difficulty, one or several pieces of soft information can be introduced into the simulation process to assist in reproducing such large-scale settings. An alternative format of Bayesian sequential indicator simulation is developed in this work that integrates a log-linear pooling approach by using the aggregation of probabilities that are reported by two sources of information, hard and soft data. The novelty of this revisited Bayesian technique is that it allows the incorporation of several influences of hard and soft data in the simulation process by assigning the weights to their probabilities. In this procedure, the conditional probability of soft data can be directly estimated from hard conditioning data and then be employed with its corresponding weight of influence to update the weighted conditional portability that is simulated from the same hard conditioning and previously simulated data in a sequential manner. To test the algorithm, a 2D synthetic case study is presented. The findings showed that the resulting maps obtained from the proposed revisited Bayesian sequential indicator simulation approach outperform other techniques in terms of reproduction of long-range geological features while keeping its consistency with other expected local and global statistical measures. Full article
(This article belongs to the Special Issue Application of the Bayesian Method in Statistical Modeling)
Show Figures

Figure 1

Planned Papers

The below list represents only planned manuscripts. Some of these manuscripts have not been received by the Editorial Office yet. Papers submitted to MDPI journals are subject to peer-review.

Title: Bayesian Latent Class Analysis
Authors: Mindrila, D.L.
Affiliation: University of West Georgia

Title: Bayesian versus Maximum Likelihood Estimation in Latent Class Analysis: Sample Size, Model Size, and Classification precision
Authors: Mindrila, D.L.
Affiliation: University of West Georgia

Title: Application of Bayesian Exploratory Factor Analysis
Authors: Mindrila, D.
Affiliation: University of West Georgia

Title: Examination of High School Graduation Rates: A Bayesian Approach
Authors: Mindrila, D.
Affiliation: University of West Georgia

Back to TopTop