Bayesian Panel Variable Selection Under Model Uncertainty for High-Dimensional Data

Pastpipatkul, Pathairat; Ko, Htwe

doi:10.3390/econometrics14010003

Open AccessArticle

Bayesian Panel Variable Selection Under Model Uncertainty for High-Dimensional Data

by

Pathairat Pastpipatkul

and

Htwe Ko

^*

Faculty of Economics, Chiang Mai University, Chiang Mai 50200, Thailand

^*

Author to whom correspondence should be addressed.

Econometrics 2026, 14(1), 3; https://doi.org/10.3390/econometrics14010003

Submission received: 23 October 2025 / Revised: 25 December 2025 / Accepted: 30 December 2025 / Published: 4 January 2026

Download

Browse Figures

Versions Notes

Abstract

Selecting the relevant covariates in high-dimensional panel data remains a central challenge in applied econometrics. Conventional fixed effects and random effects models are not designed for systematic variable selection under model uncertainty. In addition, many existing models such as LASSO in machine learning or Bayesian approaches like model averaging, Bayesian Additive Regression Trees, and Bayesian Variable Selection with Shrinking and Diffusing Priors have been primarily developed for time series analysis. This paper develops and applies Bayesian Panel Variable Selection (BPVS) models to simulation and empirical applications. These models are designed to assist researchers in identifying which input covariates matter most, while also determining whether their effects should be treated as fixed or random through Bayesian hierarchical modeling and posterior inference, which jointly accounts for variable importance ranking. Both the simulation studies and the empirical application to socioeconomic determinants of subjective well-being show that Bayesian panel models outperform classical models, especially in terms of convergence stability, predictive accuracy, and reliable variable selection. Classical panel models, in contrast, remain attractive for their computational efficiency and simplicity. The Hausman test is used as a robustness check. The study adds an econometric approach for dealing with model uncertainty in high-dimensional panel analysis and offers open-source R 4.5.1 code to support future applications.

Keywords:

BPVS; variable importance ranking; model uncertainty; R

1. Introduction

The problem of selecting most relevant covariates in defining true panel models has become more and more imperative in applied econometrics. With the rapid growth of large-scale financial, socioeconomic, and behavioral datasets, academic researchers and data scientists are confronted with high-dimensional panels where the number of independent variables may be large, their importance uncertain, and their associations with the dependent variable potentially heterogeneous across units and time (Baltagi, 2021; Hsiao, 2014). In such cases, conventional modeling strategies such as pooled ordinary least squares (OLS), fixed effects (FE), and random effects (RE) models have two main limitations. First, they do not provide systematic procedures for variable selection. Second, they are sensitive to model uncertainty. As a result, empirical findings can be fragile depending heavily on the researcher’s prior choice of covariates (Wooldridge, 2010). This specifies the necessity of variable selection step in dynamic panel models beyond theoretical coverages. For example, having too many irrelevant regressors can lead to overfitting and unreliable estimation, while excluding the important covariates may generate omitted variables bias. When models include lagged dependent variables, they add additional sources of persistence and endogeneity while expanding the dimensionality of the model. Therefore, the key econometric task is to identify covariates that are both statistically and economically meaningful while appropriately accounting for dynamic dependence and unobserved heterogeneity (Baltagi, 2021).

There are some econometric approaches that have been developed outside the panel modeling. Machine learning techniques such as the Least Absolute Shrinkage and Selection Operator (LASSO; Tibshirani, 1996) and its extensions shrink coefficients and remove irrelevant predictors. In Bayesian econometrics, econometric models such as Bayesian model averaging (BMA; Hoeting et al., 1999), Bayesian Additive Regression Trees (BART; Chipman et al., 2010), and Bayesian Variable Selection with Shrinking and Diffusing Priors (BASAD; Narisetty & He, 2014) handle model uncertainty in the time series data. However, their applications are limited to panel-specific features such as unit heterogeneity and time dependence, while they ignore to account for hierarchical assemblies and the difference between fixed and random effects. In empirical practice, researchers are required to choose between two extremes. Conventional FE and RE models are computationally efficient and widely applied but lack systematic tools for ranking the importance of covariates or assembling the important variables. This gap demonstrates the need for new methods, and we introduce Bayesian panel variable selection (BPVS) models in this paper to address that. Specifically, the BPVS framework systematically identifies important covariates and determines whether their effects should be treated as fixed or random through flexible prior specifications and posterior inference, thereby explicitly accounting for model uncertainty. These models are designed to help researchers (i) determine the important covariates influencing the dependent variable of the study, (ii) evaluate whether effects should be treated as fixed or random, and (iii) define an appropriate model averaging strategy under model uncertainty in high-dimensional panel data.

This study makes two main contributions to the field of applied econometrics. First, it provides the first Bayesian panel variable selection methods. This makes it possible to address model uncertainty in a way that fits the hierarchical and dynamic structure of panel data. Second, the proposed approach effectively identifies important covariates, which is essential not only for correctly interpreting heterogeneity across firms or countries but also for defining the most appropriate panel model averaging procedure. In addition, we display how BPVS models can be applied in practice to uncover the important socioeconomic determinants of subjective well-being across 89 countries. The remainder of this paper is organized as follows. Section 2 introduces the proposed BPVS models. Section 3 presents the simulation studies. Section 4 presents an empirical application to subjective well-being data. Section 5 discusses practical implications, and Section 6 concludes the paper with directions for future research.

2. Bayesian Panel Variable Selection (BPVS) Models

Panel data, which consists of repeated observations for multiple cross-sectional units over time, provides a good base for analyzing dynamic relationships while accounting for unobserved heterogeneity. However, not all covariates may have a significant effect on the response in many cases. Plus, including irrelevant variables may lead to bias estimates. Therefore, panel variable selection is an essential step in empirical model construction that is to identify a subset of most relevant covariates while properly accounting for both the temporal and cross-sectional dimensions of the data.

2.1. Classical Panel Estimation Framework

Consider a panel dataset of N individuals observed over T periods. Let

y_{i t}

indicate the outcome variable for individual i = 1, …, N at time t = 1, …, T, and let

x_{i t} = (x_{i t, 1}, \dots, x_{i t, K})^{⊤}

be the K-dimensional vector of explanatory variables. The panel variable selection aims to determine the most relevant covariates affecting the outcome variable

y_{i t}

.

The general panel model can be expressed as:

y_{i t} = β_{0} + \sum_{k = 1}^{K} β_{k} x_{i t, k} + μ_{i} + ε_{i t}

(1)

where

y_{i t}

denotes the dependent variable for unit i at time t,

x_{i t, k}

represents the k-th explanatory variable,

β_{k}

is the slope parameter,

μ_{i}

is unobserved unit-specific heterogeneity, and

ε_{i t} ~ N (0, σ_{ε}^{2})

is an error term.

Under the fixed effects (FE) model,

μ_{i}

is treated as a parameter to be estimated while controlling for all time-invariant heterogeneity. Under the random effects (RE) model, it is assumed that

μ_{i} ~ N (0, σ_{μ}^{2}), C o v (μ_{i}, x_{i t}) = 0

(2)

The FE model is performed using the within transformation, which demeans each variable by its individual-specific mean using the OLS estimator, same as pooled ordinary least squares (OLS). The pooled OLS model is computationally simple but fails to account for individual-specific effects. And the RE is estimated through the Generalized Least Squares (GLS) estimator. The selection between FE and RE is generally determined by the Hausman test (Hausman, 1978). To generalize this, the linear mixed effects (LME) model can be implemented for multiple levels of random effects and hierarchical modeling.

2.2. Bayesian Panel Estimation Framework

While classical panel models offer robust point estimates, the Bayesian approach provides the full probabilistic inference of model parameters. In Bayesian modeling, the coefficients are allocated priors that enable variable selection and shrinkage. Extending panel model in Equation (1) into a Bayesian framework, we specify:

y_{i t} = β_{0} + \sum_{k = 1}^{K} β_{k} x_{i t, k} + μ_{i} + ε_{i t}

(3)

With prior distributions for the regression coefficients assuming normal priors as

β_{k} ~ N (0, τ_{k}^{2}), ε_{i t} ~ N (0, σ_{ε}^{2})

(4)

where

τ_{k}^{2}

controls the degree of the shrinkage on each coefficient

β_{k}

and can be tuned to reflect prior beliefs about parameter relevance.

The joint posterior distribution is obtained via Bayes’ theorem.

p (θ | D) \propto p (D| θ) p (θ)

(5)

where

θ = (β_{0}, β_{k}, μ, σ_{μ}^{2}, σ_{ε}^{2}),

and

D

indicates the observed data. Finally, the posterior inference can be computed using Markov Chain Monte Carlo (MCMC) simulations.

In this study, the posterior predictive distribution of outcome

y_{i t}

is obtained as:

p (y_{i t}| D) = \int p (y_{i t}| θ) p (θ| D) d θ

(6)

It is sampled through posterior draws

{\tilde{y}}_{i t}^{(s)} ~ p (y_{i t}| θ^{(s)})

for s = 1, …, S.

The model predictive performance was evaluated by the Root Mean Squared Error (RMSE) and Log Pointwise Predictive Density (LPPD) shown as follows:

L_{0}^{R M S E} = \sqrt{\frac{1}{n} \sum_{i = 1}^{N} \sum_{t = 1}^{T} (y_{i t} - {\hat{y}}_{i t}^{m e d})^{2}}

(7)

L_{0}^{L P P D} = \sum_{i = 1}^{N} \sum_{t = 1}^{T} l o g [\frac{1}{S} \sum_{s = 1}^{S} p (y_{i t}| {\tilde{y}}_{i t}^{(s)}, \hat{σ})]

(8)

where

{\hat{y}}_{i t}^{m e d} = m e d i a n ({\tilde{y}}_{i t}^{(s)})

is the posterior predictive median,

p (y_{i t}| {\tilde{y}}_{i t}^{(s)}, \hat{σ})

is the predictive density from the posterior draw using the Poisson distribution, and

\hat{σ}

is the residual standard deviation obtained from the model fit.

2.3. Computing Variable Importance for Variable $x_{j}$

To estimate the importance of variable

x_{j}

, we compute how much predictive performance deteriorates when

x_{j}

is randomly permuted by breaking its relationship with

y_{i t}

.

Let

X_{π_{j}}

denote the dataset where the j-th variable is permuted across observations:

X_{π_{j}} = (x_{1}, \dots, x_{j, π}, \dots, x_{p})

(9)

where

π

is a random permutation of the indices 1, …, n. We then compute posterior predictions under the permuted data:

{\tilde{y}}_{i t, π_{j}}^{(s)} = f (X_{π_{j}, i t}, θ^{(s)})

(10)

We evaluate the permuted predictive performance

L_{π_{j}}

using the same metric, as shown in Equations (9) and (10).

The permutation procedure breaks the relationship between covariate

x_{j}

and the outcome

y_{i t}

by randomly reshuffling the values of

x_{j}

across observations while keeping all other covariates fixed. This ensures that any predictive contribution of

x_{j}

is removed, so that the resulting deterioration in predictive performance can be attributed solely to the loss of information from

x_{j} .

When predictors are highly correlated, unconditional permutation may distort the joint dependence structure among covariates and lead to biased importance measures. To address this issue, we implement a correlation-based conditional permutation scheme. Specifically, each covariate

x_{j}

is permuted only within strata defined by highly correlated variables

C_{j} = {x_{k} : ∣ ρ_{j k} ∣ > τ}

. This two-level stratification approximately preserves dependence within correlated covariate groups, thereby reducing instability and bias arising from multicollinearity.

The importance score for variable

x_{j}

is defined as the change in predictive loss before and after permutation:

I m p (x_{j}) = {_{L_{0}^{L P P D} - L_{π_{j}}^{L P P D}, i f L P P D m e t r i c i s u s e d}^{L_{π_{j}}^{R M S E} - L_{0}^{R M S E}, i f R M S E m e t r i c i s u s e d}

(11)

To reduce randomness, the permutation is repeated B times:

\hat{I m p} (x_{j}) = \frac{1}{B} \sum_{b = 1}^{B} {I m p}^{(b)} (x_{j}), \hat{σ_{I m p}} (x_{j}) = \sqrt{\frac{1}{B - 1} \sum_{b = 1}^{B} ({I m p}^{(b)} (x_{j}) - \hat{I m p} (x_{j}))^{2}}

(12)

Hence, the distribution of LPPD permutation scores can be approximated as:

\hat{I m p} (x_{j}) ~ N (μ_{I m p} (x_{j}), σ_{I m p}^{2} (x_{j}))

(12a)

where

μ_{I m p} (x_{j}) = E_{b} [{L P P D}_{0} - {L P P D}_{π_{j}}^{(b)}]

and

σ_{I m p}^{2} (x_{j}) = v_{b} [{L P P D}_{0} - {L P P D}_{π_{j}}^{(b)}]

.

For linear mixed effects model, we separate this variance into fixed and random:

σ_{I m p}^{2} (x_{j}) = ({\hat{σ}}_{I m p}^{(f i x e d)} (x_{j}))^{2} + ({\hat{σ}}_{I m p}^{(r a n d o m)} (x_{j}))^{2}

(13)

The empirical 90 percent confidence interval was used.

{C I}_{0.05,0.95} (x_{j}) = [Q_{0.05} ({I m p}^{(1 : B)} (x_{j})), Q_{0.95} ({I m p}^{(1 : B)} (x_{j}))]

(14)

where

Q_{p} (.)

indicates the p-th quantile.

Then, the variance importance for observation (i,t) was computed based on RMSE and LPPD metrics, as shown in Equations (17) and (18).

{I m p}_{i t}^{(b), R M S E} (x_{j}) = (y_{i t} - {\tilde{y}}_{i t, π_{j}}^{m e d, (b)})^{2} - (y_{i t} - {\tilde{y}}_{i t}^{m e d})^{2}

(15)

{I m p}_{i t}^{(b), L P P D} (x_{j}) = \log ({\hat{p}}_{i t}^{(0)}) - l o g ({\hat{p}}_{i t, π_{j}}^{(0)})

(16)

where

{\hat{p}}_{i t}^{(0)} = \frac{1}{S} \sum_{s} p (y_{i t} | {\tilde{y}}_{i t}^{(s)}, \hat{σ})

. Also, when predictors are correlated, unconditional shuffling may break joint dependence. Thus, we implemented a conditional permutation by permuting

x_{j}

within bands defined by correlated variables

C_{j} = {x_{k} : |p_{j k}| > τ}

, which then formed a two-level stratification as:

s (i) = 1 (r o w M e a n ({x_{i, k} : K \in C_{j}}) > m e d i a n o f r o w M e a n s)

(17)

x_{j, π | C_{j}} = p e r m u t e (x_{j}| s (i) \in {l o w, h i g h})

(17a)

where permute

x_{j}

within each stratum

s (i)

so that dependencies within

C_{j}

are approximately preserved. This can lessen the potential bias from multicollinearity.

Once all

\hat{I m p} (x_{j})

are computed, the relative importance of predictors was ranked.

R a n k (x_{j}) = r a n k (- |\hat{I m p} (x_{j})|)

(18)

3. Simulation Studies

3.1. Data Generation and Model Specification

The primary objective of the simulation experiment is to evaluate the performance of the Bayesian fixed effects (BFE) and Bayesian random effects (BRE), classical fixed effects (FE), classical random effects (RE), classical pooled ordinary least squares (OLS), and linear mixed effects (LME) models. We can then evaluate the best models to design for panel variable selection. In this work, we simulate panel datasets with N = 100 and 200 units observed at T = 10 and 20 time periods. Three covariates are generated with

x_{1} ~ N (0,1),

x_{n} ~ U n i f o r m (- 1,1),

and

x_{3} ~ N (5,2),

with unit-level random intercepts for each unit

μ_{i} ~ N (0, σ_{μ}^{2}),

where

σ_{μ} = 2

. The response variable

y_{i t}

is structured according to a nonlinear data-generating process as:

y_{i t} = 3 \cdot \sin x_{1, i t} + 2 \cdot x_{2, i t}^{2} + μ_{i} + ε_{i t}, ε_{i t} ~ N (0,1)

(19)

This specification is designed to test the availability of panel variable selection under both nonlinear covariate effects and cross-sectional dependence assuming the realistic scenarios. In this way, we can also observe if

x_{3, i t}

can still have a causal effect on simulated y in our models.

We considered both classical models and Bayesian hierarchical models; the classical panel methods comprised pooled ordinary least squares (pooled OLS), within (fixed effects) and random effects models implemented using the “plm” R package, while the linear mixed effects model was estimated using the “lme4” R package. On the other hand, Bayesian hierarchical panel models were implemented via the “brms” R package, with the “cmdstanr” R package backend. These Bayesian models incorporated both fixed and random effects for better comparison with classical FE and RE models. In addition, Bayesian FE and RE provide flexible modeling of nonlinear relationships through the inclusion of important covariates. Weakly informative priors were applied to let the data to govern inference, and posterior distributions were sampled using two Markov chains of 1000 iterations each with 500 warm-ups.

The performance of classical and Bayesian models was systematically evaluated using multiple dimensions. The goodness-of-fit was computed using

R^{2}

. Parameter estimates were evaluated by comparing estimated means against the true values generated in the simulation. While the stability of Bayesian models was weighed across repeated simulation runs, diagnostic checks were conducted to certify that model assumptions were satisfactorily satisfied. Residual analyses focused on normality and homoscedasticity in detecting potential misspecifications or violations that could affect modeling inference. Plus, the several metrics, namely, prediction accuracy, residual normality, residual homoscedasticity, model fit quality, convergence, computational efficiency, and interpretability, were used to assess the practical feasibility of each modeling approach.

From the simulations, variable importance or rank was measured through an advanced permutation-based method adapted for Bayesian modeling. This approach involved permuting the values of each input predictor while preserving correlations among covariates exceeding a threshold 0.3, thereby isolating the contribution of individual covariates to predictive performance. To ensure stability, each permutation was repeated 20 times, and results were summarized using means, standard deviations, and 90% credible intervals. Local importance metrics at the observation level further facilitated the detection of unit-specific contributions.

3.2. Simulation Results

This subsection presents the estimation results from two sample simulations. Table 1 reports the estimation results from the classical panel models including the pooled OLS, fixed effects (FE), random effects (RE), and linear mixed effects (LME) estimators for simulated datasets with N = 100, T = 10 and N = 200, T = 20. In the small sample size, the covariate

x_{1}

consistently exhibits a strong positive and significant effect on Y across all classical models. The nonlinear specified term

I (x_{2})^{2}

is significant in every model that reveals the intended quadratic relationship. In contrast,

x_{3}

shows a negative and insignificant influence on Y. Goodness-of-fit measures show that pooled OLS and classical RE perform best compared to FE and LME models. For the second panel (simulation 2), the results remain robust and consistent. The estimated effect of

x_{1}

increases marginally to approximately 1.84 across all models, while

I (x_{2})^{2}

remains statistically significant with coefficients near 1.95. The covariate

x_{3}

continues to exhibit a weak and insignificant effect on Y. The model fit improves slightly for the larger sample dataset, showing a stable performance as the sample size rises. Table 2 shows the posterior estimates from the Bayesian fixed and random effects models for the same simulated datasets. The results are remarkably consistent with those obtained from the classical estimators. For both sample sizes, the means of

x_{1}

and

I (x_{2})^{2}

are statistically significant and closely approximate their true values, with posterior standard errors remaining small. The posterior distributions offer full uncertainty quantification and model convergence as well as effective sample sizes to ensure the stable and reliable Bayesian inference.

Model diagnostics and predictive performance of each model are provided in Table A1 (see Appendix A) and Table 3. The results indicate Bayesian models achieve excellent predictive accuracy, while computational efficiency favors classical models. In sum, it highlights the strength of Bayesian panel modeling. Therefore, we estimate panel variable selection using Bayesian fixed effects and random effects models. While the BFE model also performs well, BRE demonstrates the best model for the simulated panels. Figure 1 and Figure 2 present the simulated variable importance rankings and corresponding posterior mean importance scores under Bayesian fixed effects and Bayesian random effects models. The results show that

x_{1}

and

x_{2}

consistently exhibit substantially higher importance than

x_{3}

, particularly under the random effects specification. This pattern is fully consistent with the data-generating process, in which

x_{1}

and

x_{2}

enter the outcome equation, while

x_{3}

has no causal effect. Figure A1 and Figure A2 provided in Appendix B compare the parameter estimates for fixed effects and random effects across two simulated panels. They confirm that

x_{3}

has no significant causal effect on

y

in our models. Next, we estimate Bayesian PVS models to check the variable importance as simulated, and the results confirm that only

x_{1}

and

x_{2}

are important variables or covariates. To double validate the choice of Bayesian random effects model, we further conducted the Hausman test as a robustness check (see Table 4). For two panels, the chi-squared statistics are small and provide weak evidence against the null hypothesis that the random effects are consistent. Thus, these results reveal the BRE model as the optimal choice for panel variable selection. Methodologically, the simulation studies have confirmed that Bayesian panel models provide reliable inference under unobserved heterogeneity.

4. Empirical Study

This section presents the empirical application of the proposed Bayesian Panel Variable Selection (BPVS) models to examine the effects of socioeconomic factors on subjective well-being across 89 countries as a focus. The analysis focuses on identifying the most influential determinants of subjective well-being, also referred to as temporary happiness consistent with Pastpipatkul and Ko (2025) while accounting for cross-country heterogeneity, time variation, and model uncertainty. To double validate the findings in the simulation studies, we preliminarily compared the performance of classical and Bayesian estimators through model diagnostics before proceeding to the identification of important variables using Bayesian fixed effects and random effects models.

4.1. Data and Preliminary Analysis

In this study, we used the secondary data obtained from the World Happiness Report and World Bank. Table 5 provides the symbols, their measurement units, and data sources for full transparency. The dependent variable, subjective well-being, studies individual-level life satisfaction on a point scale and is widely accepted in cross-country happiness research. The independent variables were selected to echo both economic and social dimensions of well-being. Economic indicators include per capital GDP (INCOME) and real GDP growth rates (GDPRs), which capture the dynamic of individual and national wealth. Social indicators include healthy life expectancy at birth (HEALTH), freedom to make life choices (FREEDOM), social support (SUPPORT), generosity (GENEROSITY), and perception of corruption (CORRUPTION). Macroeconomic control variables are consumer price inflation (INFLATION) and unemployment (UNEMPLOYMENT). The initial inclusion of these explanatory variables is justified on theoretical and prior empirical research. The existing empirical literature has frequently stated that subjective well-being is multidimensional, which is influenced not only by income and GDP growth alone but also by health, social, governance, and households’ conditions.

Table A3 (see Appendix A) shows the correlation matrix of the covariates. INCOME and GDPR are highly negatively correlated (−0.988), which may cause multicollinearity. SUPPORT and HEALTH show moderate positive correlation (0.692). Most other correlations are low, suggesting limited linear dependence. Multicollinearity can reduce the precision of coefficient estimates. The Bayesian Panel Variable Selection (BPVS) model helps address this issue. By selecting the most important variables, it reduces redundancy and improves model stability. This ensures more reliable estimation of the effects of socioeconomic factors on subjective well-being. Table A4 (see Appendix A) presents the estimation results from classical panel models, including pooled OLS, fixed effects, random effects, and linear mixed effects models. Across these models, FREEDOM, HEALTH, and CORRUPTION consistently show strong associations with subjective well-being. INCOME, GDPR, SUPPORT, GENEROSITY, INFLATION, and UNEMPLOYMENT exhibit varying significance depending on the model specification. Table A5 (see Appendix A) reports the results from Bayesian fixed effects (BFE) and Bayesian random effects (BRE) models. Bayesian estimates provide posterior means, credible intervals, and convergence diagnostics. FREEDOM and CORRUPTION remain the most influential factors for SWB. HEALTH and SUPPORT also show positive effects, while INFLATION and UNEMPLOYMENT have negative associations.

Table A2 (see Appendix A) summarizes the comparative diagnostics of the models. Bayesian models (BFE/BRE) exhibit excellent predictive accuracy, stable convergence, and high interpretability. Classical models show moderate predictive performance and some issues with residual homoscedasticity. Computational efficiency is higher for classical models, but Bayesian models provide richer uncertainty quantification and better handling of cross-country heterogeneity. Overall, Bayesian models outperform classical estimators in terms of predictive accuracy, parameter uncertainty quantification, and model reliability. These results demonstrate the value of Bayesian variable selection for identifying the most important determinants of SWB.

4.2. Empirical Results

Table 6 presents the variable importance ranking from Bayesian fixed effects (BFE) and Bayesian random effects (BRE) models. HEALTH and FREEDOM are consistently identified as the top two determinants of subjective well-being (SWB) in both models. SUPPORT ranks third in BFE but drops to ninth in BRE, highlighting the effect of model specification on variable importance. CORRUPTION, INCOME, INFLATION, UNEMPLOYMENT, GDPR, and GENEROSITY are ranked lower that indicate relatively smaller effects on SWB. These differences illustrate the value of DPVS in accounting for model uncertainty as it identifies which variables have the most influential factors driving SWB. Next, we conducted a Hausman test (Hausman, 1978) to determine whether fixed effects or random effects, which is widely used to specify the appropriate panel specification. Table 7 reports the Hausman test results for choosing between the fixed effects and random effects models. The chi-squared statistic is 21.28 with 9 degrees of freedom and a p-value of 0.01146. The corresponding minimum Bayes factor (mBF) is 0.1392, which indicates weak Bayesian evidence against the null hypothesis. So, it supports the use of a fixed effects for this dataset. However, the Hausman test only evaluates the overall choice between FE and RE. It may be unreliable if effects vary across countries. DPVS does not require pre-selecting FE or RE. It determines for each variable whether a fixed or random effect is more appropriate and ranks variable importance. The DPVS results show that HEALTH and FREEDOM have strong effects across countries. INCOME, UNEMPLOYMENT, and GDPR vary more across countries. This highlights which factors are universally important and which are context-specific. DPVS captures both robust and heterogeneous effects. But we estimate the effects of top four independent variables on subjective well-being following Hausman test results. Table A6 (see Appendix A) shows the Bayesian fixed effects estimation results. We found that FREEDOM is the strongest positive and significant factor. This reveals that the ability to make life choices greatly improves temporary happiness. Policies promoting personal autonomy may therefore enhance the overall well-being of populations. HEALTH also has a positive and significant effect. Good physical and mental health support daily functioning and life satisfaction. Investments in healthcare and preventive programs can strengthen SWB. SUPPORT contributes positively and significantly. Good social networks improve resilience and life satisfaction. Strengthening social safety nets and community programs can enhance SWB. CORRUPTION has a negative and insignificant effect. High perceived corruption undermines trust and reduces life satisfaction. Anti-corruption policies and transparent governance are important for maintaining SWB. These findings are also consistent with prior work by Pastpipatkul and Ko (2025) that social factors are as important as money to improve the overall self-reported life satisfaction across different income economies.

5. Discussion

The Bayesian Panel Variable Selection (BPVS) models offer a practical solution for high-dimensional and nonlinear data analysis. The BPVS are designed to help researchers in order to identify the most important covariates while accounting for unobserved heterogeneity and model uncertainty. Unlike classical fixed effects or random effects models, BPVS systematically ranks variables and quantifies their relevance using posterior inclusion probabilities and permutation-based posterior predictive importance measures. This makes it easier to distinguish strong predictors from irrelevant covariates. The method also accommodates both fixed and random effects, which provide the flexibility to model unit-specific differences across individuals, firms, or countries. Both simulation studies and the empirical study reveal that Bayesian models achieve stable convergence and accurate parameter estimates. They also maintain strong predictive performance even under nonlinear covariate relationships and cross-sectional dependence. Empirical results for subjective well-being further demonstrate the practical implication of the model. Social and economic factors such as freedom, health, and social support emerge as the top determinants, which highlight the importance of input social factors affecting subjective well-being. This approach effectively handles multicollinearity issues while reducing redundancy among correlated independent variables and improving the robustness of model inference. For applied researchers, this method could be one of the efficient tools to adopt because it integrates nonlinear variable selection, hierarchical modeling, and uncertainty quantification. In sum, this variable importance ranking method has great potential to bridge a gap in applied econometrics for Bayesian inference, model averaging, or variable selection for high-dimensional panel data.

6. Concluding Remarks

This paper develops Bayesian Panel Variable Selection (BPVS) models for high-dimensional and nonlinear panel datasets so as to address model uncertainty and multicollinearity issues. This approach determines the most relevant covariates or independent variables and classifies whether their effects should be fixed or random. Thereby, the study adds a new method to applied econometrics that addresses model uncertainty, hierarchical structure, and cross-sectional heterogeneity simultaneously for the panel study. We encourage econometricians and applied econometrists to further this work in several directions. Researchers could practically apply these models to the fields of policy analysis, behavioral economics, or health economics where high-dimensional panels are common. Data and R codes are provided in the supplementary files. We suggest developing open-source R packages to make the BPVS models widely accessible. Such package could integrate ready-to-use functions for Bayesian variable selection, permutation-based variable importance, and model diagnostics, which would reorganize applied research. In addition, future studies can combine BPVS with other machine learning techniques or nonparametric methods such as maximum likelihood estimation to handle nonlinearity more flexibly. These extensions could establish Bayesian panel variable selection as a standard tool in econometric modeling.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/econometrics14010003/s1.

Author Contributions

Conceptualization, methodology, software, validation, formal analysis, investigation, data curation, visualization, writing—original draft, writing—review & editing, P.P. and H.K.; resources, supervision, project administration, funding acquisition, P.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Faculty of Economics, Chiang Mai University.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The authors used only the secondary data obtained from the publicly available sources.

Acknowledgments

The corresponding author gratefully acknowledges the Faculty of Economics, Chiang Mai University, for the full support during the doctoral course in economics under the CMU Presidential Scholarships. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. The diagnostics of models from simulation studies.

Test	BFE	BRE	FE	RE	LME	Best Model
N = 100; T = 10
Prediction Accuracy	Excellent	Excellent	Good	Very Good	Good	BFE/BRE
Residual Normality	Good	Better	Moderate	Good	Good	BRE
Residual Homoscedasticity	Moderate	Good	Some Issues	Good	Good	Mixed
Model Fit Quality	High	Very High	Good	Excellent	High	RE
Convergence	Stable	Stable	Stable	Stable	Stable	Mixed
Computational Efficiency	Moderate	Good	Excellent	Excellent	Excellent	FE/RE
Interpretability	High	Moderate	Very High	High	Moderate	FE
N = 200; T = 20
Prediction Accuracy	Excellent	Excellent	Good	Very Good	Good	BFE/BRE
Residual Normality	Good	Better	Moderate	Good	Good	BRE
Residual Homoscedasticity	Moderate	Good	Some Issues	Good	Good	Mixed
Model Fit Quality	High	Very High	Good	Excellent	High	RE
Convergence	Stable	Stable	Stable	Stable	Stable	Mixed
Computational Efficiency	Moderate	Good	Excellent	Excellent	Excellent	FE/RE
Interpretability	High	Moderate	Very High	High	Moderate	FE

Notes: “Excellent”, “Very Good”, “Very High”, “Good”, “High”, “Moderate”, “Stable”, and “Better” classifications indicate comparative diagnostic performance of the model. Bayesian models show excellent predictive precision.

Table A2. The diagnostics of models from empirical study.

Test	BFE	BRE	FE	RE	Best Model
Prediction Accuracy	Excellent	Excellent	Good	Very Good	BFE/BRE
Residual Normality	Good	Better	Moderate	Good	BRE
Residual Homoscedasticity	Moderate	Good	Some Issues	Good	Mixed
Model Fit Quality	High	Very High	Good	Excellent	RE
Convergence	Stable	Stable	Stable	Stable	Mixed
Computational Efficiency	Moderate	Good	Excellent	Excellent	FE/RE
Interpretability	High	Moderate	Very High	High	FE

Notes: “Excellent”, “Very Good”, “Very High”, “Good”, “High”, “Moderate”, “Stable”, and “Better” classifications indicate comparative diagnostic performance of the model. Bayesian models show excellent predictive precision.

Table A3. The correlation matrix among covariates.

	Intercept	Income	Gdpr	Health	Freedom	Support	Generosity	Corruption	Inflation
Income	0.115
Gdpr	−0.135	−0.988
Health	−0.904	−0.34	0.050
Freedom	−0.036	−0.067	0.065	−0.262
Support	−0.636	−0.022	0.037	0.692	−0.182
Generosity	0.269	−0.026	0.022	−0.251	−0.092	−0.180
Corruption	−0.242	−0.017	0.020	0.100	0.034	0.071	−0.051
Inflation	0.061	0.009	0.006	−0.083	0.023	−0.052	−0.008	−0.009
Unemployment	−0.114	−0.201	0.224	−0.049	0.247	−0.023	−0.005	0.021	0.21

Notes: Values represent Pearson correlation coefficients between covariates. High correlation (>|0.7|) may indicate potential multicollinearity. INCOME and GDPR show strong negative correlation. Other correlations are generally low to moderate. BPVS models can reduce multicollinearity by selecting important covariates.

Table A4. The estimation results from classical panel models for the socioeconomic factors on subjective well-being.

Variable	Pooled OLS	Classical Fixed Effects	Classical Random Effects	Linear Mixed Effects
Variable	Estimate (Est. Error)	Estimate (Est. Error)	Estimate (Est. Error)	Estimate (Est. Error)
Intercept	−1.53842355 (0.26522153) ***	−1.53842355 (0.26522153) ***	2.86311937 (0.32283410) ***	4.2099028 (0.3435049) [12.256]
Income	−0.01065499 (0.01682268)	−0.03616362 (0.01590007) *	−0.00626134 (0.01604104)	−0.0242188 (0.0156817) [−1.544]
Gdpr	−0.00130581 (0.01680997)	0.03728959 (0.01582592) *	0.00833363 (0.01599315)	0.0257590 (0.0156181) [1.649]
Health	0.09898864 (0.00348756) ***	0.00142637 (0.00529746)	0.03509723 (0.00479972) ***	0.0139637 (0.0050377) [2.772]
Freedom	1.77289128 (0.15008940) ***	1.29447149 (0.11818652) ***	1.25948822 (0.12262170) ***	1.2757611 (0.1176892) [10.840]
Support	0.09277490 (0.00877809) ***	−0.00441317 (0.00667276)	0.02670812 (0.00655309) ***	0.0071415 (0.0065105) [1.097]
Generosity	0.20853481 (0.10625875) *	0.18932375 (0.09979860)	0.04338223 (0.10112106)	0.1291903 (0.0985866) [1.310]
Corruption	−0.66627108 (0.07936454) ***	−0.23095567 (0.06228039) ***	−0.27885220 (0.06464032) ***	−0.2452482 (0.0620348) [−3.953]
Inflation	−0.00355258 (0.00090031) ***	−0.00237520 (0.00056376) ***	−0.00292763 (0.00059292) ***	−0.0025793 (0.0005637) [−4.576]
Unemployment	−0.01183866 (0.00368910) **	−0.03866514 (0.00415799) ***	−0.03420293 (0.00417719) ***	−0.0371079 (0.0041007) [−9.049]
$R^{2}$	0.6255	0.19389	0.21092	0.1321

Notes: Standard errors are in parentheses. Significance codes *** p < 0.001; ** p < 0.01; * p < 0.05.

Table A5. Bayesian fixed effects and Bayesian random effects estimation results.

Simulated Variable	Posterior Estimate (Est. Error)	95% Credible Interval	$\hat{R}$	Bulk-Ess	Tail-Ess
	Bayesian Fixed Effects
Intercept	−1.54 (0.25)	−1.99–(−1.05)	1.00	707	564
Income	−0.01 (0.02)	−0.04–0.02	1.00	555	539
Gdpr	−0.00 (0.02)	−0.03–0.03	1.00	548	494
Health	0.10 (0.00)	0.09–0.11	1.00	824	910
Freedom	1.78 (0.15)	1.49–2.07	1.00	799	562
Support	0.09 (0.01)	0.08–0.11	1.00	1077	627
Generosity	0.21 (0.10)	−0.00–0.40	1.01	925	701
Corruption	−0.67 (0.08)	−0.82–(−0.51)	1.00	908	461
Inflation	−0.00 (0.00)	−0.01–(−0.00)	1.01	906	634
Unemployment	−0.01 (0.00)	−0.02–(−0.00)	1.00	992	793
$σ$	0.66 (0.01)	0.63–0.68	1.00	1052	724
	Bayesian Random Effects
Intercept	4.16 (0.38)	3.47–4.94	1.01	157	321
Income	−0.02 (0.02)	−0.05–0.01	1.00	277	466
Gdpr	0.02 (0.02)	−0.01–0.06	1.00	285	498
Health	0.01 (0.01)	0.00–0.02	1.01	158	336
Freedom	1.27 (0.12)	1.05–1.50	1.00	454	550
Support	0.01 (0.01)	−0.01–0.02	1.00	229	360
Generosity	0.13 (0.10)	−0.06–0.32	1.00	235	518
Corruption	−0.24 (0.06)	−0.36–(−0.11)	1.01	469	580
Inflation	−0.00 (0.00)	−0.00–(−0.00)	1.00	1050	771
Unemployment	−0.04 (0.00)	−0.05–(−0.03)	1.01	445	630
$σ$	0.36 (0.01)	0.35–0.38	1.00	578	578

Notes: Standard errors are in parentheses. The model was estimated using 1000 iterations, 500 warm-up iterations, 2 chains, and a thinning rate of 1. The 95 percent credible intervals are reported for all parameter estimates to check the stability of parameters and uncertainty of estimates. R-hat values close to 1 indicate stable and model convergence. Bulk ESS and Tail ESS measure the effective sample sizes of the posterior distribution.

Table A6. Bayesian fixed effects estimation results for the top 4 important variables affecting subjective well-being.

Variable	Estimate (Est. Error)	95% Credible Interval	$\hat{R}$	Bulk-Ess	Tail-Ess
Intercept	−1.73 (0.21)	−2.15–(−1.31)	1.00	837	926
Health	0.10 (0.00)	0.09–0.10	1.01	1052	923
Freedom	2.03 (0.14)	1.76–2.32	1.00	694	699
Support	0.09 (0.01)	0.08–0.11	1.00	991	735
Corruption	−0.70 (0.08)	−0.86–(−0.54)	1.00	806	789
$σ$	0.70 (0.01)	0.68–0.73	1.00	1013	815

Notes. The model was estimated using 1000 iterations, 500 warm-up iterations, 2 chains, and a thinning rate of 1. The 95 percent credible intervals are reported for all parameter estimates to check the stability of parameters and uncertainty of estimates. R-hat values close to 1 indicate stable and model convergence. Bulk ESS and Tail ESS measure the effective sample sizes of the posterior distribution.

Appendix B

Figure A1. Parameter estimates comparison for simulation 1.

Figure A2. Parameter estimates comparison for simulation 2.

References

Baltagi, B. H. (2021). Econometric analysis of panel data (6th ed.). Springer. [Google Scholar]
Chipman, H., George, E., & McCulloch, R. (2010). BART: Bayesian additive regression trees. The Annals of Applied Statistics, 4(1), 266–298. [Google Scholar] [CrossRef]
Hausman, J. (1978). Specification tests in econometrics. Econometrica, 46, 1251–1271. [Google Scholar] [CrossRef]
Helliwell, J. F., Layard, R., Sachs, J. D., De Neve, J.-E., Aknin, L. B., & Wang, S. (Eds.). (2024). World happiness report 2024. University of Oxford. [Google Scholar]
Hoeting, J. A., Madigan, D., Raftery, A. E., & Volinsky, C. T. (1999). Bayesian model averaging: A tutorial. Statistical Science, 14(4), 382–401. [Google Scholar] [CrossRef]
Hsiao, C. (2014). Analysis of panel data (3rd ed.). Cambridge University Press. [Google Scholar]
Jeffreys, H. (1961). Theory of probability (3rd ed.). Oxford University Press. [Google Scholar]
Narisetty, N. N., & He, X. (2014). Bayesian variable selection with shrinking and diffusing priors. The Annals of Statistics, 42(2), 789–817. [Google Scholar] [CrossRef]
Pastpipatkul, P., & Ko, H. (2025). Buddhist thought on happiness and income growth relations across varying income countries. Journal of Happiness Studies, 26, 91. [Google Scholar] [CrossRef]
Sellke, T., Bayarri, M., & Berger, J. (2001). Calibration of p-values for testing precise null hypotheses. The American Statistician, 55, 62–71. [Google Scholar] [CrossRef]
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288. [Google Scholar] [CrossRef]
Wooldridge, J. M. (2010). Econometric analysis of cross section and panel data (2nd ed.). MIT Press. [Google Scholar]

Figure 1. The variable ranking results from Bayesian fixed effects and Bayesian random effects for simulation 1.

Figure 2. The variable ranking results from Bayesian fixed effects and Bayesian random effects for simulation 2.

Table 1. The estimated results from classical panel models.

	Pooled OLS	Classical Fixed Effects	Classical Random Effects	Linear Mixed Effects
Simulated Variable	Estimate (Est. Error)	Estimate (Est. Error)	Estimate (Est. Error)	Estimate (Est. Error)
N = 100; T = 10
Intercept	0.1047976 (0.2164648)	-	0.1047976 (0.2164648)	0.104798 (0.216465) [0.484]
$x_{1}$	1.7421047 (0.0730332) ***	1.7265712 (0.0785378) ***	1.7421047 (0.0730332) ***	1.742105 (0.073033) [23.854]
$I (x_{2})^{2}$	2.3226677 (0.2448289) ***	2.2104658 (0.2694868) ***	2.3226677 (0.2448289) ***	2.322668 (0.244829) [9.487]
$x_{3}$	−0.0046189 (0.0367853)	−0.0085616 (0.0394481)	−0.0046189 (0.0367853)	−0.004619 (0.036785) [−0.126]
$R^{2}$	0.39806	0.37855	0.39806	0.397404
N = 200; T = 20
Intercept	−0.095822 (0.102455)	-	−0.095822 (0.102455)	−0.09582 (0.10245)
$x_{1}$	1.842230 (0.035690) ***	1.838079 (0.037045) ***	1.842230 (0.035690) ***	1.84223 (0.03569)
$I (x_{2})^{2}$	1.948087 (0.119837) ***	1.903118 (0.123985) ***	1.948087 (0.119837) ***	1.94809 (0.11984)
$x_{3}$	0.018729 (0.017621)	0.021081 (0.018226)	0.018729 (0.017621)	0.01873
$R^{2}$	0.42341	0.41535	0.42341	0.42322

Notes: Standard errors are in parentheses. Significance codes *** p < 0.001.

Table 2. The estimated results from Bayesian panel models.

Simulated Variable	Posterior Means (Est. Error)	95% Credible Interval	$\hat{R}$	Bulk-Ess	Tail-Ess
N = 100; T = 10	Bayesian Fixed Effects
Intercept	0.09 (0.21)	−0.32–0.53	1.00	1074	724
$x_{1}$	1.74 (0.07)	1.61–1.88	1.00	1244	886
$I (x_{2})^{2}$	2.33 (0.25)	1.83–2.86	1.01	1171	846
$x_{3}$	−0.00 (0.04)	−0.07–0.07	1.00	1074	698
$σ$	2.29 (0.05)	2.19–2.40	1.00	1570	814
N = 100; T = 10	Bayesian Random Effects
Intercept	0.10 (0.20)	−0.31–0.51	1.00	1948	667
$x_{1}$	1.74 (0.07)	1.60–1.87	1.00	1905	725
$I (x_{2})^{2}$	2.33 (0.25)	1.79–2.85	1.00	2627	634
$x_{3}$	−0.00 (0.04)	−0.07–0.07	1.00	1774	792
$σ$	2.29 (0.05)	2.19–2.39	1.00	1450	636
N = 200; T = 20	Bayesian Fixed Effects
Intercept	−0.09 (0.11)	−0.29–0.11	1.01	2374	665
$x_{1}$	1.84 (0.04)	1.77–1.92	1.00	2049	566
$I (x_{2})^{2}$	1.95 (0.11)	1.72–2.17	1.00	1922	740
$x_{3}$	0.02 (0.02)	−0.02–0.05	1.01	2499	618
$σ$	2.24 (0.03)	2.19–2.29	1.00	1915	625
N = 200; T = 20	Bayesian Random Effects
Intercept	−0.10 (0.11)	−0.31–0.11	1.00	1382	773
$x_{1}$	1.84 (0.04)	1.77–1.91	1.00	1112	810
$I (x_{2})^{2}$	1.95 (0.12)	1.72–2.18	1.00	1153	712
$x_{3}$	0.02 (0.02)	−0.02–0.05	1.00	1224	734
$σ$	2.24 (0.02)	2.19–2.29	1.00	1067	713

Notes: Standard errors are in parentheses. Credible intervals are reported as lower–upper bounds.

Table 3. Summary of predictive performance of models.

Simulation 1	$R^{2}$	Best-Fit Ranking	Simulation 2	$R^{2}$	Best-Fit Ranking
N = 100; T = 10			N = 200; T = 20
Bayesian Fixed Effects (BFE)	0.398	1	Bayesian Fixed Effects (BFE)	0.423	2
Bayesian Random Effects (BRE)	0.398	1	Bayesian Random Effects (BRE)	0.424	1
Classical Fixed Effects (FE)	0.379	3	Classical Fixed Effects (FE)	0.423	2
Classical Random Effects (RE)	0.398	1	Classical Random Effects (RE)	0.423	2
Pooled OLS	0.398	1	Pooled OLS	0.423	2
Linear Mixed Effects (LME)	0.397	2	Linear Mixed Effects (LME)	0.415	3

Notes: Higher

R^{2}

values indicate better model fit. The Bayesian RE model shows the best fit.

Table 4. Choosing between fixed effects and random effects for the robustness check.

Test	Chi-Squared	Degree of Freedom	p-Value	${m B F}_{(p)}$	Conclusion
N = 100; T = 10 or Simulation 1
Hausman Test	1.2205	3	0.7481	1	RE is consistent (FE is biased)
N = 200; T = 20 or Simulation 2
Hausman Test	2.2675	3	0.5188	1	RE is consistent (FE is biased)

Notes: The minimum Bayes factor (mBF) is computed from the corresponding p-values using the formula

{m B F}_{(p)} = - e . p . l n (p)

, for p < 1/e and

{m B F}_{(p)} = 1

if p ≥ 1/e from Sellke et al. (2001). The interpretation of mBF values follows Jeffreys (1961) in which the mBF < 0.01 indicates very strong evidence against the null hypothesis (H₀). The mBF between 0.01 and 0.10 indicates strong evidence against H₀. The mBF between 0.10 and 0.50 indicates moderate evidence against H₀. The mBF > 0.50 indicates weak evidence against H₀. Since mBFs equal 1, there is weak evidence against the null hypothesis. Thus, the Hausman test supports the random effects model for both.

Table 5. Symbols, units of measurement, and data sources of variables used in this study.

Variable	Symbols	Units	Sources
Subjective well-being or life ladder or temporary happiness	SWB	Point	Helliwell et al. (2024)
Per capital Gross Domestic Product (PCGDP)	INCOME	log PCGDP	World Bank
Real Gross Domestic Product, Annual Growth Rates	GDPR	%	World Bank
Healthy Life Expectancy at Birth	HEALTH	Point	Helliwell et al. (2024)
Freedom to make life choices	FREEDOM	Point	Helliwell et al. (2024)
Social Support	SUPPORT	Point	Helliwell et al. (2024)
Generosity	GENEROSITY	Point	Helliwell et al. (2024)
Perception of Corruption	CORRUPTION	Point	Helliwell et al. (2024)
Consumer Price Inflation Rates	INFLATION	%	World Bank
Unemployment % of Total Labor Force	UNEMPLOYMENT	%	World Bank

Notes: Helliwell et al. (2024) provides the open data publicly available at https://worldhappiness.report/data, accessed on 5 July 2025 and Work Bank also provides the open data publicly available at https://data.worldbank.org.

Table 6. Variable importance ranking using Bayesian fixed effects and Bayesian random effects models.

Variable	BFE Estimates	BRE Estimates	BFE Variable Ranking	BRE Variable Ranking
HEALTH	−34.858	−43.796	1	1
FREEDOM	−28.212	−29.476	2	2
SUPPORT	24.326	1.091	3	9
CORRUPTION	−17.817	−12.809	4	6
INCOME	2.773	16.720	5	3
INFLATION	−2.256	−6.794	6	7
UNEMPLOYMENT	−1.613	16.593	7	4
GDPR	−1.150	16.071	8	5
GENEROSITY	−0.394	−3.256	9	8

Notes: The importance of covariates were determined at a threshold 0.3 and was replicated 20 times for robustness.

Table 7. Choosing between fixed effects and random effects model using a Hausman test.

Test	Chi-Squared	Degree of Freedom	p-Value	${m B F}_{(p)}$	Conclusion
Hausman Test	21.2799	9	0.01146	0.1392	FE is consistent (RE is biased)

Notes: The minimum Bayes factor (mBF) is computed from the corresponding p-values using the formula

{m B F}_{(p)} = - e . p . l n (p)

, for p < 1/e and

{m B F}_{(p)} = 1

if p ≥ 1/e from Sellke et al. (2001). The interpretation of mBF values follows Jeffreys (1961) in which the mBF < 0.01 indicates very strong evidence against the null hypothesis (H₀). The mBF between 0.01 and 0.10 indicates strong evidence against H₀. The mBF between 0.10 and 0.50 indicates moderate evidence against H₀. The mBF > 0.50 indicates weak evidence against H₀. Although mBF shows weak Bayesian evidence against the null hypothesis of random effects, the overall result supports the fixed effects model as an appropriate model for this dataset.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pastpipatkul, P.; Ko, H. Bayesian Panel Variable Selection Under Model Uncertainty for High-Dimensional Data. Econometrics 2026, 14, 3. https://doi.org/10.3390/econometrics14010003

AMA Style

Pastpipatkul P, Ko H. Bayesian Panel Variable Selection Under Model Uncertainty for High-Dimensional Data. Econometrics. 2026; 14(1):3. https://doi.org/10.3390/econometrics14010003

Chicago/Turabian Style

Pastpipatkul, Pathairat, and Htwe Ko. 2026. "Bayesian Panel Variable Selection Under Model Uncertainty for High-Dimensional Data" Econometrics 14, no. 1: 3. https://doi.org/10.3390/econometrics14010003

APA Style

Pastpipatkul, P., & Ko, H. (2026). Bayesian Panel Variable Selection Under Model Uncertainty for High-Dimensional Data. Econometrics, 14(1), 3. https://doi.org/10.3390/econometrics14010003

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Bayesian Panel Variable Selection Under Model Uncertainty for High-Dimensional Data

Abstract

1. Introduction

2. Bayesian Panel Variable Selection (BPVS) Models

2.1. Classical Panel Estimation Framework

2.2. Bayesian Panel Estimation Framework

2.3. Computing Variable Importance for Variable $x_{j}$

3. Simulation Studies

3.1. Data Generation and Model Specification

3.2. Simulation Results

4. Empirical Study

4.1. Data and Preliminary Analysis

4.2. Empirical Results

5. Discussion

6. Concluding Remarks

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Bayesian Panel Variable Selection Under Model Uncertainty for High-Dimensional Data

Abstract

1. Introduction

2. Bayesian Panel Variable Selection (BPVS) Models

2.1. Classical Panel Estimation Framework

2.2. Bayesian Panel Estimation Framework

2.3. Computing Variable Importance for Variable x j

3. Simulation Studies

3.1. Data Generation and Model Specification

3.2. Simulation Results

4. Empirical Study

4.1. Data and Preliminary Analysis

4.2. Empirical Results

5. Discussion

6. Concluding Remarks

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.3. Computing Variable Importance for Variable $x_{j}$