Structural Compressed Panel VAR with Stochastic Volatility: A Robust Bayesian Model Averaging Procedure

: This paper improves the existing literature on the shrinkage of high dimensional model and parameter spaces through Bayesian priors and Markov Chains algorithms. A hierarchical semiparametric Bayes approach is developed to overtake limits and misspeciﬁcity involved in compressed regression models. Methodologically, a multicountry large structural Panel Vector Autoregression is compressed through a robust model averaging to select the best subset across all possible combinations of predictors, where robust stands for the use of mixtures of proper conjugate priors. Concerning dynamic analysis, volatility changes and conditional density forecasts are addressed ensuring accurate predictive performance and capability. An empirical and simulated experiment are developed to highlight and discuss the functioning of the estimating procedure and forecasting accuracy.


Introduction
This study aims to construct and develop a methodology to improve the Bayesian compressed regression literature when dealing with (i) time-varying parameters, (ii) volatility changes, (iii) the curse of dimensionality, and (iv) variable selection problems accounting for large model and parameter spaces.
In macroeconomics and finance, existing approaches involve estimating high dimensional multicountry Vector Autoregressions (VARs) and Panel VARs (PVARs) to appropriately model and evaluate time-varying linkages among sectors and countries, where the number of parameters are highly larger than the obervational data. In thix context, prior specification strategies and Monte Carlo Markov Chain (MCMC) algorithms are constructed according to past information on the parameters' distributions in order to transform overparameterized models in low-dimensional parameter space. In this way, forecasting analysis and policy evaluations are feasible and can be performed ensuring good accuracy and quality of the point estimates. Most studies based on this literature focus on frequentist and Bayesian approaches. The former generally work with a sparse hierarchical prior distribution allowing to discriminate between zero and non-zero factor loadings in order to identify unobserved factors and then provide a meaningful economic interpretation for them. See, among many other, Bernanke et al. (2005) (Factor-Augmented VARs); Pesaran et al. (2009) and Pesaran et al. (2004) (multicountry Dynamic Factor Models); Dees et al. (2007); Feldkircher and Huber (2016); Cuaresma et al. (2016); Dovern et al. (2016); and Huber (2016) (Global VARs); and ; Primiceri (2005); Koop and Korobilis (2009); Canova and Forero (2015); Banbura et al. (2010); and Koop and Korobilis (2013) (Time-Varying Parameter VARs with multivariate stochastic volatility). Conversely, according to Bayesian models, they typically use diffuse or informative priors to shrinkage A macroeconomic application highlights the estimating process performance and forecasting accuracy using an univariate Autoregressive process with a single lagged term (AR(1)) as benchmark approach. Compared with the previous models, their method fits better the data and achieve more accurate forecasts. However, the BMA used in shrinkage of high dimensional parameter spaces consists in assigning different weights to the projections based on the explanatory power of the predictors rather than the model size. Thus, open variable selection issues when dealing with overparameterization-such as model misspecification problems and overfitting-are not addressed. In addition, the number of different projections are generated randomly and then does not involve the data. Even if it is computationally useful in multiple model classes, common parameters can change meaning from one model to another, so that prior distributions should change in a corresponding fashion and be weighted more according to the model size. Last but not least, when studying macroeconomic-financial linkages, issues concerning heterogeneity, interdependence, and commonality among countries and sectors should be accounted for.
My computational approach aims to overtake these limits when estimating Bayesian compressed regressions in large VAR settings with time-varying parameters and multivariate stochastic volatilities. More precisely, its implementation consists of combining and extending the underlying logic in Pacifico (2020b), regarding variable selection problems in multiple model classes, and in Pacifico (2021), concerning high dimensional multicountry dynamic analysis. Thus, conversely to Koop et al. (2019), the main features are: (i) the selection of the best subset of predictors through Posterior Model Probabilities rather random draws in order to weight priors according to the model size; and (ii) the estimation of a structural panel framework when jointly modelling parameter and model spaces in order to perform accurate cross-country forecasts and policy issues. Here, best stands stands for the model providing the most accurate predictive performance over all candidate models, and PMP denots the probability of each candidate model fitting the data. Methodologically, the developed approach-named Structural Bayesian Compressed PVAR (SBCPVAR) model-is based on semiparametric prior assumptions to entail a strong model selection in high dimensional model classes and MCMC algorithms to construct posterior distributions.
In detail, the contributions of the proposed methodology are fourfold. First, additional data matrices are considered containing predetermined 1 variables (e.g., lagged dependent and control variables) and observable endogenous variables including macroeconomicfinancial and socioeconomic-demographic factors. Multivariate Conjugate Informative Proper Mixture (mvCIPM) priors and MCMC-based PMPs are then used in oder to: (i) include all the information from the whole multidimensional framework; (ii) impose specification choices to compress high dimensional parameter and model spaces; and (iii) jointly deal with variable selection problems (model uncertainty and overfitting), endogeneity issues, and structural model uncertainty (because of one or more parameters are posited as the source of model misspecification problems). The mvCIPM priors are an implementation of the conjugate informative proper priors in Pacifico (2020b) to deal with overparameterization in large time-varying PVAR.
Second, related to the previous feature, properly specification choices to drop or downweight bad compressions are addressed instead of compressing the data randomly. To do it, I build on and extend the Pacifico (2020b)'s analysis, who develops a Robust Open Bayesian (ROB) procedure in two stages for implementing BMA and BMS in multiple linear regression models and time-varying high dimensional multivariate data when studying cross-country dynamic economics. In this way, the best subset of model solutions is obtained by defining a criterion in accordance with either the data (explanatory power) and the model size (different interactions between covariates). Thus, the complete compressed subset regression method uses model-weighted combinations of all available subsets of predictors and resorts to a less restrictive supervised dimension reduction technique. However, that framework even if ensures better accuracy and quality of the density forecasts, it would highly increase the computational costs involved in the procedure, representing an important limit to be dealt with.
Third, I adapt the Pacifico (2021)'s strategy to transform an overparameterized structural PVAR into a compressed Seemingly Unrelated Regression model in order to account for interdependence, heterogeneity (or homogeneity), and commonality when studying macroeconomic-financial linkages. More precisely, I involve some auxiliary regression parameters in the extended ROB procedure to evaluate the time-varying VAR coefficients for each country-variable pair in presence of potential unobserved changes (volatility effects). Then, I construct a flexible factorization for the compressed regression parameters to make them estimable. Finally, a multivariate Bayesian Information Criterion (mvBIC) is used to depict the optimal number of lags in high dimensional multivariate model selection, extending the standard BIC to the case of multiple response variables (see, for instance, Sofer et al. (2014)).
Fourth, MCMC algorithms are addressed to construct appropriate posterior distributions and then perform cross-country conditional density forecasts. The diagnostic measure computed to measure forecasting accuracy and then account for relative regrets dealing with semiparametric forecasting problem is the multivariate Weighted Mean Squared Forecast Error of Christoffersen and Diebold (1998).
An empirical application involving more than hundreds of macroeconomic-financial and socioeconomic-demographic variables is developed to highlight the performance of the proposed methodology. The empirical strategy is able to design conditional density forecasts and strategic policy measures investigating either the impact of COVID-19 pandemic or real/financial shocks on potential output in a pool of advanced and emerging countries, with 'potential output' denoting the highest level of economic activity. Furthermore, a simulated experiment-compared to related works-is also addressed to discuss theoretical properties.
The remainder of this paper is organized as follows. Section 2 introduces the econometric framework and the estimating method. Section 3 displays the Bayesian model selection procedure by clarifying prior specification strategy and posterior distributions. Section 4 describes the data and the empirical analysis. Section 5 provides a simulated example through Monte Carlo simulations to discuss theoretical properties and forecasting accuracy compared to some existing approaches. The final section contains some concluding remarks.

Model Estimation
Consider a simplified version of the multicountry SPBVAR model developed in Pacifico (2021): where the subscripts (i, j) = 1, 2, . . . , N are country indices, t = 1, 2, . . . , T denotes time, m = 1, 2, . . . , M and ξ = 1, 2, . . . , Ξ are directly observed endogenous variables for i, with m = 1, 2, . . . ,M andξ = 1, 2, . . . ,Ξ referring to the ones observed for j and independent of i, λ = 1, 2, . . . , l stands for all available lags of every time-varying variable to be potentially included in the shrinking process, and ε im,t ∼ i.i.d.N(0, Ω t ) is an N M · 1 vector of heteroscedastic unobservable shocks with variance-covariance matrix Ω t . Stacking for (m, ξ,m,ξ), all terms within the system are so defined. (i) Y i,t is an N M · 1 vector of observed variables to be predicted for each i for a given m. (ii) A i,j are N M × N M matrices of lagged coefficients for each pair of countries (i, j) for a given m, and Y i,t−λ is an N M · 1 vector of observed lagged variables for each i for a given m.

In this study
i,t−λ denoting lagged outcomes (e.g., country's productivity) to capture the persistence and Y c i,t−λ including lagged control variables such as general economic-financial conditions. (iii) B i,j are NΞ × NΞ matrices of lagged coefficients for each pair of countries (i, j) for a given ξ, and Z i,t−λ is an NΞ · 1 vector including a set of additional observed lagged endogenous factors for each i for a given ξ. Let the model (1) be a VAR process, when performing forecasting analysis, every outcome for each country would depend on its lagged values (Y o it−λ ) and sudden changes in Y c it−λ due to unexpected shocks (misspecified dynamics). However, when studying macroeconomic-financial linkages, other potentially endogenous related factors would affect outcomes' distribution because of not directly observed/measured relationships (endogenity issues). In this study, I evaluate them assuming the decompo- i,t−λ referring to socioeconomic-demographic conditions and policy factors, respectively. Then, in order to perform a Bayesian compressed variable selection regression, I model the framework combining the (non-)homogeneous parameters into a 1 · Nk vector X t = (Y im,t−1 , Y im,t−2 , . . . , Y im,t−l , Z iξ,t−1 , Z iξ,t−2 , . . . , Z iξ,t−l ) and construct an auxiliary parameter Θ t -indexed k-grouping the two matrices of timevarying coefficients, where k = [M + Ξ] · l corresponds to the number of all matrix coefficients in each equation of model (1) for each pair of countries (i, j). Thus, the parameter Θ t = (A im,j1 , A im,j2 , . . . , A im,jM , B iξ,j1 , B iξ,j2 , . . . , B iξ,jΞ ) looks like a set of matrices of size N M × Nk. Nevertheless, with these specifications, the (1) faces to be unfeasible and unreliable because of (possible) different dimensions between matrices (A i,j , B i,j ) and high dimensional parameter spaces, respectively. The proposed methodology overcomes these problems by using a hierarchical factor structure and a prior shrinkage based on a multivariate ROB (mvROB) procedure.
For notational simplicity, I display the estimating procedure with no deterministic terms 2 . The heteroscedasticity imposed in the variance-covariance matrix of the vector of innovations (ε i,t ) is to capture and then investigate potential unobserved shocks (impulse) among variables affecting cross-country spillover effects on the outcomes (response). It is worth noting that when studying macroeconomic-financial linkages and other related socioeconomic-demographic effects, the model in (1) is going to admit multiple and multivariate structural breaks and policy regime shifts. Thus, according to the Primiceri (2005)'s modelling strategy, without loss of generality, I re-write the error terms in (1) as: where ∆ t is an N M × Nk matrix with elements either potentially different from zero or close to zero and ω i,t ∼ N(0, I Nk ) is a Nk · 1 vector for each set of variables (m, ξ). As an illustrative example, stacking for m and k for simplicity, consider the following structure for ∆ t , in a three-by-three case: where the d i,j 's indicate the elements potentially different from zero. Equation (3) implies a variance covariance matrix of the residuals with zero in the positions (1, 2) and (3, 2). In case of triangular decomposition, the solution would be incompatible with draws of Ω t or at least approximate whether the elements (1, 2) and (3, 2) are very close to zero. However, the latter does not ensure efficient estimation in a context of overidentified systems, unless the overidentification derives from further zero restrictions. In addition, when studying time-varying linkages, time-invariant variance-covariance matrices are undesirable or too restrictive.
In this study, a more flexible Bayesian inference is addressed in order to overtake these model misspecification problems by taking advantage of the shrinking process involved in the procedure. More precisely, two considerations are in order: the use of a random walk process to easily model the time-varying distributions of the elements in ∆ t and a compression regression form to evaluate the time-varying coefficient vectors for each country-variable pair (Θ t ) in presence of unobserved changes (d i,j = 0).
Let δ t = vec(∆ t ) = vec(d im,j1 , d im,j2 , . . . , d im,jk ) be a N MK · 1 vector containingstacked by columns-the elements of the matrix ∆ t , with K = Nk andk denoting the maximum value of the time-varying VAR coefficients for each pair of countries (i, j), the parameter δ t is then modelled as a random walk process: where ) is a block diagonal covariance matrix of size N M × N M defined according to the time-varying vector δ t for each pair of countries (i, j), and δ 0 denotes the initial conditions to be estimated. Here, the variances generated by (4) are unobserved components treated as permament shifts. I recall that random-walk assumption assumed in (4) is an useful way to easily model time-varying parameters when studying high dimensional multivariate dynamic models for a finite period of time. Thus, one is able to: (i) reduce the number of parameters; (ii) allow for the evaluation of permanent shifts; (iii) investigate any type of coefficient factors via their interactions; and (iv) replace volatility changes by coefficient changes.
Here, some considerations are in order. (i) As discussed in Primiceri (2005), according to some residuals' positions, elements in ∆ t different from or very close to zero would lead to the solution to be approximate (even if probably still reliable) or rejected, respectively. This limit is overtaken in this study by means of the variable selection procedure involved in the empirical strategy. More precisely, the vector δ t accounts for unobserved changes when studying k time-varying variables' distributions and their interactions (cross-terms). Thus, whether a model solution (or combination of predictors X t ) is likely reliable, the BMS used in the ROB procedure would automatically discard it by the data. Indeed, whether a model solution is rejected, it means that no change-points (or structural breaks) and policy regime shifts matter. In macroeconomics and finance, when investigating international spillover effects given unexpected shocks, that scenario is implausible or-on the contrarya signal of an unfounded empirical case-study. (ii) In (2), the identification I use is not based on a triangular decomposition of Ω t as with Koop et al. (2019) and Carriero et al. (2015b), requiring an additional triangular scheme on ∆ t . Conversely, the identifiability of δ t 's is guaranteed from the block diagonality of Σ δ . The idea is to absorb potential excessive spillover effects in the parameter Θ t whether they matter (δ t = 0) or discarding them otherwise (δ t 0), where excessive stands for a sudden (not directly observed) highly large intensification of the spreading of spillovers among countries and/or sectors. Thus, such a specification implies that volatility changes due to the presence of unobserved components are replaced by coefficient changes and dealt with parameter shifts. (iii) In this study, potential volatility changes are investigated through the excess kurtosis (κ i,t ) of the SPBVAR prediction error evaluated over the information on the past year (F −1 ). As highlighted in Koop et al. (2019) according to the GARCH literature, let the error terms be Normally distributed, the excess kurtosis will be high in times of large volatility and zero otherwise. (iv) According to the previous point, the empirical procedure would tend to be sufficiently restrictive evaluating multiple structural breaks through permanent shifts. A possible extension could be-for example-assuming time-varying log-volatilities in (1), just as in Pacifico (2021). In that context, Autoregressive Conditional Heteroskedasticity in Mean model effects are used to model time-varying conditional second moments so as to quantify unexpected variations in Y t . The variance-covariance matrix of the vector of innovations (Ω t ) would be then a diagonal matrix containing the time-varying logvolatilities δ t = (δ 1t , δ 2t , . . . , δ Nt ) . Even if it improves the performance of conditional density forecasts, highly larger computational costs matter requiring the use of MCMC implementations (such as Metropolis-Hastings algorithm). (v) Potential structural changes, dynamic feedback, and interactions among countries and variables are possible and allowed to vary over time. Thus, the framework of (1) makes it able to investigate and quantify international business cycles, policy implications, and economic dynamics by jointly dealing with endogeneity and volatility changes and functional forms of misspecification. These features are then able to perform accurate conditional density forecasts. (vi) Finally, the framework can be related to the literature on cointegrating approaches with timevarying coefficients. More precisely, they provide an efficient method of estimation to model macroeconomic-financial long-run relationships, where the coefficients are estimated nonparametrically as smooth functions evolving over time (see, among many other, Hansen (1992); Quintos and Phillips (1993); and Andrews (1991aAndrews ( , 1991b). However, this study does not rely on these methods because of their fruitless in multicountry dynamic analyses due to the presence of omitted variables or parameter instability (e.g., endogeneity issues) and unobserved change-points (e.g., misspecified dynamics).
With these specifications, I re-write (1) expressing it in a simultaneous-equation form: is combined with the time-varying (unobserved) elements of ∆ t , and γ t = vec(Γ t ) be a N MK · 1 vector containing-stacked into a vector-the time-varying coefficient vectors for each countryvariable pair. More precisely, the idea is to evaluate cross-unit lagged interdependencies and dynamic feedback dealing with potential unobserved changes (volatility effects). Thus, in times of large volatility (δ t = 0), the construction of Γ t would absorb and then include these unexpected shocks in the estimating procedure.
The reduced form in (5) can be re-written according to a compression regression form: The variable selection problem arises when there is some unknown subset of X t with predictors so small that it would be preferable to ignore them. Thus, the variable selection procedure can be seen as one of deciding which of the γ t 's regression parameters are sufficiently small so that the predictor X t should be ruled out from the system. Its aim is to evaluate 2 K subset choices, referring to any potential model solution (or combination of predictors X t ) better fitting the data. Now, because the coefficient vectors in γ t vary in different time periods for each country-variable pair and there are more coefficients than data points, it is impossible to evaluate them. To solve these problems, I apply a flexible factorization for γ t to estimate all coefficients and their possible interactions without restrictions or loss of efficiency. The curse of dimensionality is then dealt with performing the mvROB procedure in order to select the only γ t 's regression parameters sufficiently large to be included in the system.
Let φ t be an additional auxiliary parameter containing the compressed γ t 's regression parameters (γ c t ), I assume the following factor structure: where φ t denotes a N ϕ · 1, with N ϕ N MK and {1 ≤ ϕ < k}, dim(φ t ) dim(γ t ) by construction, γ c t refers to the compressed time-varying coefficient vectors obtained through the shrinking process, Φ t is a N MK × N ϕ conformable matrix with elements equal to zero (γ c t small and then absence of k-th covariate in the model for a given i) and one (γ c t sufficiently large and then presence of k-th covariate in the model for a given i), and u t is a N MK · 1 vector of disturbances with zero mean and variance-covariance matrix Σ u , with Σ u = V · σ t , with V = σ 2 · I N MK as in Kadiyala and Karlsson (1997) according to the methodology and σ t ∼ = Ω t ⊗ (X t X t ) denoting (potential) volatility changes. In this study, the auxiliary variable φ t is supposed to follow a random walk process: where Ψ t = diag(Ψ mt,1 , Ψ mt,2 , . . . , Ψ mt,k ) is a block diagonal matrix, and Ψ mt,k = (ψ t × I N MK ), with ψ t controlling the stringent conditions of the shrinking process of the timevarying compressed coefficient parameters (γ c t ) in order to make them estimable. Here, u t and ψ t are correlated between them by construction, and η t and ψ t are allowed to be correlated between them.
According to the factorization in Equation (6) and letX t = I N M ⊗ (X t + ω i,t ) be an N M × N MK matrix containing all lagged time-varying variables and volatility elements within the system stacked in X t and ω i,t (respectively), the reduced-form SPBVAR model in Equation (5) can be transformed into a Compressed Seemingly Unrelated Regression (CSUR) model: where χ c t ≡ (X t · Φ t ) are N M × N ϕ matrices that stack all coefficients and their possible interactions in the multicountry SPBVAR model in (1), and E c t ≡X t · u t is an N M · 1 vector having a particular heteroskedastic covariance matrix that needs to be accounted for.

Multivariate ROB Procedure
Let M K = (M i1 , M i2 , . . . , M ik ) be a countable collection of all 2 K possible subset choices, withk denoting the maximum value of the time-varying matrix coefficients in (1). The full model class set is: where M and K denote the multidimensional natural model and parameter spaces, respectively.
According to the Pacifico (2020b)'s framework, I match all (potential) candidate models to shrink both the model space (M) and the parameter space (K). The shrinkage jointly deals with overestimation of effect sizes (or individual combinations), dynamic interactions (or cross-term lagged interdependencies), model uncertainty and misspecification problems (implicit in the procedure), and endogeneity and volatility changes (involved in the hierarchical framework).
Let the PMPs denote the probability of each candidate model fitting the data, they can be defined as: is the conditional prior distribution of γ t given M K . However, when N is high dimensional and T sufficiently large, the calculation of the integral π(Y t |M K ) is unfeasible and then MCMC methods and implementations are needed. The mvROB procedure entails in jointly shrinking the model M and the parameter K space to make Equation (9) estimable. Then, a lower-dimensional model class set is obtained containing the only best model solutions (or combinations of predictors) fitting the data. It corresponds to: where M F denotes the submodel solutions of the CSUR in (8), with M F < M K , F K, where F = N ϕ and {1 ≤ ϕ < k}, and τ is a threshold chosen arbitrarily for an enough posterior consistency 3 . In this study, I use τ = 5% with N high dimensional and then more restrictive than the one used in Pacifico (2020b).
The final model solution to perform forecasting and policy-making corresponds to one of the submodels M F with higher log natural Bayes Factor (lBF): In this analysis, the lBF is interpreted according to the scale evidence in Pacifico (2020b), but with more stringent conditions: no evidence for submodel M F 5 ≤ lBF ϕ,k < 9.99 moderate evidence for submodel M F 10 ≤ lBF ϕ,k < 14.99 strong evidence for submodel M F lBF ϕ,k ≥ 15.00 very strong evidence for submodel M F .
According to the simultaneous-equation form of the multicountry SPBVAR described in (5), the compression refression form of model (10) is:  (11) contains 2 8 potential model solutions or combinations of predictors X t for each set of variables (m, ξ) and pair of countries (i, j). Now, once performed the shrinking process involved in the mvROB procedure, I suppose the following two findings are in order: 1. ξ 2 is not relevant and then discarded from the system (absence of the 4-th covariate in the model), but it shows some (potential) interactions with m 1 . 2. m 1 does not depend on m 2 and ξ 1 .
Let i 1 = (1, 1, 0, 0) and i 2 = (0, 0, 1, 1) , the (NMK × N ϕ) = (32 × 6) conformable matrix Φ t can be displayed in the form of a Table in order to better understand its construction: In Equation (12), some considerations are in order. (i) For simplicity, I do not explicit i and (m, ξ). Thus, every matrix of coefficients has to be interpreted as expressed in 4 components containing-as a whole-32 elements. (ii) Rows and/or columns equal between them do not involve in multicollinearity problems since the matrix Φ t is not estimated in the mvROB procedure, but only used in shinkage of high dimensional parameter and model spaces. (iii) There are N M · N ϕ = 24 significant elements corresponding to those set to 1 in Equation (12). (iv) According to the mvROB procedure, the final best subset of predictors will correspond to the one with higher lBF.

Prior Specification Strategy and Posterior Distributions
The variable selection procedure entails estimating the parameters (Σ δ , δ t , φ t , ψ t ) as posterior means (the probability that a variable is in the model). In this context, mvCIPM priors are used to hierarchically model them and then obtain analytical results: Here, N(·) and IG(·) stand for Normal and Inverse-Gamma distribution, respectively, F t−1 refers to the information given up to time t − 1, ψ 0 denotes the initial conditions to be estimated, and κ in (14) denotes the decay factor. This latter usually varies in the range [0.9 − 1.0] and controls the process of reducing past data by a constant rate over a period of time.
According to the conjugate informative priors in (13) and (15), the posterior of the Σ δ 's and the φ t 's depends on the draw of ψ t . Moreover, ψ t and Σ δ are not independent of one another. Thus, to allow different equations in the CSUR to have different explanatory variables, I further model the hyperparameters in identifying φ and use Independent Inverse Gamma (IIG) distribution for every draw of Σ δ . Equations (13) and (16) are then re-written as: All the hyperparameters are known. More precisely, collecting them in a vector = (ω 0 , D 0 ,ᾱ,ρ,ω,D,ν), they are treated as fixed and obtained either from the data to tune the prior to the specific applications (such as ω 0 ,ω,ν) or selected a priori to produce relatively loose priors (such as D 0 ,ᾱ,ρ,D). Finally, let φ t evolve according to (7), a variant of Gibbs sampler approach-Kalman-filter technique-can be used through MCMC integrations. Supposing that data run from (t = 0) to (t = T) in order to obtain a training sample (t − 1, 0) and then to estimate the features of the φ t 's over time, the (15) can be re-written as: whereφ t|t andR t|t denote the conditional distribution of φ t and its variance-covariance matrix at time t given the information over the sample (t − 1, 0). The posterior distributions for¯ = (Σ δ , δ t , {φ t } T t=1 , ψ t ) are obtained by combining the conjugate informative priors with the conditional likelihood. This latter is proportional to: where Y t = {Y t } T t=1 denotes the data and¯ is a vector collecting the unknowns whose joint distribution needs to be found.
For the conditional posterior distribution of (φ 1 , . . . , φ T |Y T ), the Kalman-filter technique provides the following forward recursions for posterior means (φ t|t+1 ) and covariance matrix (R t|t+1 ): Here,R t|t andR t−1|t−1 refer to variance-covariance matrices of the conditional distributions ofφ t|t at time t andφ t−1|t−1 at time t − 1, respectively, = 0.92 denotes the forgetting factor displaying the same function of the decay one,φ t−1|t−1 ∼ = 0.1, and w |φ| denotes the Posterior Inclusion Probabilities (PIPs) obtained by the sum of the PMPs in (9). The PIPs are computed according to the model size M K , through which the φ t 's will require a non-0 estimate or the γ t 's should be included in the model. In this way, one would weight more according to model size and-setting w |φ| large for smaller M K -assign more weight to parsimonious model solutions.

Data Description and Results
The SPBVAR in (1) contains 24 country-specific models, including 10 advanced economies 4 , 9 emerging economies 5 , and 5 non-European Union countries 6 . All advanced countries refer to Western Europe (WE) economies and all emerging countries-except for GR-refer to Central-Eastern Europe (CEE) economies. All European countries are Eurozone members, with the exception of CZ, HU, and PO, and thus interdependence, heterogeneity (or homogeneity), and commonality can be investigated in depth.
The estimation sample is expressed in quarters covering the period from December 1994 to March 2021, and all data comes from World Bank and OECD databases. Given the hierarchical structural conformation of the model and a sufficiently large number of years describing macroeconomic-financial and socioeconomic-demographic variables, it is able to deal with: (i) endogeneity issues; (ii) policy-relevant strategies; and (iii) functional forms of misspecification.
Given the CSUR in (8), the decomposed vectors of the lagged (observable) endogenous variables (Y i,t−l , Z i,t−l ) are: (i) Y o i,t−l denoting lagged outcomes to capture the persistence; (ii) Y c i,t−l indicating general economic conditions; (iii) Z s i,t−l indicating socioeconomic-demographic factors; and (iv) Z p i,t−l denoting macroeconomic-financial variables (including policy tools). In this study, the outcome refers to country's productivity measured through GDP per capita in logarithm ( Table 1). The dataset contains 107 observable variables split in three groups: (i) Economic Status (hereafter, ECOST), including 36 determinants combining information on economic conditions, economic development, and labour market; (ii) Socioeconomic-demographic Statistics (hereafter, SOCDEM), addressing 28 determinants concerning information on government health expenditures and population growth; and (iii) Macroeconomic-financial Indicators (hereafter, ECOFIN), referring to 43 determinants dealing with real-financial linkages and financial markets. The estimation sample amounts, without restrictions, to 847, 584 regression parameters. More precisely, each equation of the time-varying SPBVAR in (1) has K = Nk = 24 · [37 + 71] · 3 = 7776 coefficients (including lagged outcomes), with l = 3 denoting the optimal number of lags according to mvBIC, and 109 equations.
By running the shrinkage procedure described in Section 2.2, I find 34 best covariates. Thus, there would be 2 The final model solution better performing the data consists of 20 final best subset of predictors, including lagged outcomes, with higher log Bayes Factor, where lBF = 19.75, and Posterior Inclusion Probability (PIP) ≥ τ ( Table 1 in bold). The PIP corresponds to the sum of the PMPs in (9) for every best model solution. These final covariates are so split: (i) predictor (1) for Y o i,t−3 , predictors (2, 3, . . . , 9) for Y c i,t−3 ; (ii) predictors (10, 11, . . . , 17) for Z s i,t−3 ; and (iii) predictors (18, 19, . . . , 34) for Z p i,t−3 . All their available lags-including lagged outcomes-are put as instruments on the estimating procedure in order to deal with endogeneity issues and model misspecification problems.
Here, some preliminary results are addressed. (i) Let the Conditional Posterior Sign (CPS) denote the sign certainty assuming values close to 1 or 0 whether a covariate in M F has a positive or negative effect on outcomes (respectively), most of model uncertainty and overfitting are deal with. Indeed, all predictors involved in the final model solution show CPSs close to 0, such as predictors (21 and 30), and 1, such as predictors (1,3,5,6,7,10,11,13,16,17,25,26,34). (ii) When studying cross-country dynamic feedback, socioeconomicdemographic factors and general economic conditions hold a relevant position and then need to be accounted for. (iii) Macroeconomic-financial factors denote the indicators that matter more to deal with endogeneity issues and misspecified dynamics being half of the 34 best selected covariates (see, for instance, Pacifico (2019) and Pacifico (2021)). (iv) Heterogeneity, interdependence, and co-movements are also addressed let the framework be a multidimensional panel data. The Table is so split: the first column denotes the predictor number; the second and the third column display the predictors and their labels, respectively; the fourth column describes the measurement unit; and the last two columns display the PIPs (in %) and the CPS, respectively. The last row refers to the outcomes of interest at time t. All contractions stand for: Gen. Gov., 'General Government'; Cons., 'Consumption'; Dom., 'Domestic'; pop., 'population'; and manu f ., 'manufactured'. All data refer to World Bank and OECD databases.

Forecasting Results and Policy Issues
Let the final subset consist of 20 (potential) best subset of predictors, cross-country spillover effects-given an unexpected shock-are evaluated in order to highlight the performance of the CSUR in (8) (hereafter, CSUR v ). A total of 10, 000 draws for every model solution has been used to conduct posterior inference at each t. Conditional density forecasts are then obtained according to a time frame of 8 quarters (2 years) in order to investigate how policy issues and their implications would affect cross-country economic dynamics. Informative conjugate priors refer to three subsamples: (i) 2006q1-2009q4 to deal with policy regime shifts concerning the global financial crisis; (ii) 2010q1-2018q4 to evaluate postcrisis fiscal consolidation periods; and (iii) 2019q4-2021q1 to absorb volatility changes due to the ongoing disease outbreak on the global economy.
Given the φ t 's estimates-in terms of posterior means-concerning the 20 selected predictors and their interactions (lagged effects), Systemic Contribution (SC) indexes are constructed to evaluate and quantify dynamic features associated with systemic events (excess spillover effects). The SCs are able to capture persistent long-run effects of an impulse variable (net sender) to the response variable (net receiver) and is defined as the ratio between Bilateral Net Spillover Effects and Total Net Positive Spillovers of the system (see, for instance, Pacifico (2019) for further specifications). Thus, a cross-country spillover analysis can be performed supposing (jointly) unobserved volatility changes in financial economy (Z p i,t−3 ) and unexpected shocks in real economy (Y i,t−3 and Z s i,t−3 ). Conditional projections are then used to include forecasts from 2021q2 to 2023q1. Finally, to put more emphasis on volatility changes, the estimation results are compared with a similar model but assuming constant volatility (hereafter, CSUR nv ). In this latter, the variance-covariance matrix of i,t is homoscedastic (Ω) and then volatility changes are missing.
The aim of this analysis is to highlight the importance of accounting for socioeconomicdemographic factors and policy regime shifts when investigating international spillover effects in the last decade. All countries are grouped in three macro-areas: (1) advanced economies (hereafter, AVE), such as AU, BE, FI, FR, DE, IE, IT, NL, ES, and PT; (2) emerging economies (hereafter, EME), such as CZ, EE, GR, HU, LV, LT, PO, SK, and SV; and (3) non-European Union countries (hereafter, NEU), such as CH, JP, KO, GB, and US. All series are expressed in standard deviations with respect to the same quarter of the previous year (q t /q t−1 ), and the real GDP per capita in logarithmic form (lgdp) is used to evaluate and quantify the size and the spreading of cross-country international spillover effects over time given a 1% unexpected shock on the variables within the system: Four main findings are in order. (i) Persistent heterogeneity and interdependence (common trends) matter among countries and sectors (Figure 1a). Moreover, most advanced and emerging countries tend to be net senders (positive SCs) and net receivers (negative SCs), respectively. These results find confirmation with previous works, highlighting the need to support 'quasi-flexible' coordinated structural policy actions in order to ensure: higher homogeneous real economic convergence among countries; stronger international business cycle synchronization, mainly among emerging economies; and faster reinforcement in financial systems, mainly with the ongoing pandemic crisis (see, for instance, Pacifico (2021)). (ii) These findings are better highlighted in Figure 1b, where every endogenous factor is gathered together in the three country-specific groups. Concerning AVE, they would be net senders in ECOST and net receivers in ECOFIN and SOCDEM, NEU show stringent outward spillover effects, and EME are mainly net receivers except for SOCDEM. From a policy and global perspective, this implies that, given an unexpected shock, advanced economies directly affect countries with middle and low economic status (outward spillovers) and then absorb structural fiscal adjustments for boosting the output to potential growth (inward spillovers). Conversely, emerging economies-with lower socioeconomic status-initially do not fight back against unexpected shocks (outward spillovers in SOCDEM), but strongly react to shocks in real economy and even more in financial markets because of stronger cross-country financial linkages (inward spillovers). Finally, outward spillovers in NEU confirm their importance about international spillover effects affecting European financial shocks (see, for instance, Pacifico (2020a) and Curcio et al. (2020)). (iii) Highly consistent cross-country heterogeneity across spillovers' dynamics matters more in ECOST, followed by ECOFIN and SOCDEM (Figure 1c). Thus, when investigating cross-country international spillovers, the performance of an economy to face sudden and unpredictable events (e.g., economic shocks) needs to be assessed, mainly evaluating conditional density forecasting in time of increasing volatility changes (misspecified dynamics due to structural breaks). (iv) Finally, the empirical analysis confirms the importance to account for socioeconomic-demographic factors for performing conditional density forecasts in multivariate settings (Figure 1c). Indeed, examinations of socioeconomic status are useful to reveal inequities in access to resources along with (endogeneity) issues related to real economy and financial markets (ECOFIN). Thus, the latter stand for important drivers in evaluating the economic growth of a country affecting the spreading and the intensity of spillover effects (see, for instance, Curcio et al. (2020); Pacifico (2019Pacifico ( , 2020a and Ciccarelli et al. (2018)). It means that a high rate of economic growth entails an expansion in economic output and-in turn-higher socioeconomic status strongly affecting outcomes (misspecified dynamics due to cross-country linkages). The previous findings are deepened in Figure 2 by focusing on the three subsamples.
Concerning the results obtained through the CSUR v model (Figure 2a), from a modelling perspective, the spreading of spillover effects are larger due to triggering events, mainly in ECOST confirming the importance to account for the economic status of a country. In this context, AVE show higher responses and tend to be net receivers (inward spillovers) with respect to EME, let economic conditions be highly affected by stronger inter-country linkages in their financial dimension. As in Pacifico (2021), EME tend to be net senders (outward spillovers) to catch up with the economic growth of the other advanced European countries. Finally, positive spillovers among NEU highlight their role of main drivers affecting the spreading of spillovers. According to SOCDEM, emerging economies are the only net senders within the system because of lower socioeconomic status, and then reacting ex-post when facing health crisis. Finally, dealing with macroeconomic-financial factors, emerging economies show larger and negative spillover effects. From a policy perspective, these findings highlight the stringent interdependencies and economic-institutional linkages across advanced countries and larger fiscal adjustments across emerging ones. Thus, the analysis confirms the need to encourage the use of 'quasi-flexible' policy measures that consist of two steps (see, for instance, Pacifico (2021)). (i) Coordinated and focused policy interventions so as to deal with the stringent economic-institutional linkages among countries even if not European members (e.g., trade and capital transmission channels, international business, and other organisations facing international climate efforts and crisis-management operations such as NATO) 7 . (ii) Flexibility to adopt stringent or more prolonged measures according to the cross-country heterogeneous economy, without overlooking the correct guidance to governments facing sudden socioeconomic-political changes.
Having a look at the results conforming to constant volatility (CSUR nv model in Figure 2b), higher cross-country commonality and homogeneity matter, mainly among ECOST and ECOFIN since coefficient changes are affected by macroeconomic-financial linkages only. In addition, spillover effects are mostly outwards, except for emerging economies in ECOST and ECOFIN due to lower economic status. Overall, the intensity of spillovers is lower than the one in CSUR v model not accounting for volatility changes. Conditional density forecasts are displayed in Figure 3 to summarise and highlight the main previous findings. A total of 1000 retained replications has been used to conduct posterior inference at each t, where the convergence has been obtained by amounting about to 1 draw per regression parameter. I recall that the outcomes absorb the conditional forecasts computed for a time frame of 2 years (8 quarters), and the natural conjugate prior refers to the three subsamples: (i) 2006q1-2009q4, according to the Great Recession; (ii) 2010q1-2018q4, dealing with postcrisis fiscal consolidation periods; and (iii) 2019q4-2021q1, investigating further volatility changes due to the ongoing pandemic disease. In Figure 3, the yellow and red curves denote the 95% confidence bands, and the blue and purple curves denote the conditional and unconditional projections of outcomes Y T for each N country indexes and T time periods. From a modelling perspective, three main findings are addressed. (i) Advanced and non-European Union countries show similar spillovers' dynamics (top-and middleplot), but with different spreading and intensity (endogeneity issues). (ii) Conversely to unconditional projections, the conditional ones lie in the confidence interval highlighting better forecasting accuracy. Thus, when investigating international spillovers and dynamic feedback in a challenging unified framework, different set of variables need to be dealt with (misspecified dynamics due to cross-country linkages). (iii) Mostly outward countries' responses in AVE and NEU emphasize their role in driving the transmission of (global) unexpected shocks, and then their stringent economic recovery, even if uneven andsometimes-overstrict (misspecified dynamics due to structural breaks).
From a policy perspective, the results highlight greater caution to fine-tune the economy via policy measures and boost productivity to potential growth via accurate structural reforms. In this context, a hint of boosting productivity to potential (even if lower) growth can be observed among countries in the next years, mainly among advanced and non-European Union countries. It highlights strong heterogeneity in economy to be managed carefully, affecting inter-related production and consumption activities that would aid in determining how cross-country recovery resources need to be allocated.

Simulated Experiment and Forecasting Accuracy
In this section, a simulated experiment is addressed to highlight and discuss the performance of the estimating procedure of CSUR v model in (8) (2021), where volatility changes are not replaced by coefficient changes but integrated out; (ii) CSUR nv model, with constant volatility; (iii) BCVAR as in Koop et al. (2019), with stocastic volatility; and (iv) Factor-Augmented VAR (FAVAR) as in Bernanke et al. (2005), with volatility changes.
Here, some considerations are in order. In the former (SPBVAR-MTV), (potential) structural breaks, even if treated-by construction-as permament shifts, are re-evaluated at each time period. Thus, in contrast to CSUR v , excess spillover effects on outcomes would be more stressed. As regards CSUR nv , the variance-covariance matrix of i,t is homoscedastic (Ω) and then volatility changes are missing. According to BCVAR, noninformative priors are used to obtain analytical posteriors for both Ω and σ 2 i . In that context, Ω would be modelled through a triangular decomposition and σ 2 i would stand for volatility changes observed in the vector ω i,t ∼ N(0, I N ). Finally, FAVAR model is obtained by selecting the optimal (best) number of factors through principal component methods and using non-informative priors to perform forecasting.
According to model features in Section 2.3, a relatively large value for the lag length (l = 5) is chosen for all methods with no intercept. Then, two distinct sets are constructed: a training set by simulating 120 variables as independent standard normal vectors for N = 15 and T = 100 time-series data; and a prediction set by generating 100 additional observations in the same manner. Let the thrust of this analysis be to measure forecasting accuracy and estimating process performance of the proposed methodology, I estimate every model by keeping its framework. More precisely, stacking for all supposed variables, * SBCPVAR model for CSUR v and CSUR nv : where s stands for 'simulated' and the 120 supposed variables are split for Y s i,t−λ and Z s i,t−λ equally (60 supposed predictors for each composed vector). Thus, the (simulated) estimation sample amounts, without restrictions, to 900, 000 regression parameters, with K = Nk = 15 · [60 + 60] · 5.
* SPBVAR-MTV model: where the 120 supposed variables are split for Y s whereΘ s i,t contains matrices of coefficients concerning (simulated) lagged outcomes and elements of Ω s obtained by following a triangular decomposition,Φ i refers to the randomly projection matrix shrinking the parameter space,Z s i,t is a vector containing the observable (simulated) outcomes observed at time t and t − λ, and σ s i,t denotes the (simulated) standard deviations of the volatilities associated to the vector ω i,t ∼ N(0, I N ) 8 , with N = K.
whereỸ s i,t follows a VAR(l) process, Λ s i,j denotes the matrix of coefficients concerning (simulated) lagged outcomes,F s i,t−λ refers to the vector of lagged (simulated) outcomes, andε Forecasting accuracy is performed by computing the multivariate Weighted Mean Squared Forecast Error (W MSFE ij,h ), withh ∈ {1, 2, 3, 4, 8} referring to the step-ahead predictive density forecasts evaluated at (t − 1, T). It is so obtained: where we ij,T+h = (e ij,T+h · w |φ| · e ij,T+h ) and we bmk,T+h = (e bmk,T+h · w |φ| · e bmk,T+h ) denote the weighted forecast errors of every simulated model and the benchmark one at time T +h, respectively, e ij,T+h and e bmk,T+h are the (K · 1) vector of forecast errors, and w |φ| refers to the PIPs in order to put more weight on large errors. The benchmark model used in this analysis corresponds to a VAR(1) process with constant volatility and no cross-unit lagged interdependencies and structural time variations. It has the form: where Y t = (y 1t , y 2t , . . . , y nt ) is a n · 1 vector of time-varying outcomes for each i at time t, with i = 1, 2, . . . , n denoting the country index, a 0 denotes intercepts, A 1 is a n × n matrix of lagged coefficients for each i, Y t−1 is a n · 1 vector of lagged outcomes for each i at time t − 1 (only one lag), and ε t ∼ N(0, Ω) is a n · 1 unobserved white noise vector process serially uncorrelated (or independent) with zero mean and time invariant covariance matrix (Ω). Because of each equation has the same regressors (lagged values of y it ), the VAR (1) in (21) can be just written as a SUR model with lagged variables and deterministic terms as common regressors so as to be compared with every supposed model. Table 2 displays the (theoretical) W MSFE ij,h estimates computed simulating all five supposed models with respect to the benchmark one displayed in (21). Four main findings are addressed. (i) According to absence of volatility changes (such as CSUR nv ), lower weighted forecast errors are obtained than high dimensional multicountry data (such as FAVAR and SPBVAR-MTV), by reaching a not bad significance for the first two periods ahead. Thus, a compressed regression in shrinkage of large parameter and model spaces tend to perform better. (ii) Bayesian random compression method with stochastic volatility (such as BCVAR) shows significant estimates in the short-term and results close to the CSUR nv ones, highlighting the need to impose a robust Bayesian model averaging procedure when modelling time-varying and inter-related factors. (iii) Multivariate time-varying volatilities (such as SPBVAR-MTV) performs better than FAVAR but worse than Bayesian compressed methods due to its quite expensive associated computational costs compared to huge estimation sample (≥35,000) 9 . (iv) The lowest weighted forecast errors are displayed in the CSUR v model according to its hierarchical (structural) prior specification strategy and shrinking process. Thus, the most amount of variability (or dispersion) is adequately explained from the estimating procedure. In addition, let the framework be multidimensional (panel data analysis), it would perform better dealing for either endogeneity or valitility issues. The first column denotes theh-step-ahead predictive density forecasts evaluated at (t − 1, T), and the remaining five columns refer to the (theoretical) W MSFE ij,h estimates computed for every supposed model. The significance codes stand for: (*) significance at 10%; (**) significance at 5%; and (***) significance at 1%.

Concluding Remarks
This paper improves the Bayesian compressed regression literature concerning VAR models with time-varying parameters and stochastic volatilities. The proposed methodology is obtained by combining and implementing the underlying logic in the Pacifico (2020b)'s analysis, which highlights the need to select the best subset of predictors through MCMC-based Posterior Model Probabilities rather than random draws, and the estimating procedure used in Pacifico (2021), by jointly modelling high dimensional parameter and model spaces. Multivariate Conjugate Informative Proper Mixture priors are addressed to select the best model solution (or combination of predictors) fitting the data, acting as a strong model selection in large model classes. Finally, MCMC algorithms are used to construct exact posterior distributions and shrink jointly VAR parameters and volatility elements in order to perform accurate cross-country forecasts and policy issues.
An empirical application is developed by accounting for a large set of macroeconomicfinancial and socioeconomic-demographic variables to highlight the performance of the methodology proposed in this study. Thus, conditional density forecasts and strategic policy measures investigating either the impact of COVID-19 pandemic or real/financial shocks on the economic activity are performed.
A simulated experiment-compared to related works-is also addressed to discuss theoretical properties and forecasting accuracy through Monte Carlo simulations. The findings prove that the hierarchical (structural) prior specification strategy and shrinking process perform lower weighted forecast errors and then better conditional density forecasts when studying large set of time-varying data with policy shifts and volatility changes. Acknowledgments: I would like to thank Gary Koop, Dimitris Korobilis, and Davide Pettenuzzo by making available the codes to run a BCVAR model and then understand its underlying logic. I also gratefully thank the two anonymous referees for their useful suggestions improving this study.

Conflicts of Interest:
The author declares no conflict of interest.

1
In econometrics, predetermined variables denote covariates uncorrelated with contemporaneous errors, but not for their past and future values. 2 These can be easily added in a straightforward fashion with a N M · 1 vector of intercepts and an identity matrix of size N M in the vector X t . In the empirical and simulated applications, time-varying coefficients that multiply constant terms are added anyway. 3 In Bayesian analysis, posterior concistency ensures that the posterior probability (PMP) concentrates on the true model. Taveeapiradeecharoen, Paponpat, and Nattapol Aunsri. 2020. A time-varying bayesian compressed vector autoregression for macroeconomic forecasting. IEEE Access, Institute of Electrical and Electronics Engineers 8: 192777-86. Taveeapiradeecharoen, Paponpat, Chamnongthai Kosin, and Nattapol Aunsri. 2019. Bayesian compressed vector autoregression for financial time-series analysis and forecasting. IEEE Access 7: 16777-86. [CrossRef]