The first subsection introduces the data set used in estimations. The second subsection describes BMA estimation structure, along with the used statistics and choices of model and g priors, as well as jointness measures.

#### 3.1. Data and Measurement

The independent variables comprise of data on political institutions within three categories described above (institutions of power legitimacy, of power relations, of budgetary process) obtained from different sources as well as several control economic and social variables. Due to a large number of these variables, their detailed description, together with their sources, can be found in

Appendix A. However, the most important variables designed to test the hypotheses posed in

Section 3.1 are presented in

Table 1.

The particular dependent variable used in the analysis (

COFOG_tot) is the sum of central government expenditures excluding social security contributions by functions of government (COFOG) as a share of GDP compiled by the IMF for the database Government Finance Statistics (COFOG expenditures are divided into the following ten functions: general public services; defense; public order and safety; economic affairs; environmental protection; housing and community amenities; health; recreation, culture and religion; education; and social protection. Data on central government expenditures by function include transfers between the different levels of government). Although it slightly differs from other measures of central government spending, its advantage is completeness for the countries included in the dataset and, due to the same methodology, comparability to data on COFOG expenditures obtained from other sources (e.g., OECD, Eurostat). A variable for budgetary expenditure on the central level has been chosen deliberately. Since most of the institutions we study work at the level of the central government, they may not be well suited for explaining expenditure of local and regional governments. Therefore, the authors have decided to choose a variable which explicitly excludes these expenditures. Such an approach should result in expenditures’ levels being more sensitive to changes in institutions. In fact, the variance of spending both between individual countries and in different years is considerable (see

Appendix F). This should make it possible to recognize more theories as compatible with each other.

The dataset includes the following 24 countries (members of OECD and Bulgaria) for which we were able to obtain complete data over 2001–2012 (The choice of this particular timespan is informed by data availability). years for all the variables: Australia, Austria, Belgium, Bulgaria, the Czech Republic, Denmark, Estonia, Finland, France, Germany, Hungary, Ireland, Israel, Italy, Latvia, Luxembourg, the Netherlands, Norway, Poland, Portugal, Slovenia, Spain, Sweden, United Kingdom, and United States. We decided to use this particular sample of countries due to their broadly similar level of development. However, we wanted to use as much data on emerging countries as possible to test, whether the tested theories fare well in somewhat different circumstances than normally assessed. Some countries (such as e.g., Romania) were left out due to lack of data (excluding data on Bulgaria from the dataset does not alter the results in any qualitative manner—results are available upon request). Overall, the panel comprises 300 observations. All 39 variables were tested using first generation tests for the common unit root using [

60] and for individual unit root using [

61] as well as using second generation test proposed by Pesaran [

62]. As confirmed by the aforementioned tests, all variables used in the estimations are weakly stationary (Results are not reported here for brevity and are available upon request). Descriptive statistics and means for all variables in every country are reported in

Appendix D.

#### 3.2. BMA—Bayesian Model Averaging

The theoretical literature contains a long list of potential institutional determinants of budgetary expenditures. Thus far, researchers were trying to verify the hypotheses about them, focused their inquiries only on a few variables representing the institutions they were interested in or those associated with a given strain of theory. This type of approach completely disregards uncertainty about the specification of the model being tested. This issue is amplified by the presence of open-endedness, the idea that the validity of one casual theory is not implying falsification of another one [

63]. With the vast theoretical and empirical literature on the subject, the assessment which of the theories are correct becomes infeasible due to the bulk of inconsistent or even conflicting results that cannot be compared.

Accordingly, in order to assess which institutions are in fact, determining budgetary expenditure, the analytical framework needs to allow the comparison of the different models as well as their assessment based on the empirical grounds. Bayesian model averaging is a method possessing all these qualities, and consequently, in the present paper, it was used to identify the robustness of the most prominent institutional determinants of budget expenditure among the candidates which form the up to date research.

The data comprises a panel of 25 countries over the 2001–2012 period with one dependent variable and 38 regressors. In the literature, country heterogeneity in the data is dealt with using random or fixed effects models. Those models are well fit when a single theory is tested at a time, and random and fixed effects serve as a way of covering up the ignorance about the sources of heterogeneity [

64]. On the other hand, BMA deals with heterogeneity directly by finding a combination of regressors which accounts for it to the greatest extent within a conditioning set of information. Consequently, BMA appears to be ideally suited for finding robust determinants of budgetary expenditure. Within the set of regressors, the research strives at the identification of the variables, whose influence on budgetary expenditures finds the most substantial support in the data. BMA assumes the following general form of the model:

where

$j=1,2,\dots ,m$ denotes the number of the model,

${y}_{j}$ is a vector

$\left(\right(n\ast t\left)1\right)$ of the values of the dependent variable,

${\alpha}_{j}$ is a vector of intercepts,

${\beta}_{j}$ is a vector

$(K\times 1)$ of unknown parameters,

${X}_{j}$ is a matrix

$\left(\right(n\ast t)\times K)$ of explanatory variables, whereas

${\epsilon}_{j}$ is a vector of residuals which are assumed to be normally distributed and conditionally homoscedastic,

$\epsilon \sim N(0,{\sigma}^{2}I)$.

$n\ast t$ denotes the number of observations (300), and

K is the total number of regressors (38).

For the space of all models that can be estimated with the 38 regressors at hand, unconditional posterior distribution of coefficient

$\beta $ is given by:

where:

y denotes data,

$j(j=1,2,\dots ,m)$ signify the number of the model,

K being the total number of potential regressors,

$P\left(\beta \right|{M}_{j},y)$ is the conditional distribution of coefficient

$\beta $ for a given model

${M}_{j}$, and

$P\left({M}_{j}\right|y)$ is the posterior probability of the model. Using the Bayes’ theorem, the posterior probability of the model (PMP—Posterior Model Probability)

$P\left({M}_{j}\right|y)$ can be rendered as:

where PMP is proportional to the product of

$L\left(y\right|{M}_{j})$—model specific marginal likelihood-and

$P\left({M}_{j}\right)$—model specific prior probability-which can be written down as

$P\left({M}_{j}\right|y)\propto L\left(y\right|{M}_{j})\ast P\left({M}_{j}\right)$. Moreover, because:

$P\left(y\right)={\sum}_{j=1}^{{2}^{K}}L\left(y\right|{M}_{j})\ast P\left({M}_{j}\right)$, weights of individual models can be transformed into probabilities through the normalization in relation to the space of all

${2}^{K}$ models:

Applying BMA requires specifying the prior structure of the model. The value of the coefficients

$\beta $ is characterized by normal distribution with zero mean and variance

${\sigma}^{2}{V}_{0j}$, hence:

It is assumed that the prior variance matrix

${V}_{0j}$ is proportional to the covariance in the sample:

${\left(g{X}_{j}^{\prime}{X}_{j}\right)}^{-1}$, where

g is the proportionality coefficient. The g prior parameter was put forward by [

65] and is widely used in BMA applications. In all the estimations presented in this paper two versions of g prior recommended by [

66] in their seminal were used, namely: UIP—unit information prior [

67], and RIC—risk inflation criterion [

68]. For further discussion on the subject of g priors see: [

69,

70,

71,

72].

While applying BMA, besides the specification of g prior, it is necessary to determine the prior model distribution. In the main results uniform model prior [

73] was used, where priors on all the models are all equal

$(P\left({M}_{j}\right)\propto 1)$. Under uniform model prior, the prior probability of including a variable in a model amounts to 0.5. The main estimation results presented in this paper are based on a combination of uniform model prior and unit information g prior. This combination of priors is recommended by [

72]. To assure robustness of the results, other prior structures have been used as well. First of all, risk inflation prior advocated by [

66] was combined with binomial-beta model prior [

69]. In the case of binomial-beta distribution, the probability of a model of each size is the same

$\left(\frac{1}{K+1}\right)$. Thus, the prior probability of including the variable in the model amounts to 0.5, for both binomial and binomial-beta prior with

$Em=K/2$. In order to account for potential multicollinearity between regressors, dilution prior was used. Accordingly, a uniform model prior is supplemented with a function accounting for multicollinearity [

74] to obtain prior model probabilities:

where K = (38) is the number of covariates, while

$|{R}_{j}|$ is the determinant of the correlation matrix for all the regressors in the model j. The uniform model prior implies equal probabilities assigned to all the models, so the

$|{R}_{j}|$ component of (13) decides about the distribution of the prior probability mass. The higher the multicollinearity between the variables, the closer the value of

$|{R}_{j}|$ to 0 and the lower the prior ascribed to a given model. In case of 38 covariates the entire model space consists of around 275 billion possible models, which is a number infeasible to assess analytically. Accordingly, the model space is reduced with

$M{C}^{3}$ (Markov Chain Monte Carlo model Composition) sampler [

75]. The convergence of the chain is assessed by the correlation coefficient between the analytical and MC3 posterior model probabilities for the best 10,000 models. Using the posterior probabilities of the models in the role of weights allows calculation of the unconditional posterior mean and standard deviation of the coefficient

${\beta}_{i}$. Posterior mean (PM) of the coefficient

${\beta}_{i}$, independent of the space of the models, is then given with the following formula:

where

${\widehat{\beta}}_{i,j}=E\left({\beta}_{i}\right|y,{M}_{j})$ is the value of the coefficient

${\beta}_{i}$ estimated for the model

${M}_{j}$. The posterior standard deviation (PSD) is equal to:

where

$V\left({\beta}_{i,j}\right|y,{M}_{j})$ signifies the conditional variance of the parameter for the model

${M}_{j}$. To better capture the relative impact of the determinants on the government expenditure, standardized coefficients were calculated and BMA statistics based on their values. SPM denotes the standardized posterior mean, while SPSD denotes a standardized posterior standard deviation (See [

76] for elaboration).

where

${x}_{i}=1$ signifies including the variable

${x}_{i}$ in the model. In uniform and beta-binomial prior model distributions prior inclusion probability is equal to 0.5 and can serve as a point of reference in assessment of the robustness. Following [

67,

77], the robustness of each regressor is weak, positive, strong, or decisive if the posterior inclusion probability PIP lies between 0.5–0.75, 0.75–0.95, 0.95–0.99, or 0.99–1, respectively. In the case of dilution prior, there is a problem with setting the exact value of prior inclusion probability. As the method combines a uniform model prior with a function penalizing for multicollinearity, the exact prior distribution is not known before calculations. As explained above, the entire model space as well as all the values of

$|{R}_{j}|$ are infeasible to calculate with a large number of regressors and, consequently, the same is true for prior inclusion probability. On the other hand,

$|{R}_{j}|$ takes lower values for bigger models by virtue of its construction and, consequently, the expected model size is lower than for uniform distribution, and prior inclusion probability is lower than 0.5 (Moreover, prior inclusion probabilities are lower for the variables characterized by a higher degree of multicollinearity). In this setting, the critical values proposed by [

67,

77] can serve as a very strict criteria of asserting robustness of the variables under consideration.

Additionally, the researcher can be interested in the sign of the estimated parameter, if it is included in the model. The posterior probability of positive sign of the coefficient in the model

$\left[P\right(+\left)\right]$ is calculated in the following way:

where CDF signifies a cumulative distribution function, while

${t}_{ij}\equiv \widehat{{\beta}_{i}}/\widehat{S{D}_{i}}|{M}_{j}$.

Within BMA, it is possible to assess the nature of the relationships between regressors using jointness measures. Reference [

76] define their jointness measure as:

where

i and

h represent two regressors in the model. One of the biggest drawbacks of JDW is that, by construction, there are circumstances in which it cannot be calculated (For example, when a given variable is characterized by PIP very close to 0 computation of JDW will require division of 0/0, which gives undefined symbol, or nan—not a number). Accordingly, in order to obtain more reliable information about jointness, Reference [

78] measure is calculated as:

For both jointness measures

$\left(J\right)$ the same critical values can be applied. When

$J>2$, two variables are referred to as strong complements,

$2>J>1$ as significant complements,

$1>J>-1$ as unrelated,

$-1>J>-2$ as significant substitutes, while

$-2>J$ signifies strong substitutes [

79]. As demonstrated in [

80,

81], JLS generally outperforms JDW. Accordingly, interpretations of jointness measures in the results are mainly based on JLS (More on jointness measures can be found in [

82]).