Sovereign Risk Indices and Bayesian Theory Averaging

In economic applications, model averaging has found principal use in examining the validity of various theories related to observed heterogeneity in outcomes such as growth, development, and trade. Though often easy to articulate, these theories are imperfectly captured quantitatively. A number of different proxies are often collected for a given theory and the uneven nature of this collection requires care when employing model averaging. Furthermore, if valid, these theories ought to be relevant outside of any single narrowly focused outcome equation. We propose a methodology which treats theories as represented by latent indices, these latent processes controlled by model averaging on the proxy level. To achieve generalizability of the theory index our framework assumes a collection of outcome equations. We accommodate a flexible set of generalized additive models, enabling non-Gaussian outcomes to be included. Furthermore, selection of relevant theories also occurs on the outcome level, allowing for theories to be differentially valid. Our focus is on creating a set of theory-based indices directed at understanding a country’s potential risk of macroeconomic collapse. These Sovereign Risk Indices are calibrated across a set of different “collapse” criteria, including default on sovereign debt, heightened potential for high unemployment or inflation and dramatic swings in foreign exchange values. The goal of this exercise is to render a portable set of country/year theory indices which can find more general use in the research community.


Introduction
In economic applications, Bayesian Model Averaging (BMA) has proven a useful tool to assess theories related to the potentials and risks of economic expansion, see Steel (2019) for a comprehensive review. All economic theories are in some sense qualitative and no single empirical observation can encapsulate the theory's essence perfectly. To address this, a group of variables -self-evidently correlated -are often collected to proxy each theory.
Not accounting for the uneven manner by which different variables may be available for each theory can lead to inappropriate conclusions regarding overall theory validity. Standard approaches to BMA can be modified, especially through the model prior, to account for these characteristics, but still consider the direct effect of the collected variables on the single response in question. One example is Chen et al. (2017), which consider the determinants of the 2008 crisis. They use a hierarchical formulation that allows for a simultaneous selection of both theories and relevant variables.
We propose an entirely separate approach to testing theories, both with regards to standard BMA and also to Chen et al. (2017), through a new model averaging approach. We assume each observation has a number of latent features encoding values for these theories. This requires the researcher to pre-specify which theory a given empirical observation is meant to proxy, a task which is often straightforward and frequently performed in practice. The outcome of this modeling exercise is a set of theory indices associated with each observation, as well as the model parameters necessary to derive these indices for observations not included in training. Our second innovation is to link the embedding of empirical factors to theory indices across a number of correlated outcome variables. This is driven by a motivation for theory index consistency. Ideally, an index which assesses the strength of a government's institutions should be roughly the same when using the index to predict the potentials of economic growth and the susceptibility to economic collapse, for example.
Indeed an ideal encoding would allow the theory index to be trained on one set of outcome variables and be immediately useful as a standalone feature in modeling separate but related economic activity. We therefore construct a framework by which theory-level modeling oc-curs on a latent level and is tuned to addressing a theory's role in explaining the variability of a number of economic outcome variables simultaneously. Brock et al. (2003) recommend considering both theory uncertainty (many theories can explain a phenomena) and variable (which empirical proxies should be used to explain each theory) uncertainty. Following this recommendation, model averaging in our Bayesian Theory Averaging (BTA) approach occurs on two separate levels. On the theory-level, a standard BMA formulation is used to determine which proxies for a given theory have the greatest relevance. Our modeling is across multiple different outcome variables and a given theory may only be relevant for a subset of these outcomes. Thus, we also perform theory averaging on the outcome level, allowing theories to selectively enter into each outcome under consideration.
Outcomes in economics can be quantified in a variety of manners and thus our framework is formulated to entertain a broader family of outcome sampling distributions than the Gaussian context to which most economic BMA applications have adhered (Steel, 2019).
Our motivating example concerns developing useful theory-based indices for quantifying the potential for significant negative economic outcomes in macroeconomies, which we term Sovereign Risk Indices (SRIs). These outcomes range across default on Sovereign debt, the potential for high levels of inflation or unemployment, and heightened risks instability in foreign exchange. Useful introductions to sovereign default are Roubini and Manasse (2005) and Savona and Vezzoli (2015). Each of these outcomes have a number of theories which explain their variability. These theories encapsulate institutional and financial characteristics of each country and overall aspects of the global economy at the time and are proxied by a large number of potential variables. By modeling these outcomes jointly, we can construct a set of theory indices that are relevant for general research into macroeconomic extremes.
Our goal is to create a broad database of SRIs that can then be made available to the general research population, where each index has an clearly defined construction and encodes a wellarticulated theory regarding economic well-being. Our data combines the data in Savona and Vezzoli (2015) with new data sources, as explained in Section 3.1.
The structure of the article is as follows. Section 2 outlines BTA. The specifics of the algorithm that performs posterior inference for BTA is rather involved and relegated to Appendix A. Section 3 contains our analysis of the data which constructs the SRIs while Section 4 concludes.

Bayesian Theory Averaging
Let Y i be an R dimensional response vector for observation i and D = {Y 1 , . . . , Y n } be a collection of n such observations. Each variate Y ir in the vector Y i is assumed to belong to a general field F r . In this paper we consider examples where F r is {0, 1}, R and R + , though others such as N, could easily be entertained. We associate Y ir with an outcome distribution as Y ir ∼ g r (θ ir ) where g r is a general probability density or mass function and θ ir is a set of parameters assumed to have dimension q r which control g r . A given parameter θ irj , j ∈ {1, . . . , p r } is either assumed to be a global parameter and thus θ irj = θ krj for all i, k ∈ {1, . . . , n} or to have a linear form related to a T dimensional set of indices I i = (I i1 , . . . , I iT ) by the relationship θ ir = α r + T t=1 γ rt I it .
In this formulation γ rt can either be 0 or γ rt ∈ R. By convention if several γ rt are nonzero for a given index t then one of these non-zero γ rt is set to 1 to avoid issues related to identification. This matter is discussed at length in Appendix A.
The variable I it is then referred to as the theory-t index for observation i. We further assume that the I it depends on a set of p t theory proxies X it according to the linear model Due to issues of identification, the precision term ν −1 t , is considered fixed, see Appendix A for a more detailed discussion of this term and its role in balancing the system when theory-transitions are made.
Associated with the parameter β t is a model M t ⊂ {1, . . . , p t } such that β it = 0 when i ∈ M t , a standard BMA formulation. As the "null" model can be controlled by the γ rt parameter, we exclude M t = ∅ from our consideration, see Kourtellos et al. (2019) for a motivation of this structure. Writing β Mt to represent the subvector of β t not constrained to zero we assume where p Mt is the size of model M t . Alternative priors for this model could have been considered, see our discussion in the Conclusions section. Again, the term ν t adjusts the β Mt parameters and their associated prior to aid in identification matters when adjusting the theory indices γ t . Finally, the model M t can have a number of priors, see Ley and Steel (2009) for an overview of potential issues to consider when selecting this prior. We follow Kourtellos et al. (2019) which builds on Durlauf et al. (2012) by choosing a model prior for which where C Mt is the correlation matrix defined by the variables in M t . As in Kourtellos et al. (2019) we avoid of the awkward need to account for the null model encountered in Durlauf et al. (2012) by our restriction that M t = ∅, handling this side case through the γ t vector.
The system outlined above then serves as the core latent process which drives the subsequent outcome variables. Thus we see that the models M t investigate which proxies best encode a theory quantitatively while also accounting for the obvious model uncertainty in this formulation. The γ t term serves two purposes. First, by examining its non-zero elements we see for which response equations a given theory is relevant. Secondly, by requiring the first non-zero γ rt to be equal to 1 and all others to be in R the γ r term scales the latent indices to allow them to enter into model parameters differentially and indeed in opposite directions.
Finally, the latent theory indices I it are potentially of greatest interest, as they are meant to encapsulate the way that the theory proxies affect the outcome equations of interest.
Again, as outlined in the Appendix, these terms suffer from potential identification issues when combined with the restrictions placed on a given γ r . The ν t term accommodates this under-identification and therefore, final interest focuses on the scale-free termĨ it = ν t I it .
Choices for families g r that control the outcome variables are considerable. In our application, we focus on three models. The first is logistic regression. In this case Y ir ∈ {0, 1}, q r = 1 and .
In our applications we then assume that We use this logistic regression to model the probability that a country will default on its sovereign debt based on theory-indices.
The second family considered corresponds to the non-central asymmetric Laplace variates. In particular, Y ir ∈ R, q r = 2 and we write In our application we assume that the log-precision parameter θ ir2 is constant across observations and thus write θ ir2 = κ r while θ ir1 has the form in (1). This model is often referred to as the Bayesian Quantile Regression since its posterior mode is related to the quantile regression estimate under the so-called pin-ball loss ρ τ . We employ this model for two separate variates, the inflation and unemployment rates and set τ = .9 for both, thus focusing on the 90Th percentile of the respective distributions.
Finally, we consider the Generalized Extreme Value (GEV) model as parameterized by The GEV model is used to model block maxima and hence understand the nature of extreme behavior. In our case we use it to model the largest daily percentage jump in a country's exchange rate (relative to USD) seen over the course of a year. We currently fix both the log precision θ ir2 and shape θ ir3 parameters to be constant and write these as κ and ξ respectively.
Based on D we then conduct posterior interference on the full parameter set, which includes global parameters in the θ r , theory-level models M t , theory-inclusion and scaling jump methods (Green, 1995) alternate γ rt between being 0 or in R, with a modicum of book keeping to ensure that at least one γ rt is set to 1 when theory t is represented in more than one dependent equation r, again to ensure identification of the system.
In general, mixing of the algorithm appears relatively satisfactory, especially when using CBFs to compare non-neighboring models M t . Run-time is on the order of hours for our focus dataset, as opposed to days or weeks. More details are provided in Appendix A.
3 Using BTA to construct Sovereign Risk Indices

Data Outline
Our dataset for constructing SRIs is based on the data used in Savona and Vezzoli (2015). (2015) Table 1 provides an overview of the 27 covariates considered and to which theory Savona and Vezzoli (2015) associate them. For a given year, most covariates are "lagged"(except for contagion, dummy for oil and dummy for international capital markets), in that these values would be available at the start of a given year, as opposed to co-occuring with the default event. In total, these data correspond to 1998 country/year pairs. Covariate missingness is prevalent, we use the imputed values derived from the methodology outlined in Savona and Vezzoli (2015). In the conclusions section we discuss alternative approaches that could have been incorporated directly into our methodology to handle covariate missingness. (2015) is concerned with predictive models of sovereign default and therefore solely focus on this binary outcome. We augment the default binary with three other measures that can also indicate a macroeconomy in a state of collapse. First, the country's lagged (i.e. one-year behind) inflation rate was originally included in the Macroeconomic factors group of covariates in Savona and Vezzoli (2015). We instead treat (non-lagged) inflation as another dependent variable and note that doing so has no effect on the Default outcome; a run of BTA solely on Default with inflation included in the Macroeconomic fac- Cont area Number of defaults in the region the country is part of tors gave this variable 0 inclusion probability. In addition, we collected unemployment data from the IMF website. These data were only available for a subset (897 country pairs) of the total data. We note that this dependent variable missingness poses no substantive problem in terms of the derivation of SRIs. The BTA approach simply ignores the missing likelihood contributions necessary to update the asssociated latent theory indices.

Savona and Vezzoli
Finally, we collected foreign exhange rate data from the website of IMF. For each country/year pair, we first computing the log rate change relative to the US dollar and then used the annual maximum of these log changes. This variable therefore shows the largest single-day weakening of a currency relative to the US dollar in the course of a year, since one can buy more of the currency for the same amount of US dollars. We avoided commercial sources of foreign exchange data and therefore only had these values for 272 country/year pairs. See our discussion in the conclusions section regarding expanding these data.
A country is in default if it is classified so by Standard & Poors (SP) or if it receives a large nonconcessional loan by the IMF. A nonconcessional loan is a loan that has the standard IMFs market-related interest rate, while a concessional loan has a lower interest rate. The type of loans we consider from IMF must be in excess of 100 percent of quota.
Each member country has a quota, where the initial quota is set when a country joins IMF.
The quota determines, by other things, the countrys access to IMF loans and for instance its voting power. By augmenting the data from SP with data from IMF, we also capture near-defaults or debt restructurings avoided through loan packages from IMF. We consider Stand By Agreements (SBA) and Extended Fund Facility (EFF).
Our posterior inference is performed after running the BTA algorithm for 400 thousand iterations over these data. In order to verify convergence, 30 seperate runs of the algorithm were run simultaneously and the resulting output was inspected to verify posterior inference for each individual chain was nearly identical. Runtime on a 32 core machine (dual 8-core 3.4 GHz AMD Ryzen processors with hyperthreading capabilities) with 128 GB of RAM was roughly 7.5 hours.

Results
We begin our discussion of the SRI results by investigating outcomes on the theory level. Table 2 shows the theory inclusion probability (i.e. the proportion of time that γ rt was not constrained to zero in the chain) for each theory, across the four outcome variables. Given that the original dataset was constructed to model the default variable, it is unsurprising that all theories achieve inclusion probabilities of one for this outcome. The inflation outcome is interesting in that it suggests that proxies measuring a country's political stability have the best explanation for the upper tails of the inflation distribution. All other theories are also relevant to inflation, achieving probabilities between .38 and .41, but nowhere near as strong as the political factors.
The results for unemployment in Table 2 suggests a more bimodal inclusion result. The Insolvency and Systemic theories barely enter into the final outcome while the Illiquidity, Macroeconomic and Political factors achieve probabilities of one. Finally, we achieve relatively low inclusion probabilities for all theories for the devaluation outcome. This is likely due in part to the relatively small amount of data that was available using public sources, see our discussion in Section 4. However, we feel this result highlights a useful feature of BTA, namely that including this outcome variable and having the system set theory-inclusion probabilities to zero meant there was no subsequent effect on the calculation of theory indices. Table 3 shows the mean value (conditional on inclusion) of the parameter γ rt for each theory and outcome pair. Since Default was ordered first in our system and achieves inclusion probabilities of 1 for all theories, this system serves to orientate the rest of the outcomes.
In particular, a positive γ rt for the outcomes indicates that the directionality of this theory on the outcome is similar to that of default. The conditional means on the inflation outcome reflect even more clearly the importance of the Political theory relative to all others.
The value its conditional mean achieves (1.905) is more than ten times the next highest conditional mean (.124 for the Macroeconomic factor). Since the Political Theory achieved substantially higher inclusion probabilities in Table 2 this implies that the unconditional effect of the political theory index is the main driver of the upper tail of inflation.
Recalling again from Table 2 that the unemployment outcome was driven by the Illiqudity, Macroeconomic and Political factors, the results in Table 3 are interesting. They show that Illiquidity has a very strong, positive effect on the unemployment outcome. We note that "positive" is in the sense of working in the same direction as Default. The Macroeconomic and Political factors then balance this behavior; they are negatively orientated to the impact these factors have on Default, though of smaller magnitude to the Illiquidity theory.
Finally, as noted in Table 2 there appears to be negligible effect of the theory indices on the devaluation outcome. Table 4 shows the inclusion probabilities and conditional posterior mean for each proxy contained in the Insolvency theory group. Five factors achieve probabilities of one. These include two factors that measure the strength of the country's balance sheet, namely the country's current account (CAY) and reserves growth (ResG). In addition, two factors that measure a country's trade balance, namely total exports balance and import growth (WX and MG respectively) are also included in the Insolvency theory with probability one along with a measure of the country's foreign obligations (public external debt to GDP, PEDY). We see from Table 4   In some sense, we find this result appealing, as a more technical measure of money supply (M2R) cannot in itself indicate whether illiquidity events are likely to arise. Furthermore, it is interesting to note that similar to the higher inclusion probability placed on long-term    group that the short term debt to reserves (STDR) is given a weight of zero when included alongside the factor focused on longer term debt servicing (DSER). Table 6 shows the inclusion probabilities for the proxies in the Macroeconomic grouping.
We see that measures related to inflation dynamics (RGRWT) and exchange rate fluctuations (OVER) are given low inclusion probabilities. Instead, a measurement of whether a country is dependent on Oil proceeds (DOil) and a technical factor related to global interest rates (UST) are the main two constituents of this theory with inclusion probabilities of .6 and 1 respectively.
Tables 7 and 8 show the inclusion probabilities for proxies of the Political and Systemic theories respectively. In each theory there are only two features and all four receive high inclusion probabilities ranging form 0.7 to 1. We see that the Political theory is thus a blend  Likewise, a measure of global contagion (Cont tot) as well as local factors (Cont area) combine to form the Systemic theory, with slightly less weight (.76) placed on the local proxy.
We conclude by investigating detailed results for two of the theories, namely Insolvency and Illiquidity. Table 9  Two of the five (i.e. 40%) of these pairs experience a default, which is substantially higher than the 6% average over all the data, showing the degree to which this feature is positively associated with default.  Table 10 shows the five highest and lowest country year pairs according to the illiquidity index. Burundi in 1991 (i.e. two years before the start of the civil war that ran between 1993 and 2005) receives the lowest Illiqudity index, otherwise followed by countries in South Asia. On the highest end, we see both Jamaica and Lesotho represented twice. In addition, Gabon in 2002 is present, a year in which the country defaulted on its sovereign debt. This contrast to Table 9 is illuminating, as it shows the trade off between potential for insolvency and risks of illiquidity in precipitating sovereign default. We note again that two of the top five country year pairs record a default, similar to the results of Table 9. However, when inspecting the unemployment result, we also see high levels of unemployment for four of the five top countries (and a missing value for Gabon 2002, the remaining country). Simultaneously the countries with the lowest illiquidity indices have negligible unemployment rates.
This lines up with the results presented in Table 4, where the insolvency index had a large, positive effect on the unemployment outcome equation. Figure 1 shows the average across all countries for each index by year. A few things First, all indices except the Political index appear to achieve their highest levels in the early to mid-eighties, a high-interest rate period with substantial propensity for default, inflation and high unemployment. Indeed, all four of these risk indices, on the aggregate appear to decline as we move towards 2010. The result in Figure 1 (e), which shows a consistently rising mean Political risk index would at first seem quite compelling, implying some increasing risk due to political factors. Unfortunately, this mainly exposes a failing of the History proxy. This proxy measures the total number of defaults a country experienced in the past and is therefore consistently increasing. See our discussion in Section 4 for potential avenues to create a more robust set of Political proxies.
Finally, we address an issue related to theory index portability. In the theoretical construction of these indices we specified an independence structure between indices. However, there has been nothing enforcing this condition in posterior estimation. If theory indices were correlated in the posterior, this would be acceptable, however it would imply that these indices would need to included as a set when attempting to model other phenomena. Ta-

Conclusions
We have constructed a system whose purpose is to create indices representing various theories which are believed to drive heterogeneity in economic outcomes. When constructing an index, interpretability is an important feature to retain. This is primarily because through interpretability additional proxies can be found when deficiencies become apparent, and specific results can be explained directly. Our BTA approach then forms a natural means of incorporating and resolving the obvious model uncertainty present in such a specification.
Furthermore, our focus on modeling multiple outcomes coupled with the ability to entertain a broad set of outcome sampling distributions lends our system both generalizability and flexibility.
There is considerable additional work to be done, both on the technical, algorithmic sides of BTA and also related to the specific goal of modeling an economy's potential for collapse.
One key point has been the assumption that the multiple outcome variables are independent from one another. In practice, this did not seem to be overly critical, as seen by the fact that the inflation outcome was not present in the posterior when BTA was run on default using this feature alongside the others shown above. However, incorporating outcome variable dependence should be relatively straightforward using the Gaussian copula approach of Hoff (2007). Indeed uncertainty over these conditional independence assumptions could also be model averaged using the copula Gaussian graphical model approach of Dobra and Lenkoski (2011).
In this current system, outcome equations had a linear dependence on theory indices.
While it will always be necessary to orientate the indices for reasons of identification (i.e. the assumption that γ rt = 1 for at least one non-zero r), expanded linear forms such as spline models (Wood (2017)) are entirely feasible. Indeed a third layer of model selection would be to test between linearity and the expanded linearity offered by spline modeling.
The MCMC algorithm necessary to resolve BTA was neither trivial nor the most complex.
As outlined as early as Rue (2001), block updates of parameters in hierarchical generalized models is often advantageous. We have in general avoided block updates at present, but such a sampling regime could speed up convergence and also algorithm run-time.
One difficultly we experience when implementing the quantile regression was the null second derivative in the asymmetric Laplace distribution. This in turn, makes intelligent updates of parameters for this distribution somewhat harder, since there is less information regarding posterior curvature and thus proposals have a tendency to move too far along the posterior density surface. This feature has already been investigated in some detail in related contexts. One potential for improved mixing would be to follow Fasiolo et al. (2017), who propose a smooth version of the pinball loss to aid the fitting of qgam models.
Finally, our reversible jump proposals were in some sense the least inspired part of the current system. Though mixing appeared acceptable, more focused jumps could have been constructed, by following much the same Laplacian formulations as the other model parameters.
As the great expansionary period following the global financial crisis enters its second decade, it is clear that we can expect to enter a retractionary phase of the global business cycle sometime in the near future. Our applied interest has been to begin building a monitoring, forecasting and inferential toolset that can prepare us for this period. While we believe the current version of the SRI estimation system is encouraging, considerable work remains to be done.
First and foremost, the current dataset is available until 2010. We intend to continue building this system to include all available years to present. We are broadly happy with the proxies collected to model insolvency and illiquidity in an economy. Macroeconomic and Systemic features could likely be expanded in a number of obvious ways. For instance including information on global financial markets or personal or industrial bankruptcy information could expand the Systemic theory proxies.
However, we are convinced that the Political risk proxies can be expanded in several important manners. First, the History proxy is a useful concept, since it captures the propensity of a given country to consistently default on debt (i.e. Argentina), however as pointed out in Figure 1 its current construction needs to be adjusted to account for the fact that it is presently monotonic in time. Secondly, aspects related to political regimes are likely to affect potential for economic collapse. Merging our data with the regime change dataset of Reich (2002) could be one avenue to account for the effect of differing regimes and overall regime uncertainty.
Finally, it has been our hope to use only publicly available data sources to aid in the reproducability of our index construction. While we are convinced that devaluation matters should be included in our set of outcome equations, the necessary currency data has been hard to find publicly. We will continue to investigate open and public sources of currency exchange data to increase the coverage of this variable. In doing so, we hope the relative inconclusivity related to theories and their effect on sudden devaluations can be resolved.

A Full Algorithm Details
Based on data D we use MCMC to obtain a sample {ς • Reversible Jump Methods for alternating γ tr between being 0 or in R. Note that the moves here become especially detailed-though primarily in the sense of bookkeepingwhen γ tr is currently set to 1, or if γ tr is currently zero and r is smaller than all other non-zero γ tr . Finally, this becomes a joint reversible jump move when the model move will either turn-on or shut-off the theory entirely, as both γ tr and I r will be affected.
The sections below detail each of these approaches individually A.1 Gibbs Sampling Update of β t To resample β t we note that its posterior distribution Where β Mt and X Mt indicate the restriction to those elements and columns of β t and X t , respectively, associated with the variables in model M t . We then have that using standard results of Bayesian regression (see e.g. Hoff 2009), we therefore have that We therefore resample β Mt from the multivariate-Normal distribution above.

A.2 Conditional Bayes Factors to update M t
Conditional Bayes Factors compare integrated likelihoods for models M t and a new proposal model M t , conditioning on the latent indices I t . This conditioning then separates the Gaussian regression components on which the models operate from the larger non-Gaussian components in the response equations, leading to an efficient sampling regime. This efficiency is present both in the availability of closed form calculations to compare models and the relative parsimony of the approach's exposition.
In particular, note that pr(M t |D, ·) ∝ pr(I t |M t )pr(M t ) which implies that the latent theory indices I t separate the conditional posterior of the model M t from the data D and the associated non-integrable likelihoods. This term can the be represented by The integrand above is then

A.3 Metropolis-Hastings Updates via Laplacian Expansions
The two sections above dealt with parameters that could effectively be "conditioned" away from the sampling model of the dependent variables, in both cases by conditioning on the latent variables I t . This, in turn, led to updates that were straightforward to calculate as in both cases they relied on well-known results for integrals over the Gaussian distribution.
However, when conditional posterior distributions do not have a form amenable to integration or Gibbs sampling, Metropolis-Hastings algorithms provide an obvious alternative. This section therefore details all proposal distributions and acceptance ratios necessary to update these parameters.
In all cases, we follow a standard approach to creating Gaussian proposals which require no pre-specified tuning parameters and instead adapt proposals to the local curvature of the log posterior density, see e.g. chapter 4 of Rue and Held (2005) where b = f (τ ) − f (τ )τ and c = −f (τ ). The posterior distribution pr(τ |·) can therefore be approximated by the density of the Gaussian distribution N (b/c, c −1 ). Using this relationship, we choose N (b/c, c −1 ) as our proposal distribution, where τ is the current state in the MCMC chain.
This formulation alleviates the user from specifying a large number of sampling tuning parameters and achieves high acceptance proportions.
The following subsections outline the specific forms of f, f , and f for all variates that are updated in this manner. Since the I it depend on all r equations they are handled in a final, separate subsection.

A.3.1 Logistic Regression
If equation r is a logistic model then it has the form And thus the formulas for α r , γ rt require derivation (as noted above we leave I it to a final subsection). First, note log pr(Y ir ) = Y ir µ ir − log(1 + exp(µ ir )) Then for the global parameter α r with prior distribution α r ∼ N (0, 1) we have that Similarly, for γ rt not constrained to be 0 or 1 we assume γ rt ∼ N (0, 1) and have exp(µ ir ) (1 + exp(µ ir )) 2 − 1 Finally, as it will be important in derivations for the updates of I it we write be a Bayesian Quantile Regression, i.e. Y ir is considered asymmetric Laplace distributed with log-precision parameter κ and We therefore need to derive the relevant formulas for α ir , γ rt and likelihood derivatives for I it . We note log pr(Y ir |µ ir , κ, q) = κ − e κ ρ q (Y ir − µ ir ) and thus, Therefore, for α r with N (0, 1) prior we have Similarly when γ rt is not constrained to 0 or 1 we set γ rt ∼ N (0, 1) and have Likewise, we note that and thus if κ ∼ N (0, 1) in the prior, then Finally, for I it we have l(Y ir , I it ) = 0
Therefore, to update α r ∼ N (0, 1) we have Likewise, to update any γ rt not constrained to 0 or 1 we have For the term I rt we note l(Y ir , I it ) = a(Y ir ) l(Y ir , I it ) =ȧ(Y ir )γ rẗ l(Y ir , I it ) =ä(Y ir )γ 2 rt .
Now focus on the global log precision term κ ∼ N (0, 1) we have The calculations for the shape parameter ξ are somewhat more involved. Let We then obtaiṅ from which it follows that ∂ ∂ξ log pr(Y ir |·) = −ġ 1 −ġ 2 .
For the second derivative, similar calculations return

A.3.4 Updating Theory Indices
We now consider updating of theory indices I it . Noting that We have the formulas Were the l r ,l r andl r terms are those discussed in the sections above for each respective outcome equation r in the system.

A.4 Updating Theory Inclusion Parameters via Reversible Jump
Suppose now that γ rt = 0 in the current state of the chain. In the relatively straightforward case in which there is a an r < r for which γ rt = 1 -and thus the inclusion of the γ rt will not affect identification matters, we may attempt to make γ rt non zero by proposing γ rt ∼ N (0, 1). We thus transition from (γ r , γ rt ), where γ rt = 0 to γ t with (γ t ) r = γ rt and (γ t ) s = (γ t ) s for all other s = r, a transformation with Jacobian 1. Letting µ ir = α r + T t=1 γ rt I it and µ ir = α r + T t=1 γ rt I it Since our prior sets all γ rs ∼ N (0, 1), the auxiliary density cancels with the larger prior we thus have that log pr(γ r , γ rt |·) ∝ where l r is the associated log-likelihood for equation r. This gives the necessary log densities for comparing γ rt ∈ R and γ rt = 0. See our discussion in the Conclusions section regarding more focused proposals of γ rt which could aid in mixing and would also make the expressions above slightly more involved.
When γ rt = 0 and γ st = 1 for s > r, some bookkeeping is necessary to adjust the system.
In particular, we sample α ∼ N (0, 1). We then create a new vector γ t where And similarly we move from I it to I it by setting I it = I it /α, β t = β/α and ν t = αν t . We therefore note that while we have changed all γ st values and the associated theory indices I it , only the likelihood for the dependent variable r is affected and comparisons can then be performed as discussed above.