1. | A useful discussion of the inherent trade-off between theoretical and empirical coherence can be found in Pagan ( 2003). |

2. | |

3. | See also An and Schorfheide ( 2007) for a survey of Bayesian methods used to evaluate DSGE models and an extensive list of related references. |

4. | In the present paper, we follow Pagan ( 2003) by using an unrestricted VAR as a standard benchmark to assess the empirical relevance of our proposed model. Potential extensions to Bayesian VARs belong to future research (though imposing a DSGE-type prior density on VAR in order to improve its theoretical relevance could negatively impact its empirical performance). |

5. | A similar message was delivered by Jerome Powell in his swearing-in ceremony as the new Chair of the Federal Reserve: “The success of our institution is really the result of the way all of us carry out our responsibilities. We approach every issue through a rigorous evaluation of the facts, theory, empirical analysis and relevant research. We consider a range of external and internal views; our unique institutional structure, with a Board of Governors in Washington and 12 Reserve Banks around the country, ensures that we will have a diversity of perspectives at all times. We explain our actions to the public. We listen to feedback and give serious consideration to the possibility that we might be getting something wrong. There is great value in having thoughtful, well-informed critics”. (See https://www.federalreserve.gov/newsevents/speech/powell20180213a.htm for the complete speech given during the ceremonial swearing-in on February 13, 2018). |

6. | |

7. | |

8. | By doing so, we avoid producing “series with spurious dynamic relations that have no basis in the underlying data-generating process” ( Hamilton 2018) as well as “mistaken influences about the strength and dynamic patterns of relationships” ( Wallis 1974). |

9. | There is no evidence that seasonality plays a determinant role in recessions and recoveries. Therefore, without loss of generality we rely upon seasonally adjusted data, instead of substantially increasing the number of model parameters by inserting quarterly dummies, potentially in every equation of the state VAR and/or ECM processes. |

10. | NBER recession dating is based upon GDP growth, not per capita GDP growth. However, our objective is not that of dating recessions, for which there exists an extensive and expanding literature. Instead, our objective is that of tracking macroeconomic aggregates at times of rapid changes, and for that purpose per capita data can be used without loss of generality. Note that if needed per capita projections can be ex-post back-transformed into global projections. |

11. | Since we rely upon real data, it is apparent that the great ratios vary considerably over time. Most importantly, their long term dynamics appear to be largely synchronized with business cycles providing a solid basis for our main objective of tracking recessions. |

12. | It is sometimes argued that in order to be interpreted as structural and/or to be instrumental for policy analysis, a parameter needs to be time invariant. We find such a narrow definition to be unnecessarily restrictive and often counterproductive. The very fact that some key structural parameters are found to vary over time in ways that are linked to the business cycles and can be inferred from a state VAR process paves the way for policy interventions on these variables, which might not be available under the more restricted interpretation of structural parameters. An example is provided in Section 5.7. |

13. | Potential exogenous variables are omitted for the ease of notation. |

14. | It is also meant to be parsimonious in the sense that the number of state variables in ${s}_{t}$ has to be less that the number of equations. |

15. | The benchmark VAR process for $\Delta {x}_{t}$ is given by $\Delta {x}_{t}={Q}_{0}+{Q}_{1}\Delta {x}_{t-1}+{Q}_{2}{x}_{t-2}+{w}_{t}$. |

16. | See Appendix A for the full description of the data. |

17. | |

18. | The risk aversion parameter $\varphi $ could also be considered, except for the fact that it is loosely identified to the extent that letting $\varphi $ vary over time serves no useful purpose, and worse, can negatively impact the subsequent recursive invariance of the model. |

19. | For the ease of interpretation, the second component of ${r}_{t}^{\ast}$ is redefined as the sum of the original great ratios in Equation ( 11). |

20. | It follows that standard cointegration rank tests are not applicable in this context. Bierens and Martins ( 2010) propose a vector ECM likelihood ratio test for time-invariant cointegration against time-varying cointegration. However, it is not applicable as such to our two stage model and, foremost, Figure 2 offers clear empirical evidence in favor of time-varying cointegration. |

21. | Individual elimination would be undermined by the fact that the estimated residual covariance matrix ${\widehat{\mathrm{\Sigma}}}_{D}$ is ill-conditioned with condition numbers of the order of $2.4\times {10}^{-5}$, which raises concerns about the validity of asymptotic critical values for system test statistics. One advantage of the sequential system elimination is that we can rely upon standard single equation t- and F-test statistics. |

22. | Both eliminations appear to be meaningful. First, equilibrium adjustments in ${n}_{t}$ are undoubtedly impeded by factors beyond agents control. Second, the elimination of $\Delta {\widehat{d}}_{t}$ is likely driven by the fact that the quarterly variations of ${\widehat{d}}_{t}$ are too small to have a significant impact on $\Delta {x}_{t}^{o}\left(\lambda \right)$. |

23. | We conduct the simulations using auxiliary draws from the error terms in Equations ( 3) and ( 4) using recursive estimates for ${\mathrm{\Sigma}}_{A}$ and ${\mathrm{\Sigma}}_{D}$. |

24. | We investigated a number of alternative time windows and arrived at similar qualitative results. |

25. | |

26. | It is important to note that the MAE and RMSE have inherent shortcomings because they measure a single variable’s forecast properties at a single horizon (see Clements and Hendry 1993). While measures do exist for assessing forecast accuracy for multiple series across multiple horizons, we believe that they would not impact our conclusions in view of the evidence provided further below (tables, figures, and hedgehog graphs). |

27. | Analogous figures for all other coefficients of the VAR-ECM model and of VAR benchmark are presented in Figures S2–S4 of the Online Supplementary Material and confirm the overall recursive invariance of our estimates and those of the VAR benchmark. |

28. | |

29. | Note that the 95 percent confidence intervals are those of the 1000 individual MC draws. The mean forecasts are much more accurate with standard deviations divided by the square root of 1000. |

30. | The average CRPS is given by $\mathrm{CRPS}\left(j,i;\widehat{\lambda}\right)=\frac{1}{{N}_{j}}{\sum}_{{T}_{\ast}\in {W}_{j}}{\int}_{\mathbb{R}}{\left[{\widehat{F}}_{m}\left({\widehat{x}}_{{T}_{\ast}+i}\right)-\mathbf{1}\left({\widehat{x}}_{{T}_{\ast}+i}\ge {x}_{{T}_{\ast}+i}\right)\right]}^{2}\mathrm{d}{\widehat{x}}_{{T}_{\ast}+i}$, where ${\widehat{x}}_{{T}_{\ast}+i}$ stands for ${\widehat{x}}_{{T}_{\ast}+i}\left({T}_{\ast};\widehat{\lambda}\right)$ and ${\widehat{F}}_{m}$ denotes the predictive CDF. See Grimit et al. ( 2006, Formula (3)) for the discrete version of the CRPS. |

31. | The CRPS accounts for the full predictive CDF and as such was not used as one of the calibration criteria for $\widehat{\lambda}$ since our objective is that of producing mean rather than point forecasts. |

32. | A four quarter lag allows us to produce 4-step ahead forecasts, without ex-ante forecasting any of the auxiliary series added into the VAR-ECM baseline model. 4-step ahead forecasts are available upon request and were not included in the paper as they only confirm further the ex-ante forecasting delays already illustrated in Figure A1, Figure A2 and Figure A3. |

33. | The history of earlier postwar recessions unambiguously suggest that even if such series were available for the entire postwar period, they would likely fail to explain earlier recessions and would, therefore, be irrelevant at those time. Hence, we believe that any potential bias resulting from the missing data would also be insignificant. This is confirmed further by the fact that the auxiliary series incorporated into the VAR component of the VAR-ECM model turn out to be largely insignificant for the Great Recession, even though they are directly related to its cause. |