# Evaluating Forecasts, Narratives and Policy Using a Test of Invariance

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

In his speech on the Report, the Governor stressed:CPI inflation looked set to rise sharply in the near term. Further out, downward pressure from the persistent margin of spare capacity was likely to bear down on inflation for some time to come.

We term the published numerical forecast a ‘direct forecast’, whereas one constructed from the narrative, as in e.g., Ericsson (2016) for the USA, is called a ‘derived forecast’. Taken together, we call the joint production of the numerical forecast and the accompanying narrative a ‘forediction’, intended to convey a forecast made alongside a story (It is more likely than not that later this year I will need to write a letter to the Chancellor to explain why inflation has fallen more than 1 percentage point below the target (of 2%).The stimulus to demand, combined with a turnaround in the stock cycle and the effects of the depreciation in sterling, is likely to drive a recovery in activity.

**diction**) that describes the forecast verbally.2 In this paper, we investigate whether a close link between the direct and derived forecasts sustains an evaluation of the resulting foredictions and their associated policies.

## 2. Forediction: Linking Forecasts, Narratives, and Policies

Such an approach is what we term forediction: any claims as to its success need to be evaluated in the light of the widespread forecast failure precipitated by the 2008–9 Financial Crisis. As we show below, closely tying narratives and forecasts may actually achieve the opposite of what those authors seem to infer, by rejecting both the narratives and associated policies when forecasts go wrong....forecasting does not simply amount to producing a set of figures: rather, it aims at assembling a fully-fledged view—one may call it a “story behind the figures”—of what could happen: a story that has to be internally consistent, whose logical plausibility can be assessed, whose structure is sufficiently articulated to allow one to make a systematic comparison with the wealth of information that accumulates as time goes by.

#### 2.1. Do Narratives or Forecasts Come First?

Not only could add factors be used to match forecasts to a narrative, if policy makers had a suite of forecasting models at their disposal, then the weightings on different models could be adjusted to match their pooled forecast to the narrative, or both could be modified in an iterative process. Genberg and Martinez (2014) show the link between narratives and forecasts at the International Monetary Fund (IMF), where forecasts are generated on a continuous basis through the use of a spreadsheet framework that is occasionally supplemented by satellite models, as described in Independent Evaluation Office (2014). Such forecasts ‘form the basis of the analysis [...] and of the [IMF’s] view of the outlook for the world economy’ (p.1). Thus, adjustments to forecasts and changes to the associated narratives tend to go hand-in-hand at many major institutions.The projections for growth and inflation are underpinned by four key judgements.

#### 2.2. Is There a Link between Forecasts and Policy?

## 3. Forecast Failure and Forediction Failure

**by itself**is not a sufficient condition for rejecting an underlying theory or any associated forecasting model, see, e.g., Castle and Hendry (2011). Furthermore, forecasting success may, but need not, ‘corroborate’ the forecasting model and its supporting theory: see Clements and Hendry (2005). Indeed, Hendry (2006) demonstrates a robust forecasting device that can outperform the forecasts from an estimated in-sample DGP after a location shift. Nevertheless, when several rival explanations exist, forecast failure can play an important role in distinguishing between them as discussed by Spanos (2007). Moreover, systematic forecast failure almost always rejects any narrative associated with the failed forecasts and any policy implications therefrom, so inevitably results in forediction failure.

#### 3.1. A Simple Policy Data Generation Process

#### 3.1.1. The DGP Shifts Unexpectedly

#### 3.1.2. The Policy Change

#### 3.1.3. The Role of Mis-Specification

#### 3.1.4. The Sources of Forecast Failure

#### 3.2. What Can Be Learned from Systematic Forecast Failure?

**formulation**of a forecasting model, or the theory from which it was derived, as appropriate estimation would avoid such problems. For example, when the forecast origin has been mis-measured by a statistical agency resulting in a large forecast error in iii(a), one should not reject either the forecasting model or the underlying theory. Nevertheless, following systematic forecast failure, any policy conclusions that had been drawn and their accompanying narrative should be rejected as incorrect, at least until more accurate data are produced.

#### 3.3. A Numerical Illustration

#### 3.4. Implications of Forediction Failure for Inter-Temporal Theories

## 4. Step-Indicator Saturation (SIS) Based Test of Invariance

#### 4.1. Potency of SIS at Stage 1

#### 4.2. Null Rejection Frequency of the SIS Test for Invariance at Stage 2

**without selection**, using an $\mathsf{F}$-test, denoted ${\mathsf{F}}_{\mathsf{Inv}}$, at significance level ${\alpha}_{2}$ which rejects when ${\mathsf{F}}_{\mathsf{Inv}\left(\mathit{\tau}=\mathbf{0}\right)}>{c}_{{\alpha}_{2}}$. Under the null of invariance, this ${\mathsf{F}}_{\mathsf{Inv}}$-test should have an approximate F-distribution, and thereby allow an appropriately sized test. Under the alternative that $\mathit{\tau}\ne \mathbf{0}$, ${\mathsf{F}}_{\mathsf{Inv}}$ will have power, as discussed in Section 4.4.

#### 4.3. Monte Carlo Evidence on the Null Rejection Frequency

#### 4.3.1. Constant Marginal

#### 4.3.2. Location Shifts in $\left\{{x}_{t}\right\}$

#### 4.3.3. Variance Shifts in $\left\{{x}_{t}\right\}$

#### 4.4. Failure of Invariance

#### 4.5. Second-Stage Test

#### 4.6. Mis-Timed Indicator Selection in the Static Bivariate Case

## 5. Simulating the Potencies of the SIS Invariance Test

#### 5.1. Optimal Infeasible Indicator-Based $\mathsf{F}$-Test

#### 5.2. Potency of the SIS-Based Test

## 6. Application to the Small Artificial-Data Policy Model

#### 6.1. Multiplicative Indicator Saturation

## 7. Forecast Error Taxonomy and Associated Tests

## 8. How to Improve Future Forecasts and Foredictions

## 9. Conclusions

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References

- Akram, Q. Farooq, and Ragnar Nymoen. 2009. Model Selection for Monetary Policy Analysis: How Important is Empirical Validity? Oxford Bulletin of Economics and Statistics 71: 35–68. [Google Scholar] [CrossRef]
- Bank of England. 2015. Inflation Report, August, 2015. London: Bank of England Monetary Policy Committee. [Google Scholar]
- Cartwright, Nancy. 1989. Nature’s Capacities and their Measurement. Oxford: Clarendon Press. [Google Scholar]
- Castle, Jennifer L., Jurgen A. Doornik, David F. Hendry, and Felix Pretis. 2015. Detecting Location Shifts During Model Selection by Step-Indicator Saturation. Econometrics 3: 240–64. [Google Scholar] [CrossRef]
- Castle, Jennifer L., Jurgen A. Doornik, and David F. Hendry. 2016. Robustness and Model Selection. Unpublished paper. Oxford, UK: Economics Department, University of Oxford. [Google Scholar]
- Castle, Jennifer L., and David F. Hendry. 2011. On Not Evaluating Economic Models by Forecast Outcomes. Istanbul University Journal of the School of Business Administration 40: 1–21. [Google Scholar]
- Clements, Michael P., and David F. Hendry. 1998. Forecasting Economic Time Series. Cambridge: Cambridge University Press. [Google Scholar]
- Clements, Michael P., and David F. Hendry. 2005. Evaluating a Model by Forecast Performance. Oxford Bulletin of Economics and Statistics 67: 931–56. [Google Scholar] [CrossRef]
- Clements, Michael P., and J. James Reade. 2016. Forecasting and Forecast Narratives: The Bank of England Inflation Reports. Discussion paper. Reading, UK: Economics Department, University of Reading. [Google Scholar]
- Doornik, Jurgen A. 2007. Econometric Model Selection With More Variables Than Observations. Working paper. Oxford, UK: Economics Department, University of Oxford. [Google Scholar]
- Doornik, Jurgen A. 2009. Autometrics. In The Methodology and Practice of Econometrics. Edited by Jennifer L. Castle and Neil Shephard. Oxford: Oxford University Press, pp. 88–121. [Google Scholar]
- Doornik, Jurgen A. 2013. OxMetrics: An Interface to Empirical Modelling, 7th ed. London: Timberlake Consultants Press. [Google Scholar]
- Doornik, Jurgen A., and David F. Hendry. 2013. Empirical Econometric Modelling using PcGive: Volume I., 7th ed. London: Timberlake Consultants Press. [Google Scholar]
- Ellison, Martin, and Thomas J. Sargent. 2012. A Defense of the FOMC. International Economic Review 53: 1047–65. [Google Scholar] [CrossRef]
- Engle, Robert F., and David F. Hendry. 1993. Testing Super Exogeneity and Invariance in Regression Models. Journal of Econometrics 56: 119–39. [Google Scholar] [CrossRef]
- Engle, Robert F., David F. Hendry, and Jean-Francois Richard. 1983. Exogeneity. Econometrica 51: 277–304. [Google Scholar] [CrossRef]
- Ericsson, Neil R. 2012. Detecting Crises, Jumps, and Changes in Regime. Working paper. Washington, D.C., USA: Federal Reserve Board of Governors. [Google Scholar]
- Ericsson, Neil R. 2016. Eliciting GDP Forecasts from the FOMC’s Minutes Around the Financial Crisis. International Journal of Forecasting 32: 571–83. [Google Scholar] [CrossRef]
- Ericsson, Neil R. 2017. Economic Forecasting in Theory and Practice: An Interview with David F. Hendry. International Journal of Forecasting 33: 523–42. [Google Scholar] [CrossRef]
- Ericsson, Neil R., and Erica L. Reisman. 2012. Evaluating a Global Vector Autoregression for Forecasting. International Advances in Economic Research 18: 247–58. [Google Scholar] [CrossRef]
- Favero, Carlo, and David F. Hendry. 1992. Testing the Lucas Critique: A Review. Econometric Reviews 11: 265–306. [Google Scholar] [CrossRef]
- Genberg, Hans, and Andrew B. Martinez. 2014. On the Accuracy and Efficiency of IMF forecasts: A Survey and some Extensions. IEO Background Paper BP/14/04. Washington, D.C., USA: Independent Evaluation Office of the International Monetary Fund. [Google Scholar]
- Hendry, David F. 1988. The Encompassing Implications of Feedback versus Feedforward Mechanisms in Econometrics. Oxford Economic Papers 40: 132–49. [Google Scholar] [CrossRef]
- Hendry, David F. 2001. How Economists Forecast. In Understanding Economic Forecasts. Edited by David F. Hendry and Neil R. Ericsson. Cambridge: MIT Press, pp. 15–41. [Google Scholar]
- Hendry, David F. 2004. Causality and Exogeneity in Non-stationary Economic Time Series. In New Directions in Macromodelling. Edited by A. Welfe. Amsterdam: North Holland, pp. 21–48. [Google Scholar]
- Hendry, David F. 2006. Robustifying Forecasts from Equilibrium-Correction Models. Journal of Econometrics 135: 399–426. [Google Scholar] [CrossRef]
- Hendry, David F., and Søren Johansen. 2015. Model Discovery and Trygve Haavelmo’s Legacy. Econometric Theory 31: 93–114. [Google Scholar] [CrossRef]
- Hendry, David F., Søren Johansen, and Carlos Santos. 2008. Automatic Selection of Indicators in a Fully Saturated Regression. Computational Statistics 33: 317–35, Erratum, 337–39. [Google Scholar] [CrossRef]
- Hendry, David F., and Hans-Marti Krolzig. 2005. The Properties of Automatic Gets Modelling. Economic Journal 115: C32–C61. [Google Scholar] [CrossRef]
- Hendry, David F., and Michael Massmann. 2007. Co-breaking: Recent Advances and a Synopsis of the Literature. Journal of Business and Economic Statistics 25: 33–51. [Google Scholar] [CrossRef]
- Hendry, David F., and Grayham E. Mizon. 2011. Econometric Modelling of Time Series with Outlying Observations. Journal of Time Series Econometrics 3: 1–26. [Google Scholar] [CrossRef]
- Hendry, David F., and Grayham E. Mizon. 2012. Open-model Forecast-error Taxonomies. In Recent Advances and Future Directions in Causality, Prediction, and Specification Analysis. Edited by Xiaohong Chen and Norman R. Swanson. New York: Springer, pp. 219–40. [Google Scholar]
- Hendry, David F., and Grayham E. Mizon. 2014. Unpredictability in Economic Analysis, Econometric Modeling and Forecasting. Journal of Econometrics 182: 186–95. [Google Scholar] [CrossRef]
- Hendry, David F., and Felix Pretis. 2016. Quantifying the Uncertainty around Break Dates in Models using Indicator Saturation. Working paper. Oxford, UK: Economics Department, Oxford University. [Google Scholar]
- Hendry, David F., and Carlos Santos. 2010. An Automatic Test of Super Exogeneity. In Volatility and Time Series Econometrics. Edited by Mark W. Watson, Tim Bollerslev and Jeffrey Russell. Oxford: Oxford University Press, pp. 164–93. [Google Scholar]
- Independent Evaluation Office. 2014. IMF Forecasts: Process, Quality, and Country Perspectives. Technical report. Washington: International Monetary Fund. [Google Scholar]
- Jansen, Eilev S., and Timo Teräsvirta. 1996. Testing Parameter Constancy and Super Exogeneity in Econometric Equations. Oxford Bulletin of Economics and Statistics 58: 735–63. [Google Scholar] [CrossRef]
- Johansen, Søren, and Bent Nielsen. 2009. An Analysis of the Indicator Saturation Estimator as a Robust Regression Estimator. In The Methodology and Practice of Econometrics. Edited by Jennifer L. Castle and Neil Shephard. Oxford: Oxford University Press, pp. 1–36. [Google Scholar]
- Johansen, Søren, and Bent Nielsen. 2016. Asymptotic Theory of Outlier Detection Algorithms for Linear Time Series Regression Models. Scandinavian Journal of Statistics 43: 321–48. [Google Scholar] [CrossRef]
- Kitov, Oleg I., and Morten N. Tabor. 2015. Detecting Structural Changes in Linear Models: A Variable Selection Approach using Multiplicative Indicator Saturation. Unpublished paper. Oxford, UK: University of Oxford. [Google Scholar]
- Krolzig, Hans-Martin, and Juan Toro. 2002. Testing for Super-exogeneity in the Presence of Common Deterministic Shifts. Annales d’Economie et de Statistique 67/68: 41–71. [Google Scholar] [CrossRef]
- Pagan, Adrian R. 2003. Report on Modelling and Forecasting at the Bank of England. Bank of England Quarterly Bulletin Spring. Available online: http://www.bankofengland.co.uk/archive/Documents/historicpubs/qb/2003/qb030106.pdf (accessed on 5 October 2015).
- Psaradakis, Zacharias, and Martin Sola. 1996. On the Power of Tests for Superexogeneity and Structural Invariance. Journal of Econometrics 72: 151–75. [Google Scholar] [CrossRef]
- Pretis, Felix, James Reade, and Genaro Sucarrat. 2016. General-to-Specific (GETS) Modelling And Indicator Saturation With The R Package Gets. Working paper, 794. Oxford, UK: Economics Department, Oxford University. [Google Scholar]
- Romer, Christina D., and David H. Romer. 2008. The FOMC versus the Staff: Where can Monetary Policymakers add Value? American Economic Review 98: 230–35. [Google Scholar] [CrossRef]
- Sinclair, Tara M., Pao-Lin Tien, and Edward Gamber. 2016. Do Fed Forecast Errors Matter? CAMA Working Paper No. 47/2016. Canberra, Australia: Centre for Applied Macroeconomic Analysis (CAMA). [Google Scholar]
- Siviero, Stefano, and Daniele Terlizzese. 2001. Macroeconomic Forecasting: Debunking a Few Old Wives’ Tales. Discussion paper 395. Rome, Italy: Research Department, Banca d’Italia. [Google Scholar]
- Spanos, Aris. 2007. Curve-Fitting, the Reliability of Inductive Inference and the Error-Statistical Approach. Philosophy of Science 74: 1046–66. [Google Scholar] [CrossRef]
- Stekler, Herman O., and Hilary Symington. 2016. Evaluating Qualitative Forecasts: The FOMC Minutes, 2006–2010. International Journal of Forecasting 32: 559–70. [Google Scholar] [CrossRef]
- Stenner, Alfred J. 1964. On Predicting our Future. Journal of Philosophy 16: 415–28. [Google Scholar] [CrossRef]
- Zhang, Kun, Jiji Zhang, and Bernhard Schölkopf. 2015. Distinguishing Cause from Effect Based on Exogeneity. Available online: http://arxiv.org/abs/1504.05651 (accessed on 10 October 2015).

1 | Bank of England Inflation Report, November 2009, p.47. See http://www.bankofengland.co.uk/publications/Documents/inflationreport/ir09nov.pdf. |

2 | |

3 | Bank of England Minutes of the Monetary Policy Committee, November 2009, p.8. See http://www.bankofengland.co.uk/publications/minutes/Documents/mpc/pdf/2009/mpc0911.pdf. |

4 | |

5 | All of these indicator saturation methods are implemented in the Autometrics algorithm in PcGive: see Doornik (2009) and Doornik and Hendry (2013), which can handle more variables than observations using block path searches with both expanding and contracting phases as in Hendry and Krolzig (2005), and Doornik (2007). An R version is available at https://cran.r-project.org/web/packages/gets/index.html: see Pretis et al. (2016). |

6 | As noted above, the lagged impact of the policy change causes ${x}_{191}$ to overshoot, so ${\tilde{x}}_{i,t}$ is somewhat above ${x}_{t}$ over the forecast horizon, albeit a dramatic improvement over Figure 3 (I): using 2-periods to estimate the IC solves that. |

**Figure 1.**CPI inflation and 4-quarter real GDP growth forecasts, based on market interest rate expectations and £200 billion asset purchases: November 2009 Bank of England Inflation Report, with October 2011 outturns.

**Figure 2.**(

**I**,

**II**) both forediction and policy failures from the forecast failure following a changed DGP with a location shift and a policy change combined; (

**III**,

**IV**) both forediction and policy failures after a changed DGP without a location shift but with a policy change.

**Figure 3.**(

**I**) Forecast failure for ${x}_{t}$ by ${\tilde{x}}_{t}$ even with SIS; (

**II**) Forecast failure in ${\tilde{y}}_{t}$ even augmented by the SIS indicators selected from the margin model for ${x}_{t}$; (

**III**) Smaller forecast failure from ${\widehat{y}}_{t}$ based on the in-sample DGP; (

**IV**) Least forecast failure from ${\widehat{y}}_{t}$ based on the in-sample DGP with SIS.

**Figure 4.**(

**I**) Forecasts for ${y}_{t}$ by ${\tilde{y}}_{t}^{\ast}$, just using SIS${}^{se}$ for the first policy-induced shift and SIS${}_{IC}^{se}$ at observation 191; (

**II**) Forecasts for ${y}_{t}$ by ${\tilde{y}}_{t}^{\ast}$ also with SIS in-sample; (

**III**) Forecasts from $T=192$ for ${y}_{t}$ by ${\tilde{y}}_{i,t}$ with a 1-observation IC but without SIS; (

**IV**) Forecasts for ${y}_{t}$ by ${\tilde{y}}_{i,t}$ also with SIS in-sample.

Component | Source | ||
---|---|---|---|

Mis-Estimation | Mis-Specification | Change | |

$\begin{array}{c}\mathrm{Equilibrium}\text{}\mathrm{mean}\hfill \\ \mathrm{Reject}\hfill \end{array}$ | $\begin{array}{c}i\left(a\right)\\ \mathrm{FN}?,\text{}\mathrm{PV}?\end{array}$ | $\begin{array}{c}i\left(b\right)\\ \mathrm{FM},\text{}\mathrm{FN},\text{}\mathrm{PV}\end{array}$ | $\begin{array}{c}i\left(c\right)\\ \mathrm{FM},\text{}\mathrm{FN},\text{}\mathrm{PV}\end{array}$ |

$\begin{array}{c}\mathrm{Slope}\text{}\mathrm{parameter}\hfill \\ \mathrm{Reject}\text{}(\mathrm{if}\text{}\delta \ne 0)\hfill \end{array}$ | $\begin{array}{c}ii\left(a\right)\\ \mathrm{FN}?,\text{}\mathrm{PV}?\end{array}$ | $\begin{array}{c}ii\left(b\right)\\ \mathrm{FM},\text{}\mathrm{FN},\text{}\mathrm{PV}\end{array}$ | $\begin{array}{c}ii\left(c\right)\\ \mathrm{FM},\text{}\mathrm{FN},\text{}\mathrm{PV}\end{array}$ |

$\begin{array}{c}\mathrm{Unobserved}\text{}\mathrm{terms}\hfill \\ \mathrm{Reject}\hfill \end{array}$ | $\begin{array}{c}iii\left(a\right)\text{}\left[\mathrm{forecast}\text{}\mathrm{origin}\right]\\ \mathrm{FN},\text{}\mathrm{PV}\end{array}$ | $\begin{array}{c}iii\left(b\right)\text{}\left[\mathrm{omitted}\text{}\mathrm{variable}\right]\\ \mathrm{FM}?,\text{}\mathrm{FN}?,\text{}\mathrm{PV}?\end{array}$ | $\begin{array}{c}iii\left(c\right)\text{}\left[\mathrm{innovation}\text{}\mathrm{error}\right]\\ \mathrm{FN}?,\text{}\mathrm{PV}?\end{array}$ |

Forecast | Narrative | Policy |
---|---|---|

The outlook for inflation is | Core inflation has been elevated in | The Policy Committee today |

that it will be $\mathbf{6.25}\%$ this | recent months. High levels of resource | decided to raise its target |

year followed by a moderate | utilization and high prices of energy | interest rate by $\mathbf{100}$ basis points |

decline to $\mathbf{5.25}\%$ next year. | and commodities have the potential | due to ongoing concerns |

to sustain inflationary pressures | about inflation pressures. | |

but should moderate going forward. |

$\mathrm{in}-\mathrm{sample}$ | ${\gamma}_{0}$ | ${\gamma}_{1}$ | ${\gamma}_{2}$ | ${\sigma}_{\u03f5}^{2}$ | ${\beta}_{0}$ | ${\beta}_{1}$ | ${\beta}_{2}$ | ${\sigma}_{\nu}^{2}$ | $\delta $ | ${\mu}_{z}$ | ${\mu}_{{w}_{1}}$ | ${\mu}_{{w}_{2}}$ |

$\mathrm{value}$ | $1.0$ | $1.0$ | $1.0$ | $0.1$ | $1.0$ | $-1.0$ | $1.0$ | $0.1$ | $0$ | $0$ | $2$ | $2$ |

$\mathrm{out}-\mathrm{of}-\mathrm{sample}$ | ${\gamma}_{0}^{\ast}$ | ${\gamma}_{1}^{\ast}$ | ${\gamma}_{2}^{\ast}$ | ${\sigma}_{\u03f5}^{2}$ | ${\beta}_{0}^{\ast}$ | ${\beta}_{1}^{\ast}$ | ${\beta}_{2}^{\ast}$ | ${\sigma}_{\nu}^{2}$ | $\delta $ | ${\mu}_{z}^{\ast}$ | ${\mu}_{{w}_{1}}$ | ${\mu}_{{w}_{2}}$ |

$\mathrm{value}$ | $2.0$ | $0.5$ | $2.0$ | $0.1$ | $2.0$ | $-1.5$ | $2.0$ | $0.1$ | $1$ | $1$ | $2$ | $2$ |

**Table 4.**SIS simulations under the null of super exogeneity for a constant marginal process. ‘% no indicators’ records the percentage of replications in which no step indicators are retained, so stage 2 is redundant. Stage 2 gauge records the probability of the ${\mathsf{F}}_{\mathsf{Inv}}$-test falsely rejecting for the included step indicators at ${\alpha}_{2}=0.01$.

${\mathit{\alpha}}_{1}=0.01={\mathit{\alpha}}_{2}$ | $\mathit{T}=50$ | $\mathit{T}=100$ | $\mathit{T}=200$ |
---|---|---|---|

Stage 1 gauge | 0.035 | 0.033 | 0.044 |

% no indicators | 0.287 | 0.098 | 0.019 |

Stage 2 gauge | 0.006 | 0.009 | 0.009 |

**Table 5.**SIS simulations under the null of super exogeneity for a location shift in the marginal process. Stage 1 gauge is for retained step indicators at times with no shifts; and stage 1 potency is for when the exact indicators matching step shifts are retained, with no allowance for mis-timing.

$\mathit{\pi}=2$ | $\mathit{\pi}=10$ | |||||
---|---|---|---|---|---|---|

${\mathit{\alpha}}_{\mathbf{1}}=\mathbf{0.01}$ | $\mathit{T}=\mathbf{50}$ | $\mathit{T}=\mathbf{100}$ | $\mathit{T}=\mathbf{200}$ | $\mathit{T}=\mathbf{50}$ | $\mathit{T}=\mathbf{100}$ | $\mathit{T}=\mathbf{200}$ |

Stage 1 gauge | 0.034 | 0.027 | 0.043 | 0.018 | 0.018 | 0.035 |

Stage 1 potency | 0.191 | 0.186 | 0.205 | 0.957 | 0.962 | 0.965 |

% no indicators | 0.067 | 0.022 | 0.000 | 0.000 | 0.000 | 0.000 |

Stage 2 gauge: | 0.009 | 0.010 | 0.011 | 0.010 | 0.009 | 0.010 |

**Table 6.**SIS simulations under the null of super exogeneity for a variance shift in the marginal process. Legend as for Table 5.

$\mathit{\theta}=2$ | $\mathit{\theta}=10$ | |||||
---|---|---|---|---|---|---|

${\mathit{\alpha}}_{1}=0.01$ | $\mathit{T}=\mathbf{50}$ | $\mathit{T}=\mathbf{100}$ | $\mathit{T}=\mathbf{200}$ | $\mathit{T}=\mathbf{50}$ | $\mathit{T}=\mathbf{100}$ | $\mathit{T}=\mathbf{200}$ |

Stage 1 gauge | 0.042 | 0.051 | 0.067 | 0.060 | 0.083 | 0.113 |

Stage 1 potency | 0.030 | 0.030 | 0.035 | 0.041 | 0.043 | 0.071 |

% no indicators | 0.381 | 0.135 | 0.015 | 0.342 | 0.091 | 0.005 |

Stage 2 gauge | 0.006 | 0.008 | 0.009 | 0.006 | 0.009 | 0.010 |

**Table 7.**Power of the optimal infeasible $\mathsf{F}$-test for a failure of invariance using a known step indicator for ${\alpha}_{2}=0.01$ at ${T}_{1}=80$, $T=100$, $M=1,000$.

$\mathbf{d}:$$\mathit{\rho}$ | 0.75 | 1 | 1.5 | 1.75 |
---|---|---|---|---|

1 | 1.000 | 1.000 | 0.886 | 0.270 |

2 | 1.000 | 1.000 | 1.000 | 0.768 |

2.5 | 1.000 | 1.000 | 1.000 | 0.855 |

3 | 1.000 | 1.000 | 1.000 | 0.879 |

4 | 1.000 | 1.000 | 1.000 | 0.931 |

**Table 8.**Stage 1 gauge and potency at ${\alpha}_{1}=0.01$ for ${T}_{1}=80$, $T=100$, $M=1000$ and $\beta =2$.

Stage 1 Gauge | Stage 1 Potency | |||||||
---|---|---|---|---|---|---|---|---|

$\mathbf{d}:\mathit{\rho}$ | 0.75 | 1 | 1.5 | 1.75 | 0.75 | 1 | 1.5 | 1.75 |

1 | 0.038 | 0.040 | 0.041 | 0.039 | 0.231 | 0.223 | 0.227 | 0.204 |

2 | 0.029 | 0.028 | 0.030 | 0.030 | 0.587 | 0.575 | 0.603 | 0.590 |

2.5 | 0.026 | 0.026 | 0.025 | 0.025 | 0.713 | 0.737 | 0.730 | 0.708 |

3 | 0.023 | 0.023 | 0.024 | 0.025 | 0.820 | 0.813 | 0.803 | 0.817 |

4 | 0.020 | 0.021 | 0.020 | 0.022 | 0.930 | 0.930 | 0.922 | 0.929 |

**Table 9.**Stage 2 potency for a failure of invariance at ${\alpha}_{2}=0.01$, ${T}_{1}=80$, $T=100$, and $M=1000$.

${\mathit{\alpha}}_{1}=0.025$ | ${\mathit{\alpha}}_{1}=0.01$ | ${\mathit{\alpha}}_{1}=0.005$ | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

$\mathit{d}:\mathit{\rho}$ | 0.75 | 1 | 1.5 | 1.75 | 0.75 | 1 | 1.5 | 1.75 | 0.75 | 1 | 1.5 | 1.75 |

1 | 0.994 | 0.970 | 0.378 | 0.051 | 0.918 | 0.908 | 0.567 | 0.127 | 0.806 | 0.798 | 0.535 | 0.123 |

2 | 1.000 | 1.000 | 0.966 | 0.355 | 1.000 | 1.000 | 0.994 | 0.604 | 0.999 | 0.999 | 0.997 | 0.653 |

2.5 | 1.000 | 1.000 | 0.995 | 0.499 | 1.000 | 1.000 | 0.999 | 0.744 | 1.000 | 0.999 | 0.998 | 0.786 |

3 | 1.000 | 1.000 | 0.999 | 0.594 | 1.000 | 1.000 | 1.000 | 0.821 | 0.999 | 0.999 | 1.000 | 0.861 |

4 | 1.000 | 1.000 | 0.998 | 0.712 | 1.000 | 1.000 | 0.999 | 0.912 | 0.999 | 1.000 | 0.999 | 0.942 |

Component | Problem | ||
---|---|---|---|

Mis-Estimation | Mis-Specification | Change | |

$\begin{array}{c}\mathrm{Equilibrium}\text{}\mathrm{mean}\hfill \\ \mathrm{Source}\hfill \\ \mathrm{Test}\hfill \end{array}$ | $\begin{array}{c}i\left(a\right)\text{}\left[\mathrm{uncertainty}\right]\\ \left({\mu}_{y,e}-{\tilde{\mu}}_{y}\right)\\ \mathrm{SIS},\text{}\mathrm{IIS}\end{array}$ | $\begin{array}{c}i\left(b\right)\text{}\left[\mathrm{inconsistent}\right]\\ +\left({\mu}_{y}-{\mu}_{y,e}\right)\\ \mathrm{SIS},\text{}\mathrm{IIS}\end{array}$ | $\begin{array}{c}i\left(c\right)\text{}\left[\mathrm{shift}\right]\\ +\left({\mu}_{y}^{\ast}-{\mu}_{y}\right)\\ \mathrm{SIS},\text{}\mathrm{IIS},\text{}\mathrm{SIS}{}^{se}\end{array}$ |

$\begin{array}{c}\mathrm{Slope}\text{}\mathrm{parameter}\hfill \\ \mathrm{Source}\hfill \\ (\delta \ne 0)\hfill \\ \mathrm{Test}\hfill \end{array}$ | $\begin{array}{c}ii\left(a\right)\text{}\left[\mathrm{uncertainty}\right]\\ +\left({\left({\lambda}_{1}{\theta}_{1}\right)}_{e}-{\tilde{\lambda}}_{1}{\tilde{\theta}}_{1}\right)\times \\ \left({z}_{T}-{\mu}_{z}+\delta \right)\\ \mathrm{SIS}\end{array}$ | $\begin{array}{c}ii\left(b\right)\text{}\left[\mathrm{inconsistent}\right]\\ +\left({\gamma}_{1}{\beta}_{1}-{\left({\lambda}_{1}{\theta}_{1}\right)}_{e}\right)\times \\ \left({z}_{T}-{\mu}_{z}+\delta \right)\\ \mathrm{SIS}\end{array}$ | $\begin{array}{c}ii\left(c\right)\text{}\left[\mathrm{break}\right]\\ +\left({\gamma}_{1}^{\ast}{\beta}_{1}^{\ast}-{\gamma}_{1}{\beta}_{1}\right)\times \\ \left({z}_{T}-{\mu}_{z}+\delta \right)\\ \mathrm{MIS},\text{}\mathrm{SIS},\text{}\mathrm{IIS},\text{}\mathrm{SIS}{}^{se}\end{array}$ |

$\begin{array}{c}\mathrm{Unobserved}\text{}\mathrm{terms}\hfill \\ \mathrm{Source}\hfill \\ \mathrm{Test}\hfill \end{array}$ | $\begin{array}{c}iii\left(a\right)\text{}\left[\mathrm{forecast}\text{}\mathrm{origin}\right]\\ -\left({\gamma}_{1}^{\ast}{\beta}_{1}^{\ast}-{\left({\lambda}_{1}{\theta}_{1}\right)}_{e}\right)\times \\ \left({z}_{T}-{\tilde{z}}_{T}\right)\\ \mathrm{SIS},\text{}\mathrm{IIS}\end{array}$ | $\begin{array}{c}iii\left(b\right)\text{}\left[\mathrm{omitted}\text{}\mathrm{variable}\right]\\ +{\gamma}_{2}^{\ast}\left({w}_{1,T+1}-{\mu}_{{w}_{1}}\right)\\ +{\gamma}_{1}^{\ast}{\beta}_{2}^{\ast}\left({w}_{2,T+1}-{\mu}_{{w}_{2}}\right)\\ \mathrm{IIS},\text{}\mathrm{SIS}\end{array}$ | $\begin{array}{c}iii\left(c\right)\text{}\left[\mathrm{innovation}\text{}\mathrm{error}\right]\\ +{\u03f5}_{T+1}\\ +{\gamma}_{1}^{\ast}{\nu}_{T+1}\\ \mathrm{IIS}\end{array}$ |

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Castle, J.L.; Hendry, D.F.; Martinez, A.B. Evaluating Forecasts, Narratives and Policy Using a Test of Invariance. *Econometrics* **2017**, *5*, 39.
https://doi.org/10.3390/econometrics5030039

**AMA Style**

Castle JL, Hendry DF, Martinez AB. Evaluating Forecasts, Narratives and Policy Using a Test of Invariance. *Econometrics*. 2017; 5(3):39.
https://doi.org/10.3390/econometrics5030039

**Chicago/Turabian Style**

Castle, Jennifer L., David F. Hendry, and Andrew B. Martinez. 2017. "Evaluating Forecasts, Narratives and Policy Using a Test of Invariance" *Econometrics* 5, no. 3: 39.
https://doi.org/10.3390/econometrics5030039