# Detecting Location Shifts during Model Selection by Step-Indicator Saturation

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Step-Indicator Saturation

## 3. Null Retention Frequency (Gauge) of Step-Indicator Selection

**Figure 1.**Illustrating split-half one-cut step-indicator saturation (SIS) under the null of no shift in Equation (4).

#### Retention of Step Indicators Under the Null Hypothesis of No Shift

Gauge | |||
---|---|---|---|

α | Overall | ${D}_{1}$ | ${D}_{2}$ |

0.001 | 0.0018 | 0.0018 | 0.0018 |

0.01 | 0.013 | 0.013 | 0.013 |

0.05 | 0.056 | 0.057 | 0.054 |

## 4. Analytical Power of a Step-Indicator Test for a Known Mean Shift

**Theorem 1.**Let ${\psi}_{{\lambda}_{1}}^{*}=\sqrt{{T}^{*}}{\lambda}_{1}/{\sigma}_{\u03f5}$ be the non-centrality of the t-test of ${\mathsf{H}}_{0}$:${\delta}_{{T}_{1}}=0$ in Equation (7) for the DGP in Equation (6) where ${T}^{*}={T}_{1}(T-{T}_{1})/T$, then:

**Proof.**As ${\sum}_{t=1}^{T}{1}_{\left(\right)}={\sum}_{t=1}^{{T}_{1}}{1}_{\left(\right)}$, estimating (7) delivers:

## 5. Potency of SIS for an Unknown Location Shift

#### 5.1. Unknown Shift Period Matched by a Single Step Indicator

**Theorem 2.**The distribution of the least-squares estimator of ${\gamma}_{\left(1\right)}$ in Equation (13) is:

**r**is a $T/2\times 1$ vector with unity in the ${T}_{1}$-th position, and zeroes elsewhere.

**Proof.**From Equation (11):

**r**is a $T/2\times 1$ vector with unity in the ${T}_{1}$-th position and zeroes elsewhere, so:

**Theorem 3.**The distribution of the least-squares estimator of ${\gamma}_{\left(2\right)}$ in Equation (27) is:

**c**is a $T/2\times 1$ vector of ones, and ${\mathbf{j}}^{\prime}$ is a $T/2\times 1$ vector of zeroes other than unity in its first position, so only the first element of ${\widehat{\gamma}}_{\left(2\right)}$ depends on ${\lambda}_{1}$.

**Proof.**From (11), estimation yields:

#### 5.2. Simulating an Unknown Shift Period Matched by a Single Indicator

**Table 2.**Retention frequency of ${\iota}_{{T}_{1}}$ for varying shift lengths l and magnitudes, ${\lambda}_{1}$, at $\alpha =0.01$.

Algorithm | ${\lambda}_{1}$ | $l=1$ | $l=5$ | $l=10$ | $l=20$ | $l=35$ |
---|---|---|---|---|---|---|

Known shift: | $2{\sigma}_{\u03f5}$ | 0.56 (2.77) | 0.98 (4.72) | 0.99 (6.27) | 1.00 (8.17) | 1.00 (9.65) |

$4{\sigma}_{\u03f5}$ | 0.99 (5.59) | 1.00 (9.50) | 1.00 (12.57) | 1.00 (16.36) | 1.00 (19.31) | |

Split-half one-cut | $2{\sigma}_{\u03f5}$ | 0.15 (1.43) | 0.12 (1.42) | 0.13 (1.47) | 0.14 (1.44) | 0.16 (1.47) |

$4{\sigma}_{\u03f5}$ | 0.61 (2.88) | 0.61 (2.88) | 0.63 (2.92) | 0.59 (2.88) | 0.60 (2.92) | |

Split-half sequential: | $2{\sigma}_{\u03f5}$ | 0.17 (3.01) | 0.50 (3.68) | 0.57 (4.63) | 0.56 (5.86) | 0.56 (6.92) |

$4{\sigma}_{\u03f5}$ | 0.89 (4.10) | 0.93 (6.81) | 0.93 (8.96) | 0.92 (11.62) | 0.93 (13.70) | |

Multi-path | $2{\sigma}_{\u03f5}$ | 0.41 (3.89) | 0.57 (5.24) | 0.57 (6.33) | 0.58 (7.78) | 0.55 (8.65) |

$4{\sigma}_{\u03f5}$ | 0.95 (5.89) | 0.93 (9.45) | 0.95 (11.73) | 0.93 (14.79) | 0.92 (16.57) |

**Figure 3.**Comparing SIS on a single shift without (open) and with (shaded) sequential selection. (a) shows the time series ${y}_{t}$ with a location shift; (b) and (c) the simulated estimator and test-statistic densities for split-half and sequential selection; and (d) their simulated variances.

#### 5.3. Misspecified Indicator Timing

**Figure 4.**Fitted and actual values for four step-indicator specifications to a location shift at $t=21$. (a): known shift with t value and non-centrality ${\varphi}_{r}$; (b): shift approximated by a step one period late and (c): shift approximated by a step 5 periods late, both with non-centralities ${\varphi}_{s}$; and (d): SIS selection.

**Table 3.**Potency of ${1}_{\left(\right)}$ for varying break lengths ${T}_{1}$ and accuracy of timing using Autometrics.

$\lambda =2{\sigma}_{\u03f5}$ | ${T}_{1}=4$ | ${T}_{1}=5$ | ${T}_{1}=10$ | ${T}_{1}=15$ | ${T}_{1}=20$ |
---|---|---|---|---|---|

${T}_{1}$ | 0.58 | 0.55 | 0.59 | 0.59 | 0.59 |

${T}_{1}\pm 1$ | 0.76 | 0.77 | 0.79 | 0.83 | 0.81 |

${T}_{1}\pm 2$ | 0.84 | 0.87 | 0.86 | 0.90 | 0.89 |

${T}_{1}\pm 3$ | 0.89 | 0.91 | 0.92 | 0.93 | 0.92 |

#### 5.4. Unknown Shift Requiring a Two-Step Indicator in One-Half Sample

**r**replaced by

**s**, so selecting indicators in the latter half of the sample should remain as before.

**Table 4.**Split-half sequential selection: gauge and retention frequencies for a shift with two indicators.

Gauge | Retention frequency | |||
---|---|---|---|---|

${\lambda}_{1}$ | ${\mathbf{D}}_{1}$ | ${\mathbf{D}}_{2}$ | ${T}_{1}$ step | ${T}_{2}$ step |

$2{\sigma}_{\u03f5}$ | 0.020 | 0.011 | 0.52 | 0.55 |

$4{\sigma}_{\u03f5}$ | 0.004 | 0.017 | 0.91 | 0.94 |

#### 5.5. Unknown Opposite-Signed Shifts in Each Split Half

Gauge | Retention frequency | |||||
---|---|---|---|---|---|---|

${\lambda}_{1,2}$ | ${\mathbf{D}}_{1}$ | ${\mathbf{D}}_{2}$ | ${T}_{1}$ step | ${T}_{2}$ step | ${T}_{3}$ step | ${T}_{4}$ step |

$2{\sigma}_{\u03f5}$ | 0.021 | 0.045 | 0.52 | 0.55 | 0.57 | 0.56 |

$4{\sigma}_{\u03f5}$ | 0.004 | 0.030 | 0.91 | 0.94 | 0.93 | 0.93 |

#### 5.6. Unknown Equal Shifts in Each Split Half

**Table 6.**Split-half sequential selection and multi-path: unknown equal shifts in each half, $\alpha =0.01$.

Algorithm | Gauge | Retention Frequency | ||||
---|---|---|---|---|---|---|

${\lambda}_{1}$ | ${\mathbf{D}}_{1}$ | ${\mathbf{D}}_{2}$ | ${T}_{1}$ step | ${T}_{2}$ step | ${T}_{3}$ step | ${T}_{4}$ step |

Sequential: $2{\sigma}_{\u03f5}$ | 0.021 | 0.044 | 0.52 | 0.55 | 0.59 | 0.60 |

$4{\sigma}_{\u03f5}$ | 0.005 | 0.030 | 0.91 | 0.94 | 0.94 | 0.94 |

${\lambda}_{1}$ | ${\mathbf{D}}_{1}$ & ${\mathbf{D}}_{2}$ | ${T}_{1}$ step | ${T}_{2}$ step | ${T}_{3}$ step | ${T}_{4}$ step | |

Multi-path: $2{\sigma}_{\u03f5}$ | 0.038 | 0.53 | 0.48 | 0.55 | 0.55 | |

$4{\sigma}_{\u03f5}$ | 0.018 | 0.87 | 0.91 | 0.94 | 0.92 |

#### 5.7. Unknown Shift Period Spanning Both Splits

**Table 7.**Split-half sequential sequential and multi-path: shift spanning both splits, $\alpha =0.01$.

Algorithm | Gauge | Retention frequency | ||||
---|---|---|---|---|---|---|

${\lambda}_{1}$ | ${\mathbf{D}}_{1}$ | ${\mathbf{D}}_{2}$ | ${T}_{1}$ Step | ${T}_{2}$ Step | ${T}_{3}$ Step | ${T}_{4}$ Step |

Sequential: $2{\sigma}_{\u03f5}$ | 0.011 | 0.039 | 0.58 | 0.001 | 0.0 | 0.56 |

$4{\sigma}_{\u03f5}$ | 0.002 | 0.02 | 0.94 | 0.0 | 0.0 | 0.93 |

${\lambda}_{1}$ | ${\mathbf{D}}_{1}$ & ${\mathbf{D}}_{2}$ | ${T}_{1}$ Step | ${T}_{2}$ Step | ${T}_{3}$ Step | ${T}_{4}$ Step | |

Multi-path: $2{\sigma}_{\u03f5}$ | 0.029 | 0.57 | 0.01 | 0.01 | 0.55 | |

$4{\sigma}_{\u03f5}$ | 0.019 | 0.94 | 0.02 | 0.02 | 0.96 |

#### 5.8. Summary of the Simulation Results

## 6. Generalization to Retained Regressors

**Z**as independent variables. For a single-step shift with unknown timing requiring two indicators, the DGP is then given by:

Gauge | Retention frequency | ||
---|---|---|---|

${\lambda}_{1}$ | ${T}_{1}$ step | ${T}_{2}$ step | |

$2{\sigma}_{\u03f5}$ | 0.035 | 0.50 | 0.62 |

$4{\sigma}_{\u03f5}$ | 0.024 | 0.91 | 0.94 |

## 7. Comparisons with Least Angle Regression

LARS Step | 1 | 2 | 3 | 4 | 5 | Cross-Validated |
---|---|---|---|---|---|---|

Gauge | 0.029 | 0.047 | 0.061 | 0.073 | 0.084 | 0.017 |

**Table 10.**Potency and gauge of cross-validated LARS for single shifts of lengths l and magnitudes ${\lambda}_{1}$.

${\lambda}_{1}$ | $l=1$ | $l=5$ | $l=10$ | $l=20$ | $l=35$ | |
---|---|---|---|---|---|---|

Potency | $2{\sigma}_{\u03f5}$ | 0.298 | 0.800 | 0.844 | 0.853 | 0.854 |

Potency | $4{\sigma}_{\u03f5}$ | 0.780 | 0.990 | 0.990 | 0.996 | 1.00 |

Gauge | $2{\sigma}_{\u03f5}$ | 0.018 | 0.050 | 0.054 | 0.056 | 0.058 |

Gauge | $4{\sigma}_{\u03f5}$ | 0.020 | 0.052 | 0.055 | 0.058 | 0.059 |

**Table 11.**Potency and gauge of cross-validated LARS for same-sign, equal-magnitude shifts over ${T}_{1}=25$ to ${T}_{2}=35$ and ${T}_{3}=75$ to ${T}_{4}=85$.

${\lambda}_{1}$ | gauge | ${T}_{1}$ potency | ${T}_{2}$ potency | ${T}_{3}$ potency | ${T}_{4}$ potency |
---|---|---|---|---|---|

$2{\sigma}_{\u03f5}$ | 0.167 | 0.829 | 0.839 | 0.871 | 0.818 |

$4{\sigma}_{\u03f5}$ | 0.166 | 0.997 | 1.00 | 1.00 | 0.992 |

**Table 12.**Potency and gauge of cross-validated LARS for equal magnitude, opposite-signed shifts over ${T}_{1}=25$ to ${T}_{2}=35$ and ${T}_{3}=75$ to ${T}_{4}=85$.

${\lambda}_{1}$ | gauge | ${T}_{1}$ potency | ${T}_{2}$ potency | ${T}_{3}$ potency | ${T}_{4}$ potency |
---|---|---|---|---|---|

$2{\sigma}_{\u03f5}$ | 0.164 | 0.836 | 0.837 | 0.857 | 0.867 |

$4{\sigma}_{\u03f5}$ | 0.160 | 0.997 | 1.00 | 0.998 | 0.997 |

## 8. Non-Linearity and SIS

- (1)
- the DGP includes non-linear variables, but no shifts, and SIS is applied;
- (2)
- there is a shift in the dependent variable, but no non-linearity and SIS is not applied; and
- (3)
- Setting (2), but SIS is applied.

**Table 13.**The effect of using SIS in Equation (42) on the rejection frequencies when the ${x}_{t}^{k}$ are always retained and retention frequencies when they are selected over (both at 1%).

Variable | Null rejection at 1% (xs retained) | Retention at 1% (x’s selected over) | ||
---|---|---|---|---|

without SIS | with SIS | without SIS | with SIS | |

x, $\psi =3$ | 0.65 | 0.68 | 0.68 | 0.63 |

${x}^{2}$, $\psi =3$ | 0.64 | 0.67 | 0.68 | 0.56 |

${x}^{3}$, $\psi =3$ | 0.65 | 0.68 | 0.67 | 0.62 |

SIS gauge | - | 0.02 | - | 0.03 |

**Table 14.**The impact of SIS in Equation (42) on the null rejection frequencies of ${x}_{t}^{k}$, which are always retained, and the retention frequencies when they are selected over, with and without SIS in the presence of a step-shift ${\lambda}_{1}$ at ${T}_{1}=35$ (all at 1%).

Variable | Null rejection at 1% (xs retained) | Retention at 1% (xs selected over) | ||||||
---|---|---|---|---|---|---|---|---|

${\lambda}_{1}=2{\sigma}_{\u03f5}$ | ${\lambda}_{1}=4{\sigma}_{\u03f5}$ | ${\lambda}_{1}=2{\sigma}_{\u03f5}$ | ${\lambda}_{1}=4{\sigma}_{\u03f5}$ | |||||

no SIS | with SIS | no SIS | with SIS | no SIS | with SIS | no SIS | with SIS | |

x, $\psi =0$ | 0.22 | 0.06 | 0.73 | 0.03 | 0.41 | 0.02 | 0.83 | 0.02 |

${x}^{2}$, $\psi =0$ | 0.29 | 0.06 | 0.87 | 0.03 | 0.66 | 0.02 | 0.96 | 0.02 |

${x}^{3}$, $\psi =0$ | 0.26 | 0.06 | 0.80 | 0.02 | 0.30 | 0.02 | 0.83 | 0.01 |

${T}_{1}$ step | - | 0.51 | - | 0.93 | - | 0.62 | - | 0.94 |

SIS gauge | - | 0.02 | - | 0.02 | - | 0.02 | - | 0.02 |

## 9. Conclusion

## Acknowledgements

## Author Contributions

## Conflicts of Interest

## References

- J.L. Castle, and D.F. Hendry. “Model selection in under-specified equations with breaks.” J. Econ. 178 (2014): 286–293. [Google Scholar] [CrossRef]
- M.P. Clements, and D.F. Hendry. Forecasting Economic Time Series. Cambridge, UK: Cambridge University Press, 1998. [Google Scholar]
- D.F. Hendry, S. Johansen, and C. Santos. “Automatic selection of indicators in a fully saturated regression.” Comput. Stat. 33 (2008): 317–335. Erratum, 337–339. [Google Scholar] [CrossRef]
- S. Johansen, and B. Nielsen. “An analysis of the indicator saturation estimator as a robust regression estimator.” In The Methodology and Practice of Econometrics. Edited by J.L. Castle and N. Shephard. Oxford, UK: Oxford University Press, 2009, pp. 1–36. [Google Scholar]
- D.F. Hendry, and C. Santos. “An automatic test of super exogeneity.” In Volatility and Time Series Econometrics. Edited by M.W. Watson, T. Bollerslev and J. Russell. Oxford, UK: Oxford University Press, 2010, pp. 164–193. [Google Scholar]
- R.F. Engle, D.F. Hendry, and J.F. Richard. “Exogeneity.” Econometrica 51 (1983): 277–304. [Google Scholar] [CrossRef]
- D.F. Hendry, and G.E. Mizon. “Econometric modelling of time series with outlying observations.” J. Time Series Econ. 3 (2011). [Google Scholar] [CrossRef]
- J.J. Reade, and U. Volz. “From the general to the specific: Modelling inflation in China.” Appl. Econ. Q. 7 (2011): 27–44. [Google Scholar] [CrossRef]
- N.R. Ericsson, and E.L. Reisman. “Evaluating a global vector autoregression for forecasting.” Int. Adv. Econ. Res. 18 (2012): 247–258. [Google Scholar] [CrossRef]
- J.L. Castle, J.A. Doornik, and D.F. Hendry. “Model selection when there are multiple breaks.” J. Econ. 169 (2012): 239–246. [Google Scholar] [CrossRef]
- D.F. Hendry, and F. Pretis. “Anthropogenic influences on atmospheric CO2.” In Handbook on Energy and Climate Change. Edited by R. Fouquet. Cheltenham, UK: Edward Elgar, 2013, pp. 287–326. [Google Scholar]
- N.R. Ericsson. “How biased are U.S. Government Forecasts of the Federal Debt? ” Int. J. Forecast. in press.
- J.A. Doornik. “Autometrics.” In The Methodology and Practice of Econometrics. Edited by J.L. Castle and N. Shephard. Oxford, UK: Oxford University Press, 2009, pp. 88–121. [Google Scholar]
- D.F. Hendry, and J.A. Doornik. Empirical Model Discovery and Theory Evaluation. Cambridge, MA, UK: MIT Press, 2014. [Google Scholar]
- N.R. Ericsson. “Detecting Crises, Jumps, and Changes in Regime.” Working paper. Washington, DC, USA: Federal Reserve Board of Governors, 2012. [Google Scholar]
- J.L. Castle, J.A. Doornik, and D.F. Hendry. “Evaluating automatic model selection.” J. Time Series Econ. 3 (2011). [Google Scholar] [CrossRef]
- S. Johansen, and B. Nielsen. “Outlier detection in regression using an iterated one-step approximation to the Huber-Skip estimator.” Econometrics 1 (2013): 53–70. [Google Scholar] [CrossRef] [Green Version]
- J. Bai, and P. Perron. “Estimating and testing linear models with multiple structural changes.” Econometrica 66 (1998): 47–78. [Google Scholar] [CrossRef]
- J. Bai, and P. Perron. “Computation and analysis of multiple structural change models.” J. Appl. Econ. 18 (2003): 1–22. [Google Scholar] [CrossRef]
- P. Perron, and T. Yabu. “Testing for shifts in trend with an integrated or stationary noise component.” J. Bus. Econ. Stat. 27 (2009): 369–396. [Google Scholar] [CrossRef]
- J.L. Castle, and D.F. Hendry. “Semi-automatic non-linear model selection.” In Essays in Nonlinear Time Series Econometrics. Edited by N. Haldrup, M. Meitz and P. Saikkonen. Oxford, UK: Oxford University Press, 2014, pp. 163–197. [Google Scholar]
- R. Tibshirani. “Regression shrinkage and selection via the lasso.” J. R. Stat. Soc. B 58 (1996): 267–288. [Google Scholar]
- B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. “Least angle regression.” Ann. Stat. 32 (2004): 407–499. [Google Scholar]
- J.A. Doornik. Object-Oriented Matrix Programming using Ox, 7th ed. London, UK: Timberlake Consultants Press, 2009. [Google Scholar]
- J.A. Doornik, and D.F. Hendry. Empirical Econometric Modelling using PcGive: Volume I, 7th ed. London, UK: Timberlake Consultants Press, 2013. [Google Scholar]
- G.C. Chow. “Tests of equality between sets of coefficients in two linear regressions.” Econometrica 28 (1960): 591–605. [Google Scholar] [CrossRef]
- D.F. Hendry, and S. Johansen. “Model discovery and Trygve Haavelmo’s legacy.” Econ. Theory 31 (2014): 93–114. [Google Scholar] [CrossRef]
- M. Bergamelli, and G. Urga. “Detecting Multiple Structural Breaks: A Monte Carlo Study and an Application to the Fisher Equation for US.” Discussion paper. London, UK: Cass Business School, 2013. [Google Scholar]
- C. De Peretti, and G. Urga. “Stopping Tests in the Sequential Estimation of Multiple Structural Breaks.” Discussion paper. London, UK: Cass Business School, 2005. [Google Scholar]
- F. Pretis, L. Schneider, and J.E. Smerdon. “Detection of Breaks by Designed Functions Applied to Volcanic Impacts on Hemispheric Surface Temperatures.” Working paper. Oxford, UK: Economics Department, Oxford University, 2014. [Google Scholar]
- C.W.J. Granger, and T. Teräsvirta. Modelling Nonlinear Economic Relationships. Oxford, UK: Oxford University Press, 1993. [Google Scholar]
- H. White. Artificial Neural Networks: Approximation and Learning Theory. Oxford, UK: Oxford University Press, 1992. [Google Scholar]
- J.L. Castle, J.A. Doornik, D.F. Hendry, and F. Pretis. “Detecting Location Shifts by Step-Indicator Saturation.” Working paper. Oxford, UK: Economics Department, Oxford University, 2013. [Google Scholar]
- D.S. Salkever. “The use of dummy variables to compute predictions, prediction errors and confidence intervals.” J. Econ. 4 (1976): 393–397. [Google Scholar] [CrossRef]
- R. Mariscal, and A. Powell. “Commodity Price Booms and Breaks: Detection, Magnitude and Implications for Developing Countries.” Discussion paper. Washington DC, USA: Research Department, Inter American Development Bank, 2014. [Google Scholar]

^{1}We are currently investigating a range of designed break functions, including interactions with regressors to detect parameter changes.

© 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license ( http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Castle, J.L.; Doornik, J.A.; Hendry, D.F.; Pretis, F.
Detecting Location Shifts during Model Selection by Step-Indicator Saturation. *Econometrics* **2015**, *3*, 240-264.
https://doi.org/10.3390/econometrics3020240

**AMA Style**

Castle JL, Doornik JA, Hendry DF, Pretis F.
Detecting Location Shifts during Model Selection by Step-Indicator Saturation. *Econometrics*. 2015; 3(2):240-264.
https://doi.org/10.3390/econometrics3020240

**Chicago/Turabian Style**

Castle, Jennifer L., Jurgen A. Doornik, David F. Hendry, and Felix Pretis.
2015. "Detecting Location Shifts during Model Selection by Step-Indicator Saturation" *Econometrics* 3, no. 2: 240-264.
https://doi.org/10.3390/econometrics3020240