Next Article in Journal
Nonparametric Regression Estimation for Multivariate Null Recurrent Processes
Previous Article in Journal
A Pitfall in Using the Characterization of Granger Non-Causality in Vector Autoregressive Models
Article

Detecting Location Shifts during Model Selection by Step-Indicator Saturation

1
Magdalen College and Institute for New Economic Thinking, Oxford Martin School, Oxford University, Eagle House, Walton Well Road, Oxford OX2 6ED, UK
2
Economics Department and Institute for New Economic Thinking, Oxford Martin School, Oxford University, Eagle House, Walton Well Road, Oxford OX2 6ED, UK
*
Author to whom correspondence should be addressed.
Academic Editor: Kerry Patterson
Econometrics 2015, 3(2), 240-264; https://doi.org/10.3390/econometrics3020240
Received: 16 February 2015 / Accepted: 26 March 2015 / Published: 14 April 2015

Abstract

To capture location shifts in the context of model selection, we propose selecting significant step indicators from a saturating set added to the union of all of the candidate variables. The null retention frequency and approximate non-centrality of a selection test are derived using a ‘split-half’ analysis, the simplest specialization of a multiple-path block-search algorithm. Monte Carlo simulations, extended to sequential reduction, confirm the accuracy of nominal significance levels under the null and show retentions when location shifts occur, improving the non-null retention frequency compared to the corresponding impulse-indicator saturation (IIS)-based method and the lasso.
Keywords: structural breaks; model selection; Monte Carlo; indicator saturation; Autometrics structural breaks; model selection; Monte Carlo; indicator saturation; Autometrics

1. Introduction

Unmodelled location shifts (changes in previous unconditional means of data) can have pernicious effects on the constancy of models and on forecast performance. In sample, an unmodelled location shift entails that empirical models will be misspecified, potentially affecting which variables, their lags and non-linear functions will be selected, distorting parameter estimation and inference, as well as inducing non-constancy: see [1]. Out of sample, an unanticipated location shift at or near the forecast origin can lead to forecast failure: see [2]. Consequently, we consider step-indicator saturation (SIS) to detect location shifts as part of a model selection strategy, building on the developments of impulse-indicator saturation (IIS). Hendry et al. [3] derive the null distribution of IIS for independent, identically distributed ( IID ) data, and [4] generalize that analysis to dynamic regression models (possibly with unit roots). Hendry and Santos [5] propose an IIS-based test of super exogeneity, building on [6]; and [7,8,9,10,11,12] provide empirical applications of IIS.
Indicator saturation methods (such as IIS and SIS) are feasible because software, like Autometrics, can handle more candidate variables N than observations T during model selection using a combination of expanding and contracting multiple block searches, as described in [13],[14] (Chapter 19), and [15]. In this selection context, the null retention frequency of indicators is called the gauge by [16], akin to the size of a test denoting its (false) null rejection frequency, but taking into account that indicators that are insignificant on a pre-assigned criterion may nevertheless be retained to offset what would otherwise be a significant misspecification test. Johansen and Nielsen [4] establish that using small nominal significance levels α (e.g., α 0 . 01 ) for selection in IIS, despite testing T indicators, on average, α T are retained, so the gauge is approximately α: also, see [17] for a discussion of outlier detection algorithms. The non-null retention frequency when selecting indicators is called its potency, akin to a similar test’s power for rejecting a false null hypothesis: see [10] for simulations of IIS under the alternative. The analytic derivations below use the split-half one-cut approach in [3] for establishing the distribution under the null of no shifts, checked by simulations that also compare split-half sequential selection outcomes and a multi-path search algorithm, extended to non-null alternatives.
When the locations, durations, magnitudes and signs of multiple shifts are unknown, we show that selection by SIS can be beneficial. There are many extant tests for multiple breaks, such as those proposed by [18,19,20]. Our interest is in joint selection over candidate variables, dynamic reactions and possible non-linearities, investigated in detail by [14], and since location shifts can approximate non-linearities and vice versa (see, e.g., [21]), we do not explore the pre-testing route. Importantly, SIS does not require knowledge of the locations of breaks, the maximum number of shifts, nor does it impose a minimum break length, so it allows shifts to occur at the start and/or end of the sample. Alternative model selection methods include the lasso and least angle regression, as described by [22,23] respectively. lasso and least angle regression (LARS) work well for a single step shift, but once multiple breaks occur, because of the forward selection approach they adopt, selection over multiple step functions can fail to detect shifts, as no single step function is highly correlated with any of the multiple breaks.
The structure of this paper is as follows. Section 2 describes step-indicator saturation, then Section 3 derives its gauge in a split-half one-cut analysis. Monte Carlo experiments are reported following each theory section, so the gauge of SIS is simulated in Subsection 3. The numerical results are based on simulating in Ox and Autometrics: see [24,25]. Section 4 investigates the power of a step indicator to detect a known mean shift and relates that to well-known procedures, such as the test of [26]. We next consider selecting indicators for unknown shifts in Section 5, first Subsection 5.1, which develops the basic analytical tools, and is simulated in Subsection 5.2. Subsection 5.3 considers the effects of misspecifying the timing of an indicator. The basic setting is generalized in Subsection 5.4 to unknown shifts requiring a two-step (off-on-off) indicator; two opposite-signed shifts where one lies in each half in Subsection 5.5, with the same-signed shifts in each half in Subsection 5.6; then, to an unknown shift spanning both splits in Subsection 5.7, with a summary of the simulation results in Subsection 5.8. A generalization to retained regressors is noted in Section 6. Section 7 provides comparisons with LARS, and Section 8 investigates the impact on selecting non-linearity when SIS is applied. Section 9 concludes.

2. Step-Indicator Saturation

Consider adding a complete set of step indicators, S 1 = 1 t j , j = 1 , , T , to a regression model, where 1 t j = 1 for observations up to j, and zero otherwise. Step indicators are the cumulation of impulse indicators up to each next observation. As whole-sample vectors, step indicators take the form ι 1 = ( 1,0,0,…,0), ι 2 = ( 1,1,0,…,0), …, ι T = ( 1,1,1,…,1), which is the intercept dummy. As in [27], n valid conditioning variables z t can be retained without selection for n < T / 2 . Under the null of no shifts, the initial specification is:
y t = β 0 + β 1 z t + u t where u t IN 0 , σ u 2
when IN 0 , σ u 2 denotes an independent normal distribution with mean zero and variance σ u 2 . Adding the saturating set of step indicators S 1 to Equation (1) creates:
y t = β 0 + β 1 z t + j = 1 T - 1 δ j 1 t j + u t
It is infeasible to estimate (2), but the split-half one-cut approach to understanding IIS applies to SIS. First consider n = 0 . Add the first k = T / 2 indicators to Equation (1) and record which have significant coefficients at significance level α. As k < T and the indicators are deterministic, conventional inference applies. Drop the first half and add the second block of T / 2 to the original model (1), again recording which are significant in that subset. Finally, combine the recorded variables (if any) from the two stages, and select again at significance level α. Under the null, setting α = 1 / T , on average, at both sub-steps, α T / 2 (namely 1 / 2 an indicator) will be retained by chance, so on average, α T = 1 indicator will be retained from the combined stage, so one degree of freedom is lost on average. Hendry et al. [3] show that other splits, such as using r splits of size T / r , or unequal splits, do not affect the gauge, or simulation-based distributions. When n 0 and n + k > T , divide the total set of N = n + T candidate variables into smaller sub-blocks, setting α = 1 / N overall.
When m indicators are selected in a congruent representation at significance level α:
y t = β 0 + β 1 z t + i = 1 m ϕ i , α 1 t T i + v t where v t IN 0 , σ v 2
and the coefficients of significant indicators are denoted ϕ i , α . Despite some similarities between the procedures in IIS and SIS, there are important differences necessitating a new analysis. First, while impulse indicators are mutually orthogonal, step indicators overlap increasingly as their second index increases. Second, for an impulse, or more generally, a location shift that is not at either end, say from T 1 to T 2 , two indicators are required to characterize it: 1 t T 2 - 1 t < T 1 . Third, the ease of detection may be affected by whether location shifts occur with similar, or opposite, signs and magnitudes. Although we use a split-half one-cut approach for analysis, in practice, the combination of expanding and contracting multiple block searches as implemented in Autometrics should be applied. Bergamelli and Urga [28] undertake extensive simulations of IIS, SIS and their extensions to trend breaks, as well as comparisons with the sequential break tests in [29] and, using different block partitions, find similar results to those reported below.
There are many possible specifications of step indicators, but the choice should have little impact on the detection of location shifts, corroborated by simulation comparisons. More generally, the shape of shift indicators can be specified in light of subject-matter considerations. For example, to detect the impact of volcanic eruptions on tree-ring measures of temperature, [30] designed a ν-shaped formulation that proved successful; an ogive shape could represent a slower step shift as in the logistic smooth-transition approach of [31]; or a neural network: see, e.g., [32].1

3. Null Retention Frequency (Gauge) of Step-Indicator Selection

To investigate the gauge of SIS, we consider the simplest constant-parameter data generation process (DGP):
y t = μ + ϵ t where ϵ t IN 0 , σ ϵ 2
As with IIS, we use the split-half one-cut approach under the null, so there are T / 2 indicators for the first half:
y t = μ + j = 1 T / 2 δ j 1 t j + u t
the equation of which can be estimated directly, indicators being retained when their estimated coefficients δ ^ j satisfy | t δ ^ j | > c α , where c α is the critical value for significance level α. Under the null, a subset α T / 2 of the indicators will be retained by chance on average. Their locations are recorded; all of those indicators are dropped, and the second set is then investigated in a similar way, now including indicators 1 t j for j = T / 2 + 1 , , T - 1 . Add the step indicators selected in each half to Equation (4), and re-select, keeping only significant indicators: an F-test of their joint significance could be conducted.
Figure 1 illustrates the split-half one-cut approach to SIS for Equation (4) when μ = 10 , σ ϵ 2 = 1 and T = 100 . The three rows correspond to the three stages: add the first half of the indicators, the second half, then the selected indicators combined. The three columns report the indicators entered, the indicators retained and the fitted and actual values of the selected model. Fifty indicators are added, and two are retained in Row 1. When the second half is entered (Row 2), none is retained. Selecting over the two retained indicators again retains them. Different and multiple splits or unequal divisions entered into Equation (4) should not affect the retention probability under the null: a step shift should only be retained when it is present, here by chance due to a collection of sufficient magnitude, same-signed ϵ t over a sub-sample. In practice, multiple block searches, as in Autometrics, will be needed, as the null may be false.
Figure 1. Illustrating split-half one-cut step-indicator saturation (SIS) under the null of no shift in Equation (4).
Figure 1. Illustrating split-half one-cut step-indicator saturation (SIS) under the null of no shift in Equation (4).
Econometrics 03 00240 g001
When n is small, adding strongly exogenous regressors to the baseline Equation (4) as in Equation (1) will still allow unrestricted estimation of each half, so the above analysis is unaffected.
Non-normal, but continuous symmetric distributions f ( · ) , with at least eight finite moments, entail using the appropriate critical value c α , where α = 1 - - c α / 2 c α / 2 f ( u ) d u . If conventional critical values are used for selection, [10] show that, e.g., IIS will retain indicators corresponding to what are judged ‘outliers’ relative to the normal when the error distribution is fat-tailed, and we anticipate that SIS will do likewise.

Retention of Step Indicators Under the Null Hypothesis of No Shift

We investigated the properties of SIS using simulations coded in the Ox programming language and replicated M = 1,000 times. First, the model in (4) is estimated using the split-half one-cut approach where μ = 0 , ϵ t IN [ 0 , σ ϵ 2 ] and σ ϵ 2 = 1 , for a sample size T = 100 and various values of α. Table 1 records the retention frequency of irrelevant indicators (gauge) overall, as well as for the first T / 2 ( D 1 ) and second T / 2 sets of indicators ( D 2 ). The overall retention frequency of irrelevant indicators is close to α, so on average, α T irrelevant step indicators are retained under the null that no shifts occur.
Table 1. Proportion of irrelevant retained indicators under the null of no shift.
Table 1. Proportion of irrelevant retained indicators under the null of no shift.
Gauge
αOverall D 1 D 2
0.0010.00180.00180.0018
0.010.0130.0130.013
0.050.0560.0570.054

4. Analytical Power of a Step-Indicator Test for a Known Mean Shift

We next investigate the power of a step indicator to detect a known mean shift from λ 1 0 to λ 1 = 0 at time 0 < T 1 < T / 2 in the DGP:
y t = μ + λ 1 1 t T 1 + ϵ t where ϵ t IN 0 , σ ϵ 2
where λ 1 0 , so the shift is from μ + λ 1 to μ. To determine the power of a step-indicator test to detect the shift in Equation (6), the nesting model when the shift is known is:
y t = φ + δ T 1 1 t T 1 + v t
Theorem 1. Let ψ λ 1 * = T * λ 1 / σ ϵ be the non-centrality of the t-test of H 0 : δ T 1 = 0 in Equation (7) for the DGP in Equation (6) where T * = T 1 ( T - T 1 ) / T , then:
t δ ^ T 1 = T * δ ^ T 1 σ ^ ϵ T * ( δ ^ T 1 - λ 1 ) σ ϵ + T * λ 1 σ ϵ N ψ λ 1 * , 1
Proof. As t = 1 T 1 t T 1 = t = 1 T 1 1 t T 1 = T 1 , estimating (7) delivers:
φ ^ - μ δ ^ T 1 - λ 1 = T T 1 T 1 T 1 - 1 t = 1 T ϵ t t = 1 T 1 ϵ t = ϵ ¯ 2 ϵ ¯ 1 - ϵ ¯ 2
where ϵ ¯ 1 = T 1 - 1 t = 1 T 1 ϵ t , etc., and:
V φ ^ - μ δ ^ T 1 - λ 1 = σ ϵ 2 T - T 1 - 1 1 - 1 - 1 T 1 - 1 T - T 1 + 1 .
For the DGP in Equation (6):
T * δ ^ T 1 - λ 1 N 0 , σ ϵ 2
Hence, neglecting the estimation uncertainty in σ ^ ϵ 2 :
t δ ^ T 1 = T * δ ^ T 1 σ ^ ϵ T * ( δ ^ T 1 - λ 1 ) σ ϵ + T * λ 1 σ ϵ N ψ λ 1 * , 1
where T * = T 1 when there is no intercept. ☐
Then, ψ λ 1 * in Equation (10) is T * times the corresponding non-centrality for an individual impulse indicator in [5], so t δ ^ T 1 will have considerable power when λ 1 0 .

5. Potency of SIS for an Unknown Location Shift

We now develop the basic analytical tools for a shift that can be matched by a single step indicator in Subsection 5.1 and check the outcomes by simulation in Subsection 5.2. Subsection 5.3 considers the effects of misspecifying the timing of an indicator. The basic setting is generalized in Subsection 5.4 to an unknown shift period requiring a two-step indicator. Subsection 5.5 and Subsection 5.6 consider the occurrence of two shifts where one lies in each half, first when opposite-signed, then when they are equal magnitudes, signs and durations. Subsection 5.7 then considers an unknown shift spanning both halves where multi-path search across several splits is likely to outperform split-half one-cut; see [33] for more detailed simulation evidence and comparisons with IIS.

5.1. Unknown Shift Period Matched by a Single Step Indicator

We first show that detection of a single location shift falling entirely within a half-sample of the data ( 0 < T 1 < T / 2 ) as in Equation (6) is feasible using the split-half one-cut analysis of step-indicator saturation. In matrix notation, let ι T 1 denote a T × 1 vector with elements of unity till T 1 and zeroes thereafter, so the DGP is:
y = λ 1 ι T 1 + ϵ
As before, add the first half of the step indicators, assuming T is even, so the model becomes:
y t = j = 1 T / 2 γ j 1 t j + v t
We assume an intercept of zero in Equation (12) to highlight the main aspects of the algebra, written in matrix form as:
y = D 1 γ 1 + v
where γ 1 = ( γ 1 γ T / 2 ) and D 1 = ( ι 1 ι T / 2 ) .
Theorem 2. The distribution of the least-squares estimator of γ 1 in Equation (13) is:
γ ^ 1 - λ 1 r a p p ˜ N 0 , σ ϵ 2 D 1 D 1 - 1
where r is a T / 2 × 1 vector with unity in the T 1 -th position, and zeroes elsewhere.
Proof. From Equation (11):
γ ^ 1 = D 1 D 1 - 1 D 1 y = λ 1 D 1 D 1 - 1 D 1 ι T 1 + D 1 D 1 - 1 D 1 ϵ
where D 1 D 1 is:
ι 1 ι 2 ι T / 2 - 1 ι T / 2 ( ι 1 ι 2 ι T / 2 - 1 ι T / 2 ) = 1 1 1 1 1 1 2 2 2 2 1 2 3 3 3 1 2 3 T / 2 - 1 T / 2 - 1 1 2 3 T / 2 - 1 T / 2
The inverse of D 1 D 1 is the ‘double difference’ matrix:
D 1 D 1 - 1 = 2 - 1 0 0 0 - 1 2 - 1 0 0 0 - 1 2 0 0 0 0 0 2 - 1 0 0 0 - 1 1
Therefore:
D 1 D 1 - 1 D 1 = 1 - 1 0 0 0 0 1 - 1 0 0 0 0 1 0 0 0 0 0 1 - 1 0 0 0 0 1
which is the forward-difference matrix. Consequently, letting ϵ t = ϵ t - ϵ t + 1 , from Equation (15):
γ ^ 1 = λ 1 D 1 D 1 - 1 D 1 ι T 1 + D 1 D 1 - 1 D 1 ϵ = λ 1 r + ϵ ( 1 )
where r is a T / 2 × 1 vector with unity in the T 1 -th position and zeroes elsewhere, so:
γ ^ 1 - λ 1 r = ϵ ( 1 )
where the ( T / 2 × 1 ) vector ϵ ( 1 ) = ( ϵ 1 , ϵ 2 , , ϵ T / 2 , ϵ T / 2 ) . All the elements of γ ^ 1 up to the T 1 -th should be near zero and only the T 1 -th reflects λ 1 , corresponding to the location shift, with the others being distributed around zero as ϵ ( 1 ) . Thus:
γ ^ T 1 = λ 1 + ϵ T 1
Furthermore:
E ϵ ( 1 ) ϵ 1 = σ ϵ 2 D 1 D 1 - 1
Therefore:
γ ^ 1 - λ 1 r a p p ˜ N 0 , σ ϵ 2 D 1 D 1 - 1
Effectively, (18) shows that only the value of λ 1 at the shift is being picked up, so the incremental information is equivalent to an impulse indicator for T 1 . Further, letting ϵ ( 1 ) * = ( ϵ ( 1 ) : 0 T / 2 ) , so:
y ^ = D 1 γ ^ 1 = λ 1 D 1 r + D 1 ϵ ( 1 ) = λ 1 ι T 1 + ϵ ( 1 ) *
as D 1 r = ι T 1 and D 1 ϵ ( 1 ) = ϵ ( 1 ) * , then for ϵ ( 2 ) * = ( 0 T / 2 : ϵ ( 2 ) ) :
y - y ^ = λ 1 ι T 1 - λ 1 ι T 1 + ϵ - ϵ ( 1 ) * = ϵ 2 *
Thus, the estimated error variance, adjusted for degrees of freedom, based on the second half:
σ ^ ϵ 2 = 2 T t = T / 2 + 1 T y t - y ^ t 2
will be an unbiased estimator of σ ϵ 2 .
However, for IID errors:
V γ ^ T 1 = 2 σ ϵ 2
Consequently, estimating (12) leads to the test statistic:
t γ ^ T 1 = γ ^ T 1 2 σ ^ ϵ γ ^ T 1 - λ 1 2 σ ϵ + λ 1 2 σ ϵ N ψ λ 1 2 , 1
where ψ λ 1 / 2 is the non-centrality. In IIS, one-cut selection was feasible given the orthogonality of the impulse indicators. The high collinearity between the step indicators entails that there is little information accrual at the level of Equation (24), so sequential selection eliminating the least significant indicators, or multi-path search, is essential for SIS. At 1%, c α 2 . 7 , so normalizing on σ ϵ = 1 , requires λ 1 > 3 . 8 for even a 50% chance of being significant before simplification. It is unlikely that the smallest t γ ^ j occurs at T 1 , and when the least significant indicators are deleted from the model, V [ γ ^ T 1 ] will fall rapidly from Equation (23). For irrelevant step indicators:
t γ ^ j T 1 γ ^ j T 1 2 σ ϵ N 0 , 1
Therefore, on average, 100 α / 2 % of the irrelevant step indicators will be adventitiously significant during selection, as found under the null. If all irrelevant step indicators were eliminated correctly, just ι T 1 would remain, and the non-centrality would become ψ 1 = T * λ 1 / σ ϵ , which is 2 T * larger than the non-centrality before selection. We assume sequential simplification or multi-path search will be used, so it will approximate that outcome, as the simulations below confirm.
Having completed the selection of indicators from the first half, these are eliminated, and the second half of the step indicators, D 2 = ( ι T / 2 + 1 ι T ) are added, noting that ι T is the intercept. Now, the model becomes:
y t = j = T / 2 + 1 T γ j 1 t j + v t
written as:
y = D 2 γ 2 + v
where γ 2 = ( γ T / 2 + 1 γ T ) and D 2 = ( ι T / 2 + 1 ι T ) .
Theorem 3. The distribution of the least-squares estimator of γ 2 in Equation (27) is:
γ ^ 2 = λ 1 T 1 I T / 2 + T 2 jc - 1 j + D 2 D 2 - 1 D 2 ϵ
where c is a T / 2 × 1 vector of ones, and j is a T / 2 × 1 vector of zeroes other than unity in its first position, so only the first element of γ ^ 2 depends on λ 1 .
Proof. From (11), estimation yields:
γ ^ 2 = D 2 D 2 - 1 D 2 y = λ 1 D 2 D 2 - 1 D 2 ι T 1 + D 2 D 2 - 1 D 2 ϵ
where D 2 D 2 is:
T / 2 + 1 T / 2 + 1 T / 2 + 1 T / 2 + 1 T / 2 + 1 T / 2 + 1 T / 2 + 2 T / 2 + 2 T / 2 + 2 T / 2 + 2 T / 2 + 1 T / 2 + 2 T / 2 + 3 T / 2 + 3 T / 2 + 3 T / 2 + 1 T / 2 + 2 T / 2 + 3 T - 1 T - 1 T / 2 + 1 T / 2 + 2 T / 2 + 3 T - 1 T
which is:
D 2 D 2 = D 1 D 1 + 1 2 T cc
Therefore:
D 2 D 2 - 1 = I T / 2 + T 2 jc - 1 D 1 D 1 - 1
as:
D 2 D 2 - 1 = D 1 D 1 - 1 - D 1 D 1 - 1 T 2 cc I T / 2 + D 1 D 1 - 1 T 2 cc - 1 D 1 D 1 - 1 = D 1 D 1 - 1 - T 2 jc I T / 2 + T 2 jc - 1 D 1 D 1 - 1 = I T / 2 + T 2 jc - 1 D 1 D 1 - 1
since:
D 1 D 1 - 1 T 2 cc = T 2 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 = T 2 jc where j = 1 0 0 0 0
Next:
D 2 ι T 1 = ι T / 2 + 1 ι T / 2 + 2 ι T - 1 ι T ι T 1 = T 1 1 1 1 1 = T 1 c
so that:
γ ^ 2 = λ 1 T 1 I T / 2 + T 2 jc - 1 D 1 D 1 - 1 c + D 2 D 2 - 1 D 2 ϵ
and as:
D 1 D 1 - 1 c = 2 - 1 0 0 0 - 1 2 - 1 0 0 0 - 1 2 0 0 0 0 0 2 - 1 0 0 0 - 1 1 1 1 1 1 1 = 1 0 0 0 0 = j
then:
γ ^ 2 = λ 1 T 1 I T / 2 + T 2 jc - 1 j + D 2 D 2 - 1 D 2 ϵ
Thus, the indicator nearest to the shift is most likely to be retained when the relevant indicator is not ‘carried forward’, as simulations confirm. If the shift is in the second half, the last indicator in the first-half will be kept.
Finally, combine the selected step indicators in a model and reselect. When all irrelevant indicators are removed and the relevant one retained:
y t = γ T 1 1 t T 1 + v t
The distribution resulting in the case of a perfect selection must coincide with Equation (9); any irrelevant indicators retained by chance would reduce the degrees of freedom and increase variances from collinearity.
Figure 2 illustrates SIS for a location shift over the last 25 observations in the DGP:
y t = μ + λ 1 1 t 76 + ϵ t = 10 - 10 × 1 t 76 + ϵ t
where ϵ t IN 0 , 1 . Initially, the last step indicator captures the mean shift drop (Row 1) matching the above analysis, then the location shift is found in Row 2, so the now redundant indicator is eliminated in Row 3. Thus, the outcome here coincides with the optimal test for a known location shift, namely a t-test in Equation (34) at t = 76 onwards, without requiring knowledge: (1) that it was a location shift; (2) of the shift timing; (3) that it was the only shift; and (4) that the same magnitude of shift continued thereafter.
Figure 2. Illustrating split-half one-cut SIS for the shift in Equation (34).
Figure 2. Illustrating split-half one-cut SIS for the shift in Equation (34).
Econometrics 03 00240 g002

5.2. Simulating an Unknown Shift Period Matched by a Single Indicator

We now simulate the properties of SIS for a single location shift during the first half of the sample, using the DGP in Equation (11), where the timing of the location shift is set to T 1 = 35 : varying shift lengths are investigated in Table 2. The shift magnitude λ 1 is set equal to 2 σ ϵ and 4 σ ϵ , with selection at α = 0 . 01 .
For the split-half one-cut approach outlined in Subsection 5.1, the open histograms and their densities in Figure 3b show that, while the density of γ ^ T 1 is centred around the true value of λ 1 , in the case of IID errors, the variance of the estimator is twice that of the error (d), in line with Equation (23) below. The associated t-statistic density overlaps zero (c).
The retention frequencies of the step indicator 1 { t T 1 } for varying lengths of shift and two levels of λ 1 without and with sequential selection of indicators are provided in Table 2. Given the relatively low retention frequency in the simple split-half one-cut approach, sequential selection of step indicators is essential. Iterative elimination of the least significant indicators leads to a rapid fall in the variance of the estimator V [ γ ˜ T 1 ] (Figure 3b, d), an increase in the retention frequency of the correct step indicators in (c) and a reduction in the number of incorrectly retained indicators. As the shaded histograms and their densities in Figure 3 and the lower section of Table 2 both show sequential selection or multi-path search as in Autometrics, this dramatically improves the outcomes of SIS in the single-shift experiment. For a step shift of 4 σ ϵ , sequential selection increases the retention frequency on average to 0.93 from 0.59 with split-half one-cut.
Table 2. Retention frequency of ι T 1 for varying shift lengths l and magnitudes, λ 1 , at α = 0.01 .
Table 2. Retention frequency of ι T 1 for varying shift lengths l and magnitudes, λ 1 , at α = 0.01 .
Algorithm λ 1 l = 1 l = 5 l = 10 l = 20 l = 35
Known shift: 2 σ ϵ 0.56 (2.77)0.98 (4.72)0.99 (6.27)1.00 (8.17)1.00 (9.65)
4 σ ϵ 0.99 (5.59)1.00 (9.50)1.00 (12.57)1.00 (16.36)1.00 (19.31)
Split-half one-cut 2 σ ϵ 0.15 (1.43)0.12 (1.42)0.13 (1.47)0.14 (1.44)0.16 (1.47)
4 σ ϵ 0.61 (2.88)0.61 (2.88)0.63 (2.92)0.59 (2.88)0.60 (2.92)
Split-half sequential: 2 σ ϵ 0.17 (3.01)0.50 (3.68)0.57 (4.63)0.56 (5.86)0.56 (6.92)
4 σ ϵ 0.89 (4.10)0.93 (6.81)0.93 (8.96)0.92 (11.62)0.93 (13.70)
Multi-path 2 σ ϵ 0.41 (3.89)0.57 (5.24)0.57 (6.33)0.58 (7.78)0.55 (8.65)
4 σ ϵ 0.95 (5.89)0.93 (9.45)0.95 (11.73)0.93 (14.79)0.92 (16.57)
Average t-values for retained indicators shown in parentheses.
Figure 3. Comparing SIS on a single shift without (open) and with (shaded) sequential selection. (a) shows the time series y t with a location shift; (b) and (c) the simulated estimator and test-statistic densities for split-half and sequential selection; and (d) their simulated variances.
Figure 3. Comparing SIS on a single shift without (open) and with (shaded) sequential selection. (a) shows the time series y t with a location shift; (b) and (c) the simulated estimator and test-statistic densities for split-half and sequential selection; and (d) their simulated variances.
Econometrics 03 00240 g003
Varying the shift length at the start of the sample appears to have little impact on the retention frequencies of the shift indicator, except in the case of a single impulse. Using split-half sequential selection, a shift of 2 σ ϵ is retained on average around 50% of the time, with an increase to around 90% for a shift of 4 σ ϵ (at T = 100 , c 0.01 = 2 . 625 ).
These simulations are consistent with the analysis in Subsection 5.1: the t-statistics for split-half one cut γ ^ T 1 in SIS are close to half those from IIS at the equivalent λ 1 , but, with sequential simplification or multi-path search, converge to those for a known indicator. There is only a slight drop in retention frequency of the correct step at 4 σ ϵ despite searching over T indicators, though a rather larger drop at 2 σ ϵ . Although the t-values of retained indicators increase with the shift length l, the retention probability remains relatively constant in all cases for l > 1 , possibly because we only record the retention of ι T 1 , although a neighbouring indicator may have been found instead. Since the predictive failure test of [26] is based on IIS, as shown by [34], SIS should dominate the Chow test, yet not require knowledge of the shift point. IIS can already dominate [18], as shown in [10], so SIS multi-path should be a useful method for detecting and modelling location shifts.

5.3. Misspecified Indicator Timing

A step indicator selected in the marginal process may not exactly match the period characterizing a location shift because: (1) it ends after (or starts before); (2) it is a subset (so it starts after and ends before); and (3) it starts after and ends after (or starts before and ends before). Setting (1) is representative of the likely costs of misspecification, so we consider it in the special case just involving a shift and no other parameters. Let 1 t T 0 denote the indicator where T 0 > T 1 , so that the marginal DGP is:
y t = γ 2 + λ 1 t T 1 + v t where v t N n 2 0 , ω v 2
but now approximated by the incorrect model:
y t = μ + θ 1 t T 0 + u t
with τ = T 1 / T 0 < 1 . Autometrics selects congruent representations, so its step-indicator saturation algorithm will be directed away from non-overlapping indicators, like 1 t T 0 , when shifts are large or T 0 - T 1 is long. Moreover, the costs of not matching dates precisely decline rapidly for small shifts. Similar analyses apply to Cases (2) and (3), with additional costs when parts of a shift are also not captured.
The choice of the period selected by SIS for a step indicator is because it has the largest t-statistic, so serious mismatches that leave large residuals are unlikely. To illustrate this, consider y t = 10 - 2 × 1 t 21 + ϵ t where σ ϵ = 1 . 0 , estimated for a known indicator as:
y ^ t = 10 . 2 ( 0 . 19 ) - 1 . 97 ( 0 . 29 ) 1 t 21 σ ^ ϵ = 1 . 012 F ar ( 2 , 45 ) = 0 . 16 F het ( 1 , 47 ) = 1 . 39 χ nd 2 ( 2 ) = 1 . 29
The four panels in Figure 4 show the match to the data in one replication when the indicator is: (a) correct; (b) too long by one, (so T 0 = T 1 + 1 ); (c) too long by five; and (d) selected by SIS, which picked 1 t 23 , but had the most significant outcome. The t-values are close to the theoretical non-centralities, ϕ as recorded in the figure. Moreover, although the SIS selection ‘misses’ by two periods, that is precisely because that is when the shift is shown most clearly, and the residuals for t = 21 , 22 are not unusually large.
Figure 4. Fitted and actual values for four step-indicator specifications to a location shift at t = 21 . (a): known shift with t value and non-centrality ϕ r ; (b): shift approximated by a step one period late and (c): shift approximated by a step 5 periods late, both with non-centralities ϕ s ; and (d): SIS selection.
Figure 4. Fitted and actual values for four step-indicator specifications to a location shift at t = 21 . (a): known shift with t value and non-centrality ϕ r ; (b): shift approximated by a step one period late and (c): shift approximated by a step 5 periods late, both with non-centralities ϕ s ; and (d): SIS selection.
Econometrics 03 00240 g004
That the slow increase in potency shown in Table 2 is primarily due to slight mistiming rather than not detecting the shift is shown in Table 3. Potency increases monotonically down all columns, and even for λ = 2 σ ϵ and relatively short breaks, is 0.9 or higher by T 1 ± 3 . However, unlike the F T - 1 1 -test non-centrality of T r 1 - r λ 2 / σ 2 from analytic power calculations for a known break point exactly matched by the correct step function in a static regression, where r is the break-length fraction, potency does not increase much with T r 1 - r . The results for λ = 4 σ ϵ are similar, but are not reported, as potency is near unity for all break lengths using T 1 ± 1 .
Overall, SIS has relatively high potency for detecting a single location shift, albeit within a few periods on either side of its ending.
Table 3. Potency of 1 t T 1 for varying break lengths T 1 and accuracy of timing using Autometrics.
Table 3. Potency of 1 t T 1 for varying break lengths T 1 and accuracy of timing using Autometrics.
λ = 2 σ ϵ T 1 = 4 T 1 = 5 T 1 = 10 T 1 = 15 T 1 = 20
T 1 0.580.550.590.590.59
T 1 ± 1 0.760.770.790.830.81
T 1 ± 2 0.840.870.860.900.89
T 1 ± 3 0.890.910.920.930.92

5.4. Unknown Shift Requiring a Two-Step Indicator in One-Half Sample

An unknown location shift may require a two-step indicator, as in the following DGP:
y t = λ 1 t T 2 - 1 t T 1 + ϵ t where ϵ t IN 0 , σ ϵ 2
where λ 0 , and T 1 < T 2 < T / 2 , so as in Subsection 5.1, the shift is entirely within one-half of the sample. The model for the first-half split is:
y = D 1 γ 1 + v
where γ 1 = ( γ 1 γ T / 2 ) and D 1 = ( ι 1 ι T / 2 ) . For Equation (36) estimated on data from Equation (35):
γ ˜ 1 = D 1 D 1 - 1 D 1 y = λ D 1 D 1 - 1 D 1 ι T 2 - ι T 1 + D 1 D 1 - 1 D 1 ϵ = λ s + ϵ ( 1 )
where s is a T / 2 × 1 selection vector with unity in the T 2 -th position, - 1 in the T 1 -th position and zeroes elsewhere. Thus, a similar analysis to Subsection 5.1 holds for two relevant indicators, with r replaced by s, so selecting indicators in the latter half of the sample should remain as before.
To simulate an unknown shift period requiring a two-step indicator in the first-half, we set T 1 = 25 and T 2 = 35 , and only consider sequentially selected indicators here, retaining the selected indicators from D 1 and D 2 at α = 0 . 01 . Table 4 shows the simulation results for sequential selection from the split-half for two shifts, so retention frequencies are close to the case of a single shift.
Table 4. Split-half sequential selection: gauge and retention frequencies for a shift with two indicators.
Table 4. Split-half sequential selection: gauge and retention frequencies for a shift with two indicators.
GaugeRetention frequency
λ 1 D 1 D 2 T 1 step T 2 step
2 σ ϵ 0.0200.0110.520.55
4 σ ϵ 0.0040.0170.910.94

5.5. Unknown Opposite-Signed Shifts in Each Split Half

If shifts in each half of the sample have opposite signs, or perhaps very different magnitudes, then both can be detected even in a split-half sequential selection approach. Consider the DGP:
y t = λ 1 1 t T 2 - 1 t T 1 + λ 2 1 t T 4 - 1 t T 3 + ϵ t
where ϵ t IN 0 , σ ϵ 2 as before, and T 1 < T 2 T / 2 , whereas T / 2 T 3 < T 4 with λ 1 λ 2 < 0 . To remove the mean effect of the other location shift, since:
1 T t = 1 T λ 2 1 t T 4 - 1 t T 3 = λ 2 T 4 - T 3 T = ϕ 2 ,
the intercept must be retained without selection.
The formula in (37) still applies, with appropriate adjustments for estimating the intercepts, but even if the first shift is correctly modelled, the equation in (22) for the residuals becomes:
y - y ^ = λ 2 ι T 4 - ι T 3 - ϕ 2 ι + ϵ 2 * = v ^ 2
which has a larger estimated error variance than in the previous cases, because:
2 T E v ^ 2 v ^ 2 = 2 T E ( ϵ 2 * ) ϵ 2 * + 2 T λ 2 ι T 4 - ι T 3 - ϕ 2 ι λ 2 ι T 4 - ι T 3 - ϕ 2 ι = σ ϵ 2 + λ 2 2 2 T 4 - T 3 T - 2 λ 2 ϕ 2 2 T 4 - T 3 T + 2 ϕ 2 2 = σ ϵ 2 + 2 λ 2 2 T 4 - T 3 T 1 - T 4 - T 3 T
To compensate for the equivalent effect of Equation (40), when searching for a second shift, step indicators found in the first half should be be included in the second-half selection.
To simulate unknown opposite-signed shifts in each half, λ 1 and λ 2 are chosen, such that λ 1 = - λ 2 , where the shift timing is given by T 1 = 25 to T 2 = 35 and T 3 = 75 to T 4 = 85 . Table 5 shows that even with shifts falling in the middle of each half, SIS can be successful in identifying the shift points.
Table 5. Split-half sequential selection: opposite-signed shifts in each half, α = 0.01 .
Table 5. Split-half sequential selection: opposite-signed shifts in each half, α = 0.01 .
GaugeRetention frequency
λ 1 , 2 D 1 D 2 T 1 step T 2 step T 3 step T 4 step
2 σ ϵ 0.0210.0450.520.550.570.56
4 σ ϵ 0.0040.0300.910.940.930.93

5.6. Unknown Equal Shifts in Each Split Half

Shifts with relatively equal magnitudes, durations and the same signs in each half, so they are roughly evenly distributed between the two halves, could well appear as just a larger error variance, rendering the simplest split-half one-cut approach ineffective. Nevertheless, when T is sufficiently large, both shifts can be detected using a modified split-half approach. First, saturate the second half by impulse indicators, then the first half can be tackled by a split-half approach, so quarters are examined, without any additional cost under the null. That procedure is then reversed for the first half. This is a variant of super saturation, where IIS is also undertaken with SIS as in [15], but here limiting IIS to the alternate half and not using the information it reveals about outliers and shifts.
Under the alternative, by eliminating the shift in the second half, the first half comes under the above analysis for a single shift, which is then detectable provided it is not evenly split between the quarters. In practice, Autometrics uses multiple block searches, and this has proven effective for IIS in detecting multiple shifts. Blocks would need to span most of the length of a location shift to detect it using SIS, but that may be less essential for super saturation.
Unknown shifts of equal magnitude are assessed by setting λ 1 = λ 2 , with T 1 = 25 to T 2 = 35 and T 3 = 75 to T 4 = 85 , so there are two same-sign step shifts of equal magnitude and length in each half. We consider both the split-half sequential selection approach and multi-path selection using Autometrics (where a single gauge value is reported for D 1 and D 2 ). Table 6 provides summary results showing little difference in retention frequencies.
Table 6. Split-half sequential selection and multi-path: unknown equal shifts in each half, α = 0.01 .
Table 6. Split-half sequential selection and multi-path: unknown equal shifts in each half, α = 0.01 .
AlgorithmGaugeRetention Frequency
λ 1 D 1 D 2 T 1 step T 2 step T 3 step T 4 step
Sequential: 2 σ ϵ 0.0210.0440.520.550.590.60
4 σ ϵ 0.0050.0300.910.940.940.94
λ 1 D 1 & D 2 T 1 step T 2 step T 3 step T 4 step
Multi-path: 2 σ ϵ 0.038 0.530.480.550.55
4 σ ϵ 0.018 0.87 0.910.940.92

5.7. Unknown Shift Period Spanning Both Splits

The analysis in Subsection 5.6 may be effective in capturing a location shift spanning the initial halves, as then the shift will almost always lie entirely within a quarter of the sample. This follows, since within the first half of T / 2 where the shift lies towards the end by necessity of spanning into the second half, if it were longer than T / 4 , SIS would find the shorter as if it were the shift and similarly for the second half.
To simulate a shift period spanning both splits, the shift timing is set such that the end of the first shift occurs just as the second shift starts, i.e., T 2 = T / 2 and T 3 = T / 2 + 1 with T 1 = 35 and T 4 = 65 , leading to a single step shift of a length of 30 periods spanning both halves. Table 7 presents the results when using split-half sequential selection, as well as multi-path. Both correctly show the absence of shifts at T 2 and T 3 (as the shift spans the two halves), and as before, there is little difference between these, exhibiting retention frequencies at T 1 and T 4 of around 0.9 for a step shift of 4 σ ϵ .
Table 7. Split-half sequential sequential and multi-path: shift spanning both splits, α = 0.01 .
Table 7. Split-half sequential sequential and multi-path: shift spanning both splits, α = 0.01 .
AlgorithmGaugeRetention frequency
λ 1 D 1 D 2 T 1 Step T 2 Step T 3 Step T 4 Step
Sequential: 2 σ ϵ 0.0110.0390.580.0010.00.56
4 σ ϵ 0.0020.020.940.00.00.93
λ 1 D 1 & D 2 T 1 Step T 2 Step T 3 Step T 4 Step
Multi-path: 2 σ ϵ 0.029 0.570.010.010.55
4 σ ϵ 0.019 0.940.020.020.96

5.8. Summary of the Simulation Results

The Monte Carlo experiments provide evidence for the feasibility of detecting location shifts using SIS. In the case of static DGPs with specific location shifts, the step indicators exhibit high retention frequencies: around 50% in the case of a shift equal to two standard deviations and around 90% for shifts of four standard deviations. These results hold across single shifts, multiple shifts of the same or opposite signs and shifts spanning both halves. Sequential selection of step shifts is crucial to ensure high potency when using split half, which, in turn, requires any selected indicators to be carried forward into the second set of indicators. Multi-path search using Autometrics without the condition of carrying relevant indicators forward yields similar results to split-half sequential simplification. Overall, the results match the theoretical analyses of gauge and potency for a single shift, noting that gauge is higher for SIS than IIS as 2 steps are needed to characterize a single outlier.

6. Generalization to Retained Regressors

Following the theoretical findings in [4] for IIS, we use simulations to assess SIS with n < T / 2 general regressors by including the T × n matrix Z as independent variables. For a single-step shift with unknown timing requiring two indicators, the DGP is then given by:
y t = β 1 z t + λ 1 t T 2 - 1 t T 1 + ϵ t where ϵ t IN 0 , σ ϵ 2
For the present simulation, we set σ ϵ 2 = 1 and n = 10 . For each of the i = 1 , , n   IID IID regressors, the associated non-centralities are set to E [ t i ] = ψ i = 4 . The individual z i are orthogonal in expectation and not selected over (see [27]), so they are present in every selection iteration of the step indicators. The shift timing is set as before to T 1 = 25 and T 2 = 35 .
Table 8 displays the simulation outcomes and properties of the step indicators. With the inclusion of 10 relevant independent variables, the densities of the two shift estimators are centred around the true value of λ 1 = 4 σ ϵ . The potency of SIS seems unaffected by the presence of additional fixed regressors, with retention frequencies close to those in experiments without regressors.
Table 8. SIS with regressors.
Table 8. SIS with regressors.
GaugeRetention frequency
λ 1 T 1 step T 2 step
2 σ ϵ 0.0350.500.62
4 σ ϵ 0.0240.910.94

7. Comparisons with Least Angle Regression

Simulation results for LARS (see [23]) use the same DGPs as for SIS in Subsection 3, based on M = 1000 replications at T = 100 . Findings under the null of no shifts are reported in Table 9, where Table 1 recorded the gauge of SIS.
Table 9. Null retention frequency of least angle regression (LARS) under the null of no shifts.
Table 9. Null retention frequency of least angle regression (LARS) under the null of no shifts.
LARS Step12345Cross-Validated
Gauge0.0290.0470.0610.0730.0840.017
Under the null, the gauge is difficult to control for different steps in the LARS algorithm and quickly exceeds the gauge of SIS, as seen in Table 9. However, cross-validation under the null yields a gauge close to SIS at 1%.
Under the alternative, the gauge can vary drastically. Table 10 applies the cross-validated LARS step for single shifts with varying lengths, whereas Table 11 and Table 12 consider multiple shifts. For a single shift, the potency for exact detection in Table 10 is high, as expected for a single-step forward search procedure. Like SIS, there is little apparent potency increase with the length of shift, but that probably reflects mistiming rather than missing the shift, as occurred for SIS in Table 3. The gauge for a single shift between 2% and 6% is higher than under the null.
Overall, while LARS exhibits high potency, this is the result of a high gauge that is difficult to control, varying from 1.7% under the null, to 16.7% facing multiple shifts. This makes it difficult to use LARS in practice, as the number of shifts is not generally known a-priori.
Table 10. Potency and gauge of cross-validated LARS for single shifts of lengths l and magnitudes λ 1 .
Table 10. Potency and gauge of cross-validated LARS for single shifts of lengths l and magnitudes λ 1 .
λ 1 l = 1 l = 5 l = 10 l = 20 l = 35
Potency 2 σ ϵ 0.2980.8000.8440.8530.854
Potency 4 σ ϵ 0.7800.9900.9900.9961.00
Gauge 2 σ ϵ 0.0180.0500.0540.0560.058
Gauge 4 σ ϵ 0.0200.0520.0550.0580.059
Table 11. Potency and gauge of cross-validated LARS for same-sign, equal-magnitude shifts over T 1 = 25 to T 2 = 35 and T 3 = 75 to T 4 = 85 .
Table 11. Potency and gauge of cross-validated LARS for same-sign, equal-magnitude shifts over T 1 = 25 to T 2 = 35 and T 3 = 75 to T 4 = 85 .
λ 1 gauge T 1 potency T 2 potency T 3 potency T 4 potency
2 σ ϵ 0.1670.8290.8390.8710.818
4 σ ϵ 0.1660.9971.001.000.992
Table 12. Potency and gauge of cross-validated LARS for equal magnitude, opposite-signed shifts over T 1 = 25 to T 2 = 35 and T 3 = 75 to T 4 = 85 .
Table 12. Potency and gauge of cross-validated LARS for equal magnitude, opposite-signed shifts over T 1 = 25 to T 2 = 35 and T 3 = 75 to T 4 = 85 .
λ 1 gauge T 1 potency T 2 potency T 3 potency T 4 potency
2 σ ϵ 0.1640.8360.8370.8570.867
4 σ ϵ 0.1600.9971.000.9980.997

8. Non-Linearity and SIS

When the DGP, or models thereof, include non-linear transformations of variables, the issue arises as to whether also applying SIS will affect the selection of the correct formulation. We consider three possibilities when selecting non-linearity:
(1)
the DGP includes non-linear variables, but no shifts, and SIS is applied;
(2)
there is a shift in the dependent variable, but no non-linearity and SIS is not applied; and
(3)
Setting (2), but SIS is applied.
First, consider a simple static DGP without shifts, but containing non-linear variables, as given in Equation (42):
y t = μ + β 1 x t + β 2 x t 2 + β 3 x t 3 + ϵ t where ϵ t IN [ 0 , 1 ]
and for convenience, x t IN [ 0 , 1 ] , when the non-centralities associated with the three explanatory variables are ψ i = 3 for i = 1 , 2 , 3 . The model is also given by Equation (42), where all variables are first retained without SIS, then subject to selection jointly with SIS.
In Table 13, we report the effect on the null rejection frequencies of t-tests for the correctly included non-linear variables when not selected over, as well as the retention frequency when the relevant variables are selected over jointly with SIS. Simulations are reported for a sample of T = 100 using M = 1,000 replications, with variables being selected at α = 0 . 01 . Inclusion of a saturating set of step indicators in the candidates has little effect on the null retention frequencies of indicators and non-null retention frequencies of relevant non-linear variables when there is no shift. When x t k , k = 1 , 2 , 3 are always retained, the null rejection frequencies at 1% are close to the theory level of 0 . 66 , though somewhat higher when SIS is used due to slight under-estimation of the error variance. When the x t k are jointly selected with SIS, the retention drops marginally relative to the no-SIS case.
Table 13. The effect of using SIS in Equation (42) on the rejection frequencies when the x t k are always retained and retention frequencies when they are selected over (both at 1%).
Table 13. The effect of using SIS in Equation (42) on the rejection frequencies when the x t k are always retained and retention frequencies when they are selected over (both at 1%).
VariableNull rejection at 1% (xs retained)Retention at 1% (x’s selected over)
without SISwith SISwithout SISwith SIS
x, ψ = 3 0.650.680.680.63
x 2 , ψ = 3 0.640.670.680.56
x 3 , ψ = 3 0.650.680.670.62
SIS gauge-0.02-0.03
Second, we study the effect of SIS when there is a shift in the dependent variable y t , such that a non-linear transformation of the explanatory variables may spuriously appear significant by approximating this break. The example considered here is for the DGP given in Equation (43):
y t = μ + λ 1 { t 35 } + ϵ t
The estimated model is identical to that given in Equation (42), except that now the x t variable is generated such that a non-linear transformation spuriously approximates the structural break in Equation (43). For illustration, x t is generated with the functional form x t = | γ 1 + κ e - r t - 1 + v t | ( 1 / 3 ) , where γ = 10 , κ = 70 , r = 0 . 1 and v t IN [ 0 , 1 ] . The results are recorded in Table 14, both when x t k are always retained and when selected over jointly with SIS.
Table 14. The impact of SIS in Equation (42) on the null rejection frequencies of x t k , which are always retained, and the retention frequencies when they are selected over, with and without SIS in the presence of a step-shift λ 1 at T 1 = 35 (all at 1%).
Table 14. The impact of SIS in Equation (42) on the null rejection frequencies of x t k , which are always retained, and the retention frequencies when they are selected over, with and without SIS in the presence of a step-shift λ 1 at T 1 = 35 (all at 1%).
VariableNull rejection at 1% (xs retained)Retention at 1% (xs selected over)
λ 1 = 2 σ ϵ λ 1 = 4 σ ϵ λ 1 = 2 σ ϵ λ 1 = 4 σ ϵ
no SISwith SISno SISwith SISno SISwith SISno SISwith SIS
x, ψ = 0 0.220.060.730.030.410.020.830.02
x 2 , ψ = 0 0.290.060.870.030.660.020.960.02
x 3 , ψ = 0 0.260.060.800.020.300.020.830.01
T 1 step-0.51-0.93-0.62-0.94
SIS gauge-0.02-0.02-0.02-0.02
When there is a shift in the dependent variable that renders non-linear variables significant, SIS is able to pick up the shift that is otherwise attributed to non-linearity in the model. By detecting the shift, the null rejection (and retention) frequencies of the irrelevant non-linear variables is moved much closer to the nominal level, and for a 4 σ ϵ shift, using SIS reduces the null rejection frequency for the irrelevant x t k from around 0.8 to 0.02–0.03 when testing at 1 % . This result is robust to whether the x t k are always retained or are selected over. The potency of detecting the shift using SIS is also not affected by the presence of non-linear covariates and is close to that found in previous sections, namely 0 . 5 for a 2 σ ϵ shift and 0 . 9 for 4 σ ϵ .
Thus, SIS can act as an insurance mechanism in non-linear models. If there are non-linearities present in the DGP, employing SIS has little effect on the null rejection and retention frequencies, and the gauge is close to the nominal significance level. If there are unknown shifts in the dependent variable that may otherwise be attributed to non-linearities, SIS exhibits high potency in identifying the shift and restores the null rejection and retention frequencies of the irrelevant non-linear variables close to the nominal value.

9. Conclusion

Detecting location shifts by step-indicator saturation has the correct null retention frequency in constant conditional models for a nominal selection size of α. The approximate alternative retention frequency function was derived analytically for simple models and helps explain the simulation outcomes. Although only one and two shifts were considered in detail, the general nature of the approach makes it applicable when there are multiple shifts. There have already been a number of applications of SIS to empirical problems, including commodity price shifts in [35], location shifts in U.K. real wage determination in [21] and detecting crises in [15], as well as variants to measure the impacts of volcanic eruptions on temperature in [30]. Non-linearities that are relevant are not ‘lost’ by using SIS, whereas irrelevant, but spuriously significant ones can be eliminated by SIS.
While all of the derivations and Monte Carlo experiments here have been for simple static equations and specific location shifts, the principles seem general and should apply to dynamic equations (although with approximate null-retention frequencies) and to conditional systems. Generalizations to non-stationary settings would need to extend the analysis in [4]. Other important new analyses for SIS are checking for location shifts at the forecast origin, which would otherwise be pernicious for forecast accuracy, and to testing super exogeneity, where the IIS-based test in [5] has relatively low power for long shifts.

Acknowledgements

Financial support from the Open Society Foundations and the Oxford Martin School is gratefully acknowledged. We are indebted to Neil R. Ericsson, Søren Johansen, Bent Nielsen and Giovanni Urga for helpful comments on an earlier version.

Author Contributions

All authors made equal contributions.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. J.L. Castle, and D.F. Hendry. “Model selection in under-specified equations with breaks.” J. Econ. 178 (2014): 286–293. [Google Scholar] [CrossRef]
  2. M.P. Clements, and D.F. Hendry. Forecasting Economic Time Series. Cambridge, UK: Cambridge University Press, 1998. [Google Scholar]
  3. D.F. Hendry, S. Johansen, and C. Santos. “Automatic selection of indicators in a fully saturated regression.” Comput. Stat. 33 (2008): 317–335. Erratum, 337–339. [Google Scholar] [CrossRef]
  4. S. Johansen, and B. Nielsen. “An analysis of the indicator saturation estimator as a robust regression estimator.” In The Methodology and Practice of Econometrics. Edited by J.L. Castle and N. Shephard. Oxford, UK: Oxford University Press, 2009, pp. 1–36. [Google Scholar]
  5. D.F. Hendry, and C. Santos. “An automatic test of super exogeneity.” In Volatility and Time Series Econometrics. Edited by M.W. Watson, T. Bollerslev and J. Russell. Oxford, UK: Oxford University Press, 2010, pp. 164–193. [Google Scholar]
  6. R.F. Engle, D.F. Hendry, and J.F. Richard. “Exogeneity.” Econometrica 51 (1983): 277–304. [Google Scholar] [CrossRef]
  7. D.F. Hendry, and G.E. Mizon. “Econometric modelling of time series with outlying observations.” J. Time Series Econ. 3 (2011). [Google Scholar] [CrossRef]
  8. J.J. Reade, and U. Volz. “From the general to the specific: Modelling inflation in China.” Appl. Econ. Q. 7 (2011): 27–44. [Google Scholar] [CrossRef]
  9. N.R. Ericsson, and E.L. Reisman. “Evaluating a global vector autoregression for forecasting.” Int. Adv. Econ. Res. 18 (2012): 247–258. [Google Scholar] [CrossRef]
  10. J.L. Castle, J.A. Doornik, and D.F. Hendry. “Model selection when there are multiple breaks.” J. Econ. 169 (2012): 239–246. [Google Scholar] [CrossRef]
  11. D.F. Hendry, and F. Pretis. “Anthropogenic influences on atmospheric CO2.” In Handbook on Energy and Climate Change. Edited by R. Fouquet. Cheltenham, UK: Edward Elgar, 2013, pp. 287–326. [Google Scholar]
  12. N.R. Ericsson. “How biased are U.S. Government Forecasts of the Federal Debt? ” Int. J. Forecast. in press.
  13. J.A. Doornik. “Autometrics.” In The Methodology and Practice of Econometrics. Edited by J.L. Castle and N. Shephard. Oxford, UK: Oxford University Press, 2009, pp. 88–121. [Google Scholar]
  14. D.F. Hendry, and J.A. Doornik. Empirical Model Discovery and Theory Evaluation. Cambridge, MA, UK: MIT Press, 2014. [Google Scholar]
  15. N.R. Ericsson. “Detecting Crises, Jumps, and Changes in Regime.” Working paper. Washington, DC, USA: Federal Reserve Board of Governors, 2012. [Google Scholar]
  16. J.L. Castle, J.A. Doornik, and D.F. Hendry. “Evaluating automatic model selection.” J. Time Series Econ. 3 (2011). [Google Scholar] [CrossRef]
  17. S. Johansen, and B. Nielsen. “Outlier detection in regression using an iterated one-step approximation to the Huber-Skip estimator.” Econometrics 1 (2013): 53–70. [Google Scholar] [CrossRef][Green Version]
  18. J. Bai, and P. Perron. “Estimating and testing linear models with multiple structural changes.” Econometrica 66 (1998): 47–78. [Google Scholar] [CrossRef]
  19. J. Bai, and P. Perron. “Computation and analysis of multiple structural change models.” J. Appl. Econ. 18 (2003): 1–22. [Google Scholar] [CrossRef]
  20. P. Perron, and T. Yabu. “Testing for shifts in trend with an integrated or stationary noise component.” J. Bus. Econ. Stat. 27 (2009): 369–396. [Google Scholar] [CrossRef]
  21. J.L. Castle, and D.F. Hendry. “Semi-automatic non-linear model selection.” In Essays in Nonlinear Time Series Econometrics. Edited by N. Haldrup, M. Meitz and P. Saikkonen. Oxford, UK: Oxford University Press, 2014, pp. 163–197. [Google Scholar]
  22. R. Tibshirani. “Regression shrinkage and selection via the lasso.” J. R. Stat. Soc. B 58 (1996): 267–288. [Google Scholar]
  23. B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. “Least angle regression.” Ann. Stat. 32 (2004): 407–499. [Google Scholar]
  24. J.A. Doornik. Object-Oriented Matrix Programming using Ox, 7th ed. London, UK: Timberlake Consultants Press, 2009. [Google Scholar]
  25. J.A. Doornik, and D.F. Hendry. Empirical Econometric Modelling using PcGive: Volume I, 7th ed. London, UK: Timberlake Consultants Press, 2013. [Google Scholar]
  26. G.C. Chow. “Tests of equality between sets of coefficients in two linear regressions.” Econometrica 28 (1960): 591–605. [Google Scholar] [CrossRef]
  27. D.F. Hendry, and S. Johansen. “Model discovery and Trygve Haavelmo’s legacy.” Econ. Theory 31 (2014): 93–114. [Google Scholar] [CrossRef]
  28. M. Bergamelli, and G. Urga. “Detecting Multiple Structural Breaks: A Monte Carlo Study and an Application to the Fisher Equation for US.” Discussion paper. London, UK: Cass Business School, 2013. [Google Scholar]
  29. C. De Peretti, and G. Urga. “Stopping Tests in the Sequential Estimation of Multiple Structural Breaks.” Discussion paper. London, UK: Cass Business School, 2005. [Google Scholar]
  30. F. Pretis, L. Schneider, and J.E. Smerdon. “Detection of Breaks by Designed Functions Applied to Volcanic Impacts on Hemispheric Surface Temperatures.” Working paper. Oxford, UK: Economics Department, Oxford University, 2014. [Google Scholar]
  31. C.W.J. Granger, and T. Teräsvirta. Modelling Nonlinear Economic Relationships. Oxford, UK: Oxford University Press, 1993. [Google Scholar]
  32. H. White. Artificial Neural Networks: Approximation and Learning Theory. Oxford, UK: Oxford University Press, 1992. [Google Scholar]
  33. J.L. Castle, J.A. Doornik, D.F. Hendry, and F. Pretis. “Detecting Location Shifts by Step-Indicator Saturation.” Working paper. Oxford, UK: Economics Department, Oxford University, 2013. [Google Scholar]
  34. D.S. Salkever. “The use of dummy variables to compute predictions, prediction errors and confidence intervals.” J. Econ. 4 (1976): 393–397. [Google Scholar] [CrossRef]
  35. R. Mariscal, and A. Powell. “Commodity Price Booms and Breaks: Detection, Magnitude and Implications for Developing Countries.” Discussion paper. Washington DC, USA: Research Department, Inter American Development Bank, 2014. [Google Scholar]
  • 1We are currently investigating a range of designed break functions, including interactions with regressors to detect parameter changes.
Back to TopTop