Fractional Unit Root Tests Allowing for a Structural Change in Trend under Both the Null and Alternative Hypotheses

This paper considers testing procedures for the null hypothesis of a unit root process against the alternative of a fractional process, called a fractional unit root test. We extend the Lagrange Multiplier (LM) tests of Robinson (1994) and Tanaka (1999), which are locally best invariant and uniformly most powerful, to allow for a slope change in trend with or without a concurrent level shift under both the null and alternative hypotheses. We show that the limit distribution of the proposed LM tests is standard normal. Finite sample simulation experiments show that the tests have good size and power. As an empirical analysis, we apply the tests to the Consumer Price Indices of the G7 countries.


Introduction
Non-stationarity in economic time series is a pervasive feature. In order to carry proper inference, it is important to find the exact features that lead to this non-stationarity. A unit root process is a well-known example of non-stationary processes, and testing for a unit root against stationarity has been a topic of substantial interest from both theoretical and empirical perspectives. Perron (1989) [1], however, showed that the Dickey and Fuller (1979) [2] type unit root test is biased in favor of a non-rejection of the unit root null hypothesis when the process is trend stationary with a structural change in slope. Perron (1989Perron ( , 1990 [1,3] proposed testing procedures in which a structural break is allowed under both the null and alternative hypotheses. Later, Christiano (1992) [4] and Zivot and Andrews (1992) [5] criticized the assumption that the date of the structural break is known a priori. In succeeding research, Zivot and Andrews (1992) [5], Perron (1997) [6], and Vogelsang and Perron (1998) [7] treated the break date as unknown and proposed testing procedures for a unit root. In much work, especially that of Zivot and Andrews (1992) [5], it was common to allow for a structural break only under the alternative hypothesis, not under the null hypothesis of a unit root. This is very restrictive, and can lead to misleading results. Recent advances in testing for and estimating a structural break in a trend function have made possible the development of unit root tests that allow for a change in trend under both the null and alternative hypotheses. Perron and Zhu (2005) [8] established the consistency, rate of convergence, and limiting distribution of the parameter estimates when there is a break in a trend function with or without a concurrent level shift. Perron and Yabu (2009) [9] suggested a testing procedure for structural changes in the trend function of a time series without any prior knowledge of whether the noise component is stationary or has an autoregressive unit root. Building on this work, Kim and Perron (2009) [10] proposed unit root testing procedures which allow for a structural change under both the null and alternative hypotheses; see also Carrion-i-Silvestre et al. (2009) [11] for an extension to the case with multiple changes.
Fractional processes with the order of integration d ≥ 0.5 are also non-stationary. Standard unit root tests often reject the null hypothesis when the true process is fractionally integrated with d ∈ (0.5, 1). This can lead to the misleading conclusion that the process of interest is stationary. This motivated researchers to introduce unit root tests which are powerful against the alternative hypothesis of a fractional process. Robinson (1991) [12] derived a Lagrange Multiplier (LM) test for fractional white noise disturbances in a linear regression, while Robinson (1994) [13] proposed tests for unit root, and actually any real values of d in both the frequency and the time domain. Tanaka (1999) [14] suggested an LM test in the time domain, and showed that it is locally best invariant and uniformly most powerful. Dolado et al. (2002) [15] introduced a Wald-type unit root test against the alternative of fractional integration. This test is based on the Dickey and Fuller (1979) [2] type test using an auxiliary regression with a consistent estimate of the integration order. Lobato and Velasco (2007) [16] established a Wald-type test which is more efficient and is asymptotically equivalent to the LM test. Recently, Cho et al. (2015) [17] suggested combining the test of Kwiatkowski et al. (1992) [18] and a unit root test to test the null of integer integration, i.e., I(0) or I(1) against the alternative of fractional integration, i.e., I(d), d ∈ (0, 1). In this line of work, the process of interest has been limited to either a random walk or a purely fractional process. Lobato and Velasco (2007) [16] considered short-run dynamics in the process. Dolado et al. (2008) [19] extended the work of Dolado et al. (2002) [15] and Lobato and Velasco (2007) [16] to incorporate some deterministic components; for instance, a constant and a linear trend function.
Our main contribution is to extend the LM test for a fractional unit root to allow for a structural change in a trend function under both the null and alternative hypotheses. This extension has some advantages, as follows: (i) it imposes a symmetric treatment of the nature of the deterministic trend under both the null and alternative hypotheses; (ii) it does not require long memory to be distinguished from structural change; 1 (iii) the power of fractional unit root tests can be substantially improved when a break is actually present. We consider linear trend models in which a structural change in slope occurs with or without a concurrent level shift.
The rest of this paper is organized as follows. In Section 2, we first introduce fractional processes and the Lagrange Multiplier test of Tanaka (1999) [14] along with preliminary results to be used subsequently. In Section 3, the LM tests are generalized to allow for a structural break in trend under both the null and alternative hypotheses. Extensions to processes with short-run dynamics are discussed in Section 4. The results of simulation experiments about the size and power of the tests are presented in Section 5. As an empirical application, we test for a fractional unit root in the Consumer Price Indices (CPIs) of the G7 countries in Section 6. Concluding remarks are provided in Section 7. All mathematical proofs are relegated to the Appendix A.

Lagrange Multiplier Test
For an integer d = 1, 2, . . ., the operator ∆ d = (1 − L) d denotes the differencing operator with the usual lag operator L; i.e., Given that unit root and long memory processes share similar features, distinguishing between long memory processes and short memory processes with structural changes has been an important topic in econometrics and financial economics. Along the lines of Perron (1989) [1], it is well known that short memory processes with level shifts exhibit properties that lead standard tools to conclude that long memory is present (e.g., Diebold and Inoue (2001) [20], Granger and Hyung (2004) [21], Lu and Perron (2010) [22], Perron and Qu (2010) [23], Qu and Perron (2013) [24], Xu and Perron (2014) [25], and Varneskov and Perron (2016) [26], among many others). On the other hand, it has been also documented that long memory processes induce a rejection of the null hypothesis of no structural change when using conventional structural change tests (see Wright (1998) [27] and Krämer and Sibbertsen (2002) [28]). and so on. For a non-integer real number d > −1, the difference operator ∆ d = (1 − L) d is defined by means of the binomial expansion with Γ(·) the gamma function, so that π k (d) = ( k−d−1 k ) and π 0 (d) = 1. Recall that x! ≡ Γ(x + 1), x = 0, 1, . . ., and for k = 1, 2, · · · , 0 < To define a fractional process, we use the notation of Robinson (2005) [29]. Let {η t , t = 0, ±1, . . .} be a short-memory zero-mean covariance stationary process, with spectral density that is bounded and bounded away from zero. For d ∈ (−0.5, 0.5), is covariance stationary and invertible. The truncated version of ζ t is defined as where 1 A is the indicator function for the event A. For an integer m ≥ 0,  Marinucci and Robinson (1999) [30] defined type I and type II fractional Brownian motions with d ∈ (−0.5, 0.5) on D[0, 1], respectively, as follows: where B(·) denotes the standard Brownian motion. Furthermore, Robinson (2005) [29] and Davidson and Hashimzade (2009) [31] pointed out that asymptotic results vary depending on the definition of fractional Brownian motions considered, which requires one to design simulation experiments in accordance with the particular type used. Now we consider a fractional unit root test. Under the null hypothesis, {u t } is a unit root process; that is, H 0 : d 0 = 1 (i.e., d = 0 and m = 1). The alternative hypothesis can be either one-sided 2 The restriction that d 0 = 0.5 is standard in the long memory literature. Tanaka (1999) [14] showed that the case with d 0 = 0.5 needs to be treated separately from the case with d 0 = 0.5.
(H 1 : d 0 > 1 or H 1 : d 0 < 1) or two-sided (H 1 : d 0 = 1). Robinson (1994) [13] and Tanaka (1999) [14] considered the Lagrange Multiplier test in the frequency and time domain. It is well known that the LM test is locally best invariant. Further, Tanaka (1999) [14] showed that the LM test is locally uniformly most powerful invariant because it achieves the power envelope of all the invariant tests against local alternatives. The test statistic suggested in Tanaka (1999) [14] is 2 is the kth order autocorrelation of the residuals ∆u t . Local alternatives to the null hypothesis are often considered in the literature, with the integration order defined as d 0 = 1 + δT −1/2 with δ fixed, often referred to as Pitman drifts. We state the limiting distribution of the LM test under local alternatives, as it will be relevant for subsequent derivations. Lemma 1 (Theorem 3.1 in Tanaka (1999) [14]). Under the assumption that u t is generated by (3) with

Deterministic Components Allowing for a Structural Change
In this section, we extend the LM test for a fractional unit root to allow for a structural change in trend with or without a concurrent level shift. We consider the time series of interest y t as consisting of a deterministic component ( f t ) and fractionally integrated errors. The data-generating process (DGP) is specified as For u t , we impose E(u t ) = 0 and the following assumption.
Assumption 1. u t is a type I I(m + d) process which is defined in (1)- (3). Moreover, the short-memory zero-mean covariance stationary process η t is assumed to be independent and identically distributed (i.i.d.) with zero mean and finite variance.
The i.i.d. assumption on the short-memory process η t will be relaxed later to allow for short-run dynamics. The unit root null hypothesis corresponds to the case with m = 1 and d = 0, which implies that u t is a weighted sum of η t .

Change in Mean
We first consider the case where y t experiences a level shift at an unknown time T b . The DGP is specified as where C t is a dummy variable for a level shift defined by: ] is the true break date with the corresponding true break fraction λ b ∈ (0, 1).

Theorem 1 (Change in Mean)
. Under Assumption 1, suppose that the process {y t } is generated under the null hypothesis of (4). Consider the Lagrange Multiplier test LM M defined by: Under the null hypothesis H 0 : d 0 = 1, it holds that as T → ∞, LM M d → N (0, 1).
Theorem 1 implies that Tanaka's (1999) [14] LM test is robust to the presence of a level shift. In the following subsection, we consider the LM test in the context of trending series.

Slope and Intercept Change in Trending Series
We now introduce a deterministic time trend in the models. We follow the notation in Kim and Perron (2009) [10] (henceforth KP) from which we will use some relevant results. The DGPs are specified as 1. Model A0: (Deterministic time trend without a structural change) 2. Model A1: (Level shift) 3. Model A2: (Joint broken trend) where B t is a dummy variable for a slope change in trend given by 4. Model A3: (Locally disjoint broken trend) Following KP, we can rewrite Models A1-A3 as follows: where z t,1 = (1, t) , φ 1 = (µ 1 , β 1 ) , for Model A3.
In matrix notation, the models defined previously can be specified as Consider first Model A0, where no structural change is allowed. By taking first differences, we can rewrite (5) as follows: The ordinary least squares (OLS) estimate of β 1 isβ 1 = T −1 ∑ T t=1 ∆y t , which is consistent under both H 0 and H 1 . 3 We define ∆y t = ∆y t −β 1 , the OLS residuals from the regression model (8). 3 Under H 0 ,β 1 is a T 1/2 -consistent estimator of the slope coefficient β 1 . Hosking (1996) [32] considered a stationary ARFIMA (p, d, q) process {y t } and showed the weak convergence of the sample mean for d ∈ (−0.5, 0.5). It is not difficult to generalize the result to the case where d ∈ (0.5, 1), for whichβ 1 is a T 3/2−d -consistent estimator. Theorem 2 (Linear Trend). Under Assumption 1, suppose that the process {y t } is generated under the null hypothesis of (5). Consider the Lagrange Multiplier test LM T defined by: Under the null hypothesis H 0 : In what follows, the aim is to devise Lagrange Multiplier tests allowing for a slope change in trend with or without a concurrent level shift. The following assumption is essential to that effect. Assumption 2. β b = 0 and λ b ∈ (π, 1 − π) for some π ∈ (0, 1/2). Assumption 2 ensures that there is a single slope change in trend, and that the pre-and post-break samples are not asymptotically negligible, which is a standard assumption needed to derive useful asymptotic results. Model A1 (level shift only) will be revisited later.
The break date can be estimated by using a global least-squares criterion: where P T 1 is the matrix that projects on the range space of Z T 1 ; i.e., , and the resulting sum of squared residuals is, for an estimated break fractionλ s =T b /T (the subscript s refers to the fact that we consider a static regression; a dynamic regression with lagged dependent variables will be considered later): The rate of convergence ofλ s for Models A2 and A3 isλ s − λ b = O p (T −1/2 ) with I(1) errors (see Theorem 3 in PZ). Chang and Perron (2016) [33] derived the consistency and rate of convergence ofλ s when the noise component is a fractional process with the differencing parameter d 0 ∈ (−0.5, 0.5) ∪ (0.5, 1.5). Specifically, for Models A2 and A3, we can construct the detrended process {ỹ t }, and the Lagrange Multiplier test statistic LM T,λ s is given by The convergence rate of the estimateλ s is not fast enough to guarantee that LM T,λ s has the standard normal limit under H 0 . KP faced a similar issue in dealing with unit root tests. They introduced a heuristic explanation of the issue involved, which we briefly review. Letλ =T b /T denote an estimate of the break fraction such thatλ where Z(T b ) and Z(T b ) 2 are matrices stacking {z(T b ) t } and {z(T b ) t,2 }, respectively, and the idempotent matrixM In finite samples,λ = λ b in general; thereby,M z Z(T b ) 2 φ 2 will not be zero. It turns out that a fast rate of convergence for the estimate of the break date is needed for the effect ofM z Z(T b ) 2 φ 2 on the Lagrange Multiplier test to become negligible asymptotically. The following proposition provides a sufficient condition under which LM T,λ d → N (0, 1) under H 0 .

Proposition 1.
Suppose that the process {y t } is generated under the null hypothesis of Model A2 or A3, and that Assumptions 1 and 2 hold. Then, it holds that, as Proposition 1 implies that the estimate of the break fraction should converge at a rate faster than T 1/2 . As shown above,λ s does not satisfy this condition. Hence, we need to consider alternative ways to accelerate the rate of convergence of the estimate of the break fraction. KP suggested two possible approaches. The first is based on minimizing the sum of squared residuals (SSR) of a dynamic regression model. This method is similar to that in Hatanaka and Yamada (1999) [34]. The relevant dynamic regressions are specified as follows: (Models A1 and A3) where D(T b ) t = 1 for t = T b + 1 and 0 otherwise. Under the null hypothesis, we obtain an estimate of the break fractionλ d which has a faster rate of convergence, such thatλ It is worth noting that, as discussed by Hatanaka and Yamada (1999) [34] and KP, the estimateλ d has a negative bias in finite samples, especially for Model A3. As we shall see, this will affect the finite sample properties of the tests. The second approach is to use a trimmed data set using a window whose length depends on the sample size and which contains the estimated break date. The trimmed series then consists of the original one with the data points in the window excluded. KP showed that the rate of convergence ofλ s can be increased with the trimmed data set. Suppose that the estimate of the break fraction satisfiesλ − λ b = O p (T −a ) for some 0 < a < 1, and the trimming window has length 2w(T) with w(T) ≡ c 1 T δ , c 1 > 0, and −1 < −a < δ < 0. With this specification, the length of the window is negligible in the limit compared to the sample size T, but is still large enough to include the true break date asymptotically. Following KP's suggestion, one proceeds as follows: • Estimate the break fractionλ s from the original data set and form a window that ranges from ). • A trimmed data set is constructed by removing the original data from T l + 1 to T h and then shifting down the data after the window by D(T) = y T h − y T l . After the trimming and connecting procedures, we now have a new series {y * t }, for t = 1, . . . , T * (≡ T − 2w(T)T), defined by: • Test the null hypothesis H 0 : d 0 = 1 using T l as the break date (i.e.,λ tr = T l /T * ). The Lagrange Multiplier test statistic is then given by whereỹ * t is the detrended version of y * t using the estimate of the break date T l (or break fractionλ tr ).

Remark 2.
If the window contains either end of the data, then the process {y * t } turns out to be Model A0 (no structural break), and the statistic in Theorem 2 should be applied to the trimmed data {y * t }.
The trimmed process {y * t } will satisfy the properties of Model A2 regardless of the specification of the original data {y t }, which implies that we can use a common limit distribution. The following proposition states the limiting distribution of the Lagrange Multiplier test based on the trimmed data, which is the same as would be obtained if the break date was known in Model A2.

Proposition 2.
Suppose that the process {y t } is generated under the null hypothesis of Model A2 or A3, and that Assumptions 1 and 2 hold. Then, it holds that as T → ∞, LM T,λ tr d → N (0, 1).
As shown in KP, under the null hypothesis of a unit root, the estimate of the break fraction λ tr = T l /T * converges in probability to the true break fraction at some rate greater than T. Hence, the sufficient condition in Proposition 1 is satisfied, so that the proof of Proposition 2 is trivial and omitted.
In concluding this section, we consider the case where there is a change in mean; that is, µ b = 0, as in Models A1 and A3. In Model A1, we assume that there is a level shift only; that is, µ b = 0 and β b = 0. Under the null hypothesis, a stochastic trend generated by the I(1) error process tends to dominate a level shift. Hence, we cannot estimate the break fraction λ b consistently, because the magnitude of the level shift is asymptotically negligible. In finite samples, we can ignore the level shift if the magnitude of the break is small. Then, Model A1 can be treated as Model A0, and we can follow the testing procedure pertaining to Theorem 2. However, a loss of power is inevitable if a large change in mean is ignored.
On the other hand, the level shift can be specified as an increasing function of the sample size; i.e., µ b = c 2 T 1/2+α for some c 2 > 0 and α > 0. As addressed in Harvey et al. (2001) [35], PZ, and KP, this specification provides better approximations of the properties of the tests in finite samples when the level shifts are not very small. The models with µ b = c 2 T 1/2+α are labeled as Models A1b and A3b, respectively. Proposition 3. Suppose that the process {y t } is generated under the null hypothesis of Model A1b or A3b. Then, LM T,λ diverges as T → ∞.
Although the rate of convergence of the estimate of the break fraction is faster than in the case of a change in slope (see Proposition 7 in KP), Proposition 3 states that the LM tests cannot obtain the standard normal limiting distribution. Hence, the LM test LM T,λ , using the critical values from the standard normal distribution, suffers from some liberal size distortions, even when |µ b | is large. 4

Using a Pre-Test for a Break in Slope
The results of Theorem 2 and Proposition 2 show that the limit distribution of the test is the same whether there is a break in slope introduced as a regressor or not, even when the DGP specifies that no break is present. Hence, unlike the case of testing for a unit root as in KP, theoretically there is no need to carry a pre-test to improve the power of the test. However, Chang and Perron (2016) [33] considered Models A2 and A3 with fractionally integrated errors and showed that the so-called spurious break issue occurs with the order of fractional integration d 0 ∈ (0, 0.5) ∪ (0.5, 1.5). This extended the results on Nunes et al. (1995) [36], who considered the unit root case. This means that under both the null and alternative hypotheses, if a break in slope is not present and one is allowed in the regression, the fitted model will with large probability suggests the presence of a break. This could have an effect on both the size and power of the test. On one hand, the slope change regressor may induce added liberal size distortions in finite samples because of the overfit. On the other hand, since when no break exists in the DGP it is a superfluous regressor, power maybe be reduced. Hence, it may be the case that in finite samples it is beneficial to use a pre-test for a change in slope and try to choose between models (5) and (7). Since a test for a change in mean will be inconsistent, there is no point in trying to distinguish between models (5) and (6) or between model (4) and the corresponding one without the change in mean. Iacone et al. (2013) [37] suggested a sup-Wald type test (S W) for Model A2. In particular, it is robust to any order of fractional integration d 0 located in an interval [0, 1.5) excluding the boundary case 0.5. More precisely, given their recommended choice for the bandwidth when constructing the local Whittle estimate of d 0 , their test is consistent for values of d 0 in the interval [0, 1.32], though we believe the proof can be modified to allow the interval [0, 1.5). It follows the generalized least squares approach to construct the test statistic for a structural change in trend by taking d 0 -differences from the data. To make the test feasible, the fully extended local whittle (FELW) estimatord FELW of Abadir (2007) [38] is considered. While the FELW estimator is constructed under the null hypothesis of no structural change, Iacone et al. (2013) [37] showed that it also satisfies the necessary condition for consistency, even with a local break in trend. Since the true break date is unknown a priori, the final statistic SW uses the sup functional of Andrews (1993) [39] across all admissible break dates. This test is asymptotically size controlled for all d 0 's in the prescribed range. Using this pre-test, we can then define the alternative estimate of the break fractionλ =λ · 1 SW >τ , where τ is the critical value for the SW test with a nominal size p%. Given that SW is a consistent test, plim T→∞λ = λ b ifλ is a consistent estimate of λ b . If there is no break in the DGP, we can expect that p% of the estimatesλ's are nonzero. In order to obtain a consistent estimate of λ b under the null of no structural break, we assume that the critical value τ is a function of the sample size T. Since SW = O p (T ), > 0 with a local break, let τ = cT − for 0 < < . This specification introduced in KP is useful because it does not have any effect on the consistency of the test SW and does guarantee that plim T→∞λ = 0 when no break is present. Hence, based on the consistency ofλ, it is recommended to use LM T ifλ = 0 and LM T,λ ifλ = 0. The LM test statistics with the pre-test are denoted by LM . Whether using a pre-test is beneficial will be assessed later via simulation experiments about the size and power of the tests.

Short-Run Dynamics
We now relax Assumption 1 to introduce short-run dynamics in the noise component. A zero-mean short-memory covariance stationary process η t can be represented as a one-sided moving average: random variables with mean zero. A special case of interest is an autoregressive moving average (ARMA(p, q)) process given by φ(L)η t = θ(L) t . In order to implement the Lagrange Multiplier test, we first estimate the parameters Ψ = (φ 1 , . . . , φ p , θ 1 , . . . , θ q ) consistently. Then, under the null hypothesis, we can constructˆ t =φ(L)θ(L) −1 (1 − L) d 0û t , whereû t is the OLS residuals from the model considered, whereasφ(L) andθ(L) are estimated from φ(L)(1 − L) d 0û t = θ(L) t , using d 0 = 1. With short-run dynamics in the noise component, we consider the following test statistic: whereρ k is the kth order autocorrelation of residualsˆ 1 , . . . ,ˆ T . Tanaka (1999) [14] derived an important result related to this statistic when no break is present.

Lemma 2 (Theorem 3.3 in Tanaka (1999) [14]). Under local alternatives-that is, d
with g j and h j the coefficients of L j in the expansion of 1/φ(L) and 1/θ(L), respectively, and Φ the Fisher information matrix for φ and θ.
Remark 3. Note that ω 2 < π 2 /6; hence comparing Lemmas 1 and 2, the LM test has lower local asymptotic power in the presence of short-run dynamics of any kind. As will be shown via simulations, the loss in power can be substantial. It remains, nevertheless, inevitable.
With the maximum likelihood estimateω, we show thatL M/ω d → N (0, 1) as T → ∞ under the null hypothesis. In particular, when p = 1 or q = 1, it is easy to computeω. Since v j = ςv j−1 + j in both cases, g j = ς j , Hence, we have 2 , andω can be computed usingς. All these results remain valid for all trending models with a break considered. The relevant correction needed is a simple scaling byω so that the test becomes LM * T,λ ≡ LM T,λ /ω.

Proposition 4.
Suppose that the process {y t } is generated under the null hypothesis of Model A2 or A3, and that Assumptions 1 and 2 hold with η t being an ARMA(p,q) process. Then, it holds that as T → ∞, The sufficient conditions in Proposition 4 follow from Lemma 2 and Proposition 1, hence the proof is omitted. The finite sample performance of LM * T,λ withλ ∈ {λ s ,λ d ,λ tr } allowing for a structural break under both the null and alternative hypotheses will be examined in the next section.

Simulation Experiments
In this section, we present results from simulation experiments to illustrate the various theoretical results. Throughout the simulations, the true break fraction is set to λ b = 0.5. 5 The DGP is specified as 5 Unreported simulation results with λ b = {0.3, 0.7} are qualitatively similar to those with λ b = 0.5. and u t = ∆ −1 ζ # t = ∆ −1 η t 1 t≥1 , t = 0, ±1, ±2, . . ., where η t is a short-memory zero-mean covariance stationary process that will be specified below. We set some parameters as follows: µ 1 = 1.72, β 1 = 0.03, µ b = 1, and β b = 1. The configurations are the same as those in PZ, chosen to obtain distributions that easily reveal the main features of interest. In all cases, the results are obtained via 10,000 replications. Additionally, 5% nominal size tests are considered.
First, to illustrate the effect of a structural break on the power of the fractional unit root test, we consider two different models when a structural change in slope is allowed in the DGP: (i) Model A0 (which ignores a relevant slope change); and (ii) Model A2 (which is well specified). The results are provided in Table 1. It is clear that the power of LM T is much lower than that of LM T,λ d , which supports the fact that a structural break in the DGP should be allowed when testing for a fractional unit root.  Tables 2-5 present the rejection probabilities of the tests LM T and LM T,λ at the 5% significance level when η t ∼ i.i.d. N (0, 1). In Table 2, no structural change is allowed in the DGP (Model A0); i.e., y t = µ 1 + β 1 t + u t . The size of LM T is well controlled, which is 0.05 and 0.06 with sample sizes T = 150, 500, respectively. Table 3 reports the results for Model A1. The break fraction is not estimated consistently, because the level shift is negligible compared to the stochastic trend induced by the I(1) errors. Hence, LM T,λ s and LM T,λ d suffer from severe size distortion, while LM T maintains size close to the nominal level 5%. Table 4 presents the results pertaining to Model A2. We also consider the test based on trimmed data, LM T,λ tr . The test LM T,λ d is size-controlled, while the others show minor size distortion. However, the power of LM T,λ d is always lower than that of the other two tests. Table 5 presents the results pertaining to Model A3. Here, we set µ b = 0 to consider the effect of an irrelevant level shift. Notice that LM T,λ s exhibits liberal size distortion and LM T,λ d also shows considerable size distortion. As noted by Chang and Perron (2016) [33], the estimate of the break date shows a pattern of bi-modality when an irrelevant level shift is introduced. This phenomenon is referred to the "contamination" effect, because the irrelevant level shift can make the estimate of the true break date less precise. By construction, the contamination effect is marginal on LM T,λ tr , whose exact size is 6.7% when T = 500.          In Tables 6-8, we provide simulation results when the errors have short-run dynamics; i.e., N (0, 1). We set the value of the autoregressive (AR) parameter at ρ ∈ {−0.5, 0, 0.3, 0.6, 0.8}. When ρ = 0, we can compare the loss of power caused by allowing for dynamics when none is present. The other parameters remain unchanged. Table 6 reports the size and power of the Lagrange multiplier tests pertaining to Model A1, LM T . It is well size-controlled with less persistent AR parameters ρ ∈ {0, 0.3}, but it is very conservative with a higher AR coefficient ρ ∈ {0.6, 0.8}, while it shows liberal size distortions with ρ = −0.5. We find some interesting features in terms of power. First, power is higher when the AR parameter ρ is negative (in part due to the liberal size distortions). Second, as ρ becomes positive and large, power shrinks considerably. In particular, the loss of power is substantial when the AR parameter ρ increases from 0.6 to 0.8. This implies that a sufficiently large time span is needed to distinguish fractional integration from weak dependence. Comparing Table 3 with Table 6, for the ρ = 0 case, it is obvious that power is substantially lower when an irrelevant AR parameter is introduced. This result suggests that selecting the number of lags in the noise component is crucial to obtain good power. Lastly, with a persistent AR parameter ρ = 0.8, the LM tests have non-monotonic power; that is, power does not increase when the order of integration d 0 moves away from the null of a unit root. As also discussed in Lobato and Velasco (2007) [16], it is difficult to distinguish fractional integration from a highly persistent stationary short-memory process. Table 7 reports the results pertaining to Model A2. They show similar patterns as for Model A1. It is noticeable that LM * T,λ d performs well in terms of size across all cases, while its power is always lower than that of the other tests. Table 8

The Size and Power When a Pre-Test Is Used
In Figures 1-4, we present the size and power of the LM tests as the slope change parameter (β b ) changes in Models A2 and A3 with and without the use of a pre-test. As a pre-test, we use the SW test of Iacone et al. (2013) [37] at the nominal 5% level. We only consider the version of the LM statistics based on the trimmed estimate of the break fraction, denoted LM T,λ tr and LM p T,λ tr when no short-run dynamics is allowed, and by LM * T,λ tr and LM * p T,λ tr when an AR(1) structure is allowed. To assess the extent of the differences in the size distortion and power, we also report the infeasible LM test based on the true value of break fraction, denoted LM T,λ b and LM * T,λ b . The results are presented in Figure 1 for Model A2 (no short-run dynamics), Figure 2 for Model A2, and in The results for Model A2 (presented in Figure 1) show first that the version without the pre-test exhibits some liberal size distortions when β b is near 0, which reduce when T increases, though remain noticeable even with T = 300. On the other hand, the version with the pre-test exhibits conservative size-distortions when β b is near but not equal to 0, which again reduce but remain noticeable when T = 300. The most drastic differences occur when considering the power of the tests. The power of the version without the pre-test is slightly below but near to the power of the version with the true break fraction when T = 150 for all values of β b . When T = 300, the power functions are nearly the same. Things are very different when the version with the pre-test is used. When β b is near but not equal to zero, the power reduces drastically, creating pronounced power valleys. This reduction in power alleviates somewhat when T = 300, but remains important. This is due to the fact that for low values of β b , the SW test of Iacone et al. (2013) [37] is not powerful enough, so a change in slope regressor is not included. Yet, the magnitude of the change in slope is large enough to induce a considerable loss in power. This is akin to the problem faced by the Kim and Perron (2009) [10] test in the context of testing for a unit root.
The results for Model A3 (presented in Figure 2) show a similar picture. This is also the case when considering the tests LM * T,λ tr and LM * p T,λ tr with serial correlation in the DGP (Figures 3 and 4)  Based on the simulation results, it is recommended to use the LM T,λ tr or LM * T,λ tr tests without the pre-test. In our view, the reduction in power when using a pre-test considerably outweighs the differences in size distortions. The SW test of Iacone et al. (2013) [37] is nevertheless still useful to assess the presence of large breaks.

An Empirical Application
We analyze the aggregate price indices of the G7 countries. Monthly seasonally-adjusted CPI series were obtained from the OECD Main Economic Indicators. All series are analyzed with a logarithm transformation and are plotted in Figure 5, where the vertical line is the break date estimated by minimizing the sum of squared residuals from Model A2. The results are presented in Table 9. We only consider Model A2, and use the test for a slope change of Iacone et al. (2013) [37]. Based on the simulation results, we recommend using the LM tests withλ tr . We present results with and without short-run dynamics. When dynamics is allowed, an AR(1) specification is used. The time span is from January 1969 to December 2007 (T = 468). First, the augmented Dickey-Fuller type test (ADF) cannot reject the null of a unit root against the alternative of trend stationarity for all G7 countries. On the other hand, the SW test of Iacone et al. (2013) [37] detects a change in the slope of the trend. Allowing for a structural change in trend, the fractional unit root tests LM T,λ tr and LM * T,λ tr lead to a rejection of the unit root in favor of fractional integration. Specifically, the test results state that the order of fractional integration is greater than unity for all G7 countries. We apply the two-step feasible exact local Whittle estimatord ELW of Shimotsu (2010) [40] to the residuals from the fitted trend equipped with the estimate of the break dateT b . This result is compatible with that in Gil-Alana (2008) [41], where he estimated the order of fractional integration for the U.S. CPI and found that the confidence intervals were located above unity. Hassler and Wolters (1995) [42] considered the inflation rates for various countries, and found that the order of fractional integration was located in an interval (0, 0.5); that is, the inflation rate is a long-memory process. (3) the numbers in · are the estimates of the AR coefficient in the noise component. *, **, and *** denote a statistic significant at the 10%, 5%, and 1% level, respectively.

Conclusions
We established testing procedures for a fractional unit root, allowing for a structural change under both the null and alternative hypotheses. Following Robinson (1994) [13], Tanaka (1999) [14] derived a Lagrange multiplier test in the time domain, and Dolado et al. (2002Dolado et al. ( , 2008 [15,19] and Lobato and Velasco (2007) [16] considered Wald-type tests for a unit root null hypothesis against fractional integration. Although Dolado et al. (2008) [19] introduced deterministic components, the case with a structural break in trend has not been considered in the literature. In contrast to the large amount of work related to testing the null hypothesis of long-memory against the alternative of stationarity with level shifts, and vice versa, work related to a fractional unit root test allowing for a structural break in trend is more scarce. To the best of our knowledge, this paper is the first that addresses testing for a fractional process allowing a structural break under both the null and alternative hypotheses.
Fractional unit root tests allowing for a structural break under both the null and alternative hypotheses have some desirable features: (i) given that economic variables are often subject to structural changes, our approach imposes a symmetric treatment of the change under both the null and alternative hypotheses; (ii) it is not required to distinguish long memory from structural change; (iii) the power of fractional unit root tests can be substantially improved when a break is actually present. Under some conditions, the proposed LM test statistics have the standard normal limit under the null hypothesis. Simulation experiments confirmed that the tests have good size and power. Hence, we believe that our procedures offer useful complements to existing tests and should be used in practical applications.
An extension of practical interest is to allow I(d 0 ) processes under the null hypothesis, where I(1) processes are included as a special case. The sufficient condition for the LM test statistic to have the standard normal limit may be different from that in Proposition 1. Recently, Chang and Perron (2016) [33] extended PZ's analysis to cover the more general case of fractionally integrated errors for values of d 0 in the interval (−0.5, 1.5) excluding the boundary case 0.5. In particular, they established the rate of the convergence ofλ s from the static regression [33] (Theorem 2). It is also important to examine the performance ofλ d andλ tr under the null of I(d 0 ) processes. Such investigations, and others, are the object of the ongoing subject. covariance stationary process, it is straightforward to show that T −1 ∑ T−(T b +1) k=1 η 2 T 1 +1+k = O p (1). By the continuous mapping theorem, |A 2 | = o p (1), which implies that A 2 = o p (1). Similarly, where we use the expansion − log ∆ = L + 1 2 L 2 + 1 3 L 3 + · · · . We show that B i = o p (1) for i = 1, 2, 3.
where γ is the Euler-Mascheroni constant and ζ T ∼ 1/(2T) which approaches 0 as T → ∞. The results for B 2 and B 3 follow from the arguments used in the proof of Theorem 1 for the terms A 2 and A 3 . Hence, Then, under H 0 , This completes the proof.
Proof of Propositions A1 and A3. When the estimate of the break date is consistent, the proof is trivial and omitted. We here focus on having a consistent estimate of the break fraction at some rate T κ for 0 < κ ≤ 1. Specifically, suppose that the estimate of the break fractionλ satisfies that λ − λ b = o p (T −κ ) for 0 < κ ≤ 1. For all models A1-A3, we have a detrended sequence {ỹ t } based on the OLS method in PZ. The Lagrange Multiplier test statistic is given by: We show that the stochastic orders of terms associated with the deterministic time trend are smaller than those of terms associated with the error process. We can write (9) as: and ∆Ỹ = ∆M + ∆Ũ. Since LM T,λ is a functional of ∆Ỹ and subvectors of ∆Ỹ, it suffices to consider the inner product of ∆Ỹ ∆Ỹ, that is, ∆Ỹ ∆Ỹ = ∆M ∆M + 2∆M ∆Ũ + ∆Ũ ∆Ũ.
Note that we only need to check the stochastic order of ∆Ỹ ∆Ỹ because the lag of order k is controlled to be small relative to the sample size T. We want to show that the term ∆Ũ ∆Ũ dominates the others. It is straightforward to show that ∆Ũ ∆Ũ = O p (T) uniformly over all admissible break dates T b ∈ {πT, (1 − π)T} for some π ∈ (0, 1/2) in all models. When ∆M ∆M has a smaller order of magnitude compared to that of ∆Ũ ∆Ũ, so does ∆M∆Ũ by the Cauchy-Schwartz inequality. Further, the order of magnitude of ∆M ∆M cannot be greater than that ofM ∆M. The order of magnitude ofM ∆M is O p (T 2−2κ ) for Models A2 and A3, which implies that the break fraction should be estimated consistently at some rate greater than T 1/2 . On the other hand, for Model A1b, the stochastic order ofM ∆M is O p (T 1+2α ). Hence, the orders of magnitude of terms associated with the deterministic trend are greater than those of the terms associated with the error process, thereby the LM statistic diverges as T → ∞.