1. Introduction
Most panel tests for the null hypothesis of (no) cointegration rely on single-equations, notable exceptions being Larsson et al. [
1], Groen and Kleibergen [
2], Breitung [
3] and Karaman Örsal and Droge [
4] who proposed panel system approaches. In particular, the more recent paper by Miller [
5], building on nonlinear instrumental variable likelihood-based rank tests, allows for cross-correlation between the units. Similarly, recent single-equation tests by Chang and Nguyen [
6] or Demetrescu et al. [
7] also rely on nonlinear instrumental variable estimation, while the vast majority of such panel tests builds on ordinary or fully modified or dynamic least squares (LS). Here, we study exactly this class of LS-based single-equation panel tests for the null of either cointegration or no cointegration.
We focus on the situation where the test statistics are computed from regressions with an intercept only, and with at least one of the integrated regressors displaying a linear time trend on top of the stochastic trend. Such a constellation is often met in practical applications, see for instance Coe and Helpman [
8] and Westerlund [
9] on R&D spillovers (total factor productivity and capital stock), Larsson et al. [
1] on log real consumption and income (per capita), or Hanck [
10] on prices and exchange rate series testing the weak purchasing power parity (PPP). The relevance of a linear trend in panel data has been addressed in Hansen and King [
11] when commenting on the link between health care expenditure and GDP, see McCoskey and Selden [
12]; consequently, Blomqvist and Carter [
13], Gerdtham and Löthgren [
14] or Westerlund [
15] worked (partly) with detrended series, i.e., they included time as an explanatory variable in their panel tregressions. Hansen ([
16], p. 103), however, argue that “it seems reasonable that excess detrending will reduce the test’s power”. Therefore, we study the empirically relevant case where test statistics are computed from regressions with intercept only (i.e., without detrending) when at least one of the I(1) regressors displays a linear time trend.
Before becoming more technical, we want to outline our findings as a rule for empirical applications. Let
denote a generic panel cointegration statistic computed from a regression with intercept only involving
I(1) variables. The least squares regression may be static in levels,
where
is assumed to be I(1) in the case of no cointegration, or I(0) under the null hypothesis of cointegration, see Remarks 1 and 3 below, respectively. Alternatively,
may be from the error-correction regression
1,
where contemporaneous differences
or additional lags of differences may be required as additional regressors to render
free of serial correlation, see Remark 2 below. The test statistic may be constructed from pooling the data or from averaging individual statistics, see e.g., Pedroni [
18,
19] or Westerlund [
15]. Much of the nonstationary panel literature relies on sequential limit theory where
is followed by
, such that limiting normality can be established under the assumption that none of the I(1) regressors follows a deterministic time trend:
The constants
and
required for appropriate normalization are typically tabulated for a selected number of values of
m, see again Remarks 1 through 3. A different set of such moments
and
is also typically given for detrended regressions, where the test statistic
stems from regressions of the type (
)
or
We call such regressions “detrended” because, in a single-equation framework, the resulting parameter estimators are equivalent to what one obtains from a two-step procedure: first, regress all variables on a linear time trend, and, second, regress the individually detrended residuals on each other. This equivalence is sometimes called Frisch-Waugh-Lovell Theorem, see e.g., Greene ([
20], Theorem 3.2). For generic
from, e.g., the tests mentioned in Remarks 1 through 3, it holds, irrespective of an eventual linear trend in the data, that
Our main contribution is twofold for the case that at least one of the I(1) regressors has a linear time trend and the regressions are run with intercept only (without detrending). First, it is shown that the normalization with
and
and the resulting critical values for
from the regression “with intercept only” are not correct in the presence of linear time trends in the data. It is analytically (Proposition 1) and numerically demonstrated that their usage results in size distortions growing with the panel size
N. Second, we characterize the appropriate limiting distributions by showing that normalization of
with
and
results in a standard normal limit, such that the size of the tests can be controlled (Theorem 1). Put differently, Theorem 1 means in non-technical terms: The limiting distribution arising from a regression on
k I(1) variables with drift and an intercept amounts to the limiting distribution in the case of a regression on
I(1) variables and an intercept plus a linear time trend. Such a rule is known in a pure time series context for the special case of the residual-based Phillips-Ouliaris test for no cointegration from Hansen ([
16], p. 103): “[...] deterministic trends in the data affect the limiting distribution of the test statistics
whether or not we detrend the data”; see also the expositions in Hamilton ([
21], p. 596, 597) and Hassler ([
22], Proposition 16.6). It is even more relevant in our panel framework since we illustrate numerically and analytically that the size distortions of an inappropriate normalization grow with the panel size
N (either to zero or one, depending on the specific test). Moreover, we compare our proposal to account for linear trends in the data with the more traditional method of detrending the regression. By simulation, we show that power gains of our new strategy according to Theorem 1 over detrending may be considerable. We hence recommend this strategy as being superior to detrending.
The rest of the paper is organized as follows. The next section sets some notation and assumptions.
Section 3 establishes and discusses our asymptotic results and illustrates them with numerical evidence. It also compares our suggestion to account for linear trends with the conventional method of detrending. The last section discusses consequences for applied work. Mathematical proofs are relegated to the
Appendix A.
2. Notation and Assumptions
Restricting our attention to the single-equation framework we partition the m-vector of observables into a scalar and a k-element vector , , . As usual, the index i stands for the cross-section, , while t denotes time, . Each sequence , , is assumed to be integrated of order 1, I(1), where we allow for a non-zero drift, and assume for simplicity a negligible starting value, . While may be cointegrated or not, depending on the respective null hypothesis, we rule out cointegration among . Technically, these assumptions translate as follows, where denotes an m-dimensional standard Wiener process, stands for the integer part of a number x, and ⇒ is the symbol for weak convergence.
Assumption 1. With obvious partitioning according to , we assume ()The stochastic zero mean process is integrated of order 0 in that it satisfieswithwhere and is positive definite. Now, we turn to assumptions with respect to the tests. Let
and
stand again for generic test statistics computed from individual single-equation least squares regressions with “intercept only” and “intercept plus linear trend”, respectively. The superscript
stands for the dimension of the I(1) vector entering the equations. One route to panel testing relies on so-called group statistics averaging individual statistics. We denote them as follows:
Similarly, panel statistics rely on pooling the data across the dimension within, i.e., summing over terms showing up in the numerator and denominator separately,
A typical example for the mapping
g is
in the case of t-type statistics. Here, it is assumed that the generic
and
or
and
are computed from individually demeaned or detrended regressions, respectively. We allow for group and panel statistics by introducing the generic notation
and
, and maintain for the panel the joint null hypothesis
A distinction between the individual null hypotheses of cointegration or absence of cointegration is not required, and both cases are treated in the generic assumption as follows.
Assumption 2. Consider linear single-equation least squares regressions (, )orwhere contemporaneous differences or lags of , , may be required as additional regressors in (3) to ensure residuals free of serial correlation. Let and stand for group statistics and or for panel statistics and computed from regressions with “intercept only” and “intercept plus linear trend”, respectively. We assume under the null hypothesis (1) thatas followed by . Tests, e.g., by Kao [
23], Pedroni [
18,
19], Westerlund [
9,
24] or Westerlund [
15] meet Assumption 2 under different sets of restrictions, and they will be considered in the next section, see Remarks 1 through 3. In particular, these authors tabulate values of
and
,
. Our assumption of a single equation approach is motivated by the fact that much of applied work relies on this. However, such an assumption comes at a price: In (
2), we have to assume that the regressors
alone are not cointegrated (
is positive definite according to Assumption 1), and, in (
3), we have to assume under the alternative of cointegration that
adjust to deviations from the long-run equilibrium, and not
.
Much of the earlier panel cointegration literature assumed independent units invoking a central limit theorem to establish Assumption 2, see e.g., Pedroni [
18,
19] and Kao [
23]. Cross-sectional independence, however, is not maintained in our Assumption 2. Westerlund [
15,
24] e.g., allows for cross-correlation driven by a common factor. To account for this, he suggests replacing
and
by the cross-sectionally demeaned time series,
This way, he establishes that the limiting results maintained under Assumption 2 are met under cross-sectional correlation (subject to some restrictions).
3. Results
3.1. Asymptotic Theory
The first paper allowing for linear time trends in a panel cointegration context was by Kao [
23]. He considers a residual-based unit root test for the null hypothesis of no cointegration in the tradition of Phillips and Ouliaris [
25]. His test builds on pooling the data while allowing for a individual-specific intercept. Kao [
23] does not consider regressions containing a linear time trend as additional regressors, but allows for a linear drift in the data when performing a regression with a fixed effect intercept. In the case of the
regressor (i.e.,
), Kao ([
23], Equation (15)) observed that the linear time trend dominates the I(1) component; hence, the limiting distribution amounts to that of the panel unit root test by Levin et al. [
26] upon detrending. To be precise: let
and
denote the normalizing constants provided by Levin et al. [
26] for detrended panel unit root tests; then, one should use them for the pooled residual-based panel cointegration statistic
in a bivariate regression if the regressor is I(1) with drift, see Kao ([
23], Theorem 4):
In Theorem 1, we extend Kao’s result for any panel or group statistics from static or dynamic regressions with computed from regressions with intercept only in the presence of linear time trends.
Theorem 1. Let the data satisfy Assumption 1, and the generic test statistic meets Assumption 2 for . Furthermore, assume that for all . Under the null hypothesis (1), it then holds true thatas , where are from Assumption 2. Note that Assumption 2 does not impose any restriction on . As is shown in the proof, Theorem 1 holds irrespective of whether displays a linear trend or not ( or ).
Two research strategies can be employed in the presence of linear time trends when dealing with statistics resulting from regressions with intercept only. The first one simply ignores the linear time trends in the data and standardizes with and . The second strategy accounts for the drift in the data according to Theorem 1; in other words, it applies upon standardizing with and . We summarize as follows:
Strategy SI:
When is computed from panel regressions without detrending, then compare with quantiles from the standard normal distribution, i.e., ignore the presence of linear trends in the data.
Strategy SA:
When is computed from panel regressions without detrending, then compare with quantiles from the standard normal distribution, i.e., account for the presence of linear trends in the data.
For the rest of the paper, we assume that an applied econometrician is able to distinguish between the two cases, whether a linear time trend underlies the variables (e.g., log income or log prices) or not (e.g., interest or inflation rates). Hence, we maintain the assumption behind Theorem 1: the researcher knows that at least one regressor is I(1) with drift (). We assume that strategy is only employed when linear trends are truly present and thus refrain from the discussion of misspecification: what happens if there are no linear time trends in the data, but one erroneously accounts for trends.
The situation analyzed in Theorem 1 has not been considered in the previous panel cointegration literature, with the notable exception of Kao [
23]. Consequently, all applied papers that we are aware of standardize
, with
and
ignoring the effect of deterministic trends in the series, which amounts to strategy
. The effect of strategy
under linear time trends is discussed for growing
N in the following proposition. The resulting size distortions depend on whether the test is right-tailed or left-tailed (null hypothesis is rejected for too large or too small values, respectively).
Proposition 1. Let the assumptions from Theorem 1 hold true. Furthermore, assumeUnder the null hypothesis, one has the following results for strategy :- (a)
For a left-tailed test, the probability to reject according to strategy increases with growing N to 1;
- (b)
for a right-tailed test, the probability to reject according to strategy decreases with growing N to 0.
We now discuss a couple of panel tests satisfying Assumption 2 and (
5), such that Theorem 1 and Proposition 1 apply.
Remark 1. The residual-based unit root tests for the null hypothesis of no cointegration proposed by Pedroni [18,19] build on static regressions as in (2). The null hypothesis (1) is rejected for too negative values of the test statistic (of in our generic notation). The expected values and standard deviations and showing up in Assumption 2 are available from Pedroni ([18], Table 2) for and from Pedroni ([19], Corollary 1) for . In order to apply Theorem 1 (strategy ) for , one requires and . These values stem from the detrended Dickey-Fuller distribution in the case of group statistics and have been tabulated by Nabeya ([27], Table 4): and . Throughout this, we observe . Hence, Proposition 1(a) applies. If strategy is employed under linear trends, and then the probability to reject the true null hypothesis converges into one with growing panel size N. Alternatively, Westerlund [24] suggested group and panel variance ratio type tests along the lines of Breitung [28]. The null hypothesis of no cointegration is rejected again for too small values of the variance ratio statistic, and and showing up in Assumption 2 are given in Westerlund ([24], Table 1) for . To apply Theorem 1 with , we need and . For the detrended Breitung distribution we obtain by simulation and , which are the values corresponding to the case of group statistics. Again, we observe , so that (5) holds. Consequently, Proposition 1(a) applies, and the probability to reject the true null hypothesis under strategy grows with N as long as there is a linear trend in the data. To sum up: in the case of residual-based tests for no cointegration, strategy results in massive size distortions; numerical evidence for finite N is given in Table 1 below. Remark 2. The error-correction tests by Westerlund [15] relies on regressions of type (3). It is again a left-tailed test: The null hypothesis of no cointegration is rejected for too negative t-values associated with γ. Values of and are tabulated in Westerlund ([15], Table 1) for . In case of (i.e., no on the right-hand side), the limiting distributions are of the usual Dickey-Fuller type. Hence, and for group statistics are again from detrended Dickey-Fuller-type distributions and given in Nabeya ([27], Table 4) (see above). Comparing with , we find meeting (5) again. Consequently, strategy is increasingly liberal in the presence of linear time trends, and the probability to reject the true null hypothesis approaches 1 in the limit as long as the series display a linear time trend. For numerical evidence, see Table 2 below. Remark 3. We now flip the null and the alternative hypotheses. Westerlund [9] suggested testing the null hypothesis of cointegration. He proposed a CUSUM group test statistic for this null hypothesis to be applied with tabulated values and , . To apply Theorem 1 for , we provide as moments of the univariate, detrended distribution by simulation: and 2. This test is right-tailed and in accordance with Westerlund ([9], Table 1) . Thus, this time Proposition 1(b) comes in. Under strategy in the presence of linear trends, the test is increasingly undersized with growing N. Such a conservative behaviour implies low power under the alternative hypothesis. 3.2. Numerical Evidence
The statements obtained from Proposition 1 may be quantified more precisely by means of Equations (
A2) and (
A3) given in the
Appendix. These rejection probabilities apply approximately (for large
N) under the null hypothesis at nominal significance level
α. We report results for the group
t-tests by Pedroni [
18,
19] and by Westerlund [
15] in
Table 1 and
Table 2.
Generally, the size distortions in
Table 1 and
Table 2 grow with
N, while decreasing with
at the same time. The fact that
is too liberal is characteristic for these tests where we reject for too negative values (of
in our generic notation). Overrejection is not the general case, however, as we see when reversing the null and alternative hypotheses. To quantify distortions for the CUSUM test discussed in Remark 3, we use Equation (
A3) from the
Appendix. When evaluating
under
, we observe rejection probabilities equal to zero up to three digits for
; this strongly supports the limiting result from Proposition 1 (b).
3.3. Regressions with a Linear Time Trend
For regressions with intercept only, strategy
has been used in the literature and applied with the tests mentioned in the remarks above. We have illustrated its failure to control size under the null hypothesis in the presence of a linear time trend. In practice, one may use two strategies to account for linear time trends. The first one is the new
according to Theorem 1 from regressions without detrending. The second one consists of detrending the series, or equivalently running detrended regressions, i.e., including
and
in (
2) and (
3), respectively. The empirical strategy then becomes the following:
Strategy SD:
Compute from detrended panel regressions and compare the normalization with quantiles from the standard normal distribution.
By Assumption 2, this strategy will provide asymptotically correct size. However, tests from detrended regressions will be prone to power losses relative to strategy , which is more parsimonious. For this reason, we next investigate the price of strategy relative to in terms of power.
In Monte Carlo experiments, we study in particular the error-correction test (group
t-statistic) by Westerlund [
15]. Before turning to a power analysis, we make sure that size is under control. For the data-generating process (DGP), we consider hence the null hypothesis of no cointegration under linear time trends:
where
are normal iid sequences,
, independent of each other. Finally,
is an independent random walk entering (
6). The DGP under the alternative of cointegration becomes
where
and
are generated as before. Using the regression
we computed the group
t-statistic proposed by Westerlund [
15]. Strategy
is employed with
All reported rejection frequencies rely on 10,000 replications.
The leading case consists of the following parameterization, where only the first component of the regressors
is driven by a linear time trend:
This mimics with
or
a typical macro panel with monthly data and e.g., income, interest rates and inflation rates as regressors.
Table 3 reports the frequencies of rejection for different values of
from (
6), and rejection is based on strategy
according to Theorem 1. It illustrates how well the rule of Theorem 1 works: the experimental sizes are close to the nominal ones. This is particularly true for
, while the test is mildly conservative for
, and a bit more conservative for
, in particular for
N large relative to
. Next, we consider strategy
with the same data. The rejection frequencies are given in
Table 4. We observe that the experimental size from detrended regressions is close to the nominal one under the null hypothesis of no cointegration, irrespective of
.
Since strategies
and
both hold the nominal size, the question of which one is more powerful naturally arises. The results contained in
Table 5 are very clear: first, the power increases with
; second, strategy
always outperforms
considerably, and has, e.g., rejection frequencies more than twice as large for
or
. In particular, detrending becomes all the more costly; relative to strategy
, the larger
N is, which is intuitively clear: including a linear time trend in a regression requires the estimation of an additional parameter; in a panel of
N units, detrending thus involves the estimation of
N additional parameters compared to strategy
. At the same time, these estimated trends can be spuriously correlated with the stochastic trends in the data, and, therefore, incorrectly lead to support for cointegration, in particular when the time dimension is relatively short.
We have varied the leading case with the parameterization from (
11). First, we allowed for more and stronger trends in the regressors,
with all other parameters fixed. This corrects the mild undersizedness of strategy
reported in
Table 3 yielding empirical sizes very close to the nominal one. At the same, time power relative to
Table 5 is increased, with strategy
still dominating
. Second, we have increased the magnitude of the random walks, namely
, while the other parameters are from (
11) and
(see
Table 6). Here, the linear trends are less pronounced, such that
results in slightly more conservative tests (compared to the first panel in
Table 3), and similarly, power is reduced (compared to the first panel in
Table 5). Still,
clearly dominates
in
Table 6. Third, we simulated shorter panels,
. This makes both strategies,
and
, conservative under
, which is accompanied by a loss of power.
4. Conclusions
In time series econometrics, it has been known for a long time that “the deterministic trends in the data affect the limiting distributions of the test statistics
whether or not we detrend the data” (Hansen [
16], p. 103). This has been shown for the residual-based Phillips-Ouliaris (or Engle-Granger) cointegration test by Hansen [
16], see also the exposition in Hamilton ([
21], p. 596, 597). Analogous results have been given for other cointegration tests by Hassler [
30,
31], see also the summary by Hassler ([
22], Proposition 16.6). In this paper, these findings are carried over to the panel framework, and they are shown to continue to hold for single-equation tests relying on least squares, no matter whether the null hypothesis is absence or presence of cointegration. In a regression involving
variables, much of the panel cointegration theory relies on normalization with suitable constants
and
and letting the panel dimension
N go to infinity to obtain a standard normal distribution. The numbers
and
are tabulated for the case of regressions with intercept only. Different figures
and
are tabulated for regressions with intercept and linear time trend. We show the following: when statistics are computed from regressions with
m integrated variables with intercept only, but one of the integrated regressors is dominated by a linear time trend, then normalization with
and
is required to achieve asymptotically valid inference under the null hypothesis (Theorem 1). Normalization with
and
, however, which has been the conventional strategy so far, results in a loss of size control under the null hypothesis. In fact, employing
and
in the presence of linear time trends gives rejection probabilities converging with
N to 1 or 0, depending on whether the null hypothesis is no cointegration or cointegration, respectively (see Proposition 1). To avoid such size distortions, one may employ the strategy following Theorem 1, or one may work with detrended regressions. Detrending, however, comes at a price: a regression with intercept only will provide more powerful tests (see e.g., Hamilton [
21], p. 598); according to our simulations, power gains of our new strategy over detrending may be considerable and growing with
N, and this also holds true if there is a linear trend superimposing the level relation. Our Monte Carlo evidence, however, is limited to the case of testing for the null hypothesis of no cointegration.
Hence, we propose the following empirical strategy if at least one of the integrated regressors is driven by a linear time trend when testing for no cointegration. First, test the null hypothesis of no cointegration with our new strategy from Theorem 1, since it is more powerful than tests relying on detrending. If the null hypothesis of no cointegration is rejected according to Theorem 1, then one may test, in a second step, whether a linear time trend is present, superimposing the level relation between and . If strategy does not reject the null hypothesis of no cointegration, then one may, of course, try a test building on detrending, although it will tend to be less powerful, since it requires the estimation of N additional parameters.