2. The Model
We start the discussion from the VAR(1) model
or, in compact notation,
where
. The representation for the univariate components, the so-called “final equations” in [
5], can be obtained following [
6] . Considering the lag polynomial matrix
, its determinant
and the adjoint matrix
the “final equations” for the VAR(1) model are given by
It follows that the univariate processes evolve as an ARMA(2,1) model with a common AR component and two distinct MA components.
1The magnitude and sign of the roots of the characteristic equation
determine both the stationarity or nonstationarity of the univariate time series
and
and the existence of a cointegrating relationship between them. A necessary condition for cointegration is that the roots of the characteristic equation satisfy
and
. From this unit root constraint, we obtain the restriction
which can be used to obtain the VECM representation
where
,
and
.
The second restriction
,
i.e.,
guarantees the stationarity of the error correction mechanism. In fact, it is easy to show that
Imposing the constraints
and
, the “final equations”,
i.e., the univariate models become
Thus, if the DGP is a bivariate VAR(1) with one unit root and one cointegrating relationship, the marginal processes for the level processes follow an ARIMA(1,1,1) model and the marginal processes for the first-difference stationary processes are ARMA(1,1) processes.
2 It follows that the autocorrelation structure of the implied marginal processes, induced by the interaction of the AR and MA roots, is deemed to affect the finite sample size and power properties of unit root test in any simulation study where the DGP is a multivariate one.
Considering the right-hand side of (5), we see how the aggregate error term for each component of
is the sum of an MA(1) process and a lagged white noise process
It is easy to show that both aggregate error terms on the right-hand side have the autocorrelation function of an MA(1) process so that we can write
where
and
are white noise processes. By setting the first-order autocorrelation coefficient of each marginal process on the left-hand side of (
6) equal to the first-order autocorrelation coefficient of
,
i.e.,
we can find the moving average coefficients of the polynomials
by choosing the invertible solution of the previous second degree equation.
3For example, let us consider the DGP 4 in
Table 1,
i.e.,
The roots of the characteristic equation of the reduced form VAR are given by 1 and −0.4 and the marginal processes are given by
which, after some algebra, can be written as
Thus, the MA component of the process
has a large negative root, which explains the enormous size distortion, as reported in many simulation studies since Schwert [
1].
3. A Simulation Experiment
In a small simulation study, we assess the size distortion of a number of unit root tests when the data comes from a cointegrated DGP. We consider the classical ADF test by Said and Dickey [
10], the
and
tests proposed by Phillips and Perron [
11], the modified
and
by Stock [
12] and Perron and Ng [
2], the modified Sargan-Bhargava
test proposed by Stock [
12], the point optimal test
of Elliott
et al. [
13] and its modification
proposed by Ng and Perron [
14], and, finally the DF-GLS test by Elliott
et al. [
13]. We always estimate the spectral density at frequency zero of the error term using the autoregressive spectral density estimator as in Perron and Ng [
2] and for the
tests we consider both OLS detrending and GLS detrending, as in Ng and Perron [
14].
For the selection of the lag length, we do not follow a rule based just on the sample size but consider the Modified Akaike Information Criterion (MAIC) developed by Ng and Perron [
14], where an upper bound to the lag length is set to the integer part of
. Given the better performance of this information criterion compared to the BIC one, we do not consider the latter in our simulations. Further, we consider the suggestion by Perron and Qu [
15] and present results for
obtained using OLS detrending instead of GLS detrending in the MAIC. They show that this simple modification produces tests with effective size closer to the nominal size.
We also consider two bootstrap unit-root tests. Palm
et al. [
16] carried out extensive simulation experiments on the size and power of bootstrap unit root considering ADF sieve and block bootstrap tests, based on first difference series or on residuals. Their findings suggest that, both in terms of size and power, the ADF sieve test as in Chang and Park [
17] or its residual-based version perform best. In the following, we shall consider these two versions of the ADF sieve unit root test and implement the tests following the procedure set forth in Palm
et al. [
16].
Finally, to be able to make a comparison with a widely used test for the presence of unit roots and cointegration in multivariate time series, we also consider Johansen’s trace-statistics, say , where, under the null, we should be able to reject no cointegration and not to reject the presence of one cointegrating vector.
In the simulation experiment, we consider two different ways of formulating the DGP. Firstly, as in Reed [
3], we consider the VAR model in (
1) subject to the constraints (
3)–(
4) which guarantee cointegration between
and
. Secondly, we also consider a DGP widely used in cointegration analysis (see, among others, [
18]):
From this parameterization, we can obtain the implied VAR(1) as
where the unit root constraint
is satisfied and under the condition
we have cointegration. The VECM representation is, then, given by
From the above VAR and VECM representations, we can obtain the DGPs considered in Reed [
3] in terms of the parameters of (
7) and
vice versa.
In the simulation study, we consider DGPs parameterized, as in (
7), following Gonzalo [
18], and as in (
1), following Reed [
3]. The two set of parameters are defined as follows:
- MC(a):
the values taken by
and
in Reed [
3], reported in
Table 1 together with the implied values of
ρ,
β, the roots of the MA component in the univariate representations and the unconditional contemporaneous autocorrelation;
- MC(b):
as in Gonzalo [
18], we set
,
and
and consider the following values for the remaining parameters:
,
and
, for a total of 18 experiments. The root of the common autoregressive component (one root is always equal to 1) and the coefficients of the distinct MA components of the univariate models implied by the multivariate DGP are reported in
Table 2.
All results are based on a sample size
, on 1000 simulations of the DGP and, for the bootstrap tests, on 500 bootstrap replications as in Psaradakis [
19].
We notice that, for many values of the parameters, the univariate models are characterized by a large negative MA coefficient, which is exactly the circumstance in which unit root tests have low power and great size distortion even in moderately large sample sizes. For the set of parameters in MC(b), this always occurs when and when and , while for the set of parameters in MC(a), this occurs only in half of the cases.
Some additional remarks are in order. First of all,
MC(a) is able to generate, at least for the parameter values considered here, greater heterogeneity in the MA roots of the univariate first-difference processes than that generated by
MC(b). In fact, in
MC(a), we observe a large negative MA root for
associated to a small or a medium size negative MA root for
or a positive MA root for
associated to a negative one for
. On the contrary, in
MC(b), looking, for instance, at the upper panel of
Table 2 (the case in which
), we may see how the coefficients of the MA components are almost identical for the two processes
and
and that, to a lower extent, the same applies to the lower panel (
). When this occurs, the univariate representations of unit root processes resemble each other, and this may explain the similar behavior of unit root tests applied to
and
, separately.
Furthermore, when
, the coefficients of the MA component in
and
are not only very similar among themselves but also quite close to the root of the autoregressive component. This implies the presence of a near common factor in the univariate models for the first differenced series so that the AR and MA roots would almost cancel out. To our knowledge, this feature of the parameterization
MC(b) used by (and many others Gonzalo [
18]) has not been noticed so far in simulation studies on unit roots or cointegration tests. As a consequence of this near common root, the lagged unconditional correlation of
and
will tend to be small. For instance, for
,
and
, the first-order unconditional autocorrelations of
and
are equal to
and
, respectively, and the first-order unconditional cross-correlation is equal to
. Similar results are obtained when
and for different values for
σ and
η. The first-order unconditional autocorrelations increase very slowly as
ρ decreases, for instance, when
, we have 0.05, −0.11 and −0.16 for the first-order unconditional autocorrelation of
, of
and the first-order unconditional cross-correlation, respectively.
The empirical size, at a 5% nominal level, for the unit root tests, is reported in
Table 3,
Table 4,
Table 5 and
Table 6 for the estimated regression without a trend.
4 For each DGP, we test for a unit root both in
and in
. For each set of parameters in
MC(b), in each table, we report the effective size for a fixed value of the “signal-to-noise” ratio and different values of the remaining parameters; for the experiments in
MC(a), we report the effective size for the four different parameterizations.
The first general and striking result, common to both parameterization MC(b) and MC(a), concerns the presence of important differences in the effective size according to whether or are tested for the presence of a unit root. Considering MC(b), the empirical size increases with η when testing for a unit root in and, on the other hand, it decreases when testing is carried out on . This finding is remarkable and unexpected since the univariate ARMA representations of and share the same AR component and have very similar MA components for most parameterizations. The differences in size can be fully appreciated when or . For instance, in the case , for most tests based on GLS detrending (but for the ), the effective size is close to the nominal one when the unit root test is applied to , but it doubles or almost triples when unit root tests are applied to . In addition, the same applies to bootstrap unit root tests. The distortion in the effective size on and is reversed when but and to a lesser extent for smaller values of the AR root such as . Notice also that only in the case in which , do these differences tend to to be negligible.
For the parameterization in
MC(a), in
Table 6, we continue to observe substantial differences in the effective size according to whether
or
are being tested for a unit root. These differences are even more pronounced than those in
Table 3,
Table 4 and
Table 5, perhaps because of the greater range and heterogeneity of the MA component obtained from the parameter values under
MC(a). Furthermore, there are noticeable differences among tests and across DGPs: for instance, looking at DGP1, both the DF-GLS and bootstrap tests have reasonable effective sizes for
while the effective size more than doubles for the bootstrap tests applied to
, but it does not change substantially for the DF-GLS test. Again, for DGP4, the size of the DF-GLS more than doubles when
is tested while the size of the bootstrap tests is more than four times larger for
than for
.
Considering the parameterization in
MC(b), from
Table 3,
Table 4 and
Table 5, we notice that when
, the empirical size is, in general, close to the nominal one for most tests and it is so, in particular, for the ADF sieve bootstrap unit root tests. In particular, the empirical size in both versions of the ADF boostrap test seem to be more stable and closer to the nominal size than the empirical size of the
,
, and DF-GLS tests. However, for
, and to a larger extent when
, the empirical size of these tests tend to differ more and more from the nominal 5% level. In fact, the effective size increases with
σ ranging in the interval
when
to the interval
when
. Thus, as the variance of the random walk component in
and
increases, the size of the unit root tests increases, leading to greater size distortion, and the size distortion itself is quite sizeable for
, irrespective of
ρ and
η. In general, GLS detrending tends to increases the empirical size and this exerts a beneficial effect when
σ is small, but, on the contrary, it is detrimental for the size when
σ is large.
Considering the four parameterizations under
MC(a), the behavior of unit root tests is more heterogeneous, as is the pattern of AR and MA roots. Under OLS detrending, the
and
tests have large size distortion for all parameterizations. The modifications suggested by Perron and Ng [
2] are somehow effective in reducing the distortion, but the behavior of the
M tests is not stable across DGPs, and the same remark applies to the ADF test. Under GLS detrending, the DF-GLS test by Elliott
et al. [
13] has the best performance, showing an effective size very close to the nominal one in all cases but for
in DGP2 and DGP4 where, in fact, the MA component has a coefficient close to
. The bootstrap unit root tests have greater size distortion than the DF-GLS but, for
in DGP2 and DGP4, exactly in those cases where the DF-GLS test does not perform well.
Finally, we consider consider Johansen’s trace test under the null of one cointegrating vector. For parameters in
MC(b), the test statistics are severely biased when
, irrespective of the values taken by
σ and
η, while it has a size close to the nominal one for
and its performance, in the latter case, is superior to those of the standard and bootstrap unit root tests. For the parameterizations in
MC(a), Johansen’s trace test also displays a very good behavior since about 5% of the time, we reject no cointegration in favor of stationarity of the VAR in (
1) in all cases. However, we should bear in mind that the AR root is rather small now and that, from the previous results based on
MC(b), Johansen’s test is adversely affected by large values of
ρ.
We do not consider the power properties of the unit root tests considered here. Ng and Perron [
14] provide simulation evidence that the DF-GLS has better power then the
M tests, even though the latter have better size properties. For the bootstrap unit root tests, we take the results of the extensive simulation study by Palm
et al. [
16] who found that the ADF sieve bootstrap test performs better under a variety of DGPs with and without an MA component. An extensive simulation study on the power properties of the tests considered here for univariate time series generated by a cointegrated VAR is left for a further investigation.