# Improved Inference on Cointegrating Vectors in the Presence of a near Unit Root Using Adjusted Quantiles

^{1}

^{2}

^{3}

^{*}

Next Article in Journal

Next Article in Special Issue

Next Article in Special Issue

Previous Article in Journal

Previous Article in Special Issue

Previous Article in Special Issue

Department of Statistical Sciences, Sapienza University of Rome, P.le A. Moro 5, 00198 Rome, Italy

Department of Economics, University of Copenhagen, Øster Farimagsgade 5 Building 26, 1353 Copenhagen K, Denmark

CREATES, Department of Economics and Business, Aarhus University, Building 1322, DK-8000 Aarhus C, Denmark

Author to whom correspondence should be addressed.

Academic Editor: Katarina Juselius

Received: 20 April 2017 / Revised: 25 May 2017 / Accepted: 7 June 2017 / Published: 14 June 2017

(This article belongs to the Special Issue Recent Developments in Cointegration)

It is well known that inference on the cointegrating relations in a vector autoregression (CVAR) is difficult in the presence of a near unit root. The test for a given cointegration vector can have rejection probabilities under the null, which vary from the nominal size to more than 90%. This paper formulates a CVAR model allowing for multiple near unit roots and analyses the asymptotic properties of the Gaussian maximum likelihood estimator. Then two critical value adjustments suggested by McCloskey (2017) for the test on the cointegrating relations are implemented for the model with a single near unit root, and it is found by simulation that they eliminate the serious size distortions, with a reasonable power for moderate values of the near unit root parameter. The findings are illustrated with an analysis of a number of different bivariate DGPs.

Elliott (1998) and Cavanagh et al. (1995) investigated the test on a coefficient of a cointegrating relation in the presence of a near unit root in a bivariate cointegrating regression. They show convincingly that when inference on the coefficient is performed as if the process has a unit root, then the size distortion is serious, see top panel of Figure A1 for a reproduction of their results. This paper analyses the p-dimensional cointegrated VAR model with r cointegrating relations under local alternatives
where $\alpha ,\beta $ are $p\times r$ and ${\epsilon}_{t}$ is i.i.d. ${N}_{p}(0,\Omega )$. It is assumed that ${\alpha}_{1}$ and ${\beta}_{1}$ are known $p\times (p-r)$ matrices of rank $p-r,$ and c is $(p-r)\times (p-r)$ and an unknown parameter, such that the model allows for a whole matrix, $c,$ of near unit roots. We consider below the likelihood ratio test, ${Q}_{\beta},$ for a given value of $\beta ,$ calculated as if $c=0,$ that is, as if we have a CVAR with rank $r.$ The properties of the test ${Q}_{\beta}$ can be very bad, when the actual data generating process (DGP) is a slight perturbation of the process generated by the model specified by $\alpha {\beta}^{\prime}$. The matrix $\alpha {\beta}^{\prime}$ describes a surface in the space of $p\times p$ matrices of dimension ${p}^{2}-{(p-r)}^{2}$. Therefore a model is formulated that in some particular “directions”, given by the matrix ${\alpha}_{1}c{\beta}_{1}^{\prime},$ has a small perturbation of the order of ${T}^{-1}$ and ${(p-r)}^{2}$ extra parameters, c, that are used to describe the near unit roots.

$$\Delta {y}_{t}=(\alpha {\beta}^{\prime}+{T}^{-1}{\alpha}_{1}c{\beta}_{1}^{\prime}){y}_{t-1}+{\epsilon}_{t},\phantom{\rule{2.em}{0ex}}t=1,\cdots ,T,$$

A similar model could be suggested for near unit roots in the $I\left(2\right)$ model, see Di Iorio et al. (2016), but this will not be attempted here.

The model (1) contains as a special case the DGP used for the simulations in Elliott (1998), whe the errors are i.i.d. Gaussian and no deterministic components are present. The likelihood ratio test, ${Q}_{\beta},$ for $\beta $ equal to a given value, is derived assuming that $c=0$ and analyzed when in fact near unit roots are present, $c\ne 0$. The parameters $\alpha ,\beta ,$ and $\Omega $ can be estimated consistently, but c cannot, and this is what causes the bad behaviour of ${Q}_{\beta}.$

The matrix $\Pi (\alpha ,\beta ,c)=\alpha {\beta}^{\prime}+{T}^{-1}{\alpha}_{1}c{\beta}_{1}^{\prime}$ is an invertible function of the ${p}^{2}$ parameters $(\alpha ,\beta ,c),$ see Lemma 1, so that the Gaussian maximum likelihood estimator in model (1) is least squares, and their limit distributions are found in Theorem 2. The main contribution of this paper, however, is a simulation study for the bivariate VAR with $p=2$, $r=1.$ It is shown that two of the methods introduced by McCloskey (2017, Theorems Bonf and Bonf -Adj), for allowing the critical value for ${Q}_{\beta}$ to depend on the estimator of $c,$ give a much better solution to inference on $\beta ,$ in the case of a near unit root. The results of McCloskey (2017) also allow for multivariate parameters and for more complex adjustments, but in the present paper we focus for the simulations on the case with $p=2$ and $r=1,$ so there is only one parameter in c. In case $r=1,$ the matrix ${I}_{p}+\Pi $ is linear in $c\in \mathbb{R},$ and for $c=0,$ it has an extra unit root. Therefore there is a near unit root for $c\ne 0$, and we choose the vector ${\alpha}_{1}$ such $c\ge 0$ corresponds to the non-explosive near unit roots of interest.

The assumption that ${\alpha}_{1}$ and ${\beta}_{1}$ are known is satisfied under the null, in the DGP analyzed by Elliott, see (15) and (16). This is of course convenient, because ${\alpha}_{1},{\beta}_{1}$ as free parameters, are not estimable.

Let $\theta $ denote the parameters $\alpha ,$ $\beta $ and $\Omega $ and let $\widehat{\theta}$ and $\widehat{c}$ denote the maximum likelihood estimators in model (1). For a given $\eta $ (here $5\%$ or $10\%),$ the quantile ${c}_{\theta ,\eta}\left(c\right)$ is defined by ${P}_{c,\theta}\{\widehat{c}\le {c}_{\theta ,\eta}\left(c\right)\}=\eta $. Simulations show that the quantile is increasing in $c,$ and solving the inequality for $c,$ a $1-\eta $ confidence interval, $[0,{c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)],$ is defined for c. For given $\xi $ (here $90\%$ or $95\%)$ the quantile ${q}_{\theta ,\xi}\left(c\right)$ is defined by ${P}_{c,\theta}\{{Q}_{\beta}\le {q}_{\theta ,\xi}\left(c\right)\}=\xi ,$ and McCloskey (2017) suggests replacing the critical value ${q}_{\theta ,\xi}\left(c\right),$ by the stochastic critical value ${q}_{\theta ,\xi}\left({c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)\right),$ or introducing the optimal $\xi $ by solving the equation
for a given nominal size $\upsilon $ (here 10%).

$$\underset{0\le c\le \infty}{max}{P}_{c,\theta}\left\{{Q}_{\beta}\le {q}_{\theta ,\xi}\left({c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)\right)\right\}=\upsilon ,$$

These methods are explained and implemented by a simulation study, and it is shown that they offer a solution to the problem of inference on $\beta $ in the presence of a near unit root.

The model is given by (1) and the following standard $I\left(1\right)$ assumptions are made.

It is assumed that $r<p,$ c is $(p-r)\times (p-r),$ and that the equation
has $p-r$ roots equal to one, and the remaining roots are outside the unit circle, such that $\left|eigen\right({I}_{r}+{\beta}^{\prime}\alpha \left)\right|<1$. Moreover $\Pi =\alpha {\beta}^{\prime}+{T}^{-1}{\alpha}_{1}c{\beta}_{1}^{\prime}$ has rank p and
has all roots outside the unit circle for all $T\ge {T}_{0}$.

$$det\left({I}_{p}(1-z)-\alpha {\beta}^{\prime}z\right)=0$$

$$det\left({I}_{p}(1-z)-\alpha {\beta}^{\prime}z-{T}^{-1}{\alpha}_{1}c{\beta}_{1}^{\prime}z\right)=0,$$

For the asymptotic analysis we need condition (2) to hold for T tending to ∞, and for the simulations, we need it to hold for $T=100.$ In model (1) with cointegrating rank r and ${\alpha}_{1}$ and ${\beta}_{1}$ known, the number of free parameters in $\alpha $ and $\beta $ is $2pr-{r}^{2}={p}^{2}-{(p-r)}^{2}.$ The next result shows how the parameters $\alpha ,\beta ,c$ are calculated from $\Pi .$ For any $p\times m$ matrix of rank $m<p,$ we use the notation ${a}_{\perp}$ to indicate a $p\times (p-m)$ matrix of rank $p-m,$ for which ${a}_{\perp}^{\prime}a=0,$ and the notation $\overline{a}=a{\left({a}^{\prime}a\right)}^{-1}.$

Let $\Pi =\alpha {\beta}^{\prime}+{T}^{-1}{\alpha}_{1}c{\beta}_{1}^{\prime}$ and let Assumption 1 be satisfied. Then, for β normalized as ${\beta}^{\prime}b={I}_{r},$

$$\begin{array}{cc}\hfill \alpha & =\Pi {\beta}_{1\perp}{({\alpha}_{1\perp}^{\prime}\Pi {\beta}_{1\perp})}^{-1}{\alpha}_{1\perp}^{\prime}\Pi b,\hfill \end{array}$$

$$\begin{array}{cc}\hfill {\beta}^{\prime}& ={({\alpha}_{1\perp}^{\prime}\Pi b)}^{-1}{\alpha}_{1\perp}^{\prime}\Pi ,\hfill \end{array}$$

$$\begin{array}{cc}\hfill c& =T{\left({\beta}_{1}^{\prime}{\Pi}^{-1}{\alpha}_{1}\right)}^{-1}.\hfill \end{array}$$

To discuss the estimation we introduce the product moments of $\Delta {y}_{t}$ and ${y}_{t-1}$

$${S}_{00}={T}^{-1}\sum _{t=1}^{T}\Delta {y}_{t}\Delta {y}_{t}^{\prime},\phantom{\rule{2.em}{0ex}}{S}_{11}={T}^{-1}\sum _{t=1}^{T}{y}_{t-1}{y}_{t-1}^{\prime},\phantom{\rule{2.em}{0ex}}{S}_{10}={S}_{01}^{\prime}={T}^{-1}\sum _{t=1}^{T}{y}_{t-1}\Delta {y}_{t}^{\prime}.$$

In model (1) with ${\alpha}_{1}$ and ${\beta}_{1}$ known, the Gaussian maximum likelihood estimator of $\Pi =\alpha {\beta}^{\prime}+{T}^{-1}{\alpha}_{1}c{\beta}_{1}^{\prime}$ is the coefficient in a least squares regression of $\Delta {y}_{t}$ on ${y}_{t-1}.$ For β normalized on some $p\times r$ matrix b, ${\beta}^{\prime}b={I}_{r},$ the maximum likelihood estimators $(\alpha ,\beta ,c)$ are given in (3)–(5) by inserting $\widehat{\Pi}.$

For $c=0,$ such that the rank of Π is $r,$ the likelihood ratio test for a given value of β is
where the maximum likelihood estimator $\stackrel{\u02d8}{\beta}$ is determined by reduced rank regression assuming the rank is r.

$${Q}_{\beta}=Tlog\frac{det\left({S}_{00}-{S}_{01}\beta \left({\beta}^{\prime}{S}_{11}^{-1}\beta \right){\beta}^{\prime}{S}_{10}\right)}{det\left({S}_{00}-{S}_{01}\stackrel{\u02d8}{\beta}\left({\stackrel{\u02d8}{\beta}}^{\prime}{S}_{11}^{-1}\stackrel{\u02d8}{\beta}\right){\stackrel{\u02d8}{\beta}}^{\prime}{S}_{10}\right)},$$

The basic asymptotic result for the analysis of the estimators and the test statistic is that ${\alpha}_{\perp}^{\prime}{y}_{t}$ converges to an Ornstein-Uhlenbeck process. This technique was developed by Phillips (1988), and Johansen (1996, chp. 14) is used as a reference for details related to the CVAR. The results for the test statistic can be found in Elliott (1998).

Under Assumption 1, the process given by (1) satisfies
where $K$ is the Ornstein-Uhlenbeck process
$C={\beta}_{\perp}{\left({\alpha}_{\perp}^{\prime}{\beta}_{\perp}\right)}^{-1}{\alpha}_{\perp}^{\prime}$ and ${W}_{\epsilon}$ is Brownian motion generated by the cumulated ${\epsilon}_{t}.$

$${T}^{-1/2}{\alpha}_{\perp}^{\prime}{y}_{\left[Tu\right]}\stackrel{D}{\to}K\left(u\right),$$

$$K\left(u\right)={\alpha}_{\perp}^{\prime}{\int}_{0}^{u}exp\left\{{\alpha}_{1}c{\beta}_{1}^{\prime}C(u-s)\right\}d{W}_{\epsilon}\left(s\right),$$

The test ${Q}_{\beta}$ for a given value of $\beta ,$ derived assuming $c=0,$ see (6), satisfies
where the stochastic noncentrality parameter
is independent of the ${\chi}^{2}$ distribution and has expectation
Here $\zeta ={\alpha}_{1}^{\prime}{\Omega}^{-1}\alpha {\left({\alpha}^{\prime}{\Omega}^{-1}\alpha \right)}^{-1}{\alpha}^{\prime}{\Omega}^{-1}{\alpha}_{1}$ and $\tau ={\alpha}_{1}c{\beta}_{1}^{\prime},$ so it follows that $E\left(B\right)=0$ if and only if ${\alpha}_{1}^{\prime}{\Omega}^{-1}\alpha =0,$ in which case ${Q}_{\beta}\stackrel{D}{\to}{\chi}_{(p-r)r}^{2}.$

$${Q}_{\beta}\stackrel{D}{\to}{\chi}_{(p-r)r}^{2}+B,$$

$$B=tr\left\{{\beta}_{1}{c}^{\prime}\zeta c{\beta}_{1}^{\prime}{\beta}_{\perp}{\left({\alpha}_{\perp}^{\prime}{\beta}_{\perp}\right)}^{-1}\left({\int}_{0}^{1}K{K}^{\prime}du\right){\left({\beta}_{\perp}^{\prime}{\alpha}_{\perp}\right)}^{-1}{\beta}_{\perp}^{\prime}\right\},$$

$$E\left(B\right)=tr\left\{{\beta}_{1}{c}^{\prime}\zeta c{\beta}_{1}^{\prime}C\left({\int}_{0}^{1}(1-v)exp\left(v\tau C\right)\Omega exp\left(v{C}^{\prime}{\tau}^{\prime}\right)dv\right){C}^{\prime}\right\}.$$

Let β be normalized as ${\beta}^{\prime}{\beta}_{1\perp}={I}_{r}.$ The asymptotic distribution of the estimators, $\widehat{\alpha}$, $\widehat{\beta}$, $\widehat{c}$, see (3)–(5), are given as

$$\begin{array}{cc}\hfill {T}^{1/2}(\widehat{\alpha}-\alpha )\stackrel{\mathrm{D}}{\to}& {N}_{p\times r}(0,{\mathsf{\Sigma}}_{\beta \beta}^{-1}\otimes \Omega ),\hfill \end{array}$$

$$\begin{array}{cc}\hfill T{(\widehat{\beta}-\beta )}^{\prime}{\beta}_{\perp}\stackrel{\mathrm{D}}{\to}& {\left({\alpha}_{1\perp}^{\prime}\alpha \right)}^{-1}{\alpha}_{1\perp}^{\prime}{\int}_{0}^{1}\left(d{W}_{\epsilon}\right){K}^{\prime}{\left({\int}_{0}^{1}K{K}^{\prime}du\right)}^{-1}{\alpha}_{\perp}^{\prime}{\beta}_{\perp},\hfill \end{array}$$

$$\begin{array}{cc}\hfill \widehat{c}-c\stackrel{\mathrm{D}}{\to}& {\left({\alpha}_{\perp}^{\prime}{\alpha}_{1}\right)}^{-1}{\alpha}_{\perp}^{\prime}{\int}_{0}^{1}\left(d{W}_{\epsilon}\right){K}^{\prime}{\left({\int}_{0}^{1}K{K}^{\prime}du\right)}^{-1}{\alpha}_{\perp}^{\prime}{\beta}_{\perp}{\left({\beta}_{1}^{\prime}{\beta}_{\perp}\right)}^{-1}.\hfill \end{array}$$

Note that the asymptotic distributions of $\widehat{\beta}$ and $\widehat{c}$ given in (11) and (12) are not mixed Gaussian, because ${\alpha}_{1\perp}^{\prime}{W}_{\epsilon}\left(u\right)$ and ${\alpha}_{\perp}^{\prime}{W}_{\epsilon}\left(u\right)$ are not independent of $K\left(u\right),$ which is generated by ${\alpha}_{\perp}^{\prime}{\epsilon}_{t}.$

In the special case where $r=p-1,$ we choose ${\alpha}_{1}$ so that $c\ge 0,$ and find
where

$$E\left(B\right)=\frac{{e}^{2\delta c}-1-2\delta c}{{\left(2\delta \right)}^{2}}\kappa \zeta ,$$

$$\delta ={\beta}_{1}^{\prime}C{\alpha}_{1},\phantom{\rule{4pt}{0ex}}\phantom{\rule{4pt}{0ex}}\phantom{\rule{4pt}{0ex}}\phantom{\rule{4pt}{0ex}}\phantom{\rule{4pt}{0ex}}\kappa ={\beta}_{1}^{\prime}C\Omega {C}^{\prime}{\beta}_{1},\phantom{\rule{4pt}{0ex}}\phantom{\rule{4pt}{0ex}}\phantom{\rule{4pt}{0ex}}\phantom{\rule{4pt}{0ex}}\phantom{\rule{4pt}{0ex}}\phantom{\rule{4pt}{0ex}}\phantom{\rule{4pt}{0ex}}\zeta ={\alpha}_{1}^{\prime}{\Omega}^{-1}\alpha {\left({\alpha}^{\prime}{\Omega}^{-1}\alpha \right)}^{-1}{\alpha}^{\prime}{\Omega}^{-1}{\alpha}_{1}.$$

In this section the method of McCloskey (2017, Theorem Bonf) is illustrated by a number of simulation experiments. The simulations are performed with data generated by a bivariate model (1), where $p=2$ and$\phantom{\rule{4pt}{0ex}}r=1.$ The direction ${\alpha}_{1}$ is chosen such that $c\ge 0.$ The test ${Q}_{\beta}$ for a given value of $\beta ,$ is calculated assuming $c=0$, see (6). The simulations of Elliott (1998), see Section 3.3, show that there may be serious size distortions of the test, depending on the value of c and $\rho $, if the test is based on the quantiles from the asymptotic ${\chi}^{2}\left(1\right)$ distribution.

The methods of McCloskey (2017) consists in this case of replacing the ${\chi}^{2}\left(1\right)$ critical value with a stochastic critical value depending on $\widehat{c},$ in order to control the rejection probability under the null hypothesis.

Let $\theta =(\alpha ,\beta ,\Omega )$ and let ${P}_{c,\theta}$ denote the probability measure corresponding to the parameters $c,\theta .$ The method consists of finding the $\eta $ quantile of $\widehat{c},$ see (5) with $\Pi $ replaced by $\widehat{\Pi}$, as defined by
for $\eta =5\%$ or $10\%,$ say, and the $\xi $ quantile ${q}_{\theta ,\xi}\left(c\right)$ of ${Q}_{\beta}$ as defined by
for $\xi =90\%$ or $95\%,$ say.

$${P}_{c,\theta}\left(\widehat{c}\le {c}_{\theta ,\eta}\left(c\right)\right)=\eta ,$$

$${P}_{c,\theta}\left({Q}_{\beta}\le {q}_{\theta ,\xi}\left(c\right)\right)=\xi ,$$

By simulation for given $\theta $ and a grid of given of values $c\in ({c}_{1},\cdots ,{c}_{n}),$ the quantiles ${c}_{\theta ,\eta}\left({c}_{i}\right)$ and ${q}_{\theta ,\xi}\left({c}_{i}\right)$ are determined. It turns out, that both ${c}_{\theta ,\eta}\left(c\right)$ and ${q}_{\theta ,\xi}\left(c\right)$ are increasing in $c,$ see Figure A2. Therefore, a solution ${c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)$ can be found such that

$${P}_{c,\theta}\left\{\widehat{c}>{c}_{\theta ,\eta}\left(c\right)\right\}={P}_{c,\theta}\left\{c\le {c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)\right\}=1-\eta .$$

This gives a $1-\eta $ confidence interval $[0,{c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)]$ for c, based on the estimator $\widehat{c}$. Note that for $c\le {c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)$ it holds by monotonicity of ${q}_{\theta ,\xi}(\xb7)$ that ${q}_{\theta ,\xi}\left(c\right)\le {q}_{\theta ,\xi}\left({c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)\right),$ such that
but we also have
such that

$${P}_{c,\theta}\left[{Q}_{\beta}>{q}_{\theta ,\xi}\left\{{c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)\right\}\phantom{\rule{4.pt}{0ex}}\mathrm{and}\phantom{\rule{4.pt}{0ex}}c\le {c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)\right]\le {P}_{c,\theta}\left\{{Q}_{\beta}>{q}_{\theta ,\xi}\left(c\right)\right\}\le 1-\xi ,$$

$${P}_{c,\theta}\left[{Q}_{\beta}>{q}_{\theta ,\xi}\left\{{c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)\right\}\phantom{\rule{4.pt}{0ex}}\mathrm{and}\phantom{\rule{4.pt}{0ex}}c>{c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)\right]\le {P}_{c,\theta}\left[c>{c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)\right]=\eta ,$$

$${P}_{c,\theta}\left[{Q}_{\beta}>{q}_{\theta ,\xi}\left\{{c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)\right\}\right\}\le 1-\xi +\eta .$$

In the paper from McCloskey (2017) it is proved under suitable conditions that we have the much stronger result

$$1-\xi \le \underset{T\to \infty}{limsup}\underset{0\le c<\infty}{sup}{P}_{c,\theta}\left[{Q}_{\beta}>{q}_{\widehat{\theta},\xi}\left\{{c}_{\widehat{\theta},\eta}^{-1}\left(\widehat{c}\right)\right\}\right]\le 1-\xi +\eta .$$

Thus, the limiting rejection probability, for given $\theta ,$ of the test on $\beta ,$ calculated as if $c=0,$ but replacing the ${\chi}_{\xi}^{2}\left(1\right)$ quantile by the estimated stochastic quantile ${q}_{\widehat{\theta},\xi}\left({c}_{\widehat{\theta},\eta}^{-1}\left(\widehat{c}\right)\right),$ lies between $1-\xi $ and $1-\xi +\eta .$ In the simulations we set $\eta =0.05$ and $\xi =0.95$, so that the limiting rejection probability is bounded by $10\%.$

Note that $\theta $ is replaced by the consistent estimator $\widehat{\theta}.$ It obviously simplifies matters that in all the examples we simulate, it turns out that ${c}_{\theta ,\eta}\left(c\right)$ is approximately linear and increasing in $c,$ and ${q}_{\theta ,\xi}\left(c\right)$ is approximately quadratic and increasing in c for the relevant values of c, see Figure A2.

McCloskey (2017, Theorem Bonf-Adj) suggests determining by simulation on a grid of values of c and $\xi ,$ the quantitity
It turns out that ${\overline{P}}_{\theta ,\eta}\left(\xi \right)$ is monotone in $\xi ,$ and we can determine for a given nominal size $\upsilon $ (here 10%)
The Adjusted Bonferroni quantile is then
and we find
The result of McCloskey (2017, Theorem Bonf-Adj) is that under suitable assumptions
where we illustrate the upper bound.

$${\overline{P}}_{\theta ,\eta}\left(\xi \right)=\underset{0\le c\le \infty}{max}{P}_{c,\theta}\left({Q}_{\beta}>{q}_{\theta ,\xi}({c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)\right).$$

$${\xi}_{opt}={\overline{P}}_{\theta ,\eta}^{-1}\left(\upsilon \right).$$

$${q}_{\theta ,{\xi}_{opt}}\left({c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)\right),$$

$${P}_{c,\theta}\left({Q}_{\beta}>{q}_{\theta ,{\xi}_{opt}}({c}_{\theta ,\eta}^{-1}\left(\widehat{c}\right)\right)\le \upsilon .$$

$$\underset{T\to \infty}{limsup}\underset{0\le c<\infty}{sup}{P}_{c,\theta}\left[{Q}_{\beta}>{q}_{\widehat{\theta},{\xi}_{opt}}\left\{{c}_{\widehat{\theta},\eta}^{-1}\left(\widehat{c}\right)\right\}\right]=\upsilon ,$$

The DGP is defined by the equations,

$$\begin{array}{cc}\hfill {y}_{1t}& =\left(1-\frac{c}{T}\right){y}_{1t-1}+{u}_{1t},\hfill \end{array}$$

$$\begin{array}{cc}\hfill {y}_{2t}& =\gamma {y}_{1t}+{u}_{2t}.\hfill \end{array}$$

It is assumed that ${u}_{t}={({u}_{1t},{u}_{2t})}^{\prime}$ are i.i.d. ${N}_{2}(0,{\Omega}_{u})$ with
and the initial values are ${y}_{10}={y}_{20}=0.$ The data ${y}_{1},\cdots ,{y}_{T}$ are generated from (15) and (16), and the test statistic ${Q}_{\beta}$ for the hypothesis $\gamma =0,$ is calculated using (6).

$${\Omega}_{u}=\left(\begin{array}{cc}1& \rho \\ \rho & 1\end{array}\right),$$

The DGP defined by (15) and (16) is contained in model (1) for $p=2$. Note that ${y}_{2t}=\gamma (1-c/T){y}_{1t-1}+\gamma {u}_{1t}+{u}_{2t}$ such that
where the sign on ${\alpha}_{1}$ has been chosen such that $c\ge 0.$ Finally ${\epsilon}_{1t}={u}_{1t}$ and ${\epsilon}_{2t}={u}_{2t}+\gamma {u}_{1t},$ and therefore

$$\alpha =\left(\begin{array}{c}0\\ 1\end{array}\right),\phantom{\rule{4.pt}{0ex}}\beta =\left(\begin{array}{c}\hfill \gamma \\ \hfill -1\end{array}\right),\phantom{\rule{4.pt}{0ex}}{\alpha}_{1}=\left(\begin{array}{c}\hfill -1\\ \hfill -\gamma \end{array}\right),\phantom{\rule{4.pt}{0ex}}{\beta}_{1}=\left(\begin{array}{c}1\\ 0\end{array}\right),$$

$$\Omega =\left(\begin{array}{cc}1& \rho +\gamma \\ \rho +\gamma & 1+{\gamma}^{2}+2\gamma \rho \end{array}\right).$$

For $c=0,$ the process ${y}_{t}={({y}_{1t},{y}_{2t})}^{\prime}$ is $I\left(1\right)$ and $\gamma {y}_{1t}-{y}_{2t}$ is stationary, and if $c/T$ is close to zero, ${y}_{t}$ has a near unit root.

Applying Corollary 1 to the DGP (15) and (16), the expectation of the test statistic ${Q}_{\beta}$ is found to be
which increases approximately linearly in $c.$

$$E\left({Q}_{\beta}\right)=p-1+\frac{{e}^{-2c}-1+2c}{4}\frac{{\rho}^{2}}{1-{\rho}^{2}},$$

Based on $N=1000$ simulations of errors ${u}_{1},\cdots ,{u}_{T}$, $T=100$, the data ${y}_{1},\cdots ,{y}_{T}$, are constructed from the DGP for each combination of the parameters
where $[a:(b):c]$ indicates the interval from a to c with step b. Based on each simulation, $\widehat{c}$ and the test ${Q}_{\beta}$ for $\gamma =0$ are calculated.

$$(\gamma ,c,\rho )\in [-0.5:(0.01):0.5]\times [1:(1):20]\times [-0.9:(0.1):0.9],$$

Top panel of Figure A1 shows the rejection probabilities of the test ${Q}_{\beta}$ as a function of $(c,\rho )$, using the asymptotic critical value, ${\chi}_{0.90}^{2}\left(1\right)=2.71,$ for a nominal rejection probability of $10\%$. The rejection probability increases with $\left|\rho \right|$ and with c. When $c=10$ (corresponding to an autoregressive coefficient of $c/T=0.9$) and $\left|\rho \right|=0.7$, the size of the test ${Q}_{\beta}$ is around $50\%$, as found in Elliott (1998). The results are analogous across models with an unrestricted constant term, or with a constant restricted to the cointegrating space. In the paper by Elliott (1998) a number of tests are analyzed, and it was found that they were quite similar in their performance and similar to the above likelihood ratio test ${Q}_{\beta}$ from the CVAR with rank equal to 1.

Data are simulated as above and first the rank test statistic, ${Q}_{r}$, see Johansen (1996, chp. 11) for rank equal to 1, is calculated. The rejection probabilities for a 5% test using ${Q}_{r}$ are given in the bottom panel of Figure A1 and they show that for $c=20,$ the hypothesis that the rank is 1, is practically certain to be rejected. If $c=8,$ the probability of rejecting that the rank is 1 is around $50\%$, so that plotting the rejection probabilities for $0\le c\le 10,$ covers the relevant values, see Figure A3.

For $\eta =5\%$ and $10\%,$ the quantiles ${c}_{\eta}\left(c\right)$ of $\widehat{c}$ are reported in Figure A2 as a function of c. The quantiles ${c}_{\eta}\left(c\right)$ are nearly linear in $c,$ and they are approximated by
where the coefficients $({a}_{\eta},{b}_{\eta})$ depend on $\eta $, which is used to construct the upper confidence limit in (14) as

$${\tilde{c}}_{\eta}\left(c\right)={a}_{\eta}+{b}_{\eta}c,$$

$${\tilde{c}}_{\eta}^{-1}\left(\widehat{c}\right)=(\widehat{c}-{a}_{\eta}){b}_{\eta}^{-1}.$$

For $\xi =90\%$ and $95\%,$ the quantiles ${q}_{\rho ,\xi}\left(c\right)$ of ${Q}_{\beta}$ are reported in Figure A2 as function of c for four values of $\rho $. It is seen that for given $\rho $, the quantiles ${q}_{\rho ,\xi}\left(c\right)$ are monotone and quadratic in c, for relevant values of $c,$ and hence they can be approximated by
where the coefficients $({f}_{\rho ,\xi},{g}_{\rho ,\xi},{h}_{\rho ,\xi})$ depend on $\rho $ and $\xi $. The modified critical value is then constructed replacing $(c,\rho )$ by $({\tilde{c}}_{\eta}^{-1}\left(\widehat{c}\right),\widehat{\rho})$ in (19), and thus one finds the adjusted critical value
which depends on estimated values, $\widehat{c}$ and $\widehat{\rho}$, and on discretionary values, $\xi $ and $\eta $.

$${\tilde{q}}_{\rho ,\xi}\left(c\right)={f}_{\rho ,\xi}+{g}_{\rho ,\xi}c+{h}_{\rho ,\xi}{c}^{2},$$

$${\tilde{q}}_{\widehat{\rho},\xi ,\eta}\left(\widehat{c}\right)={f}_{\widehat{\rho},\xi}+{g}_{\widehat{\rho},\xi}(\widehat{c}-{a}_{\eta}){b}_{\eta}^{-1}+{h}_{\rho ,\xi}{\left((\widehat{c}-{a}_{\eta}){b}_{\eta}^{-1}\right)}^{2}$$

The adjusted Bonferroni quantile is explained in Section 3.2. Simulations show that ${\overline{P}}_{\theta ,\eta}\left(\xi \right)$ is linear in $\xi $ and the solution of the equation
where $\upsilon =0.10$ is the nominal size of the test, determines ${\xi}_{opt}$; the adjusted Bonferroni q-quantile is then found like (20) as
where $\eta =0.05$.

$${\overline{P}}_{\theta ,\eta}\left(\xi \right)=\upsilon ,$$

$${\tilde{q}}_{\widehat{\rho},{\xi}_{opt},\eta}\left(\widehat{c}\right)={f}_{\widehat{\rho},{\xi}_{opt}}+{g}_{\widehat{\rho},{\xi}_{opt}}(\widehat{c}-{a}_{\eta}){b}_{\eta}^{-1}+{h}_{\rho ,{\xi}_{opt}}{\left((\widehat{c}-{a}_{\eta}){b}_{\eta}^{-1}\right)}^{2},$$

The rejection frequency of ${Q}_{\beta},$ the test for $\gamma =0,$ calculated using the ${\chi}^{2}{\left(1\right)}_{0.90}$ quantile, the Bonferroni quantile in (20) for $\xi =95\%$ and $\eta =5\%$ and the adjusted Bonferroni quantile in (21) for $\eta =5\%$ is reported as a function of c for four values of $\rho $ in Figure A3. For both corrections the rejection frequency is below the nominal size of $10\%$; hence both procedures are able to eliminate the serious size-distortions of the ${\chi}^{2}$ test. While the Bonferroni adjustment leads to rather conservative test with rejection frequency well below the nominal size, the adjusted Bonferroni procedure is closer to the nominal value. The power of the two procedures is shown in Figure A4 and Figure A5 for values of $\left|\gamma \right|\le 1/2.$ It is seen that the better rejection probabilities in Figure A3 are achieved together with a reasonable power for $c\le 5,$ where the probability of rejecting the hypothesis of $r=1$ is around $30\%$, see bottom panel of Figure A1. Notice that both tests become slightly biased, that is, the power functions are not flat around the null $\gamma =0$.

In conclusion, the simulations indicate that the adjusted Bonferroni procedure works better than the simple Bonferroni, the reason being that the former relies on the joint distribution of ${Q}_{\beta}$ and $\widehat{c}$.

Four other data generating processes are defined in Table 1, to investigate the role of different choices of ${\alpha}_{1}$ and ${\beta}_{1}$ for the results on improving the rejection probabilities for test on $\beta $ under the null and alternative. The DGPs all have $\alpha =-\beta ={(-1,1)}^{\prime}/2$. The vectors ${\alpha}_{1}$ and ${\beta}_{1}$ are chosen to investigate different positions of the near unit root in the DGP.

The choice of DGP turns out to be important also for the test, ${Q}_{r},$ for $r=1.$ In fact the probability of rejecting $r=1$ is around $50\%$ for DGP 1 if $c=4$, for DGP 2 if $c=20$, whereas for DGP 3 and 4 the $50\%$ value value is 8.

The rejection probabilities in Figure A6 are plotted for $0\le c\le 10,$ to cover the most relevant values.

The results are summarized in Figure A6, Figure A7 and Figure A8. It is seen that the conclusions from the study of the DGP analyzed by Elliott seem to be valid also for other DGPs. For moderate values of $c,$ using the Bonferroni quantiles gives a rather conservative test while the adjusted Bonferroni procedure is closer to the nominal size and the power curves look reasonable for $c\le 5,$ although the tests are slightly biased, except for DGP 1. For this DGP, ${\alpha}_{1}={\beta}_{1}={(1,1)}^{\prime},$ $\Omega ={I}_{2},$ such that ${\alpha}_{1}^{\prime}{\Omega}^{-1}\alpha =0,$ which means that the asymptotic distribution of ${Q}_{\beta}$ is ${\chi}^{2}\left(1\right),$ see Theorem 3, despite the near unit root. It is seen from Figure A6, there is only moderate distortion of the rejection probability in this case and in Figure A7 and Figure A8, the power curves are symmetric around $\gamma =0,$ so the tests are approximately unbiased.

It has been demonstrated that for the DGP analyzed by Elliott (1998), it is possible to apply the methods of McCloskey (2017) to adjust the critical value in such a way that the rejection probabilities of the test for $\beta $ are very close to the nominal values. By simulating the power of the test for $\beta ,$ it is seen that for $c\le 5,$ the test has a reasonable power. Some other DGPs have been investigated and similar results have been found.

The first author gratefully acknowledges partial financial support from MIUR PRIN grant 2010J3LZEN and Stefano Fachin for discussions. The second author is grateful to CREATES - Center for Research in Econometric Analysis of Time Series (DNRF78), funded by the Danish National Research Foundation, and to Peter Boswijk and Adam McCloskey for discussions.

Both authors contributed equally to the paper.

The authors declare no conflict of interest.

Multiplying $\Pi =\alpha {\beta}^{\prime}+{T}^{-1}{\alpha}_{1}c{\beta}_{1}^{\prime}$ by ${\beta}_{1\perp},$ we find
Multiplying $\Pi $ by ${\alpha}_{1\perp}^{\prime}$ we find
Multiplying by b we find
It follows that from (A2)
and from (A1)
which proves (3) and (4).

$$\Pi {\beta}_{1\perp}=\alpha {\beta}^{\prime}{\beta}_{1\perp}.$$

$${\alpha}_{1\perp}^{\prime}\Pi ={\alpha}_{1\perp}^{\prime}\alpha {\beta}^{\prime}={\alpha}_{1\perp}^{\prime}\Pi {\beta}_{1\perp}{\left({\beta}^{\prime}{\beta}_{1\perp}\right)}^{-1}{\beta}^{\prime}.$$

$${\left({\beta}^{\prime}{\beta}_{1\perp}\right)}^{-1}={({\alpha}_{1\perp}^{\prime}\Pi {\beta}_{1\perp})}^{-1}{\alpha}_{1\perp}^{\prime}\Pi b.$$

$${\beta}^{\prime}={\beta}^{\prime}{\beta}_{1\perp}{({\alpha}_{1\perp}^{\prime}\Pi {\beta}_{1\perp})}^{-1}{\alpha}_{1\perp}^{\prime}\Pi ={({\alpha}_{1\perp}^{\prime}\Pi b)}^{-1}{\alpha}_{1\perp}^{\prime}\Pi $$

$$\alpha =\Pi {\beta}_{1\perp}{\left({\beta}^{\prime}{\beta}_{1\perp}\right)}^{-1}=\Pi {\beta}_{1\perp}{({\alpha}_{1\perp}^{\prime}\Pi {\beta}_{1\perp})}^{-1}{\alpha}_{1\perp}^{\prime}\Pi b,$$

Inserting these results in the expression for $\Pi ,$ we find using ${\alpha}_{1\perp}^{\prime}\Pi b{\beta}^{\prime}={\alpha}_{1\perp}^{\prime}\alpha {\beta}^{\prime}b{\beta}^{\prime}={\alpha}_{1\perp}\Pi $

$$\Pi =\alpha {\beta}^{\prime}+{T}^{-1}{\alpha}_{1}c{\beta}_{1}^{\prime}=\Pi {\beta}_{1\perp}{({\alpha}_{1\perp}^{\prime}\Pi {\beta}_{1\perp})}^{-1}{\alpha}_{1\perp}^{\prime}\Pi +{T}^{-1}{\alpha}_{1}c{\beta}_{1}^{\prime}.$$

Next $\Pi $ is decomposed using
which is proved by premultiplying (A4) by ${\alpha}_{1\perp}^{\prime}$ and ${\beta}_{1}^{\prime}{\Pi}^{-1}.$ Subtracting (A3) and (A4) and multiplying by ${\overline{\alpha}}_{1}^{\prime}$ and ${\overline{\beta}}_{1},$ it is seen that
☐

$$\Pi =\Pi {\beta}_{1\perp}{({\alpha}_{1\perp}^{\prime}\Pi {\beta}_{1\perp})}^{-1}{\alpha}_{1\perp}^{\prime}\Pi +{\alpha}_{1}{\left({\beta}_{1}^{\prime}{\Pi}^{-1}{\alpha}_{1}\right)}^{-1}{\beta}_{1}^{\prime},$$

$${\left({\beta}_{1}^{\prime}{\Pi}^{-1}{\alpha}_{1}\right)}^{-1}=c/T.$$

The unrestricted maximum likelihood estimator of $\Pi $ is $\widehat{\Pi}={S}_{01}{S}_{11}^{-1},$ and $\widehat{\Omega}=\phantom{\rule{3.33333pt}{0ex}}{S}_{00}-{S}_{01}{S}_{11}^{-1}{S}_{10},$ and the results for $\widehat{\alpha},\widehat{\beta},\widehat{c}$ follow from Lemma 1. If $c=0,$ the maximum likelihood estimator $\stackrel{\u02d8}{\beta}$ can be determined by reduced rank regression, see (Johansen (1996, chp. 6)). ☐

Proof of (7) and (8): The limit results for the product moments are given first, using the normalization matrix ${C}_{T}=(\beta ,{T}^{-1/2}{\alpha}_{\perp})$ and the notation ${S}_{1\epsilon}={T}^{-1}{\sum}_{t=1}^{T}{y}_{t-1}{\epsilon}_{t}^{\prime}$,

$$\begin{array}{cc}\hfill {C}_{T}^{\prime}{S}_{11}{C}_{T}& =\left(\begin{array}{cc}{\beta}^{\prime}{S}_{11}\beta & {T}^{-1/2}{\beta}^{\prime}{S}_{11}{\alpha}_{\perp}\\ {T}^{-1/2}{\alpha}_{\perp}^{\prime}{S}_{11}\beta & {T}^{-1}{\alpha}_{\perp}^{\prime}{S}_{11}{\alpha}_{\perp}\end{array}\right)\stackrel{\mathrm{D}}{\to}\left(\begin{array}{cc}{\mathsf{\Sigma}}_{\beta \beta}& 0\\ 0& {\int}_{0}^{1}K{K}^{\prime}du\end{array}\right),\hfill \end{array}$$

$$\begin{array}{cc}\hfill {T}^{1/2}{C}_{T}^{\prime}{S}_{1\epsilon}& =\left(\begin{array}{c}{T}^{1/2}{\beta}^{\prime}{S}_{1\epsilon}\\ {T}^{-1}{\alpha}_{\perp}^{\prime}{S}_{1\epsilon}\end{array}\right)\stackrel{\mathrm{D}}{\to}\left(\begin{array}{c}{N}_{r\times p}(0,\Omega \otimes {\Sigma}_{\beta \beta})\\ {\int}_{0}^{1}K{\left(d{W}_{\epsilon}\right)}^{\prime}\end{array}\right).\hfill \end{array}$$

The test for a known value of $\beta $ is given in (6). It is convenient for the derivation of the limit distribution of ${Q}_{\beta},$ to normalize $\stackrel{\u02d8}{\beta}$ on the matrix $\alpha {\left({\beta}^{\prime}\alpha \right)}^{-1},$ such that ${\stackrel{\u02d8}{\beta}}^{\prime}\alpha {\left({\beta}^{\prime}\alpha \right)}^{-1}={I}_{r},$ and define $\stackrel{\u02d8}{\theta}={\left({\beta}_{\perp}^{\prime}{\alpha}_{\perp}\right)}^{-1}{\beta}_{\perp}^{\prime}(\stackrel{\u02d8}{\beta}-\beta ).$ This gives the representation
The proof under much weaker conditions can be found in Elliott (1998), and is just sketched here. The estimator for $\theta $ for known $\alpha ,\Omega $ and $c=0,$ is given by the equation
where ${\alpha}_{\Omega}={\Omega}^{-1}\alpha {\left({\alpha}^{\prime}{\Omega}^{-1}\alpha \right)}^{-1}.$ The limit distribution of $T\stackrel{\u02d8}{\theta}$ follows from (A5) and (A6) as follows. Because ${T}^{-1}{\alpha}_{\perp}^{\prime}{S}_{11}\beta \stackrel{\mathrm{P}}{\to}0$ it follows that
and from ${\alpha}_{\perp}^{\prime}{S}_{1\epsilon}\stackrel{\mathrm{D}}{\to}{\int}_{0}^{1}K\left(d{W}_{\epsilon}\right),$ it is seen that
say. Conditional on $K,$ the distribution of U is Gaussian with variance ${\left({\alpha}^{\prime}{\Omega}^{-1}\alpha \right)}^{-1}\otimes {\left({\int}_{0}^{1}K{K}^{\prime}du\right)}^{-1}$ and mean ${\left({\beta}_{\perp}^{\prime}{\alpha}_{\perp}\right)}^{-1}{\beta}_{\perp}^{\prime}{\beta}_{1}c{\alpha}_{1}^{\prime}{\alpha}_{\Omega}.$ The information about $\theta $ satisfies
and inserting U for $\left(d\theta \right)$ determines the asymptotic distribution of ${Q}_{\beta}.$ Conditional on K, this has a noncentral ${\chi}^{2}\left((p-r)r\right)$ distribution with noncentrality parameter
where $\zeta ={\alpha}_{1}^{\prime}{\Omega}^{-1}\alpha {\left({\alpha}^{\prime}{\Omega}^{-1}\alpha \right)}^{-1}{\alpha}^{\prime}{\Omega}^{-1}{\alpha}_{1},$ which proves (8). The marginal distribution is therefore a noncentral ${\chi}^{2}$ distribution with a stochastic noncentrality parameter, which is independent of the ${\chi}^{2}$ distribution, as shown by Elliott (1998).

$$\stackrel{\u02d8}{\beta}-\beta ={\alpha}_{\perp}{\left({\beta}_{\perp}^{\prime}{\alpha}_{\perp}\right)}^{-1}{\beta}_{\perp}^{\prime}(\stackrel{\u02d8}{\beta}-\beta )+\beta {\left({\alpha}^{\prime}\beta \right)}^{-1}{\alpha}^{\prime}(\stackrel{\u02d8}{\beta}-\beta )={\alpha}_{\perp}\stackrel{\u02d8}{\theta}.$$

$$T\stackrel{\u02d8}{\theta}={\left({\alpha}_{\perp}^{\prime}{T}^{-1}{S}_{11}{\alpha}_{\perp}\right)}^{-1}({\alpha}_{\perp}^{\prime}{S}_{1\epsilon}+{\alpha}_{\perp}^{\prime}{T}^{-1}{S}_{11}{\beta}_{1}c{\alpha}_{1}^{\prime}){\alpha}_{\Omega},$$

$$\begin{array}{cc}\hfill {\alpha}_{\perp}^{\prime}{T}^{-1}{S}_{11}{\beta}_{1}c{\alpha}_{1}^{\prime}& ={\alpha}_{\perp}^{\prime}{T}^{-1}{S}_{11}\left({\alpha}_{\perp}{\left({\beta}_{\perp}^{\prime}{\alpha}_{\perp}\right)}^{-1}{\beta}_{\perp}^{\prime}+\beta {\left({\alpha}^{\prime}\beta \right)}^{-1}{\alpha}^{\prime}\right){\beta}_{1}c{\alpha}_{1}^{\prime}\hfill \\ & \stackrel{\mathrm{D}}{\to}\left({\int}_{0}^{1}K{K}^{\prime}du\right){\left({\beta}_{\perp}^{\prime}{\alpha}_{\perp}\right)}^{-1}{\beta}_{\perp}^{\prime}{\beta}_{1}c{\alpha}_{1}^{\prime},\hfill \end{array}$$

$$T\stackrel{\u02d8}{\theta}\stackrel{\mathrm{D}}{\to}{\left({\int}_{0}^{1}K{K}^{\prime}du\right)}^{-1}\left({\int}_{0}^{1}K\left(d{W}_{\epsilon}\right)+\left({\int}_{0}^{1}K{K}^{\prime}du\right){\left({\beta}_{\perp}^{\prime}{\alpha}_{\perp}\right)}^{-1}{\beta}_{\perp}^{\prime}{\beta}_{1}c{\alpha}_{1}^{\prime}\right){\alpha}_{\Omega}=U,$$

$${T}^{-2}{I}_{\theta \theta}=tr\left\{{\Omega}^{-1}\alpha {\left(d\theta \right)}^{\prime}{\alpha}_{\perp}^{\prime}{S}_{11}{\alpha}_{\perp}\left(d\theta \right){\alpha}^{\prime}\right\}\stackrel{\mathrm{D}}{\to}tr\left\{{\alpha}^{\prime}{\Omega}^{-1}\alpha {\left(d\theta \right)}^{\prime}{\int}_{0}^{1}K{K}^{\prime}du\left(d\theta \right)\right\},$$

$$B=tr\left\{{\left({\beta}_{\perp}^{\prime}{\alpha}_{\perp}\right)}^{-1}{\beta}_{\perp}^{\prime}{\beta}_{1}{c}^{\prime}\zeta c{\beta}_{1}^{\prime}{\beta}_{\perp}{\left({\alpha}_{\perp}^{\prime}{\beta}_{\perp}\right)}^{-1}{\int}_{0}^{1}K{K}^{\prime}du\right\},$$

Proof of (9): For $\tau ={\alpha}_{1}c{\beta}_{1}^{\prime}$ it is seen that
which proves (9). Note that this expression is zero if and only if $\zeta =0,$ or ${\alpha}_{1}^{\prime}{\Omega}^{-1}\alpha =0,$ in which case the asymptotic distribution of ${Q}_{\beta}$ is ${\chi}^{2}.$

$$\begin{array}{c}Etr\left\{{\left({\beta}_{\perp}^{\prime}{\alpha}_{\perp}\right)}^{-1}{\beta}_{\perp}^{\prime}{\beta}_{1}{c}^{\prime}\zeta c{\beta}_{1}^{\prime}{\beta}_{\perp}{\left({\alpha}_{\perp}^{\prime}{\beta}_{\perp}\right)}^{-1}{\int}_{0}^{1}K{K}^{\prime}du\right\}\hfill \\ =Etr\left\{{\beta}_{1}{c}^{\prime}\zeta c{\beta}_{1}^{\prime}C{\int}_{0}^{1}\left({\int}_{0}^{u}exp\left(\tau C(u-s)\right)dW\left(s\right)\right)\left({\int}_{0}^{u}dW{\left(t\right)}^{\prime}exp\left({C}^{\prime}{\tau}^{\prime}(u-t)\right)\right)du{C}^{\prime}\right\}\hfill \\ =tr\left\{{\beta}_{1}{c}^{\prime}\zeta c{\beta}_{1}^{\prime}C{\int}_{0}^{1}\left({\int}_{0}^{u}exp\left(\tau C(u-s))\Omega exp({C}^{\prime}{\tau}^{\prime}(u-s)\right)ds\right)du{C}^{\prime}\right\}\hfill \\ =tr\left\{{\beta}_{1}{c}^{\prime}\zeta c{\beta}_{1}^{\prime}C\left({\int}_{0}^{1}(1-v)exp\left(v\tau C\right)\Omega exp\left(v{C}^{\prime}{\tau}^{\prime}\right)dv\right){C}^{\prime}\right\},\hfill \end{array}$$

Proof of (10) and (11):

It follows that $\widehat{\Pi}={S}_{01}{S}_{11}^{-1}$ can be expressed as
where, using (A5) and (A6),
From ${\widehat{\beta}}^{\prime}={\left({\alpha}_{1\perp}^{\prime}\widehat{\Pi}b\right)}^{-1}{\alpha}_{1\perp}^{\prime}\widehat{\Pi},$ it follows that
where ${T}^{1/2}{M}_{1T}{\beta}^{\prime}{\beta}_{\perp}=0,$ ${\alpha}_{1\perp}^{\prime}{\alpha}_{1}c{\beta}_{1}^{\prime}=0$ and ${\beta}^{\prime}b={I}_{r}.$ This proves (11).

$$\begin{array}{cc}\hfill \widehat{\Pi}& =\alpha {\beta}^{\prime}+{T}^{-1}{\alpha}_{1}c{\beta}_{1}^{\prime}+{S}_{\epsilon 1}{S}_{11}^{-1}\hfill \\ & =\alpha {\beta}^{\prime}+{T}^{-1}{\alpha}_{1}c{\beta}_{1}^{\prime}+{T}^{-1/2}\left({T}^{1/2}{S}_{\epsilon 1}{C}_{T}\right){\left({C}_{T}^{\prime}{S}_{11}{C}_{T}\right)}^{-1}{\left(\beta ,{T}^{-1/2}{\alpha}_{\perp}\right)}^{\prime}\hfill \\ & =\alpha {\beta}^{\prime}+{T}^{-1}{\alpha}_{1}c{\beta}_{1}^{\prime}+{T}^{-1/2}{M}_{1T}{\beta}^{\prime}+{T}^{-1}{M}_{2T}{\alpha}_{\perp}^{\prime},\hfill \end{array}$$

$$\begin{array}{cc}\hfill {M}_{1T}\stackrel{\mathrm{D}}{\to}{M}_{1}& ={N}_{p\times r}(0,{\mathsf{\Sigma}}_{\beta \beta}^{-1}\otimes \Omega ),\hfill \end{array}$$

$$\begin{array}{cc}\hfill {M}_{2T}\stackrel{\mathrm{D}}{\to}{M}_{2}& ={\int}_{0}^{1}d{W}_{\epsilon}{K}^{\prime}{\left({\int}_{0}^{1}K{K}^{\prime}du\right)}^{-1}.\hfill \end{array}$$

$$\begin{array}{cc}\hfill T{(\widehat{\beta}-\beta )}^{\prime}{\beta}_{\perp}& =T{\left({\alpha}_{1\perp}^{\prime}\widehat{\Pi}b\right)}^{-1}{\alpha}_{1\perp}^{\prime}(\widehat{\Pi}-\alpha {\beta}^{\prime}){\beta}_{\perp}\hfill \\ & ={\left({\alpha}_{1\perp}^{\prime}\widehat{\Pi}b\right)}^{-1}{\alpha}_{1\perp}^{\prime}({T}^{1/2}{M}_{1T}{\beta}^{\prime}+{M}_{2T}{\alpha}_{\perp}^{\prime}){\beta}_{\perp}\hfill \\ & \stackrel{\mathrm{D}}{\to}{\left({\alpha}_{1\perp}^{\prime}\alpha {\beta}^{\prime}b\right)}^{-1}{\alpha}_{1\perp}^{\prime}{M}_{2}{\alpha}_{\perp}^{\prime}{\beta}_{\perp}={\left({\alpha}_{1\perp}^{\prime}\alpha \right)}^{-1}{\alpha}_{1\perp}^{\prime}{M}_{2}{\alpha}_{\perp}^{\prime}{\beta}_{\perp},\hfill \end{array}$$

From the normalization ${\widehat{\beta}}^{\prime}b={I}_{r}$ we find, replacing $\widehat{\beta}$ by $\beta $
which proves (10).

$$\begin{array}{cc}\hfill {T}^{1/2}(\widehat{\alpha}-\alpha )& ={T}^{1/2}(\widehat{\Pi}{\beta}_{1\perp}{\left({\widehat{\beta}}^{\prime}{\beta}_{1\perp}\right)}^{-1}-\Pi {\beta}_{1\perp}{\left({\beta}^{\prime}{\beta}_{1\perp}\right)}^{-1})\hfill \\ & ={T}^{1/2}({T}^{-1/2}{M}_{1T}+{T}^{-1}{M}_{2T}{\alpha}_{\perp}^{\prime}{\beta}_{1\perp}{\left({\beta}^{\prime}{\beta}_{1\perp}\right)}^{-1})+{o}_{P}\left(1\right)\hfill \\ & ={M}_{1T}+{T}^{-1/2}{M}_{2T}{\alpha}_{\perp}^{\prime}{\beta}_{1\perp}{\left({\beta}^{\prime}{\beta}_{1\perp}\right)}^{-1}+{o}_{P}\left(1\right)\stackrel{\mathrm{D}}{\to}{M}_{1},\hfill \end{array}$$

Proof of (12): To analyse the limit distribution of $\widehat{c}$, define
and write
The expansion (A7), and the limits (A8) and (A9) are then applied to give the limit results
Thus
Multiplying by ${\beta}_{1}^{\prime}$ and ${\alpha}_{1}$ and inverting, it is seen that because ${\beta}_{1}^{\prime}{\beta}_{\perp}$ and ${\alpha}_{1}^{\prime}{\alpha}_{\perp}$ are $(p-r)\times (p-r)$ of full rank,
which proves (12). ☐

$${A}_{T}=({T}^{-1/2}\overline{\alpha},{\alpha}_{\perp})\phantom{\rule{4.pt}{0ex}}\mathrm{and}\phantom{\rule{4.pt}{0ex}}{B}_{T}=({T}^{-1/2}\overline{\beta},{\beta}_{\perp}),$$

$$\widehat{c}=T{\left({\beta}_{1}^{\prime}{\widehat{\Pi}}^{-1}{\alpha}_{1}\right)}^{-1}={\left({\beta}_{1}^{\prime}{B}_{T}{\left({A}_{T}^{\prime}T\widehat{\Pi}{B}_{T}\right)}^{-1}{A}_{T}^{\prime}{\alpha}_{1}\right)}^{-1}.$$

$$\begin{array}{cc}\hfill {T}^{-1/2}{\overline{\alpha}}^{\prime}\left(T\widehat{\Pi}\right){T}^{-1/2}\overline{\beta}& ={I}_{r}+O\left({T}^{-1}\right)+{O}_{P}\left({T}^{-1}\right),\hfill \\ \hfill {T}^{-1/2}{\overline{\alpha}}^{\prime}\left(T\widehat{\Pi}\right){\beta}_{\perp}& =0+O\left({T}^{-1/2}\right)+{O}_{P}\left({T}^{-1/2}\right),\hfill \\ \hfill {\alpha}_{\perp}^{\prime}\left(T\widehat{\Pi}\right)\overline{\beta}{T}^{-1/2}& =0+O\left({T}^{-1/2}\right)+{\alpha}_{\perp}^{\prime}{M}_{1T},\hfill \\ \hfill {\alpha}_{\perp}^{\prime}\left(T\widehat{\Pi}\right){\beta}_{\perp}& =0+{\alpha}_{\perp}^{\prime}{\alpha}_{1}c{\beta}_{1}^{\prime}{\beta}_{\perp}+{\alpha}_{\perp}^{\prime}{M}_{2T}{\alpha}_{\perp}^{\prime}{\beta}_{\perp}.\hfill \end{array}$$

$$\begin{array}{c}{A}_{T}^{\prime}\left(T\widehat{\Pi}\right){B}_{T}\stackrel{\mathrm{D}}{\to}\left(\begin{array}{cc}{I}_{r}& 0\\ {\alpha}_{\perp}^{\prime}{M}_{1}& {\alpha}_{\perp}^{\prime}{\alpha}_{1}c{\beta}_{1}^{\prime}{\beta}_{\perp}+{\alpha}_{\perp}^{\prime}{M}_{2}{\alpha}_{\perp}^{\prime}{\beta}_{\perp}\end{array}\right),\hfill \\ {B}_{T}{\left({A}_{T}^{\prime}\left(T\widehat{\Pi}\right){B}_{T}\right)}^{-1}{A}_{T}^{\prime}\stackrel{\mathrm{D}}{\to}(0,{\beta}_{\perp}){\left(\begin{array}{cc}{I}_{r}& 0\\ {\alpha}_{\perp}^{\prime}{M}_{1}& {\alpha}_{\perp}^{\prime}{\alpha}_{1}c{\beta}_{1}^{\prime}{\beta}_{\perp}+{\alpha}_{\perp}^{\prime}{M}_{2}{\alpha}_{\perp}^{\prime}{\beta}_{\perp}\end{array}\right)}^{-1}{(0,{\alpha}_{\perp})}^{\prime}\hfill \\ ={\beta}_{\perp}{({\alpha}_{\perp}^{\prime}{\alpha}_{1}c{\beta}_{1}^{\prime}{\beta}_{\perp}+{\alpha}_{\perp}^{\prime}{M}_{2}{\alpha}_{\perp}^{\prime}{\beta}_{\perp})}^{-1}{\alpha}_{\perp}^{\prime}.\hfill \end{array}$$

$$\begin{array}{cc}\hfill \widehat{c}={\left({\beta}_{1}^{\prime}{B}_{T}{\left({A}_{T}^{\prime}T\widehat{\Pi}{B}_{T}\right)}^{-1}{A}_{T}^{\prime}{\alpha}_{1}\right)}^{-1}\stackrel{\mathrm{D}}{\to}& \phantom{\rule{4pt}{0ex}}{\left[{\beta}_{1}^{\prime}{\beta}_{\perp}{({\alpha}_{\perp}^{\prime}{\alpha}_{1}c{\beta}_{1}^{\prime}{\beta}_{\perp}+{\alpha}_{\perp}^{\prime}{M}_{2}{\alpha}_{\perp}^{\prime}{\beta}_{\perp})}^{-1}{\alpha}_{\perp}^{\prime}{\alpha}_{1}\right]}^{-1}\hfill \\ \hfill =& \phantom{\rule{4pt}{0ex}}{\left({\alpha}_{\perp}^{\prime}{\alpha}_{1}\right)}^{-1}({\alpha}_{\perp}^{\prime}{\alpha}_{1}c{\beta}_{1}^{\prime}{\beta}_{\perp}+{\alpha}_{\perp}^{\prime}{M}_{2}{\alpha}_{\perp}^{\prime}{\beta}_{\perp}){\left({\beta}_{1}^{\prime}{\beta}_{\perp}\right)}^{-1}\hfill \\ \hfill =& \phantom{\rule{4pt}{0ex}}c+{\left({\alpha}_{\perp}^{\prime}{\alpha}_{1}\right)}^{-1}{\alpha}_{\perp}^{\prime}{M}_{2}{\alpha}_{\perp}^{\prime}{\beta}_{\perp}{\left({\beta}_{1}^{\prime}{\beta}_{\perp}\right)}^{-1},\hfill \end{array}$$

Proof of (13): If $r=p-1$, the expression (9) can be reduced as follows. For $\tau ={\alpha}_{1}c{\beta}_{1}^{\prime}$
for $\delta ={\beta}_{1}^{\prime}C{\alpha}_{1},$ and in general for $n\ge 0,$ it is seen that
Therefore, using $\phantom{\rule{4pt}{0ex}}{\beta}_{1}{c}^{\prime}\zeta c{\beta}_{1}^{\prime}={\beta}_{1}{c}^{\prime}\tau {\Omega}^{-1}\alpha {\left({\alpha}^{\prime}{\Omega}^{-1}\alpha \right)}^{-1}{\alpha}^{\prime}{\Omega}^{-1}{\tau}^{\prime},$
The integral can be calculated by the expansion
where $\kappa ={\beta}_{1}^{\prime}C\Omega {C}^{\prime}{\beta}_{1}$. This allows the integral to be calculated
Therefore
where $\zeta ={\alpha}_{1}^{\prime}{\Omega}^{-1}\alpha {\left({\alpha}^{\prime}{\Omega}^{-1}\alpha \right)}^{-1}{\alpha}^{\prime}{\Omega}^{-1}{\alpha}_{1}.$ ☐

$${\left(\tau C\right)}^{2}=c{\alpha}_{1}c\left({\beta}_{1}^{\prime}C{\alpha}_{1}\right){\beta}_{1}^{\prime}C=c\left({\beta}_{1}^{\prime}C{\alpha}_{1}\right){\alpha}_{1}c{\beta}_{1}^{\prime}C=c\delta \tau C,$$

$${\left(\tau C\right)}^{n+1}={\left(c\delta \right)}^{n}\tau C.$$

$$E\left(B\right)=tr\left\{{\Omega}^{-1}\alpha {\left({\alpha}^{\prime}{\Omega}^{-1}\alpha \right)}^{-1}{\alpha}^{\prime}{\Omega}^{-1}\tau C\left({\int}_{0}^{1}(1-v)exp\left(v\tau C\right)\Omega exp\left(v{C}^{\prime}{\tau}^{\prime}\right)dv\right){C}^{\prime}{\tau}^{\prime}\right\}.$$

$$\begin{array}{c}\tau Cexp\left(v\tau C\right)\Omega exp\left(v{C}^{\prime}{\tau}^{\prime}\right){C}^{\prime}{\tau}^{\prime}=\sum _{n,m=0}^{\infty}\frac{{v}^{n}}{n!}{\left(\tau C\right)}^{n+1}\Omega {\left({C}^{\prime}{\tau}^{\prime}\right)}^{m+1}\frac{{v}^{m}}{m!}\hfill \\ =\sum _{n,m=0}^{\infty}\frac{{\left(vc\delta \right)}^{n+m}}{n!m!}\tau C\Omega {C}^{\prime}{\tau}^{\prime}=exp\left(2vc\delta \right){c}^{2}\kappa {\alpha}_{1}{\alpha}_{1}^{\prime},\hfill \end{array}$$

$$\begin{array}{c}\tau C\left({\int}_{0}^{1}(1-v)exp\left(v\tau C\right)\Omega exp\left(v{C}^{\prime}{\tau}^{\prime}\right)dv\right){C}^{\prime}{\tau}^{\prime}\hfill \\ =\left({\int}_{0}^{1}(1-v)exp\left(2vc\delta \right)dv\right){c}^{2}\kappa {\alpha}_{1}{\alpha}_{1}^{\prime}=\frac{{e}^{2\delta c}-1-2c\delta}{{\left(2\delta c\right)}^{2}}{c}^{2}\kappa {\alpha}_{1}{\alpha}_{1}^{\prime}.\hfill \end{array}$$

$$E\left(B\right)=\frac{{e}^{2c\delta}-1-2c\delta}{{\left(2\delta \right)}^{2}}\kappa {\alpha}_{1}^{\prime}{\Omega}^{-1}\alpha {\left({\alpha}^{\prime}{\Omega}^{-1}\alpha \right)}^{-1}{\alpha}^{\prime}{\Omega}^{-1}{\alpha}_{1}=({e}^{2c\delta}-1-2c\delta )\frac{\kappa \zeta}{{\left(2\delta \right)}^{2}},$$

- Cavanagh, Christopher L., Graham Elliott, and James H. Stock. 1995. Inference in Models with Nearly Integrated Regressors. Econometric Theory 11: 1131–47. [Google Scholar] [CrossRef]
- Di Iorio, Francesca, Stefano Fachin, and Riccardo Lucchetti. 2016. Can you do the wrong thing and still be right? Hypothesis testing in I(2) and near-I(2) cointegrated VARs. Applied Economics 48: 3665–78. [Google Scholar] [CrossRef]
- Elliott, Graham. 1998. On the robustness of cointegration methods when regressors almost have unit roots. Econometrica 66: 149–58. [Google Scholar] [CrossRef]
- Johansen, Søren. 1996. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford: Oxford University Press. [Google Scholar]
- McCloskey, Adam. 2017. Bonferroni-based size-correction for nonstandard testing problems. Journal of Econometrics. in press. Available online: http://www.sciencedirect.com/science/article/pii/S0304407617300556 (accessed on 12 June 2017). [CrossRef]
- Phillips, Peter C. B. 1988. Regression theory for near integrated time series. Econometrica 56: 1021–44. [Google Scholar] [CrossRef]

Four DGPs Allowing for near Unit Roots, $\Omega ={\mathit{I}}_{2}$ | |||
---|---|---|---|

1: | $\left(\begin{array}{cc}-\frac{1}{4}-c/T& \frac{1}{4}-c/T\\ \frac{1}{4}-c/T& -\frac{1}{4}-c/T\end{array}\right)$ | 2: | $\left(\begin{array}{cc}-\frac{1}{4}& \frac{1}{4}\\ \frac{1}{4}& -\frac{1}{4}-c/T\end{array}\right)$ |

3: | $\left(\begin{array}{cc}-\frac{1}{4}& \frac{1}{4}\\ \frac{1}{4}-c/T& -\frac{1}{4}-c/T\end{array}\right)$ | 4: | $\left(\begin{array}{cc}-\frac{1}{4}-c/T& \frac{1}{4}-c/T\\ \frac{1}{4}& -\frac{1}{4}\end{array}\right)$ |

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).