Likelihood Ratio Tests of Restrictions on Common Trends Loading Matrices in I(2) VAR Systems

Boswijk, H. Peter; Paruolo, Paolo

doi:10.3390/econometrics5030028

Open AccessArticle

Likelihood Ratio Tests of Restrictions on Common Trends Loading Matrices in I(2) VAR Systems

by

H. Peter Boswijk

¹ and

Paolo Paruolo

^2,*

¹

Amsterdam School of Economics and Tinbergen Institute, University of Amsterdam, 1001 NJ Amsterdam, The Netherlands

²

Joint Research Centre, European Commission, 21027 Ispra (VA), Italy

^*

Author to whom correspondence should be addressed.

Econometrics 2017, 5(3), 28; https://doi.org/10.3390/econometrics5030028

Submission received: 28 February 2017 / Revised: 30 May 2017 / Accepted: 21 June 2017 / Published: 29 June 2017

(This article belongs to the Special Issue Recent Developments in Cointegration)

Download Versions Notes

Abstract

:

Likelihood ratio tests of over-identifying restrictions on the common trends loading matrices in I(2) VAR systems are discussed. It is shown how hypotheses on the common trends loading matrices can be translated into hypotheses on the cointegration parameters. Algorithms for (constrained) maximum likelihood estimation are presented, and asymptotic properties sketched. The techniques are illustrated using the analysis of the PPP and UIP between Switzerland and the US.

Keywords:

cointegration; common trends; identification; VAR; I(2)

JEL Classification:

C32

1. Introduction

The duality between the common trends representation and the vector equilibrium-correction model-form (VECM) in cointegrated systems allows researchers to formulate hypotheses of economic interest on any of the two. The VECM is centered on the adjustment with respect to disequilibria in the system; in this way it facilitates the interpretation of cointegrating relations as (deviations from) equilibria.

The common trends representation instead highlights how variables in the system as pushed around by common stochastic trends, which are often interpreted as the main persistent economic factors influencing the long-term. Both representations provide economic insights on the economic system under scrutiny. Examples of both perspectives are given in Juselius (2017a, 2017b)

The common trends and VECM representations are connected through representation results such as the Granger Representation Theorem, in the case of I(1) systems, see Engle and Granger (1987) and Johansen (1991), and the Johansen Representation Theorem, for the case of I(2) systems, see Johansen (1992). In particular, both representation theorems show that the loading matrix of the common stochastic trends of highest order is a basis of the orthogonal complement of the matrix of cointegrating relations. Because of this property, these two matrices are linked, and any one of them can be written as a function of the other one.

This paper focuses on I(2) vector autoregressive (VAR) systems, and it considers the situation where (possibly over-identifying) economic hypotheses are entertained for the factor loading matrix of the I(2) trends. It is shown how they can then be translated into hypotheses on the cointegrating relations, which appear in the VECM representation; the latter forms the basis for maximum likelihood (ML) estimation of I(2) VAR models. In this way, constrained ML estimators are obtained and the associated likelihood ratio (LR) tests of these hypotheses can be defined. These tests are discussed in the present paper; Wald tests on just-identified loading matrices of the I(1) and I(2) common trends have already been proposed by Paruolo (1997, 2002).

The running example of the paper is taken from Juselius and Assenmacher (2015), which is the working paper version of Juselius and Assenmacher (2017). The following notation is used: for a full column-rank matrix a,

col a

denotes the space spanned by the columns of a and

a_{⊥}

indicates a basis of the orthogonal complement of

col a

. For a matrix b of the same dimensions of a, and for which

b^{'} a

is full rank, let

b_{a} : = b {(a^{'} b)}^{- 1}

; a special case is when

a = b

, for which

\bar{a} : = a_{a} = a {(a^{'} a)}^{- 1}

. Let also

P_{a} : = a {(a^{'} a)}^{- 1} a^{'}

indicate the orthogonal projection matrix onto

col a

, and let the matrix

P_{a_{⊥}} = I - P_{a}

denote the orthogonal projection matrix on its orthogonal complement. Finally

e_{j}

is used to indicate the j-th column of an identity matrix of appropriate dimension.

The rest of this paper is organized as follows: Section 2 contains the motivation and the definition of the problem considered in the paper. The identification of the I(2) common trends loading matrix under linear restrictions is analysed in Section 3. The relationship between the identified parametrization of I(2) common trends loading matrix and an identified version of the cointegration matrix is also discussed. Section 4 considers a parametrization of the VECM, and discusses its identification. ML estimation of this model is discussed in Section 5; the asymptotic distributions of the resulting ML estimator of the I(2) loading matrix and the LR statistic of the over-identifying restrictions are sketched in Section 6. Section 7 reports an illustration of the techniques developed in the paper on a system of US and Swiss prices, interest rates and exchange rate. Section 8 concludes, while two appendices report additional technical material.

2. Common Trends Representation for I(2) Systems

This section introduces quantities of interest and presents the motivation of the paper. Consider a p-variate VAR(k) process

X_{t}

:

X_{t} = A_{1} X_{t - 1} + \dots + A_{k} X_{t - k} + μ_{0} + μ_{1} t + ε_{t},

(1)

where

A_{i}, i = 1, \dots, k

are

p \times p

matrices,

μ_{0}

and

μ_{1}

are

p \times 1

vectors, and

ε_{t}

is a

p \times 1

i.i.d.

N (0, Ω)

vector, with

Ω

positive definite. Under the conditions of the Johansen Representation Theorem, see Appendix A, called the I(2) conditions,

X_{t}

admits a common trends I(2) representation of the form

X_{t} = C_{2} S_{2 t} + C_{1} S_{1 t} + Y_{t} + v_{0} + v_{1} t,

(2)

where

S_{2 t} : = \sum_{i = 1}^{t} \sum_{s = 1}^{i} ε_{s}

are the I(2) stochastic trends (cumulated random walks),

S_{1 t} : = Δ S_{2 t} = \sum_{i = 1}^{t} ε_{i}

is a random walk component, and

Y_{t}

is an I(0) linear process.

Cointegration occurs when the matrix

C_{2}

has reduced rank

r_{2} < p

, such that

C_{2} = a b^{'}

, where a and b are

p \times r_{2}

and of full column rank. This observation lends itself to the following interpretation:

b^{'} S_{2 t}

defines the

r_{2}

common I(2) trends, while a acts as the loading matrix of

X_{t}

on the I(2) trends. The reduced rank of

C_{2}

implies that there exist

m : = p - r_{2}

linearly independent cointegrating vectors, collected in a

p \times m

matrix

τ

, satisfying

τ^{'} C_{2} = 0

; hence

τ^{'} X_{t}

is I(1). Combining this with

C_{2} = a b^{'}

, it is clear that

a = τ_{⊥}

, i.e., the columns of the loading matrix span the orthogonal complement of the cointegration space

col τ

. Interest in this paper is on hypotheses on

a = τ_{⊥}

.1

Observe that

C_{2} = a b^{'}

is invariant to the choice of basis of either

col a

and

col b

. In fact,

(a, b)

can be replaced by

(a Q, b Q^{' - 1})

with Q square and nonsingular without affecting

C_{2}

. One way to resolve this identification problem is to impose restrictions on the entries of

a = τ_{⊥}

; enough restrictions of this kind would make the choice of

τ_{⊥}

unique. Such an approach to identification is common in confirmatory factor analysis in the statistics literature, see Jöreskog et al. (2016).

If more restrictions are imposed than needed for identification, they are over-identifying. Such over-identifying restrictions on

τ_{⊥}

usually correspond to (similarly over-identifying) restrictions on

τ

, see Section 3 below. Although economic hypotheses may directly imply restrictions on the cointegrating vectors in

τ

, in some cases it is more natural to formulate restrictions on the I(2) loading matrix

τ_{⊥}

. This is illustrated by the two following examples.

2.1. Example 1

Kongsted (2005) considers a model for

X_{t} = {(m_{t} : y_{t}^{n} : p_{t})}^{'}

, where

m_{t}

,

y_{t}^{n}

and

p_{t}

denote the nominal money stock, nominal income and the price level, respectively (all variables in logs); here ‘:’ indicates horizontal concatenation. He assumes that the system is I(2), with

r_{2} = 1

. Given the definition of the variables, Kongsted (2005) considers the natural question of whether real money

m_{t} - p_{t}

and real income

y_{t}^{n} - p_{t}

are at most I(1). This corresponds to an (over-identified) cointegrating matrix

τ

and loading vector

τ_{⊥}

of the form

τ = (\begin{matrix} 1 & 0 \\ 0 & 1 \\ - 1 & - 1 \end{matrix}), τ_{⊥} = (\begin{matrix} 1 \\ 1 \\ 1 \end{matrix}) .

The form of

τ

corresponds to the fact that the I(1) linear combinations

τ^{'} X_{t}

are (linear combinations of)

{((m_{t} - p_{t}) : (y_{t}^{n} - p_{t}))}^{'}

, as required. On the other hand, the restriction on

τ_{⊥}

says that each of the three series have exactly the same I(2) trend, with the same scale factor. Both formulations are easily interpretable.

Note that the hypothesis on

τ_{⊥}

involves two over-identifying restrictions (the second and third component are equal to the first component), in addition to a normalization (the first component equals 1). Similarly, the restriction that the matrix consisting of the first two rows of

τ

equals

I_{2}

is a normalization; the two over-identifying restrictions are that the entries in both columns sum to 0.

As this first example shows, knowing

τ

is the same as knowing

τ_{⊥}

and vice versa2.

2.2. Example 2

Juselius and Assenmacher (2015) consider a 7-dimensional VAR with

X_{t} = {(p_{1 t} : p_{2 t} : e_{12 t} : b_{1 t} : b_{2 t} : s_{1 t} : s_{2 t})}^{'}

with

r_{2} = 2

, where

p_{i t}

,

b_{i t}

,

s_{i t}

are the (log of) the price index, the long and the short interest rate of country i at time t respectively, and

e_{12 t}

is the log of the exchange rate between country 1 (Switzerland) and 2 (the US) at time t. They expect the common trends representation to have a loading matrix

τ_{⊥}

of the form:

τ_{⊥} = (\begin{matrix} ϕ_{11} & 0 \\ ϕ_{21} & ϕ_{22} \\ ϕ_{31} & ϕ_{32} \\ 0 & ϕ_{42} \\ 0 & ϕ_{52} \\ 0 & ϕ_{62} \\ 0 & ϕ_{72} \end{matrix}) .

(3)

where

ϕ_{i j}

indicates an entry not restricted to 0.

The second I(2) trend is loaded on the interest rates

b_{1 t}

,

b_{2 t}

,

s_{1 t}

,

s_{2 t}

, as well as on US prices

p_{2 t}

and the exchange rate

e_{12 t}

; this can be interpreted as a financial (or ‘speculative’) trend affecting world prices. The first I(2) trend, instead, is only loaded on

p_{1 t}

,

p_{2 t}

,

e_{12 t}

and embodies a ‘relative price’ I(2) trend; it can be interpreted as the Swiss contribution to the trend in prices.

The cointegrating matrix

τ

in this example is of dimension

7 \times 5

. It is not obvious what type of restrictions on

τ

correspond to the structure in (3). However, it is

τ

rather than

τ_{⊥}

that enters the likelihood function (as will be analyzed in Section 4). The rest of the paper shows that the restrictions in (3) are over-identifying, how they can be translated into hypotheses on

τ

, and how they can be tested via LR tests.

3. Hypothesis on the Common Trends Loadings

This section discusses linear hypotheses on

τ_{⊥}

and their relation to

τ

. First, attention is focused on the case of linear hypotheses on the normalized version

τ_{⊥ c_{⊥}} : = τ_{⊥} {(c_{⊥}^{'} τ_{⊥})}^{- 1}

of

τ_{⊥}

. Here

c_{⊥}

is a full-column-rank matrix of the same dimension of

τ_{⊥}

such that

c_{⊥}^{'} τ_{⊥}

is square and nonsingular3. This normalization was introduced by Johansen (1991) in the context of the I(1) model in order to isolate the (just-) identified parameters in the cointegration matrix.

Later, linear hypotheses formulated directly on

τ_{⊥}

are discussed. The main result of this section is the fact that the parameters of interest appears linearly both in

τ_{⊥ c_{⊥}}

and in

τ_{c}

in the first case; this is not necessarily true in the second case.

The central relation employed in this section (for both cases), is the following identity:

τ_{c} : = τ {(c^{'} τ)}^{- 1} = (I - c_{⊥} {(τ_{⊥}^{'} c_{⊥})}^{- 1} τ_{⊥}^{'}) \bar{c} = (I - c_{⊥} τ_{⊥ c_{⊥}}^{'}) \bar{c},

(4)

where

\bar{c} : = c {(c^{'} c)}^{- 1}

. This identity readily follows from the oblique projections identity

I = τ {(c^{'} τ)}^{- 1} c^{'} + c_{⊥} {(τ_{⊥}^{'} c_{⊥})}^{- 1} τ_{⊥}^{'},

see e.g. Srivastava and Kathri (1979, p. 19), by post-multiplication by

\bar{c} .

3.1. Linear hypotheses on $τ_{⊥ c_{⊥}}$

Johansen (1991) noted that the function

a_{b} : = a {(b^{'} a)}^{- 1}

is invariant with respect to the choice of basis of the space spanned by a. in fact, consider in the present context any alternative basis

τ_{⊥}^{⋆}

of the space spanned by

τ_{⊥}

; this has representation

τ_{⊥}^{⋆} = τ_{⊥} Q

for Q square and full rank. Inserting

τ_{⊥}^{⋆}

in place of

τ_{⊥}

in the definition of

τ_{⊥ c_{⊥}} : = τ_{⊥} {(c_{⊥}^{'} τ_{⊥})}^{- 1}

, one finds

τ_{⊥ c_{⊥}}^{⋆} = τ_{⊥}^{⋆} {(c_{⊥}^{'} τ_{⊥}^{⋆})}^{- 1} = τ_{⊥} Q {(c_{⊥}^{'} τ_{⊥} Q)}^{- 1} = τ_{⊥ c_{⊥}} .

Hence

τ_{⊥ c_{⊥}}

, similarly to the cointegration matrix in the I(1) model in Johansen (1991), is (just-)identified.

To facilitate stating hypotheses on the unconstrained elements of

τ_{⊥ c_{⊥}}

, the following representation of

τ_{⊥ c_{⊥}}

appears useful:

τ_{⊥ c_{⊥}} = {\bar{c}}_{⊥} + c ϑ

(5)

where

ϑ

is an

m \times r_{2}

matrix of free coefficients in

τ_{⊥}

4. For example, one may have

c_{⊥} = (\begin{matrix} 0_{3 \times 2} \\ I_{2} \end{matrix}), c = (\begin{matrix} I_{3} \\ 0_{2 \times 3} \end{matrix}), τ_{⊥ c_{⊥}} = {\bar{c}}_{⊥} + c (\begin{matrix} ϑ_{11} & ϑ_{12} \\ ϑ_{21} & ϑ_{22} \\ ϑ_{31} & ϑ_{32} \end{matrix}) = (\begin{matrix} ϑ_{11} & ϑ_{12} \\ ϑ_{21} & ϑ_{22} \\ ϑ_{31} & ϑ_{32} \\ 1 & 0 \\ 0 & 1 \end{matrix})

(6)

with

p = 5

,

m = 3

,

r_{2} = 2

.

Consider over-identifying linear restrictions on the columns of

ϑ

in (5). Typically, such restrictions will come in the form of zero (exclusion) restrictions or unit restrictions, where the latter would indicate equal loadings of a specific variable and the variable on which the column of

τ_{⊥ c_{⊥}}

has been normalized. The general formulation of such restrictions is

ϑ_{i} = k_{i} + K_{i} ϕ_{i}, i = 1, \dots, r_{2},

(7)

where

ϑ_{i}

is the i-th column vector of

ϑ

,

k_{i}

and

K_{i}

are conformable vectors and matrices, and

ϕ_{i}

contains the remaining unknown parameters in

ϑ_{i}

. If only zero restrictions are imposed, then

k_{i} = 0_{m}

.

The formulation in (7) includes several notable special cases. For instance, if all

K_{i} = K

and

k_{i} = 0_{m}

, one obtains the hypothesis that

ϑ

is contained in a given linear space,

ϑ = K ϕ

. Another example is given by the case where one column

ϑ_{1}

is known,

ϑ = (k_{1} : ϕ)

; this corresponds to the choice

ϑ_{1} = k_{1}

with

K_{1}

and

ϕ_{1}

void and

k_{2} = \dots = k_{r_{2}} = 0

,

K_{2} = \dots = K_{r_{2}} = I

.

The restrictions in (7) may be summarized as

vec ϑ = k + K ϕ,

(8)

where

k = {(k_{1}^{'} : \dots : k_{r_{2}}^{'})}^{'}

,

K = blkdiag (K_{1}, \dots, K_{r_{2}})

and

ϕ = {(ϕ_{1}^{'} : \dots : ϕ_{r_{2}}^{'})}^{'}

. Here

blkdiag (B_{1}, B_{2}, \dots, B_{n})

indicates a matrix with the (not necessarily square) blocks

B_{1}, B_{2}, \dots, B_{n}

along the main diagonal. Formulation (8) generalises (7).

The main result of this section is stated in the next theorem.

Theorem 1

(Hypotheses on

τ_{⊥ c_{⊥}}

). Assume that ϑ satisfies linear restrictions of the type (8); then these restrictions are translated into a linear hypothesis on

vec τ_{c}

via

vec τ_{c} = (vec \bar{c} - (I_{m} \otimes c_{⊥}) K_{m, r_{2}} k) - (I_{m} \otimes c_{⊥}) K_{m, r_{2}} K ϕ,

(9)

where

K_{m, n}

is the commutation matrix satisfying

K_{m, n} vec A = vec A^{'}

, with A of dimensions

m \times n

, see Magnus and Neudecker (2007).

Proof.

Substitute (8) into (4) and vectorize using standard properties of the

vec

operator, see Magnus and Neudecker (2007). ☐

The previous theorem shows that, when one can express a linear hypothesis on the coefficients in

ϑ

that are unrestricted in

τ_{⊥ c_{⊥}}

, then the same linear hypothesis is translated into a restriction on

vec τ_{c}

. Note that the proof simply exploits (4).

Identification of the restricted coefficients

ϕ

under these hypothesis can be addressed in a straightforward way. In fact, the parameters in

ϑ

are identified; hence

ϕ

is identified provided that the matrix K is of full column rank, which in turn will imply that the Jacobian matrix

\partial vec τ_{c} / \partial ϕ^{'} = - (I_{m} \otimes c_{⊥}) K_{m, r_{2}} K

in (9) has full column rank.

Because, in practice, econometricians may explore the form of

τ_{⊥}

via unrestricted estimates of

τ_{⊥ c_{⊥}}

, see Paruolo (2002), before formulating restrictions on

τ_{⊥}

, using hypothesis on the unrestricted coefficients in

τ_{⊥ c_{⊥}}

appears a natural sequential step.

The next subsection discusses the alternative approach of specifying hypotheses directly on

τ_{⊥}

.

3.2. Linear Hypotheses on $τ_{⊥}$

In case placing restrictions on the unrestricted coefficients in

τ_{⊥ c_{⊥}}

is not what the econometrician wants, this subsection considers linear hypothesis on

τ_{⊥}

directly. It is shown that sometimes it is possible to translate linear hypothesis on

τ_{⊥}

into linear hypothesis on

τ_{⊥ c_{⊥}}

for some

c_{⊥}

. It is also shown that this is always possible for

r_{2} = 2

, for which a constructive proof is provided.

Analogously to (7), consider linear hypotheses on the columns of

τ_{⊥}

, of the following type:

τ_{⊥, i} = h_{i} + H_{i} ϕ_{i}, i = 1, \dots, r_{2},

(10)

summarized as

vec τ_{⊥} = h + H ϕ .

(11)

In this case, non-zero vectors

h_{i}

represent normalizations of the columns of the loading matrix, and as before,

ϕ_{i}

collects the unknown parameters in

τ_{⊥, i}

.

Theorem 2 (Hypotheses on τ_⊥).

Assume that

τ_{⊥} = τ_{⊥} (ϕ)

satisfies linear restrictions of the type (11), then these restrictions are translated in general into a non-linear hypothesis on

vec τ_{c}

via

τ_{c} = (I - c_{⊥} {(τ_{⊥} {(ϕ)}^{'} c_{⊥})}^{- 1} τ_{⊥} {(ϕ)}^{'}) \bar{c}

(12)

and the Jacobian of the transformation from ϕ to

vec τ_{c}

is

J (\cdot) : = \frac{\partial vec τ_{c} (\cdot)}{\partial ϕ^{'}} = - (τ_{c} {(\cdot)}^{'} \otimes c_{⊥} {(τ_{⊥} {(\cdot)}^{'} c_{⊥})}^{- 1}) K_{p, r_{2}} H .

(13)

This parametrization is smooth on an open set in the parameter space Φ of ϕ where

c_{⊥}^{'} τ_{⊥}

is of full rank.

Proof.

Equation (12) is a re-statement of (4). Differentiation of (12) delivers (13). ☐

One can note that the Jacobian matrix in (13) can be used to check local identification using the results in Rothenberg (1971).

The result of Theorem 2 is in contrast with the result of Theorem 1, because the latter delivers a linear hypothesis for

τ_{c}

while Theorem 2 gives in general non-linear restrictions on

τ_{c}

. One may hence ask the following question: when is it possible to reduce the more general linear hypothesis on

τ_{⊥}

given in (11) to the simpler linear hypothesis on

ϑ

given in (8)?

In the special case of

r_{2} = 2

, the following theorem states that this can be always obtained. This applies for instance to the motivating example (3), where one can choose some

c_{⊥}

so that

τ_{⊥}^{'} c_{⊥}

is equal to the identity, as shown below. Consider the formulation (10) with

r_{2} = 2

, and assume that no normalizations have been imposed yet, such that

h_{1} = h_{2} = 0

. It is assumed that

τ_{⊥}

, under the equation-by-equation restrictions, satisfies the usual rank conditions for identification, see Johansen (1995, Theorem 1) :

rank R_{i}^{'} τ_{⊥} = 1 for i = 1, 2,

(14)

where

R_{i} = H_{i, ⊥}

.

Theorem 3 (Case r₂ = 2).

Let

τ_{⊥}

obey the restrictions

τ_{⊥} = (H_{1} ϕ_{1} : H_{2} ϕ_{2})

satisfying the rank conditions (14); then one can choose normalization conditions on

ϕ_{1}

and

ϕ_{2}

so that there exists a matrix

c_{⊥}

such that

c_{⊥}^{'} τ_{⊥} = I

. This implies that a hypotheses on

τ_{⊥}

can be stated in terms of ϑ in (5), and, by Theorem 1, a linear hypotheses on

vec ϑ

corresponds to linear hypothesis on

vec τ_{c}

.

Proof.

Because

R_{1}^{'} τ_{⊥} = (0 : R_{1}^{'} H_{2} ϕ_{2})

has rank 1, one can select (at least) one linear combination of

R_{1}

,

R_{1} a_{1}

say, so that

ϕ_{2}

is normalized to be one in the direction

b_{2}^{'} : = a_{1}^{'} R_{1}^{'} H_{2}

, i.e.,

b_{2}^{'} ϕ_{2} = 1

. Similarly,

R_{2}^{'} τ_{⊥} = (R_{2}^{'} H_{1} ϕ_{1} : 0)

has rank 1, and one can select (at least) one linear combination of

R_{2}

,

R_{2} a_{2}

say, so that

ϕ_{1}

is normalized to be one in the direction

b_{1}^{'} : = a_{2}^{'} R_{2}^{'} H_{1}

, i.e.,

b_{1}^{'} ϕ_{1} = 1

. Next define

c_{⊥} = (R_{2} a_{2} : R_{1} a_{1})

which by construction satisfies

c_{⊥}^{'} τ_{⊥} = I_{2}

. ☐

The proof of the previous theorem provides a way to construct

c_{⊥}

when

r_{2} = 2

and the usual rank condition for identification (14) holds. The rest of the paper focuses attention on the case of linear restrictions on

ϑ

in (8), which can be translated linearly into restrictions on

τ_{c}

as shown in Theorem 1.

3.3. Example 2 Continued

Consider (3); this hypothesis is of type

τ_{⊥} = (H_{1} ϕ_{1} : H_{2} ϕ_{2})

with

H_{1} = (\begin{matrix} I_{3} \\ 0_{4 \times 3} \end{matrix}), H_{2} = (\begin{matrix} 0_{1 \times 6} \\ I_{6} \end{matrix}),

and hence

R_{1}^{'} = (I_{4} : 0_{4 \times 3})

and

R_{2}^{'} = (I_{6} : 0_{6 \times 1})

. In this case one can define

c = (e_{2} : e_{3} : e_{5} : e_{6} : e_{7})

and

c_{⊥} = (e_{1} : e_{4})

where

e_{j}

is the j-th column of

I_{7}

.

It is simple to verify that, under the additional normalization restrictions

ϕ_{11} = 1

and

ϕ_{42} = 1

,

τ_{⊥}

in (3) satisfies

c_{⊥}^{'} τ_{⊥} = I_{2}

. Therefore, define

τ_{⊥ c_{⊥}}

as (3) under these normalization restrictions. Using formula (4) one can see that

τ_{c} = (I - c_{⊥} τ_{⊥ c_{⊥}}^{'}) \bar{c} = (\begin{matrix} - ϕ_{21} & - ϕ_{31} & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ - ϕ_{22} & - ϕ_{32} & - ϕ_{52} & - ϕ_{62} & - ϕ_{72} \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{matrix}),

(15)

so that

vec τ_{c}

is linear in

ϕ

, as predicted by Theorem 3.

4. The VECM Parametrization

This section describes the I(2) parametrization employed in the statistical analysis of the paper. Consider the following

τ

-parametrization (

τ

-par) of the VECM for I(2) VAR systems5. See Mosconi and Paruolo (2017):

Δ^{2} X_{t} = α (ρ^{'} τ^{'} X_{t - 1} + ψ^{'} Δ X_{t - 1}) + λ τ^{'} Δ X_{t - 1} + Υ Δ^{2} X_{t - 1} + ε_{t},

(16)

with

Υ Δ^{2} X_{t - 1} = \sum_{j = 1}^{k - 2} Υ_{j} Δ^{2} X_{t - j}

. Recall that

m = p - r_{2}

is the total number of cointegrating relations, i.e., the number of I(1) linear combinations

τ^{'} X_{t}

. The number of linear combinations of

τ^{'} X_{t}

that cointegrate with

Δ X_{t}

to I(0), i.e., the number of I(0) linear combinations

ρ^{'} τ^{'} X_{t} + ψ^{'} Δ X_{t}

, is indicated6 by

r \leq m

. Here

α

is

p \times r

,

τ

is

p \times m

and the other parameter matrices are conformable; the parameters are

α

,

ρ

,

τ

,

ψ

,

λ

,

Υ

,

Ω

, all freely varying, and

Ω

is assumed to be positive definite. When

λ

is restricted as

λ = Ω α_{⊥} {(α_{⊥}^{'} Ω α_{⊥})}^{- 1} κ^{'}

with

κ^{'}

a

(p - r) \times m

matrix of freely varying parameters, the

τ

-par reduces to the parametrization of Johansen (1997); this restriction on

λ

is not imposed here.

4.1. Identification of $τ$

The parameters in the

τ

-par (16) are not identified; in particular

τ^{'}

can be replaced by

B τ^{'}

with B square and nonsingular, provided

ρ

and

λ

are simultaneously replaced by

B^{- 1'} ρ

and

λ B^{- 1}

. This is because

τ

enters the likelihood only via (16) in the products

ρ^{'} τ^{'} = ρ^{'} B^{- 1} B τ^{'}

and

λ τ^{'} = (λ B^{- 1}) (B τ^{'})

. The transformation that generates observationally equivalent parameters, i.e., the post multiplication of

τ

by a square and invertible matrix

B^{'}

, is the same type of transformation that induces observational equivalence in the classical system of simultaneous equations, see Sargan (1988), or to the set of cointegrating equations in I(1) systems, see Johansen (1995). This leads to the following result.

Theorem 4 (Identification of τ in the τ-par).

Assume that

τ_{c}

is specified as the restricted

τ_{c}

in (9), which is implied by the general linear hypothesis (8) on

τ_{⊥ c_{⊥}}

; then the restricted

τ_{c}

is identified within the τ-par if and only if

rank (R_{τ}^{'} (I_{m} \otimes τ)) = m^{2}, \underset{m p \times m_{τ}}{R_{τ}} = G_{⊥}, G : = - (I_{m} \otimes c_{⊥}) K_{m, r_{2}} K

(17)

(rank condition), where

m_{τ} = m p - dim ϕ

. The corresponding order condition is

m_{τ} \geq m^{2}

, or equivalently

m r_{2} \geq dim ϕ

.

Alternatively, consider the general linear hypothesis (11) on

τ_{⊥}

; then the constrained

τ_{c}

in (12) is identified in a neighborhood of the point

ϕ = ϕ^{⋆}

provided the Jacobian

J (ϕ^{⋆}) : = \partial vec τ_{c} (ϕ^{⋆}) / \partial ϕ^{'}

in (13) is of full rank.

Proof.

The rank condition follows from Sargan (1988), given that the class of transformation that induce observational equivalence is the same as the classical one for systems of simultaneous equations. The local identification condition follows from Rothenberg (1971). ☐

4.2. The Identification of Remaining Parameters

This subsection discusses conditions for remaining parameters of the

τ

-par to be identified, when

τ

is identified as in Theorem 4. These additional conditions are used in the discussion of the ML algorithms of the next section.

The VECM can be rewritten as

Δ^{2} X_{t} = ν ς^{'} (\begin{matrix} τ^{'} X_{t - 1} \\ Δ X_{t - 1} \end{matrix}) + Υ Δ^{2} X_{t - 1} + ε_{t}, with ς^{'} : = (\begin{matrix} ρ^{'} & ψ^{'} \\ 0 & τ^{'} \end{matrix}), ν : = (α : λ) .

One can see that the equilibrium correction terms

ν ς^{'} {({(τ^{'} X_{t - 1})}^{'} : Δ X_{t - 1}^{'})}^{'}

may be replaced by

ν^{\circ} ς^{' \circ} {({(τ^{\circ'} X_{t - 1})}^{'} : Δ X_{t - 1}^{'})}^{'}

without changing the likelihood, where

ν^{\circ} : = ν Q^{- 1} = (α A^{- 1} : λ B^{- 1} - α A^{- 1} C)

,

ς^{' \circ} : = Q ς^{'} W^{- 1}

and

Q : = (\begin{matrix} A & C B \\ 0 & B \end{matrix}), W : = (\begin{matrix} B & 0 \\ 0 & I_{p} \end{matrix}), ς^{' \circ} : = Q ς^{'} W^{- 1} = (\begin{matrix} A ρ^{'} B^{- 1} & A ψ^{'} + C B τ^{'} \\ 0 & B τ^{'} \end{matrix});

here A and B are square nonsingular matrices, and C is a generic matrix. Hence one observes that

(α

,

ρ

,

τ

,

ψ

,

λ

,

Υ

,

Ω)

is observationally equivalent to

(α^{\circ}

,

ρ^{\circ}

,

τ^{\circ}

,

ψ^{\circ}

,

λ^{\circ}

,

Υ

,

Ω)

. A, B and C define the class of observationally equivalent transformations in the

τ

-par for all parameters, including

τ

. When

τ

is identified one has

B = I_{m}

in the above formulae.

Consider additional restrictions on

φ

of the type:

\underset{m_{φ} \times f_{φ}}{R_{φ}^{'}} vec φ^{'} = q_{φ}, φ^{'} : = (ρ^{'} : ψ^{'}) .

(18)

where

f_{φ} = r (p + m)

. The next theorem states rank conditions for these restrictions to identify the remaining parameters.

Theorem 5 (Identification of other parameters in the τ-par).

Assume that τ is identified as in Theorem 4; the restrictions (18) identify φ and all other parameters in the τ-par if and only if (rank condition)

rank R_{φ}^{'} (ς \otimes I_{r}) = r (r + m) .

(19)

A necessary but not sufficient condition (order condition) for this is that

m_{φ} \geq r (r + m) .

(20)

Proof.

Because

τ

is identified, one has

B = I_{m}

in Q. For the identification of

φ

, observe that

ς - ς^{\circ} = ς (I - Q^{'})

. One finds

φ - φ^{\circ} = (ς - ς^{\circ}) {(I_{r} : 0)}^{'} = ς (I_{m + r} - Q^{'}) {(I_{r} : 0)}^{'}

. Because both

φ

and

φ^{\circ}

satisfy (18), one has

0 = R_{φ}^{'} vec (φ^{'} - φ^{\circ'}) = R_{φ}^{'} (ς \otimes I_{r}) vec ((I_{r} : 0) (I_{r + m} - Q))

. This implies that

(I_{r} : 0) (I_{m + r} - Q) = 0

, i.e., that both

A = I_{r}

and

C = 0_{r \times m}

, and that

φ

is identified, if and only if

rank R_{φ}^{'} (ς \otimes I_{r}) = r (r + m)

. This completes the proof. ☐

Observe that the identification properties of the

τ

-par differ from the ones of the parametrization of Johansen (1997), where

λ = Ω α_{⊥} {(α_{⊥}^{'} Ω α_{⊥})}^{- 1} κ^{'}

is restricted, and hence the adding-and-subtracting associated with C above is not permitted.

4.3. Deterministic Terms

The

τ

-par in (16) does not involve deterministic terms. Allowing a constant and a trend to enter the VAR Equation (1) in a way that rules out quadratic trends, one obtains the following equilibrium correction I(2) model—for simplicity still called the

τ

-par below:

Δ^{2} X_{t} = α (ρ^{'} τ^{⋆'} X_{t - 1}^{⋆} + ψ^{⋆'} Δ X_{t - 1}^{⋆}) + λ τ^{⋆'} Δ X_{t - 1}^{⋆} + Υ Δ^{2} X_{t - 1} + ε_{t} .

(21)

Here

X_{t - 1}^{⋆} = {(X_{t - 1}^{'} : t)}^{'}

so that

Δ X_{t - 1}^{⋆} = {(Δ X_{t - 1}^{'} : 1)}^{'}

; and

τ^{⋆} = (τ^{'} : τ_{1})

and

ψ^{⋆} = {(ψ^{'} : ψ_{0})}^{'}

.

This parametrization satisfies the conditions of the Johansen Representation Theorem and it generates deterministic trends up to first order, as shown in Appendix A. This is the I(2) model used in the application, with the addition of unrestricted dummy variables.

5. Likelihood Maximization

This section discusses likelihood maximization of the

τ

-par of the I(2) model (16) under linear, possibly over-identifying, restrictions on

τ_{⊥ c_{⊥}}

, i.e., on

ϑ

in (5). The same treatment applies to (21) replacing (

X_{t - 1}

,

Δ X_{t - 1}

) with (

X_{t - 1}^{⋆}

,

Δ X_{t - 1}^{⋆}

), and (

τ

,

ψ

), with (

τ^{⋆}

,

ψ^{⋆}

). The formulation (16) is preferred here for simplicity in exposition.

The alternating maximization procedure proposed here is closely related, but not identical, to the algorithms proposed by Doornik (2017b); related algorithms were discussed in Paruolo (2000b). Restricted ML estimation in the I(1) model was discussed in Boswijk and Doornik (2004).

5.1. Normalizations

Consider restrictions (8), which are translated into linear hypotheses on

τ_{c}

in (9) as follows

vec τ_{c} = (vec \bar{c} - (I_{m} \otimes c_{⊥}) K_{m, r_{2}} k) - (I_{m} \otimes c_{⊥}) K_{m, r_{2}} K ϕ = : g + G ϕ,

where by construction g and G satisfy

(I_{m} \otimes c^{'}) g = vec I_{r m}

and

(I_{m} \otimes c^{'}) G = 0

such that

c^{'} τ_{c} = I_{m}

.

Next, consider just-identifying restrictions on the remaining parameters. For

ψ

, the linear combinations of first differences entering the multicointegration relations, one can consider

c^{'} ψ = 0 \Leftrightarrow ψ = c_{⊥} δ^{'},

(22)

where

δ

is the

r \times r_{2}

matrix of multicointegration parameters. This restriction differs from the restriction

ψ = τ_{⊥} δ^{'}

which is considered e.g., in Juselius (2017a, 2017b), and it was proposed and analysed by Boswijk (2000).

Furthermore, the

m \times r

matrix

ρ

can be normalized as follows

d^{'} ρ = I_{r} \Leftrightarrow ρ = \bar{d} + d_{⊥} ϱ,

(23)

where d is some known

m \times r

matrix, and where

ϱ

, of dimension

(m - r) \times r

, contains freely varying parameters.

It can be shown that restrictions (22) and (23) identify the remaining parameters using Theorem 5. In fact, (22) and (23) can be written as

φ^{'} V = v

where

V : = blkdiag (d, c)

and

v : = (I_{r} : 0_{r \times m})

. Vectorizing, one obtains an equation

R_{φ}^{'} vec φ^{'} = q_{φ}

of the form (18) with

R_{φ} = (V \otimes I_{r})

and

q_{φ} = vec v

. The rank condition (19) is satisfied, since

R_{φ}^{'} (ς \otimes I_{r}) = (V^{'} ς \otimes I_{r}) = I_{r (m + r)}

because

V^{'} ς = (\begin{matrix} d^{'} ρ & 0 \\ c^{'} ψ & c^{'} τ \end{matrix}) = (\begin{matrix} I_{r} & 0 \\ 0 & I_{m} \end{matrix}),

where the last equality follows from (22) and (23) and

τ = τ_{c}

.

5.2. The Concentrated Likelihood Function

The model (16), after concentrating out the unrestricted parameter matrix

Υ

, can be represented by the equations

Z_{0 t} = α (ρ^{'} τ^{'} Z_{2 t} + ψ^{'} Z_{1 t}) + λ τ^{'} Z_{1 t} + ε_{t} (ξ),

(24)

where

ξ

indicates the vector of free parameters in

(α, ϱ, ϕ, δ, λ)

,

Z_{0 t}

,

Z_{1 t}

and

Z_{2 t}

are residual vectors of regressions of

Δ^{2} X_{t}

,

Δ X_{t - 1}

and

X_{t - 1}

, respectively, on

X_{t - 1}

;7 this derivation follows similarly to Chapter 6.1 in Johansen (1996). The associated log-likelihood function, concentrated with respect to

Υ

, is given by

ℓ (ξ, Ω) = - \frac{T}{2} log |Ω| - \frac{1}{2} \sum_{t = 1}^{T} ε_{t} {(ξ)}^{'} Ω^{- 1} ε_{t} (ξ),

In the rest of this section,

ε_{t}

is used as shorthand for

ε_{t} (ξ)

.

Algorithms for the maximization of the concentrated log-likelihood function

ℓ (ξ, Ω)

are proposed below. The first one, called al1, considers the alternative maximization of

ℓ (ξ, Ω)

over

(α, ϱ, δ, λ, Ω)

for a fixed value of

ϕ

(called the

α

-step), and over

(ϕ, δ)

for a given value of

(α, ϱ, λ, Ω)

(called the

τ

-step).

A variant of this algorithm, called al2, can be defined fixing

δ

in the

τ

-step to the value of

δ

obtained in the

α

-step. It can be shown that the increase in

ℓ (ξ, Ω)

obtained in one combination of

α

-step and

τ

-step of al1 is greater or equal to the one obtained by al2. The proof of this result is reported in Proposition A1 in Appendix B. Because of this property, and because al2 may display very slow convergence properties in practice, al1 is implemented in the illustration below.

The rest of this section presents algorithms al1 and al2, defining first the

τ

-step, then the

α

-step and finally discussing the starting values, a line search and normalizations.

5.2.1. $τ$ Step

Taking differentials, one has

d ℓ = - \sum_{t = 1}^{T} ε_{t}^{'} Ω^{- 1} d ε_{t}

. Keeping

(α, ϱ, λ)

fixed, one finds

\begin{matrix} - d ε_{t} & = d (α ρ^{'} τ^{'} Z_{2 t} + α ψ^{'} Z_{1 t} + λ τ^{'} Z_{1 t}) \\ = ((Z_{2 t}^{'} \otimes α ρ^{'}) + (Z_{1 t}^{'} \otimes λ)) d vec τ^{'} + (Z_{1 t}^{'} \otimes α) d vec ψ^{'} \\ = ((Z_{2 t}^{'} \otimes α ρ^{'}) + (Z_{1 t}^{'} \otimes λ)) K_{m, r_{1}} G d ϕ + (Z_{1 t}^{'} c_{⊥} \otimes α) d vec δ . \end{matrix}

Writing

ε_{t}

in terms of

ϕ

and

vec δ

, i.e.,

ε_{t} = Z_{0 t} - ((Z_{2 t}^{'} \otimes α ρ^{'}) + (Z_{1 t}^{'} \otimes λ)) K_{m, r_{1}} (G ϕ + g) - (Z_{1 t}^{'} c_{⊥} \otimes α) vec δ

, the first-order conditions

\partial ℓ / \partial ϕ = 0

and

\partial ℓ / \partial vec δ = 0

are solved by

\begin{matrix} (\begin{matrix} \hat{ϕ} \\ vec \hat{δ^{'}} \end{matrix}) & = {(\begin{matrix} G^{'} U_{1}^{'} (Ω^{- 1} \otimes I_{T}) U_{1} G & G^{'} U_{1}^{'} (Ω^{- 1} \otimes I_{T}) U_{2} \\ U_{2}^{'} (Ω^{- 1} \otimes I_{T}) U_{1} G & U_{2}^{'} (Ω^{- 1} \otimes I_{T}) U_{2} \end{matrix})}^{- 1} \cdot \\ \cdot (\begin{matrix} G^{'} U_{1}^{'} (Ω^{- 1} \otimes I_{T}) \\ U_{2}^{'} (Ω^{- 1} \otimes I_{T}) \end{matrix}) (vec Z_{0} - U_{1} g), \end{matrix}

(25)

where

Z_{j} = {(Z_{j 1} : \dots : Z_{j T})}^{'}

,

j = 0, 1, 2

, and where

U_{1} = (α ρ^{'} \otimes Z_{2}) + (λ \otimes Z_{1})

, and

U_{2} = (α \otimes Z_{1} c_{⊥})

. Note that (25) is the GLS estimator in a regression of

vec Z_{0} - U_{1} g

on

(U_{1} G : U_{2})

. This defines the

τ

-step for al1.

The

τ

-step for al2 is defined similarly, but keeping

δ

fixed. In this case it is simple to see that

\hat{ϕ} = {(G^{'} U_{1}^{'} (Ω^{- 1} \otimes I_{T}) U_{1} G)}^{- 1} G^{'} U_{1}^{'} (Ω^{- 1} \otimes I_{T}) (vec Z_{0} - U_{1} g - vec (Z_{1} ψ α^{'})) .

5.2.2. $α$ Step

When

ϕ

is fixed (and hence

τ

is fixed), one can construct

Z_{3 t} = τ^{'} Z_{1 t}

and

Z_{4 t} = (\begin{matrix} {\bar{d}}^{'} τ^{'} Z_{2 t} \\ d_{⊥}^{'} τ^{'} Z_{2 t} \\ c_{⊥}^{'} Z_{1 t} \end{matrix}), γ = (\begin{matrix} I_{r} \\ ϱ \\ δ^{'} \end{matrix}) .

The concentrated model (24) can then be written as a reduced rank regression:

Z_{0 t} = α γ^{'} Z_{4 t} + λ Z_{3 t} + ε_{t},

for which the Guassian ML estimator for

α

,

γ

,

λ

has a closed-form solution, see Johansen (1996). Specifically, let

M_{i j} : = T^{- 1} \sum_{t = 1}^{T} Z_{i t} Z_{j t}^{'}

,

i, j = 0, 3, 4

and

S_{i j} : = M_{i j} - M_{i 3} M_{33}^{- 1} M_{3 j}

,

i, j = 0, 4

. If

v_{i}

,

i = 1, \dots, r

, are the eigenvectors corresponding to the largest r eigenvalues of the problem

(μ S_{44} - S_{40} S_{00}^{- 1} S_{04}) v = 0,

and

v = (v_{i}, \dots, v_{r})

is the matrix of the corresponding eigenvectors, then the optimal solutions for

ϱ

,

δ

,

α

,

λ

is given by

\hat{γ} = (\begin{matrix} I_{r_{0}} \\ \hat{ϱ} \\ {\hat{δ}}^{'} \end{matrix}) = v {(e^{'} v)}^{- 1}, \hat{α} = S_{04} \hat{γ} {({\hat{γ}}^{'} S_{44} \hat{γ})}^{- 1}, \hat{λ} = (M_{03} - \hat{α} {\hat{γ}}^{'} M_{43}) M_{33}^{- 1},

where

e^{'} = (I_{r} : 0)

. Optimization with respect to

\hat{Ω}

is performed using

Ω (ξ) = T^{- 1} \sum_{t = 1}^{T} ε_{t} (ξ) ε_{t} {(ξ)}^{'}

replacing

ξ

with

\hat{ξ}

formed from the previous expressions, namely taking

(α, ϱ, δ, λ)

equal to

(\hat{α}, \hat{ϱ}, \hat{δ}, \hat{λ})

in the above display and

ϕ = \hat{ϕ}

from the

τ

-step. Using the

S_{i j}

matrices, one can also compute

\hat{Ω}

directly as

\hat{Ω} = S_{00} - S_{04} \hat{γ} {({\hat{γ}}^{'} S_{44} \hat{γ})}^{- 1} {\hat{γ}}^{'} S_{40}

. This completes the definition of the

α

-step.

5.2.3. Starting Values and Line Search

If the system is just-identified, consistent starting values for all parameters can be obtained by imposing the identifying restrictions on the two-stage estimator for the I(2) model (2SI2), see Johansen (1995) and Paruolo (2000a). In case of over-identification, this method may be used to produce starting values for

(α, ϱ, λ)

, which may then be used as input for the first

τ

-step to obtain starting values for

ϕ

and

δ

.

Let

η

be the vector containing all free parameters in

(α, ϱ, δ, λ)

, and let

ξ : = {(ϕ^{'} : η^{'})}^{'}

. Denote by

ξ_{j - 1} = {(ϕ_{j - 1}^{'} : η_{j - 1}^{'})}^{'}

the value of

ξ

in iteration

(j - 1)

of algorithms. Denote as

{\hat{ξ}}_{j} = {({\hat{ϕ}}_{j}^{'} : {\hat{η}}_{j}^{'})}^{'}

the value of

ξ

obtained by the application of a

τ

-step and

α

-step of algorithms al1 and al2 at iteration j starting from

ξ_{j - 1}

. In an I(1) context, Doornik (2017a) found that better convergence properties can be obtained if a line search is added. For this purpose, define the final value of the j-th iteration as

ξ_{j} (ω) = ξ_{j - 1} + ω ({\hat{ξ}}_{j} - ξ_{j - 1})

where

ω

is chosen in

R_{+} = (0, \infty)

using a line search; note that values of

ω

greater than 1 are admissible. A simple (albeit admittedly sub-optimal) implementation of the line search is employed in Doornik (2017a); it consists of evaluating the log-likelihood function

ℓ (ξ, Ω (ξ))

with

Ω (ξ) = T^{- 1} \sum_{t = 1}^{T} ε_{t} (ξ) ε_{t} {(ξ)}^{'}

setting

ξ

equal to

ξ_{j} (ω)

for

ω \in {1.2, 2, 4, 8}

, and in choosing the value of

ω

with the highest loglikelihood ℓ. This simple choice of line search is used in the empirical illustration.

5.3. Standard Errors

The asymptotic variance matrix of the ML estimators may be obtained from the inverse observed (concentrated) information matrix as usual. Writing (24) as

Z_{0 t} = Π Z_{2 t} + Γ Z_{1 t} + ε_{t}

, and letting

θ = {(vec {(Π^{'})}^{'} : vec {(Γ^{'})}^{'})}^{'}

, the observed concentrated information matrix for the reduced-form parameter vector

θ

is obtained from

I_{θ} = - \frac{\partial^{2} ℓ (θ)}{\partial θ \partial θ^{'}} = (\begin{matrix} Ω^{- 1} \otimes Z_{2}^{'} Z_{2} & Ω^{- 1} \otimes Z_{2}^{'} Z_{1} \\ Ω^{- 1} \otimes Z_{1}^{'} Z_{2} & Ω^{- 1} \otimes Z_{2}^{'} Z_{2} \end{matrix}) .

This leads to the following information matrix in terms of the parameters

(ϕ, η)

:

I_{ϕ, η} = (\begin{matrix} J_{ϕ}^{'} \\ J_{η}^{'} \end{matrix}) I_{θ} (\begin{matrix} J_{ϕ} & J_{η} \end{matrix}),

where

J_{ϕ} = \partial θ / \partial ϕ^{'}

and

J_{η} = \partial θ / \partial η^{'}

. From

Π = α ρ^{'} τ^{'}

and

Γ = α ψ^{'} + λ τ^{'}

, one obtains

J_{ϕ} = (\begin{matrix} α ρ^{'} \otimes I_{p} \\ λ \otimes I_{p} \end{matrix}) G .

Define

η = {(vec {(α^{'})}^{'} : vec {(ϱ)}^{'} : vec {(δ^{'})}^{'} : vec {(λ^{'})}^{'})}^{'}

, so that

J_{η} = [J_{α} : J_{ϱ} : J_{δ} : J_{λ}]

, with

J_{α} = (\begin{matrix} I_{p} \otimes τ ρ \\ I_{p} \otimes ψ \end{matrix}), J_{ϱ} = (\begin{matrix} α \otimes τ d_{⊥} \\ 0 \end{matrix}), J_{δ} = (\begin{matrix} 0 \\ α \otimes c_{⊥} \end{matrix}), J_{λ} = (\begin{matrix} 0 \\ I_{p} \otimes τ \end{matrix}) .

With these ingredients, one finds

\hat{var} (\hat{ϕ}) = {({\hat{J}}_{ϕ}^{'} {\hat{I}}_{θ} {\hat{J}}_{ϕ} - {\hat{J}}_{ϕ}^{'} {\hat{I}}_{θ} {\hat{J}}_{η} {({\hat{J}}_{η}^{'} {\hat{I}}_{θ} {\hat{J}}_{η})}^{- 1} {\hat{J}}_{η}^{'} {\hat{I}}_{θ} {\hat{J}}_{ϕ})}^{- 1},

where

{\hat{I}}_{θ}

,

{\hat{J}}_{ϕ}

and

{\hat{J}}_{η}

are the expressions given above, evaluated at the ML estimators. Standard errors of individual parameters estimates are obtained as the square root of the diagonal elements of

\hat{var} (\hat{ϕ})

. Asymptotic normality of resulting t-statistics (under the null hypothesis), and

χ^{2}

asymptotic null distributions of likelihood ratio test statistics for the over-identifying restrictions, depend on conditions for asymptotic mixed normality being satisfied; this is discussed next.

6. Asymptotics

The asymptotic distribution of the ML estimator in the I(2) model has been discussed in Johansen (1997, 2006). As shown there and discussed in Boswijk (2000), the limit distribution of the ML estimator is not jointly mixed normal as in the I(1) case. As a consequence, the limit distribution of LR test statistics of generic hypotheses need not be

χ^{2}

under the null hypothesis.

In some special cases, the asymptotic distribution of the just-identified ML estimator of the cointegration parameters can be shown to be asymptotically mixed normal. Consider the case

r_{1} = 0

(i.e.,

r = m

), and assume as before that no deterministic terms are included in the model. In this case, the limit distribution of the cointegration parameters in Theorem 4 in Johansen (2006), J06 hereafter, can be described in terms of the estimated parameters

{\hat{B}}_{0} : = {\bar{τ}}_{⊥}^{'} (\hat{ψ} - ψ)

and

{\hat{B}}_{2} : = {\bar{τ}}_{⊥}^{'} (\hat{τ} - τ)

, where

\hat{τ}

is identified as

τ_{c}

with

c = τ

. Note that the components C and

B_{1}

in the above theorem do not appear here, because

r_{1} = 0

. One has

(\begin{matrix} T {\hat{B}}_{0} \\ T^{2} {\hat{B}}_{2} \end{matrix}) \overset{w}{\to} B^{\infty} : = {(\int_{0}^{1} H_{*} (s) H_{*} {(s)}^{'} d s)}^{- 1} \int_{0}^{1} H_{*} (s) d W_{1} (s)

with

H_{*} (u) : = {(H_{0} {(u)}^{'} : H_{2} {(u)}^{'})}^{'}

,

H_{2 u} : = \int_{0}^{u} H_{0} (s) d s, H_{0} (u) : = τ_{⊥}^{'} C_{2} W (u), W_{1} (u) : = {(α^{'} Ω^{- 1} α)}^{- 1} α^{'} Ω^{- 1} W (u),

and where

T^{- \frac{1}{2}} \sum_{i = 1}^{⌊ T u ⌋} ε_{i} \overset{w}{\to} W (u)

, a vector Brownian motion with covariance matrix

Ω

8.

As noted in J06,

B^{\infty}

has a mixed normal distribution with mean 0, because

H_{*} (u)

is a function of

α_{⊥}^{'} W (u)

, which is independent of

W_{1} (u)

. Moreover in the case

r_{1} = 0

, the

C^{\infty}

component of the ML limit distribution does not appear, so that the whole limit distribution of the cointegration parameters is jointly mixed normal, unlike in the case

r_{1} > 0

.

One can see that hypothesis (8) defines a smooth restriction of the

B_{2}

parameters9. More precisely

B_{2}

depends smoothly only on

ϕ_{2}

,

B_{2} = B_{2} (ϕ_{2})

, where

ϕ_{2}

contains the

ϕ

parameters in (8). Note also that

B_{0}

depends on the parameters in

ψ

, which are unrestricted by (8); hence

B_{0}

depends only on

ϕ_{1}

,

B_{0} = B_{0} (ϕ_{1})

, where

ϕ_{1}

contains the parameters in

δ

in (22).

The conditions of Theorem 5 in J06 are next shown to be verified, and hence the LR test of the hypothesis (8) is asymptotically

χ^{2}

with degrees of freedom equal to the number of constraints, in case

r_{1} = 0

. In fact,

B_{0} (ϕ_{1})

,

B_{2} (ϕ_{2})

are smoothly parametrizated by the continuously identified parameters

ϕ_{1}

and

ϕ_{2}

. Because

B_{2}

does not depend on

ϕ_{1}

, one easily deduces

\partial B_{2} / \partial ϕ_{1} = \partial^{2} B_{2} / \partial ϕ_{1}^{2} = 0

in (37) of J06. Similarly, one has

ϕ_{1} = ϕ_{1 B}

with

\partial B_{0} / \partial ϕ_{1}

and

\partial B_{2} / \partial ϕ_{2}

of full rank; hence (38) of J06 is satisfied. This shows that the LR statistic is asymptotically

χ^{2}

under the null, for

r_{1} = 0

.

In case

r_{1} = (m - r) > 0

, the asymptotic distribution of

\hat{τ}

is defined in terms of

(B^{\infty}, C^{\infty})

in J06 p. 92, which is not jointly mixed normal. In such cases, Boswijk (2000) showed that inference is mixed normal if the restrictions on

{\hat{τ}}_{c}

can be asymptotically linearized in

(B^{\infty}, C^{\infty})

, and separated into two sets of restrictions, the first group involving

B^{\infty}

only, and the second group involving

C^{\infty}

only. Because the conditions of Theorem 5 in J06 cannot be easily verified for general linear hypotheses of the form (8) in this case, they will need to be checked case by case. The authors intend to develop more readily verifiable conditions for

χ^{2}

inference on

τ

in their future research.

7. Illustration

Following Juselius and Assenmacher (2015), consider a 7-dimensional VAR with

X_{t} = {(p_{1 t} : p_{2 t} : e_{12 t} : b_{1 t} : b_{2 t} : s_{1 t} : s_{2 t})}^{'},

where

p_{i t}

,

b_{i t}

,

s_{i t}

are the (log of) the price index, the long and the short interest rate of country i at time t respectively, and

e_{12 t}

is the log of the exchange rate between country 1 (Switzerland) and 2 (the US) at time t. The results are based on quarterly data over the period 1975:1–2013:3. The model has two lags, a restricted linear trend as in (21), which appears in the equilibrium correction only appended to the vector of lagged levels, and a number of dummy variables; see Juselius and Assenmacher (2017), which is an updated version of Juselius and Assenmacher (2015), for further details on the empirical model. The data set used here is taken from Juselius and Assenmacher (2017).

Specification (3) is based on the prediction that

r_{2} = 2

. Based on I(2) cointegration tests, Juselius and Assenmacher (2017) choose a model with

r = m = 5

, which indeed implies

r_{2} = 2

, but also

r_{1} = m - r = 0

; arguably, however, the test results in Table 1 of their paper also support the hypothesis

(r, r_{1}) = (4, 1)

, which has the same number

r_{2} = 2

of common

I (2)

trends. The latter model would be selected applying the sequential procedure in Nielsen and Rahbek (2007) using a

5 %

or

10 %

significance level in each test in the sequence.

Consider the case

(r, r_{1}) = (5, 0)

. The over-identifying restrictions on

τ_{⊥}

implied by (3) are incorporated in the parametrization (3), with normalizations

ϕ_{11} = ϕ_{42} = 1

, which in turn leads to the over-identified structure for

τ_{c}

in (15), to be estimated by ML. The restricted ML estimate of

τ_{⊥ c_{⊥}}

is (standard errors in parentheses):

{\hat{τ}}_{⊥ c_{⊥}} = (\begin{matrix} 1 & 0 \\ \underset{(0.11)}{1.49} & \underset{(5.23)}{- 25.14} \\ \underset{(0.72)}{- 1.88} & \underset{(29.81)}{- 35.70} \\ 0 & 1 \\ 0 & \underset{(0.53)}{- 1.91} \\ 0 & \underset{(0.29)}{1.23} \\ 0 & \underset{(0.95)}{- 3.02} \end{matrix}) .

The LR statistics for the 3 over-identifying restrictions equals

16.11

. Using the

χ^{2} (3)

asymptotic limit distribution, one finds an asymptotic p-value of

0.001

, and hence a rejection of the null hypothesis. This indicates that the hypothesized structure on

τ_{⊥}

is rejected.

For comparison, consider also the case

(r, r_{1}) = (4, 1)

, for which the LR test for cointegration ranks has a p-value of

0.13

. The resulting restricted estimate of

τ_{⊥ c_{⊥}}

is:

{\hat{τ}}_{⊥ c_{⊥}} = (\begin{matrix} 1 & 0 \\ \underset{(0.09)}{1.38} & \underset{(5.22)}{- 24.67} \\ \underset{(0.56)}{- 1.07} & \underset{(22.42)}{- 30.10} \\ 0 & 1 \\ 0 & \underset{(0.52)}{- 1.75} \\ 0 & \underset{(0.28)}{1.20} \\ 0 & \underset{(1.02)}{- 2.97} \end{matrix}) .

The estimates and standard errors are similar to those obtained under the hypothesis

(r, r_{1}) = (5, 0)

. The LR statistic for the over-identifying restrictions now equals

10.08

. If one conjectured that the limit distribution of the LR test is also

χ^{2} (3)

in this case, one would obtain an asymptotic p-value of

0.018

, so the evidence against the hypothesized structure of

τ

appears slightly weaker in this model.

The results for both model

(r, r_{1}) = (5, 0)

and for model

(r, r_{1}) = (4, 1)

are in line with the preferred specification of Juselius and Assenmacher (2017), who select an over-identified structure for

τ

, which is not nested in (15), and therefore implies a different impact of the common I(2) trends.

8. Conclusions

Hypotheses on the loading matrix of I(2) common trends are of economic interest. They are shown to be related to the cointegration relations. This link is explicitly discussed in this paper, also for hypotheses that are over-identifying. Likelihood maximization algorithms are proposed and discussed, along with LR tests of the hypotheses.

The application of these LR tests to a system of prices, exchange rates and interest rates for Switzerland and the US shows support for the existence of two I(2) common trends. These may represent a ‘speculative’ trend and a ‘relative prices’ trend, but there is little empirical support for the corresponding exclusion restrictions in the loading matrix.

Acknowledgments

Helpful comments and suggestions from two anonymous referees and the Academic Editor, Katarina Juselius, are gratefully acknowledged.

Author Contributions

Both authors contributed equally to the paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Theorem A1 (Johansen Representation Theorem).

Let the vector process

X_{t}

satisfy

A (L) X_{t} = μ_{0} + μ_{1} t + ε_{t}

, where

A (L) : = I_{p} - \sum_{i = 1}^{k} A_{i} L^{i}

, a matrix lag polynomial of degree k, and where

ε_{t}

is an i.i.d.

(0, Ω)

sequence. Assume that

A (z)

is of full rank for all

|z| < 1 + c

,

c > 0

, with the exception of

z = 1

. Let A,

\dot{A}

and

\ddot{A}

denote

A (1)

, the first and second derivative of

A (z)

with respect to z, evaluated at

z = 1

; finally define

Γ = \dot{A} - A

. Then

X_{t}

is I(2) if and only if the following conditions hold:

(i): $A = - α β^{'}$ where α, $β$ are $p \times r$ matrices of full column rank $r < p$ ,
(ii): $P_{α_{⊥}} Γ P_{β_{⊥}} = α_{1} β_{1}^{'}$ where $α_{1}$ , $β_{1}$ are $p \times r_{1}$ matrices of full column rank $r_{1} < p - r$ ,
(iii): $α_{2}^{'} Θ β_{2}$ is of full rank $r_{2} : = p - r - r_{1}$ , where $Θ : = \frac{1}{2} \ddot{A} + \dot{A} \bar{β} {\bar{α}}^{'} \dot{A}$ , $α_{2} : = {(α, α_{1})}_{⊥}$ and $β_{2} : = {(β, β_{1})}_{⊥}$ ,
(iv): $μ_{1}$ $= α β_{D}$ for some $β_{D}$ ,
(v): $α_{2}^{'} μ_{0} = α_{2}^{'} Γ \bar{β} β_{D}$ .

Under these conditions,

X_{t}

admits a common trends I(2) representation of the form

X_{t} = C_{2} \sum_{i = 1}^{t} \sum_{s = 1}^{i} ε_{s} + C_{1} \sum_{i = 1}^{t} ε_{i} + C^{⋆} (L) ε_{t} + v_{0} + v_{1} t,

(A1)

where

C_{2} = β_{2} {(α_{2}^{'} Θ β_{2})}^{- 1} α_{2}^{'},

(A2)

C^{⋆} (L) ε_{t}

is an I(0) linear process, and

v_{0}

and

v_{1}

depend on the VAR coefficients and on the initial values of the process.

Proof.

See Johansen (1992), Johansen (2009) and Rahbek et al. (1999), which also contain expressions for

C_{1}

,

C^{*} (L)

and

(v_{0}, v_{1})

. ☐

It is next shown that conditions (iv) and (v) are satisfied by the

τ

-par (21). In fact, condition (iv) holds for

β_{D} = ρ^{'} τ_{1}

. Note that

Γ = α ψ^{'} + λ τ^{'}

,

β = τ ρ

and

P_{α_{⊥}} Γ P_{β_{⊥}} = P_{α_{⊥}} λ τ^{'} P_{β_{⊥}} = α_{1} β_{1}^{'}

. The l.h.s. of (v) is

α_{2}^{'} μ_{0} = α_{2}^{'} λ τ_{1} .

(A3)

Next write the r.h.s. of (v) using

τ^{'} τ ρ {(ρ^{'} τ^{'} τ ρ)}^{- 1} ρ^{'} = I - ρ_{⊥} {(ρ_{⊥}^{'} {(τ^{'} τ)}^{- 1} ρ_{⊥})}^{- 1} ρ_{⊥}^{'} {(τ^{'} τ)}^{- 1}

by oblique projections; one finds

\begin{matrix} α_{2}^{'} Γ \bar{β} β_{D} & = α_{2}^{'} λ τ^{'} τ ρ {(ρ^{'} τ^{'} τ ρ)}^{- 1} ρ^{'} τ_{1} \\ = α_{2}^{'} λ τ_{1} - α_{2}^{'} λ ρ_{⊥} {(ρ_{⊥}^{'} {(τ^{'} τ)}^{- 1} ρ_{⊥})}^{- 1} ρ_{⊥}^{'} {(τ^{'} τ)}^{- 1} τ_{1} = α_{2}^{'} λ τ_{1} \end{matrix}

(A4)

where the last equality holds because

α_{2}^{'} λ ρ_{⊥} = 0

, as shown below. Note in fact that

β_{1} = \bar{τ} ρ_{⊥}

lies in

col β_{⊥}

and

α_{2}

lies in

col α_{⊥}

; hence one can write

α_{2}^{'} λ ρ_{⊥} = α_{2}^{'} λ τ^{'} \bar{τ} ρ_{⊥} = α_{2}^{'} P_{α_{⊥}} λ τ^{'} P_{β_{⊥}} β_{1} = α_{2}^{'} P_{α_{⊥}} Γ P_{β_{⊥}} β_{1} = α_{2}^{'} α_{1} β_{1}^{'} β_{1} = 0 .

Hence, because (A3) equals (A4), condition (v) is satisfied.

Appendix B

This Appendix contains a proof that the increase in ℓ in one combination of

α

-step and

τ

-step of al1 is greater or equal to the one obtained by al2. In order to state the argument in somewhat greater generality, define a parameter vector

θ

partitioned in 3 components, denoted

(θ_{1}, θ_{2}, θ_{3})

, where each

θ_{j}

represents a subvector of parameters, respectively of dimensions

n_{1}

,

n_{2}

,

n_{3}

. Let

ℓ (θ)

be the log-likelihood function. Define also the following switching algorithms, both starting at the same initial value

(θ_{1}^{(j - 1)}, θ_{2}^{(j - 1)}, θ_{3}^{(j - 1)})

:

Definition A1.

algo1 (3 way switching)

Step 1:: for fixed $θ_{1}$ , maximize ℓ with respect to $(θ_{2}, θ_{3})$ ;
Step 2:: for fixed $θ_{2}$ , maximize ℓ with respect to $(θ_{1}, θ_{3})$ .

Let

ℓ (θ^{(1, j)})

be the value of ℓ corresponding to the application of step 1 and 2 of algo1.

Definition A2.

algo2 (Pure switching)

Step 1:: for fixed $θ_{1}$ , maximize ℓ with respect to $(θ_{2}, θ_{3})$ ;
Step 2:: for fixed $(θ_{2}, θ_{3})$ , maximize ℓ with respect to $θ_{1}$ .

Let

ℓ (θ^{(2, j)})

be the value of ℓ corresponding to the application of step 1 and 2 of algo2.

Proposition A1 (Pure versus 3-way switching).

One has

ℓ (θ^{(1, j)}) \geq ℓ (θ^{(2, j)})

.

Proof.

In order to see this, let

(θ_{2}^{⋆}, θ_{3}^{⋆}) = \underset{θ_{2}, θ_{3}}{arg max} ℓ (θ_{1}^{(j - 1)}, θ_{2}, θ_{3}) .

Step 1 is the same for algo1 and algo2. In the second step of algo1 one considers

ℓ (θ^{(1, j)}) = max_{θ_{1}, θ_{3}} ℓ (θ_{1}, θ_{2}^{⋆}, θ_{3}),

(A5)

while for algo2 one considers

ℓ (θ^{(2, j)}) = max_{θ_{1}} ℓ (θ_{1}, θ_{2}^{⋆}, θ_{3}^{⋆}) .

(A6)

The conclusion that

ℓ (θ^{(1, j)}) \geq ℓ (θ^{(2, j)})

follows from the fact that the maximization problem (A6) is a constrained version of (A5) under

θ_{3} = θ_{3}^{⋆}

. ☐

It is simple to observe that the argument of the proof implies that the larger the dimension of

n_{3}

, the better.

References

Boswijk, H. Peter. 2000. Mixed normality and ancillarity in I(2) systems. Econometric Theory 16: 878–904. [Google Scholar] [CrossRef]
Boswijk, H. Peter, and Jurgen A. Doornik. 2004. Identifying, estimating and testing restricted cointegrated systems: An overview. Statistica Neerlandica 58: 440–65. [Google Scholar] [CrossRef]
Doornik, Jurgen A. 2017a. Accelerated estimation of switching algorithms: The cointegrated VAR model and other applications. Working paper, University of Oxford, Oxford, UK. [Google Scholar]
Doornik, Jurgen A. 2017b. Maximum likelihood estimation of the I(2) model under linear restrictions. Econometrics 5: 19. [Google Scholar] [CrossRef]
Engle, Robert F., and Clive W. J. Granger. 1987. Co-integration and error correction: Representation, estimation, and testing. Econometrica 55: 251–76. [Google Scholar] [CrossRef]
Johansen, Søren. 1991. Estimation and hypothesis testing of cointegration vectors in Gaussian vector autoregressive models. Econometrica 59: 1551–80. [Google Scholar] [CrossRef]
Johansen, Søren. 1992. A representation of vector autoregressive processes integrated of order 2. Econometric Theory 8: 188–202. [Google Scholar] [CrossRef]
Johansen, Søren. 1995. Identifying restrictions of linear equations with applications to simultaneous equations and cointegration. Journal of Econometrics 69: 111–32. [Google Scholar] [CrossRef]
Johansen, Søren. 1996. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. 2nd printing. Oxford: Oxford University Press. [Google Scholar]
Johansen, Søren. 1997. A likelihood analysis of the I(2) model. Scandinavian Journal of Statistics 24: 433–62. [Google Scholar] [CrossRef]
Johansen, Søren. 2006. Statistical analysis of hypotheses on the cointegrating relations in the I(2) model. Journal of Econometrics 132: 81–115. [Google Scholar] [CrossRef]
Johansen, Søren. 2009. Representation of cointegrated autoregressive processes with application to fractional processes. Econometric Reviews 28: 121–45. [Google Scholar] [CrossRef]
Jöreskog, Karl G., Ulf H. Olsson, and Fan Y. Wallentin. 2016. Multivariate Analysis with LISREL. Basel: Springer International Publishing. [Google Scholar]
Juselius, Katarina. 2017a. A theory consistent CVAR scenario for a standard monetary model using data-generated expectations. Working paper, University of Copenhagen, Copenhagen, Denmark. [Google Scholar]
Juselius, Katarina. 2017b. Using a theory-consistent CVAR scenario to test an exchange rate model based on imperfect knowledge. Working paper, University of Copenhagen, Copenhagen, Denmark. [Google Scholar]
Juselius, Katarina, and Katrin Assenmacher. 2015. Real exchange rate persistence: The case of the Swiss franc-US dollar rate. SNB Working paper 2015-03, Swiss National Bank, Zürich, Switzerland. [Google Scholar]
Juselius, Katarina, and Katrin Assenmacher. 2017. Real Exchange Rate Persistence and the Excess Return Puzzle: The Case of Switzerland Versus the US. Journal of Applied Econometrics. forthcoming. [Google Scholar] [CrossRef]
Kongsted, Hans Christian. 2005. Testing the nominal-to-real transformation. Journal of Econometrics 124: 205–25. [Google Scholar] [CrossRef]
Magnus, Jan R., and Heinz Neudecker. 2007. Matrix Differential Calculus with Applications in Statistics and Econometrics, 3rd ed. New York: Wiley. [Google Scholar]
Mosconi, Rocco, and Paolo Paruolo. 2017. Cointegration and error correction in I(2) vector autoregressive models: Identification, estimation and testing. Mimeo, Polictenico di Milano, Milano, Italy. [Google Scholar]
Nielsen, Heino Bohn, and Anders Rahbek. 2007. The likelihood ratio test for cointegration ranks in the I(2) model. Econometric Theory 23: 615–37. [Google Scholar] [CrossRef]
Paruolo, Paolo. 1997. Asymptotic inference on the moving average impact matrix in cointegrated I(1) VAR systems. Econometric Theory 13: 79–118. [Google Scholar] [CrossRef]
Paruolo, Paolo. 2000a. Asymptotic efficiency of the two stage estimator in I(2) systems. Econometric Theory 16: 524–50. [Google Scholar] [CrossRef]
Paruolo, Paolo. 2000b. On likelihood-maximizing algorithms for I(2) VAR models. Mimeo, University of Insubria, Varese, Italy. [Google Scholar]
Paruolo, Paolo. 2002. Asymptotic inference on the moving average impact matrix in cointegrated I(2) VAR systems. Econometric Theory 18: 673–90. [Google Scholar] [CrossRef]
Rahbek, Anders, Hans Christian Kongsted, and Clara Jørgensen. 1999. Trend-stationarity in the I(2) cointegration model. Journal of Econometrics 90: 265–89. [Google Scholar] [CrossRef]
Rothenberg, Thomas J. 1971. Identification in parametric models. Econometrica 39: 577–91. [Google Scholar] [CrossRef]
Sargan, J. Denis. 1988. Lectures on Advanced Econometric Theory. Edited by Meghnad Desai. Oxford: Basil Blackwell. [Google Scholar]
Srivastava, Muni S., and C. G. Kathri. 1979. An Introduction to Multivariate Statistics. New York: North Holland. [Google Scholar]

1.	In the I(2) cointegration literature, $τ_{⊥}$ is also referred to as $β_{2}$ , see the Johansen Representation Theorem in Appendix A.
2.	Up to normalizations, see below.
3.	When $c_{⊥}^{'} τ_{⊥}$ is square and nonsingular, then one can prove that also $c^{'} τ$ is square and nonsingular, see e.g., Johansen (1996, Exercise 3.7).
4.	This equation is obtained by using orthogonal projection of $τ_{⊥ c_{⊥}}$ on the columns spaces of c and $c_{⊥}$ , and applying the equality $c_{⊥}^{'} τ_{⊥ c_{⊥}} = I_{r_{2}}$ which follows by definition.
5.	In the general VAR(k) model (1), $ε_{t}$ in (16) is replaced by $μ_{0} + μ_{1} t + ε_{t}$ ; see Section 4.3 below.
6.	The difference $m - r = p - r - r_{2}$ is referred to as either s or $r_{1}$ in the I(2) cointegration literature, see Appendix A.
7.	If a restricted constant and linear trend are included in the model, as in (21), then $Z_{1 t}$ and $Z_{2 t}$ are defined as the residual vectors of regressions of $Δ X_{t - 1}^{⋆}$ and $X_{t - 1}^{⋆}$ , respectively, on $X_{t - 1}$ .
8.	Here $\overset{w}{\to}$ indicates weak convergence and $⌊ \cdot ⌋$ denotes the greatest integer part.
9.	In the rest of this section the notation $ϕ_{1}, ϕ_{2}$ and $\partial B_{i} / \partial ϕ_{j}$ are used in accordance to the notation in J06.

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Boswijk, H.P.; Paruolo, P. Likelihood Ratio Tests of Restrictions on Common Trends Loading Matrices in I(2) VAR Systems. Econometrics 2017, 5, 28. https://doi.org/10.3390/econometrics5030028

AMA Style

Boswijk HP, Paruolo P. Likelihood Ratio Tests of Restrictions on Common Trends Loading Matrices in I(2) VAR Systems. Econometrics. 2017; 5(3):28. https://doi.org/10.3390/econometrics5030028

Chicago/Turabian Style

Boswijk, H. Peter, and Paolo Paruolo. 2017. "Likelihood Ratio Tests of Restrictions on Common Trends Loading Matrices in I(2) VAR Systems" Econometrics 5, no. 3: 28. https://doi.org/10.3390/econometrics5030028

APA Style

Boswijk, H. P., & Paruolo, P. (2017). Likelihood Ratio Tests of Restrictions on Common Trends Loading Matrices in I(2) VAR Systems. Econometrics, 5(3), 28. https://doi.org/10.3390/econometrics5030028

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Likelihood Ratio Tests of Restrictions on Common Trends Loading Matrices in I(2) VAR Systems

Abstract

1. Introduction

2. Common Trends Representation for I(2) Systems

2.1. Example 1

2.2. Example 2

3. Hypothesis on the Common Trends Loadings

3.1. Linear hypotheses on $τ_{⊥ c_{⊥}}$

3.2. Linear Hypotheses on $τ_{⊥}$

3.3. Example 2 Continued

4. The VECM Parametrization

4.1. Identification of $τ$

4.2. The Identification of Remaining Parameters

4.3. Deterministic Terms

5. Likelihood Maximization

5.1. Normalizations

5.2. The Concentrated Likelihood Function

5.2.1. $τ$ Step

5.2.2. $α$ Step

5.2.3. Starting Values and Line Search

5.3. Standard Errors

6. Asymptotics

7. Illustration

8. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Likelihood Ratio Tests of Restrictions on Common Trends Loading Matrices in I(2) VAR Systems

Abstract

1. Introduction

2. Common Trends Representation for I(2) Systems

2.1. Example 1

2.2. Example 2

3. Hypothesis on the Common Trends Loadings

3.1. Linear hypotheses on τ ⊥ c ⊥

3.2. Linear Hypotheses on τ ⊥

3.3. Example 2 Continued

4. The VECM Parametrization

4.1. Identification of τ

4.2. The Identification of Remaining Parameters

4.3. Deterministic Terms

5. Likelihood Maximization

5.1. Normalizations

5.2. The Concentrated Likelihood Function

5.2.1. τ Step

5.2.2. α Step

5.2.3. Starting Values and Line Search

5.3. Standard Errors

6. Asymptotics

7. Illustration

8. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.1. Linear hypotheses on $τ_{⊥ c_{⊥}}$

3.2. Linear Hypotheses on $τ_{⊥}$

4.1. Identification of $τ$

5.2.1. $τ$ Step

5.2.2. $α$ Step