1. Introduction
The duality between the common trends representation and the vector equilibrium-correction model-form (VECM) in cointegrated systems allows researchers to formulate hypotheses of economic interest on any of the two. The VECM is centered on the adjustment with respect to disequilibria in the system; in this way it facilitates the interpretation of cointegrating relations as (deviations from) equilibria.
The common trends representation instead highlights how variables in the system as pushed around by common stochastic trends, which are often interpreted as the main persistent economic factors influencing the long-term. Both representations provide economic insights on the economic system under scrutiny. Examples of both perspectives are given in
Juselius (
2017a,
2017b)
The common trends and VECM representations are connected through representation results such as the Granger Representation Theorem, in the case of I(1) systems, see
Engle and Granger (
1987) and
Johansen (
1991), and the Johansen Representation Theorem, for the case of I(2) systems, see
Johansen (
1992). In particular, both representation theorems show that the loading matrix of the common stochastic trends of highest order is a basis of the orthogonal complement of the matrix of cointegrating relations. Because of this property, these two matrices are linked, and any one of them can be written as a function of the other one.
This paper focuses on I(2) vector autoregressive (VAR) systems, and it considers the situation where (possibly over-identifying) economic hypotheses are entertained for the factor loading matrix of the I(2) trends. It is shown how they can then be translated into hypotheses on the cointegrating relations, which appear in the VECM representation; the latter forms the basis for maximum likelihood (ML) estimation of I(2) VAR models. In this way, constrained ML estimators are obtained and the associated likelihood ratio (LR) tests of these hypotheses can be defined. These tests are discussed in the present paper; Wald tests on just-identified loading matrices of the I(1) and I(2) common trends have already been proposed by
Paruolo (
1997,
2002).
The running example of the paper is taken from
Juselius and Assenmacher (
2015), which is the working paper version of
Juselius and Assenmacher (
2017). The following notation is used: for a full column-rank matrix
a,
denotes the space spanned by the columns of
a and
indicates a basis of the orthogonal complement of
. For a matrix
b of the same dimensions of
a, and for which
is full rank, let
; a special case is when
, for which
. Let also
indicate the orthogonal projection matrix onto
, and let the matrix
denote the orthogonal projection matrix on its orthogonal complement. Finally
is used to indicate the
j-th column of an identity matrix of appropriate dimension.
The rest of this paper is organized as follows:
Section 2 contains the motivation and the definition of the problem considered in the paper. The identification of the I(2) common trends loading matrix under linear restrictions is analysed in
Section 3. The relationship between the identified parametrization of I(2) common trends loading matrix and an identified version of the cointegration matrix is also discussed.
Section 4 considers a parametrization of the VECM, and discusses its identification. ML estimation of this model is discussed in
Section 5; the asymptotic distributions of the resulting ML estimator of the I(2) loading matrix and the LR statistic of the over-identifying restrictions are sketched in
Section 6.
Section 7 reports an illustration of the techniques developed in the paper on a system of US and Swiss prices, interest rates and exchange rate.
Section 8 concludes, while two appendices report additional technical material.
2. Common Trends Representation for I(2) Systems
This section introduces quantities of interest and presents the motivation of the paper. Consider a
p-variate VAR(
k) process
:
where
are
matrices,
and
are
vectors, and
is a
i.i.d.
vector, with
positive definite. Under the conditions of the Johansen Representation Theorem, see
Appendix A, called
the I(2)
conditions,
admits a common trends I(2) representation of the form
where
are the I(2) stochastic trends (cumulated random walks),
is a random walk component, and
is an I(0) linear process.
Cointegration occurs when the matrix
has reduced rank
, such that
, where
a and
b are
and of full column rank. This observation lends itself to the following interpretation:
defines the
common I(2) trends, while
a acts as the loading matrix of
on the I(2) trends. The reduced rank of
implies that there exist
linearly independent cointegrating vectors, collected in a
matrix
, satisfying
; hence
is I(1). Combining this with
, it is clear that
, i.e., the columns of the loading matrix span the orthogonal complement of the cointegration space
. Interest in this paper is on hypotheses on
.
1Observe that
is invariant to the choice of basis of either
and
. In fact,
can be replaced by
with
Q square and nonsingular without affecting
. One way to resolve this identification problem is to impose restrictions on the entries of
; enough restrictions of this kind would make the choice of
unique. Such an approach to identification is common in confirmatory factor analysis in the statistics literature, see
Jöreskog et al. (
2016).
If more restrictions are imposed than needed for identification, they are over-identifying. Such over-identifying restrictions on
usually correspond to (similarly over-identifying) restrictions on
, see
Section 3 below. Although economic hypotheses may directly imply restrictions on the cointegrating vectors in
, in some cases it is more natural to formulate restrictions on the I(2) loading matrix
. This is illustrated by the two following examples.
2.1. Example 1
Kongsted (
2005) considers a model for
, where
,
and
denote the nominal money stock, nominal income and the price level, respectively (all variables in logs); here ‘:’ indicates horizontal concatenation. He assumes that the system is I(2), with
. Given the definition of the variables,
Kongsted (
2005) considers the natural question of whether real money
and real income
are at most I(1). This corresponds to an (over-identified) cointegrating matrix
and loading vector
of the form
The form of
corresponds to the fact that the I(1) linear combinations
are (linear combinations of)
, as required. On the other hand, the restriction on
says that each of the three series have exactly the same I(2) trend, with the same scale factor. Both formulations are easily interpretable.
Note that the hypothesis on involves two over-identifying restrictions (the second and third component are equal to the first component), in addition to a normalization (the first component equals 1). Similarly, the restriction that the matrix consisting of the first two rows of equals is a normalization; the two over-identifying restrictions are that the entries in both columns sum to 0.
As this first example shows, knowing
is the same as knowing
and vice versa
2.
2.2. Example 2
Juselius and Assenmacher (
2015) consider a 7-dimensional VAR with
with
, where
,
,
are the (log of) the price index, the long and the short interest rate of country
i at time
t respectively, and
is the log of the exchange rate between country 1 (Switzerland) and 2 (the US) at time
t. They expect the common trends representation to have a loading matrix
of the form:
where
indicates an entry not restricted to 0.
The second I(2) trend is loaded on the interest rates , , , , as well as on US prices and the exchange rate ; this can be interpreted as a financial (or ‘speculative’) trend affecting world prices. The first I(2) trend, instead, is only loaded on , , and embodies a ‘relative price’ I(2) trend; it can be interpreted as the Swiss contribution to the trend in prices.
The cointegrating matrix
in this example is of dimension
. It is not obvious what type of restrictions on
correspond to the structure in (
3). However, it is
rather than
that enters the likelihood function (as will be analyzed in
Section 4). The rest of the paper shows that the restrictions in (
3) are over-identifying, how they can be translated into hypotheses on
, and how they can be tested via LR tests.
3. Hypothesis on the Common Trends Loadings
This section discusses linear hypotheses on
and their relation to
. First, attention is focused on the case of linear hypotheses on the normalized version
of
. Here
is a full-column-rank matrix of the same dimension of
such that
is square and nonsingular
3. This normalization was introduced by
Johansen (
1991) in the context of the I(1) model in order to isolate the (just-) identified parameters in the cointegration matrix.
Later, linear hypotheses formulated directly on are discussed. The main result of this section is the fact that the parameters of interest appears linearly both in and in in the first case; this is not necessarily true in the second case.
The central relation employed in this section (for both cases), is the following identity:
where
. This identity readily follows from the oblique projections identity
see e.g.
Srivastava and Kathri (
1979, p. 19), by post-multiplication by
3.1. Linear hypotheses on
Johansen (
1991) noted that the function
is invariant with respect to the choice of basis of the space spanned by
a. in fact, consider in the present context any alternative basis
of the space spanned by
; this has representation
for
Q square and full rank. Inserting
in place of
in the definition of
, one finds
Hence
, similarly to the cointegration matrix in the I(1) model in
Johansen (
1991), is (just-)identified.
To facilitate stating hypotheses on the unconstrained elements of
, the following representation of
appears useful:
where
is an
matrix of free coefficients in
4. For example, one may have
with
,
,
.
Consider over-identifying linear restrictions on the columns of
in (
5). Typically, such restrictions will come in the form of zero (exclusion) restrictions or unit restrictions, where the latter would indicate equal loadings of a specific variable and the variable on which the column of
has been normalized. The general formulation of such restrictions is
where
is the
i-th column vector of
,
and
are conformable vectors and matrices, and
contains the remaining unknown parameters in
. If only zero restrictions are imposed, then
.
The formulation in (
7) includes several notable special cases. For instance, if all
and
, one obtains the hypothesis that
is contained in a given linear space,
. Another example is given by the case where one column
is known,
; this corresponds to the choice
with
and
void and
,
.
The restrictions in (
7) may be summarized as
where
,
and
. Here
indicates a matrix with the (not necessarily square) blocks
along the main diagonal. Formulation (
8) generalises (
7).
The main result of this section is stated in the next theorem.
Theorem 1 (Hypotheses on
)
. Assume that ϑ satisfies linear restrictions of the type (
8)
; then these restrictions are translated into a linear hypothesis on viawhere is the commutation matrix satisfying , with A of dimensions , see Magnus and Neudecker (2007). The previous theorem shows that, when one can express a linear hypothesis on the coefficients in
that are unrestricted in
, then the same linear hypothesis is translated into a restriction on
. Note that the proof simply exploits (
4).
Identification of the restricted coefficients
under these hypothesis can be addressed in a straightforward way. In fact, the parameters in
are identified; hence
is identified provided that the matrix
K is of full column rank, which in turn will imply that the Jacobian matrix
in (
9) has full column rank.
Because, in practice, econometricians may explore the form of
via unrestricted estimates of
, see
Paruolo (
2002), before formulating restrictions on
, using hypothesis on the unrestricted coefficients in
appears a natural sequential step.
The next subsection discusses the alternative approach of specifying hypotheses directly on .
3.2. Linear Hypotheses on
In case placing restrictions on the unrestricted coefficients in is not what the econometrician wants, this subsection considers linear hypothesis on directly. It is shown that sometimes it is possible to translate linear hypothesis on into linear hypothesis on for some . It is also shown that this is always possible for , for which a constructive proof is provided.
Analogously to (
7), consider linear hypotheses on the columns of
, of the following type:
summarized as
In this case, non-zero vectors
represent normalizations of the columns of the loading matrix, and as before,
collects the unknown parameters in
.
Theorem 2 (Hypotheses on
τ⊥)
. Assume that satisfies linear restrictions of the type (
11)
, then these restrictions are translated in general into a non-linear
hypothesis on viaand the Jacobian of the transformation from ϕ to isThis parametrization is smooth on an open set in the parameter space Φ
of ϕ where is of full rank. Proof. Equation (
12) is a re-statement of (
4). Differentiation of (
12) delivers (
13). ☐
One can note that the Jacobian matrix in (
13) can be used to check local identification using the results in
Rothenberg (
1971).
The result of Theorem 2 is in contrast with the result of Theorem 1, because the latter delivers a linear hypothesis for
while Theorem 2 gives in general non-linear restrictions on
. One may hence ask the following question: when is it possible to reduce the more general linear hypothesis on
given in (
11) to the simpler linear hypothesis on
given in (
8)?
In the special case of
, the following theorem states that this can be always obtained. This applies for instance to the motivating example (
3), where one can choose some
so that
is equal to the identity, as shown below. Consider the formulation (
10) with
, and assume that no normalizations have been imposed yet, such that
. It is assumed that
, under the equation-by-equation restrictions, satisfies the usual rank conditions for identification, see
Johansen (
1995, Theorem 1) :
where
.
Theorem 3 (Case
r2 = 2)
. Let obey the restrictions satisfying the rank conditions (
14)
; then one can choose normalization conditions on and so that there exists a matrix such that . This implies that a hypotheses on can be stated in terms of ϑ in (
5)
, and, by Theorem 1, a linear hypotheses on corresponds to linear hypothesis on . Proof. Because has rank 1, one can select (at least) one linear combination of , say, so that is normalized to be one in the direction , i.e., . Similarly, has rank 1, and one can select (at least) one linear combination of , say, so that is normalized to be one in the direction , i.e., . Next define which by construction satisfies . ☐
The proof of the previous theorem provides a way to construct
when
and the usual rank condition for identification (
14) holds. The rest of the paper focuses attention on the case of linear restrictions on
in (
8), which can be translated linearly into restrictions on
as shown in Theorem 1.
3.3. Example 2 Continued
Consider (
3); this hypothesis is of type
with
and hence
and
. In this case one can define
and
where
is the
j-th column of
.
It is simple to verify that, under the additional normalization restrictions
and
,
in (
3) satisfies
. Therefore, define
as (
3) under these normalization restrictions. Using formula (
4) one can see that
so that
is linear in
, as predicted by Theorem 3.
4. The VECM Parametrization
This section describes the I(2) parametrization employed in the statistical analysis of the paper. Consider the following
-parametrization (
-par) of the VECM for I(2) VAR systems
5. See
Mosconi and Paruolo (
2017):
with
. Recall that
is the total number of cointegrating relations, i.e., the number of I(1) linear combinations
. The number of linear combinations of
that cointegrate with
to I(0), i.e., the number of I(0) linear combinations
, is indicated
6 by
. Here
is
,
is
and the other parameter matrices are conformable; the parameters are
,
,
,
,
,
,
, all freely varying, and
is assumed to be positive definite. When
is restricted as
with
a
matrix of freely varying parameters, the
-par reduces to the parametrization of
Johansen (
1997); this restriction on
is not imposed here.
4.1. Identification of
The parameters in the
-par (
16) are not identified; in particular
can be replaced by
with
B square and nonsingular, provided
and
are simultaneously replaced by
and
. This is because
enters the likelihood only via (
16) in the products
and
. The transformation that generates observationally equivalent parameters, i.e., the post multiplication of
by a square and invertible matrix
, is the same type of transformation that induces observational equivalence in the classical system of simultaneous equations, see
Sargan (
1988), or to the set of cointegrating equations in I(1) systems, see
Johansen (
1995). This leads to the following result.
Theorem 4 (Identification of
τ in the
τ-par)
. Assume that is specified as the restricted in (
9)
, which is implied by the general linear hypothesis (
8)
on ; then the restricted is identified within the τ-par if and only if(rank condition), where . The corresponding order condition is , or equivalently . Alternatively, consider the general linear hypothesis (
11)
on ; then the constrained in (
12)
is identified in a neighborhood of the point provided the Jacobian in (13)
is of full rank. Proof. The rank condition follows from
Sargan (
1988), given that the class of transformation that induce observational equivalence is the same as the classical one for systems of simultaneous equations. The local identification condition follows from
Rothenberg (
1971). ☐
4.2. The Identification of Remaining Parameters
This subsection discusses conditions for remaining parameters of the -par to be identified, when is identified as in Theorem 4. These additional conditions are used in the discussion of the ML algorithms of the next section.
The VECM can be rewritten as
One can see that the equilibrium correction terms
may be replaced by
without changing the likelihood, where
,
and
here
A and
B are square nonsingular matrices, and
C is a generic matrix. Hence one observes that
,
,
,
,
,
,
is observationally equivalent to
,
,
,
,
,
,
.
A,
B and
C define the class of observationally equivalent transformations in the
-par for all parameters, including
. When
is identified one has
in the above formulae.
Consider additional restrictions on
of the type:
where
. The next theorem states rank conditions for these restrictions to identify the remaining parameters.
Theorem 5 (Identification of other parameters in the
τ-par)
. Assume that τ is identified as in Theorem 4; the restrictions (18)
identify φ and all other parameters in the τ-par if and only if (rank condition)A necessary but not sufficient condition (order condition) for this is that Proof. Because is identified, one has in Q. For the identification of , observe that . One finds . Because both and satisfy (18), one has . This implies that , i.e., that both and , and that is identified, if and only if . This completes the proof. ☐
Observe that the identification properties of the
-par differ from the ones of the parametrization of
Johansen (
1997), where
is restricted, and hence the adding-and-subtracting associated with
C above is not permitted.
4.3. Deterministic Terms
The
-par in (
16) does not involve deterministic terms. Allowing a constant and a trend to enter the VAR Equation (
1) in a way that rules out quadratic trends, one obtains the following equilibrium correction I(2) model—for simplicity still called the
-par below:
Here
so that
; and
and
.
This parametrization satisfies the conditions of the Johansen Representation Theorem and it generates deterministic trends up to first order, as shown in
Appendix A. This is the I(2) model used in the application, with the addition of unrestricted dummy variables.
5. Likelihood Maximization
This section discusses likelihood maximization of the
-par of the I(2) model (
16) under linear, possibly over-identifying, restrictions on
, i.e., on
in (
5). The same treatment applies to (
21) replacing (
,
) with (
,
), and (
,
), with (
,
). The formulation (
16) is preferred here for simplicity in exposition.
The alternating maximization procedure proposed here is closely related, but not identical, to the algorithms proposed by
Doornik (
2017b); related algorithms were discussed in
Paruolo (
2000b). Restricted ML estimation in the I(1) model was discussed in
Boswijk and Doornik (
2004).
5.1. Normalizations
Consider restrictions (
8), which are translated into linear hypotheses on
in (
9) as follows
where by construction
g and
G satisfy
and
such that
.
Next, consider just-identifying restrictions on the remaining parameters. For
, the linear combinations of first differences entering the multicointegration relations, one can consider
where
is the
matrix of multicointegration parameters. This restriction differs from the restriction
which is considered e.g., in
Juselius (
2017a,
2017b), and it was proposed and analysed by
Boswijk (
2000).
Furthermore, the
matrix
can be normalized as follows
where
d is some known
matrix, and where
, of dimension
, contains freely varying parameters.
It can be shown that restrictions (
22) and (
23) identify the remaining parameters using Theorem 5. In fact, (
22) and (
23) can be written as
where
and
. Vectorizing, one obtains an equation
of the form (18) with
and
. The rank condition (
19) is satisfied, since
because
where the last equality follows from (
22) and (
23) and
.
5.2. The Concentrated Likelihood Function
The model (
16), after concentrating out the unrestricted parameter matrix
, can be represented by the equations
where
indicates the vector of free parameters in
,
,
and
are residual vectors of regressions of
,
and
, respectively, on
;
7 this derivation follows similarly to Chapter 6.1 in
Johansen (
1996). The associated log-likelihood function, concentrated with respect to
, is given by
In the rest of this section,
is used as shorthand for
.
Algorithms for the maximization of the concentrated log-likelihood function are proposed below. The first one, called al1, considers the alternative maximization of over for a fixed value of (called the -step), and over for a given value of (called the -step).
A variant of this algorithm, called
al2, can be defined fixing
in the
-step to the value of
obtained in the
-step. It can be shown that the increase in
obtained in one combination of
-step and
-step of
al1 is greater or equal to the one obtained by
al2. The proof of this result is reported in Proposition A1 in
Appendix B. Because of this property, and because
al2 may display very slow convergence properties in practice,
al1 is implemented in the illustration below.
The rest of this section presents algorithms al1 and al2, defining first the -step, then the -step and finally discussing the starting values, a line search and normalizations.
5.2.1. Step
Taking differentials, one has
. Keeping
fixed, one finds
Writing
in terms of
and
, i.e.,
, the first-order conditions
and
are solved by
where
,
, and where
, and
. Note that (
25) is the GLS estimator in a regression of
on
. This defines the
-step for
al1.
The -step for al2 is defined similarly, but keeping fixed. In this case it is simple to see that
5.2.2. Step
When
is fixed (and hence
is fixed), one can construct
and
The concentrated model (
24) can then be written as a reduced rank regression:
for which the Guassian ML estimator for
,
,
has a closed-form solution, see
Johansen (
1996). Specifically, let
,
and
,
. If
,
, are the eigenvectors corresponding to the largest
r eigenvalues of the problem
and
is the matrix of the corresponding eigenvectors, then the optimal solutions for
,
,
,
is given by
where
. Optimization with respect to
is performed using
replacing
with
formed from the previous expressions, namely taking
equal to
in the above display and
from the
-step. Using the
matrices, one can also compute
directly as
. This completes the definition of the
-step.
5.2.3. Starting Values and Line Search
If the system is just-identified, consistent starting values for all parameters can be obtained by imposing the identifying restrictions on the two-stage estimator for the I(2) model (2SI2), see
Johansen (
1995) and
Paruolo (
2000a). In case of over-identification, this method may be used to produce starting values for
, which may then be used as input for the first
-step to obtain starting values for
and
.
Let
be the vector containing all free parameters in
, and let
. Denote by
the value of
in iteration
of algorithms. Denote as
the value of
obtained by the application of a
-step and
-step of algorithms
al1 and
al2 at iteration
j starting from
. In an I(1) context,
Doornik (
2017a) found that better convergence properties can be obtained if a line search is added. For this purpose, define the final value of the
j-th iteration as
where
is chosen in
using a line search; note that values of
greater than 1 are admissible. A simple (albeit admittedly sub-optimal) implementation of the line search is employed in
Doornik (
2017a); it consists of evaluating the log-likelihood function
with
setting
equal to
for
, and in choosing the value of
with the highest loglikelihood
ℓ. This simple choice of line search is used in the empirical illustration.
5.3. Standard Errors
The asymptotic variance matrix of the ML estimators may be obtained from the inverse observed (concentrated) information matrix as usual. Writing (
24) as
, and letting
, the observed concentrated information matrix for the reduced-form parameter vector
is obtained from
This leads to the following information matrix in terms of the parameters
:
where
and
. From
and
, one obtains
Define
, so that
, with
With these ingredients, one finds
where
,
and
are the expressions given above, evaluated at the ML estimators. Standard errors of individual parameters estimates are obtained as the square root of the diagonal elements of
. Asymptotic normality of resulting
t-statistics (under the null hypothesis), and
asymptotic null distributions of likelihood ratio test statistics for the over-identifying restrictions, depend on conditions for asymptotic mixed normality being satisfied; this is discussed next.
6. Asymptotics
The asymptotic distribution of the ML estimator in the I(2) model has been discussed in
Johansen (
1997,
2006). As shown there and discussed in
Boswijk (
2000), the limit distribution of the ML estimator is not jointly mixed normal as in the I(1) case. As a consequence, the limit distribution of LR test statistics of generic hypotheses need not be
under the null hypothesis.
In some special cases, the asymptotic distribution of the just-identified ML estimator of the cointegration parameters can be shown to be asymptotically mixed normal. Consider the case
(i.e.,
), and assume as before that no deterministic terms are included in the model. In this case, the limit distribution of the cointegration parameters in Theorem 4 in
Johansen (
2006), J06 hereafter, can be described in terms of the estimated parameters
and
, where
is identified as
with
. Note that the components
C and
in the above theorem do not appear here, because
. One has
with
,
and where
, a vector Brownian motion with covariance matrix
8.
As noted in J06, has a mixed normal distribution with mean 0, because is a function of , which is independent of . Moreover in the case , the component of the ML limit distribution does not appear, so that the whole limit distribution of the cointegration parameters is jointly mixed normal, unlike in the case .
One can see that hypothesis (
8) defines a smooth restriction of the
parameters
9. More precisely
depends smoothly only on
,
, where
contains the
parameters in (
8). Note also that
depends on the parameters in
, which are unrestricted by (
8); hence
depends only on
,
, where
contains the parameters in
in (
22).
The conditions of Theorem 5 in J06 are next shown to be verified, and hence the LR test of the hypothesis (
8) is asymptotically
with degrees of freedom equal to the number of constraints, in case
. In fact,
,
are smoothly parametrizated by the continuously identified parameters
and
. Because
does not depend on
, one easily deduces
in (37) of J06. Similarly, one has
with
and
of full rank; hence (38) of J06 is satisfied. This shows that the LR statistic is asymptotically
under the null, for
.
In case
, the asymptotic distribution of
is defined in terms of
in J06 p. 92, which is not jointly mixed normal. In such cases,
Boswijk (
2000) showed that inference is mixed normal if the restrictions on
can be asymptotically linearized in
, and separated into two sets of restrictions, the first group involving
only, and the second group involving
only. Because the conditions of Theorem 5 in J06 cannot be easily verified for general linear hypotheses of the form (
8) in this case, they will need to be checked case by case. The authors intend to develop more readily verifiable conditions for
inference on
in their future research.
7. Illustration
Following
Juselius and Assenmacher (
2015), consider a 7-dimensional VAR with
where
,
,
are the (log of) the price index, the long and the short interest rate of country
i at time
t respectively, and
is the log of the exchange rate between country 1 (Switzerland) and 2 (the US) at time
t. The results are based on quarterly data over the period 1975:1–2013:3. The model has two lags, a restricted linear trend as in (
21), which appears in the equilibrium correction only appended to the vector of lagged levels, and a number of dummy variables; see
Juselius and Assenmacher (
2017), which is an updated version of
Juselius and Assenmacher (
2015), for further details on the empirical model. The data set used here is taken from
Juselius and Assenmacher (
2017).
Specification (
3) is based on the prediction that
. Based on I(2) cointegration tests,
Juselius and Assenmacher (
2017) choose a model with
, which indeed implies
, but also
; arguably, however, the test results in Table 1 of their paper also support the hypothesis
, which has the same number
of common
trends. The latter model would be selected applying the sequential procedure in
Nielsen and Rahbek (
2007) using a
or
significance level in each test in the sequence.
Consider the case
. The over-identifying restrictions on
implied by (
3) are incorporated in the parametrization (
3), with normalizations
, which in turn leads to the over-identified structure for
in (
15), to be estimated by ML. The restricted ML estimate of
is (standard errors in parentheses):
The LR statistics for the 3 over-identifying restrictions equals
. Using the
asymptotic limit distribution, one finds an asymptotic
p-value of
, and hence a rejection of the null hypothesis. This indicates that the hypothesized structure on
is rejected.
For comparison, consider also the case
, for which the LR test for cointegration ranks has a
p-value of
. The resulting restricted estimate of
is:
The estimates and standard errors are similar to those obtained under the hypothesis
. The LR statistic for the over-identifying restrictions now equals
. If one conjectured that the limit distribution of the LR test is also
in this case, one would obtain an asymptotic
p-value of
, so the evidence against the hypothesized structure of
appears slightly weaker in this model.
The results for both model
and for model
are in line with the preferred specification of
Juselius and Assenmacher (
2017), who select an over-identified structure for
, which is not nested in (
15), and therefore implies a different impact of the common I(2) trends.
8. Conclusions
Hypotheses on the loading matrix of I(2) common trends are of economic interest. They are shown to be related to the cointegration relations. This link is explicitly discussed in this paper, also for hypotheses that are over-identifying. Likelihood maximization algorithms are proposed and discussed, along with LR tests of the hypotheses.
The application of these LR tests to a system of prices, exchange rates and interest rates for Switzerland and the US shows support for the existence of two I(2) common trends. These may represent a ‘speculative’ trend and a ‘relative prices’ trend, but there is little empirical support for the corresponding exclusion restrictions in the loading matrix.